Apify
ReviewAudited by ClawScan on May 18, 2026.
Overview
This is a coherent Apify scraping plugin, but it deserves Review because it can broaden agent tool permissions, exposes untrusted Actor documentation to the agent, and has avoidable API-key boundary weaknesses.
Install only if you are comfortable giving the agent Apify account access and possible scraping/billing authority. Before setup, avoid enabling group:plugins unless you really want all plugin tools allowed, keep your Apify token out of terminal logs, and review or restrict Actor choices because third-party Actor documentation and cached prior runs can influence later agent behavior.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Running the setup wizard may grant the agent access to more plugin tools than the user intended.
The Apify setup path always treats all tools as selected, then writes group:plugins into tools.alsoAllow instead of only allowing the apify tool. That can broaden agent access to all plugin tools in the environment, not just this integration.
const selectedTools = ALL_TOOLS.map((t) => t.name);
const allSelected = true;
...
const toolsToAdd = allSelected ? ["group:plugins"] : selectedTools;
for (const t of toolsToAdd) {
if (!cfg.tools.alsoAllow.includes(t)) {
cfg.tools.alsoAllow.push(t);
}
}Change setup to add only "apify" by default, or clearly warn before adding "group:plugins" and explain that it can allow other plugin tools.
A malicious or compromised Actor listing could influence the agent’s next steps, such as what Actor to run or what inputs to provide.
Actor README and Store descriptions are external marketplace content returned directly to the agent. Unlike collected dataset items, these fields are not wrapped with untrusted-content markers, so instructions embedded in Actor docs could be over-trusted by the agent.
const readme = actorDef?.readme ?? build.readme ?? null; ... readme: readme ? String(readme).slice(0, 3000) : null, ... if (actor.description) lines.push(actor.description.slice(0, 200));
Wrap Actor READMEs/descriptions with the same untrusted-content boundary used for scrape results and instruct agents not to treat Actor documentation as commands.
If the base URL is mis-set or poisoned, the Apify token could be sent to an unintended host.
The base URL validation uses startsWith, so a lookalike host such as https://api.apify.com.evil.example would pass the check. Because the API token is then supplied to the client for that baseUrl, a bad config could route the token outside Apify.
export const ALLOWED_APIFY_BASE_URL_PREFIX = "https://api.apify.com";
...
if (!url.startsWith(ALLOWED_APIFY_BASE_URL_PREFIX)) {
throw new Error(...);
}
...
return new ApifyClient({
token: apiKey,
baseUrl,Parse the URL and require the exact origin https://api.apify.com, or remove configurable baseUrl unless it is truly needed.
Anyone who can see the terminal output or logs from setup might see the Apify API token.
The manual setup fallback prints the full API key into the terminal config snippet. This is local and user-directed, but it can leave the secret in terminal scrollback or captured logs.
console.log(` apiKey: "${apiKey}"`);Prefer direct config writing or environment variables, and consider changing the CLI to mask the key or ask the user to paste it manually into their config.
Recent scrape targets, inputs, run IDs, or result summaries may appear in later tool responses even if the user did not ask for them again.
The repository documentation describes automatic reuse of prior run inputs/results in later responses. Although bounded by TTL and size, this is persistent cross-call context that is not clearly disclosed in SKILL.md and could leak prior scrape targets or contaminate later tasks.
Every response auto-includes a `previousRuns` field with a compact summary of cached scrape results... At collect time, the original run input is fetched... and stored in the cache payload... Last 10 entries shown.
Document this cache in user-facing docs, provide a clear disable/clear option, and avoid auto-injecting previous run details unless explicitly requested.
