Apify

ReviewAudited by ClawScan on May 18, 2026.

Overview

This is a coherent Apify scraping plugin, but it deserves Review because it can broaden agent tool permissions, exposes untrusted Actor documentation to the agent, and has avoidable API-key boundary weaknesses.

Install only if you are comfortable giving the agent Apify account access and possible scraping/billing authority. Before setup, avoid enabling group:plugins unless you really want all plugin tools allowed, keep your Apify token out of terminal logs, and review or restrict Actor choices because third-party Actor documentation and cached prior runs can influence later agent behavior.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Running the setup wizard may grant the agent access to more plugin tools than the user intended.

Why it was flagged

The Apify setup path always treats all tools as selected, then writes group:plugins into tools.alsoAllow instead of only allowing the apify tool. That can broaden agent access to all plugin tools in the environment, not just this integration.

Skill content
const selectedTools = ALL_TOOLS.map((t) => t.name);
const allSelected = true;
...
const toolsToAdd = allSelected ? ["group:plugins"] : selectedTools;
for (const t of toolsToAdd) {
  if (!cfg.tools.alsoAllow.includes(t)) {
    cfg.tools.alsoAllow.push(t);
  }
}
Recommendation

Change setup to add only "apify" by default, or clearly warn before adding "group:plugins" and explain that it can allow other plugin tools.

What this means

A malicious or compromised Actor listing could influence the agent’s next steps, such as what Actor to run or what inputs to provide.

Why it was flagged

Actor README and Store descriptions are external marketplace content returned directly to the agent. Unlike collected dataset items, these fields are not wrapped with untrusted-content markers, so instructions embedded in Actor docs could be over-trusted by the agent.

Skill content
const readme = actorDef?.readme ?? build.readme ?? null;
...
readme: readme ? String(readme).slice(0, 3000) : null,
...
if (actor.description) lines.push(actor.description.slice(0, 200));
Recommendation

Wrap Actor READMEs/descriptions with the same untrusted-content boundary used for scrape results and instruct agents not to treat Actor documentation as commands.

What this means

If the base URL is mis-set or poisoned, the Apify token could be sent to an unintended host.

Why it was flagged

The base URL validation uses startsWith, so a lookalike host such as https://api.apify.com.evil.example would pass the check. Because the API token is then supplied to the client for that baseUrl, a bad config could route the token outside Apify.

Skill content
export const ALLOWED_APIFY_BASE_URL_PREFIX = "https://api.apify.com";
...
if (!url.startsWith(ALLOWED_APIFY_BASE_URL_PREFIX)) {
  throw new Error(...);
}
...
return new ApifyClient({
  token: apiKey,
  baseUrl,
Recommendation

Parse the URL and require the exact origin https://api.apify.com, or remove configurable baseUrl unless it is truly needed.

What this means

Anyone who can see the terminal output or logs from setup might see the Apify API token.

Why it was flagged

The manual setup fallback prints the full API key into the terminal config snippet. This is local and user-directed, but it can leave the secret in terminal scrollback or captured logs.

Skill content
console.log(`          apiKey: "${apiKey}"`);
Recommendation

Prefer direct config writing or environment variables, and consider changing the CLI to mask the key or ask the user to paste it manually into their config.

What this means

Recent scrape targets, inputs, run IDs, or result summaries may appear in later tool responses even if the user did not ask for them again.

Why it was flagged

The repository documentation describes automatic reuse of prior run inputs/results in later responses. Although bounded by TTL and size, this is persistent cross-call context that is not clearly disclosed in SKILL.md and could leak prior scrape targets or contaminate later tasks.

Skill content
Every response auto-includes a `previousRuns` field with a compact summary of cached scrape results... At collect time, the original run input is fetched... and stored in the cache payload... Last 10 entries shown.
Recommendation

Document this cache in user-facing docs, provide a clear disable/clear option, and avoid auto-injecting previous run details unless explicitly requested.