Firecrawl MCP Server: Web Scraping & Search for AI Coding Agents

An AI agent that can generate perfect code but can’t read a webpage is like a chef who can cook anything but can’t order ingredients. The Firecrawl MCP Server fixes this — it gives your AI agent the ability to scrape, search, and process any website into structured data.

With 110K+ GitHub stars, Firecrawl is one of the most widely deployed MCP servers in production. It powers everything from automated documentation scrapers to competitive research agents to personalized news digests.

What Is Firecrawl MCP?#

Firecrawl is an open-source web scraping platform specifically designed for AI/LLM consumption. Its MCP server exposes 13+ tools that let any MCP-compatible AI client:

Scrape any URL into clean, LLM-ready markdown (strips ads, nav, clutter)
Search the web with full page content returned
Crawl entire websites following link structures
Map a website’s architecture (discover all paths)
Deep research with autonomous agent to synthesize across multiple sources
Interact with pages — click buttons, fill forms, navigate

Without Firecrawl MCP:

1
$ claude "Read the docs for Library X and tell me how to install it"
2
  → Agent: "I can't access the internet directly. Can you paste the docs?"
3
  ← You: manually copy-paste from the docs site

With Firecrawl MCP:

1
$ claude "Read the docs for Library X and tell me how to install it"
2
  → Agent: *scrapes Library X's documentation site*
3
  → "npm install library-x. Then add this to your config..."

Installation#

Step 1: Get an API Key#

Sign up at firecrawl.dev
Get your API key from the dashboard: https://www.firecrawl.dev/app/api-keys

Firecrawl offers a free tier (500 pages/month) — enough for personal use and evaluation.

Step 2: Install#

Claude Code (npx method):

1
claude mcp add firecrawl \
2
  --env FIRECRAWL_API_KEY=fc-YOUR_API_KEY \
3
  -- npx -y firecrawl-mcp

Remote URL (simplest — no local process):

1
claude mcp add firecrawl \
2
  --transport http \
3
  firecrawl https://mcp.firecrawl.dev/YOUR_API_KEY/v2/mcp

Cursor (.cursor/mcp.json):

1
{
2
  "mcpServers": {
3
    "firecrawl-mcp": {
4
      "command": "npx",
5
      "args": ["-y", "firecrawl-mcp"],
6
      "env": {
7
        "FIRECRAWL_API_KEY": "fc-YOUR_API_KEY"
8
      }
9
    }
10
  }
11
}

VS Code + Copilot (.vscode/mcp.json): Same JSON structure as Cursor.

Step 3: Verify#

1
# In Claude Code
2
claude "Scrape https://example.com and summarize it"
3
# Or search
4
claude "Search the web for latest Flutter 4.0 release notes"

The 13+ Tools#

Firecrawl exposes a comprehensive set of tools:

Tool	What It Does	Use Case
firecrawl_scrape	Scrape a single URL into markdown	Quick doc lookup
firecrawl_batch_scrape	Scrape multiple URLs in one batch	Multi-page research
firecrawl_crawl	Crawl an entire website	Documentation sites
firecrawl_search	Web search with full page content	Finding latest info
firecrawl_map	Discover all URLs on a site	Site architecture
firecrawl_deep_research	Autonomous multi-source research	Competitive analysis
firecrawl_extract	Extract structured data from pages	Pricing tables, specs
firecrawl_click	Click elements on a page	Login flows, navigation
firecrawl_fill	Fill form fields	Search forms, signups
firecrawl_screenshot	Capture page screenshots	Visual verification
firecrawl_session	Manage browser sessions	Persistent auth

Real-World Workflows#

Workflow 1: Documentation Scraper for AI Context#

The most common use case — feeding documentation into your AI agent’s context:

1
$ claude "I need to use the Stripe Checkout API.
2
   Scrape their docs and create a usage example."
3

4
Agent:
5
  1. firecrawl_scrape → https://docs.stripe.com/api/checkout/sessions
6
  2. firecrawl_scrape → https://docs.stripe.com/payments/checkout
7
  3. firecrawl_scrape → https://docs.stripe.com/api/checkout/errors
8
  4. Synthesizes: "Here's how Stripe Checkout works:
9
     - Create a session via /v1/checkout/sessions
10
     - Required params: line_items, mode, success_url, cancel_url
11
     - Example code in Node.js: ..."

Save this as a Skill for repeat use:

1
# /fetch-docs Skill
2

3
When asked about a library or API:
4
1. Firecrawl_scrape the official docs homepage
5
2. Firecrawl_map to discover all doc paths
6
3. Crawl the relevant section
7
4. Return a structured summary with code examples

Workflow 2: Competitive Research#

1
$ claude "Research three alternatives to Vercel for Next.js hosting."
2

3
Agent:
4
  1. firecrawl_search → "Next.js hosting alternatives 2026"
5
  2. firecrawl_extract → pricing tables from the top 5 results
6
  3. firecrawl_deep_research → "Compare Vercel vs Netlify vs Railway"
7
  4. Builds a comparison table:
8
     | Platform | Price | Next.js Support | Edge Functions | Deploy Speed |
9
     |---|---|---|---|---|
10
     | Vercel | $20/mo | Native | Yes | 5s |
11
     | Netlify | $19/mo | Plugin | Yes | 8s |
12
     | Railway | $5/mo | Container | No | 12s |

Workflow 3: Monitoring + Reporting#

1
$ claude "Every morning, scrape Hacker News front page and synthesize the
2
   top 5 AI-tagged stories into a summary."
3

4
Agent runs daily (via cron + Skill):
5
  1. firecrawl_scrape → news.ycombinator.com (filter: AI tagged)
6
  2. firecrawl_scrape → each story's content
7
  3. Synthesizes: "Today's AI highlights:
8
     - MCP SDK hits 100M downloads
9
     - New model: Claude Opus 4.5 released
10
     - Microsoft open-sources VibeVoice 2.0"

Workflow 4: Building a RAG Knowledge Base#

1
$ claude "I need to build a searchable knowledge base from the Tailwind CSS docs."
2

3
Agent:
4
  1. firecrawl_map → tailwindcss.com/docs → discovers 200+ doc pages
5
  2. firecrawl_crawl → scrapes all 200 pages into markdown
6
  3. Writes each page to a separate file
7
  4. Saves to a local docs folder → ready for vector embedding

Firecrawl vs. Other Approaches#

Method	Pros	Cons
Firecrawl MCP	LLM-ready output, browser interaction, deep research	API key needed, rate limits
manually paste URL content	Free, no setup	Slow, breaks agent flow
curl/wget	Free, scriptable	Raw HTML, no LLM optimization
Browser Use MCP	Full browser control	Heavier, slower
Brave Search MCP	Pure search, lightweight	No page interaction

Security Considerations#

Firecrawl is a read-scraping tool by nature, but:

API key management: Store FIRECRAWL_API_KEY as an environment variable, never hardcode in config files
Rate limits respect: The free tier is 500 pages/month. Monitor usage to avoid surprises
Prompt injection risk: Since Firecrawl returns web content, a malicious website could inject instructions into the agent’s context. This is primarily an issue if your agent has write-access to systems. Mitigation: keep write scopes separate
Self-hosting option: Firecrawl is fully open-source. You can self-host it behind your own infrastructure for complete control

Self-Hosting#

Firecrawl is fully open-source and can be self-hosted via Docker:

1
git clone https://github.com/firecrawl/firecrawl
2
cd firecrawl
3
docker compose up

Then configure the MCP server to point to your self-hosted instance:

1
{
2
  "mcpServers": {
3
    "firecrawl-selfhosted": {
4
      "command": "npx",
5
      "args": ["-y", "firecrawl-mcp"],
6
      "env": {
7
        "FIRECRAWL_API_KEY": "fc-selfhosted-key",
8
        "FIRECRAWL_BASE_URL": "http://localhost:3002"
9
      }
10
    }
11
  }
12
}

Troubleshooting#

“Scrape returns empty content”

Check if the site blocks scrapers (Cloudflare, bot detection)
Try with --allow-bots flag or self-hosted Chrome
The site might be JavaScript-rendered — use crawl instead of scrape

“Rate limit exceeded”

Wait or upgrade your plan
Reduce crawl depth
Use search instead of scrape where possible

“Deep research takes too long”

The autonomous research agent visits multiple pages — this is expected
Try narrow queries: “Compare X vs Y on pricing only”

Summary#

Firecrawl MCP is the essential web-reader for AI agents. It transforms the web from “you’ll have to copy-paste that” to “let me fetch and analyze that for you.”

Install it second — right after GitHub MCP. Between the two, your agent can read your codebase (GitHub MCP) and the entire internet (Firecrawl MCP). Everything else is additive.

Next in this series: Context7 MCP — fixing stale LLM training data by injecting the latest docs.

Series: Practical MCP Servers for Developers — 2026 Edition. Day 2 of 6.