Integration

AI Framework

CrewAI

Give your CrewAI agents the ability to scrape web pages, extract structured data, and take screenshots. AlterLab tools integrate directly into CrewAI crews as ready-to-use tools.

Why AlterLab + CrewAI?

CrewAI agents need real-time web data to make informed decisions. AlterLab handles JavaScript rendering, anti-bot bypass, and content extraction — so your agents get clean, structured data from any website.

Installation

Bash

pip install crewai-tools-alterlab crewai

The crewai-tools-alterlab package provides CrewAI-compatible tools that wrap the AlterLab API. Each tool follows CrewAI's BaseTool interface, so agents can use them directly.

Available Tools

Tool	Purpose	Returns
`AlterLabScrapeTool`	Scrape a URL and return clean content	Markdown or text content
`AlterLabExtractTool`	Extract structured data from a page using a JSON schema	JSON matching your schema
`AlterLabScreenshotTool`	Take a full-page screenshot of a URL	Base64-encoded PNG image

Basic Usage

Single Tool

Use the scrape tool directly to fetch clean web content:

Python

from crewai_tools_alterlab import AlterLabScrapeTool

# Initialize the tool
scrape_tool = AlterLabScrapeTool(api_key="your_api_key")

# Use it directly
result = scrape_tool.run("https://example.com/blog/ai-trends")
print(result)

Full Crew Example

Create a crew with a research agent that can scrape web pages:

Python

from crewai import Agent, Task, Crew
from crewai_tools_alterlab import AlterLabScrapeTool, AlterLabExtractTool

# Initialize tools
scrape_tool = AlterLabScrapeTool(api_key="your_api_key")
extract_tool = AlterLabExtractTool(api_key="your_api_key")

# Create a research agent with scraping capabilities
researcher = Agent(
    role="Web Researcher",
    goal="Gather and analyze information from web pages",
    backstory="You are an expert web researcher who finds "
              "and synthesizes information from multiple sources.",
    tools=[scrape_tool, extract_tool],
    verbose=True,
)

# Define a research task
research_task = Task(
    description=(
        "Research the latest trends in AI agent frameworks. "
        "Scrape these pages and summarize the key findings:\n"
        "- https://docs.crewai.com/\n"
        "- https://docs.autogen.org/\n"
    ),
    expected_output="A summary of AI agent framework trends "
                    "with key features and differences.",
    agent=researcher,
)

# Run the crew
crew = Crew(
    agents=[researcher],
    tasks=[research_task],
    verbose=True,
)

result = crew.kickoff()
print(result)

Structured Extraction

Use the extract tool to pull structured data from web pages:

Python

from crewai import Agent, Task, Crew
from crewai_tools_alterlab import AlterLabExtractTool

extract_tool = AlterLabExtractTool(
    api_key="your_api_key",
    extraction_schema={
        "type": "object",
        "properties": {
            "title": {"type": "string"},
            "price": {"type": "string"},
            "rating": {"type": "number"},
            "features": {
                "type": "array",
                "items": {"type": "string"},
            },
        },
    },
)

# Agent that extracts product data
product_analyst = Agent(
    role="Product Analyst",
    goal="Extract and compare product information from web pages",
    backstory="You analyze product pages and extract structured data "
              "for comparison reports.",
    tools=[extract_tool],
)

task = Task(
    description="Extract product details from https://example.com/product "
                "and format a comparison report.",
    expected_output="Structured product data with title, price, "
                    "rating, and features.",
    agent=product_analyst,
)

crew = Crew(agents=[product_analyst], tasks=[task])
result = crew.kickoff()
print(result)

Tips & Best Practices

Use markdown format (the default) for content that agents need to reason about. It preserves structure while being concise.
Set cost controls when agents may scrape many pages autonomously. Use max_tier to limit per-page costs.
Combine tools — use the scrape tool for broad content gathering and the extract tool when you need specific structured fields.
Use JS mode for SPAs and dynamic sites. Pass mode="js" to the tool constructor for JavaScript-rendered pages.
Pair with other integrations — use AlterLab tools in CrewAI alongside your existing LangChain or LlamaIndex RAG pipelines for agents that can both search the web and query knowledge bases.

Last updated: June 2026

Tool

Purpose

Returns

AlterLabScrapeTool

Scrape a URL and return clean content

Markdown or text content

AlterLabExtractTool

Extract structured data from a page using a JSON schema

JSON matching your schema

AlterLabScreenshotTool

Take a full-page screenshot of a URL

Base64-encoded PNG image

Basic Usage

Single Tool

Use the scrape tool directly to fetch clean web content:

Python

from crewai_tools_alterlab import AlterLabScrapeTool

# Initialize the tool
scrape_tool = AlterLabScrapeTool(api_key="your_api_key")

# Use it directly
result = scrape_tool.run("https://example.com/blog/ai-trends")
print(result)

Full Crew Example

Create a crew with a research agent that can scrape web pages:

Python

from crewai import Agent, Task, Crew
from crewai_tools_alterlab import AlterLabScrapeTool, AlterLabExtractTool

# Initialize tools
scrape_tool = AlterLabScrapeTool(api_key="your_api_key")
extract_tool = AlterLabExtractTool(api_key="your_api_key")

# Create a research agent with scraping capabilities
researcher = Agent(
    role="Web Researcher",
    goal="Gather and analyze information from web pages",
    backstory="You are an expert web researcher who finds "
              "and synthesizes information from multiple sources.",
    tools=[scrape_tool, extract_tool],
    verbose=True,
)

# Define a research task
research_task = Task(
    description=(
        "Research the latest trends in AI agent frameworks. "
        "Scrape these pages and summarize the key findings:\n"
        "- https://docs.crewai.com/\n"
        "- https://docs.autogen.org/\n"
    ),
    expected_output="A summary of AI agent framework trends "
                    "with key features and differences.",
    agent=researcher,
)

# Run the crew
crew = Crew(
    agents=[researcher],
    tasks=[research_task],
    verbose=True,
)

result = crew.kickoff()
print(result)

Structured Extraction

Use the extract tool to pull structured data from web pages:

Python

from crewai import Agent, Task, Crew
from crewai_tools_alterlab import AlterLabExtractTool

extract_tool = AlterLabExtractTool(
    api_key="your_api_key",
    extraction_schema={
        "type": "object",
        "properties": {
            "title": {"type": "string"},
            "price": {"type": "string"},
            "rating": {"type": "number"},
            "features": {
                "type": "array",
                "items": {"type": "string"},
            },
        },
    },
)

# Agent that extracts product data
product_analyst = Agent(
    role="Product Analyst",
    goal="Extract and compare product information from web pages",
    backstory="You analyze product pages and extract structured data "
              "for comparison reports.",
    tools=[extract_tool],
)

task = Task(
    description="Extract product details from https://example.com/product "
                "and format a comparison report.",
    expected_output="Structured product data with title, price, "
                    "rating, and features.",
    agent=product_analyst,
)

crew = Crew(agents=[product_analyst], tasks=[task])
result = crew.kickoff()
print(result)

Tips & Best Practices

Use markdown format (the default) for content that agents need to reason about. It preserves structure while being concise.

Set cost controls when agents may scrape many pages autonomously. Use max_tier to limit per-page costs.

Combine tools — use the scrape tool for broad content gathering and the extract tool when you need specific structured fields.

Use JS mode for SPAs and dynamic sites. Pass mode="js" to the tool constructor for JavaScript-rendered pages.

Pair with other integrations — use AlterLab tools in CrewAI alongside your existing LangChain or LlamaIndex RAG pipelines for agents that can both search the web and query knowledge bases.