Integration
AI Framework
CrewAI
Give your CrewAI agents the ability to scrape web pages, extract structured data, and take screenshots. AlterLab tools integrate directly into CrewAI crews as ready-to-use tools.
Why AlterLab + CrewAI?
CrewAI agents need real-time web data to make informed decisions. AlterLab handles JavaScript rendering, anti-bot bypass, and content extraction — so your agents get clean, structured data from any website.
Installation
Bash
pip install crewai-tools-alterlab crewaiThe crewai-tools-alterlab package provides CrewAI-compatible tools that wrap the AlterLab API. Each tool follows CrewAI's BaseTool interface, so agents can use them directly.
Available Tools
| Tool | Purpose | Returns |
|---|---|---|
AlterLabScrapeTool | Scrape a URL and return clean content | Markdown or text content |
AlterLabExtractTool | Extract structured data from a page using a JSON schema | JSON matching your schema |
AlterLabScreenshotTool | Take a full-page screenshot of a URL | Base64-encoded PNG image |
Basic Usage
Single Tool
Use the scrape tool directly to fetch clean web content:
Python
from crewai_tools_alterlab import AlterLabScrapeTool
# Initialize the tool
scrape_tool = AlterLabScrapeTool(api_key="your_api_key")
# Use it directly
result = scrape_tool.run("https://example.com/blog/ai-trends")
print(result)Full Crew Example
Create a crew with a research agent that can scrape web pages:
Python
from crewai import Agent, Task, Crew
from crewai_tools_alterlab import AlterLabScrapeTool, AlterLabExtractTool
# Initialize tools
scrape_tool = AlterLabScrapeTool(api_key="your_api_key")
extract_tool = AlterLabExtractTool(api_key="your_api_key")
# Create a research agent with scraping capabilities
researcher = Agent(
role="Web Researcher",
goal="Gather and analyze information from web pages",
backstory="You are an expert web researcher who finds "
"and synthesizes information from multiple sources.",
tools=[scrape_tool, extract_tool],
verbose=True,
)
# Define a research task
research_task = Task(
description=(
"Research the latest trends in AI agent frameworks. "
"Scrape these pages and summarize the key findings:\n"
"- https://docs.crewai.com/\n"
"- https://docs.autogen.org/\n"
),
expected_output="A summary of AI agent framework trends "
"with key features and differences.",
agent=researcher,
)
# Run the crew
crew = Crew(
agents=[researcher],
tasks=[research_task],
verbose=True,
)
result = crew.kickoff()
print(result)Structured Extraction
Use the extract tool to pull structured data from web pages:
Python
from crewai import Agent, Task, Crew
from crewai_tools_alterlab import AlterLabExtractTool
extract_tool = AlterLabExtractTool(
api_key="your_api_key",
extraction_schema={
"type": "object",
"properties": {
"title": {"type": "string"},
"price": {"type": "string"},
"rating": {"type": "number"},
"features": {
"type": "array",
"items": {"type": "string"},
},
},
},
)
# Agent that extracts product data
product_analyst = Agent(
role="Product Analyst",
goal="Extract and compare product information from web pages",
backstory="You analyze product pages and extract structured data "
"for comparison reports.",
tools=[extract_tool],
)
task = Task(
description="Extract product details from https://example.com/product "
"and format a comparison report.",
expected_output="Structured product data with title, price, "
"rating, and features.",
agent=product_analyst,
)
crew = Crew(agents=[product_analyst], tasks=[task])
result = crew.kickoff()
print(result)Tips & Best Practices
- Use markdown format (the default) for content that agents need to reason about. It preserves structure while being concise.
- Set cost controls when agents may scrape many pages autonomously. Use
max_tierto limit per-page costs. - Combine tools — use the scrape tool for broad content gathering and the extract tool when you need specific structured fields.
- Use JS mode for SPAs and dynamic sites. Pass
mode="js"to the tool constructor for JavaScript-rendered pages. - Pair with other integrations — use AlterLab tools in CrewAI alongside your existing LangChain or LlamaIndex RAG pipelines for agents that can both search the web and query knowledge bases.