How to Give Your AI Agent Access to Product Hunt Data
Tutorials

How to Give Your AI Agent Access to Product Hunt Data

Learn how to give your AI agent reliable, structured access to public Product Hunt data via AlterLab’s Extract and Search APIs, with anti‑bot handling and ready‑to‑use JSON.

5 min read
10 views

This guide covers accessing publicly available data. Always review a site's robots.txt and Terms of Service before automated access.

TL;DR

Give your AI agent structured Product Hunt data by calling AlterLab’s Extract API with a URL and a JSON schema. The API handles anti‑bot measures, renders JavaScript, and returns clean JSON ready for LLMs. Use the Search API for query‑based results and the MCP server for seamless agent tool calls.

Why AI agents need Product Hunt data

Product Hunt surfaces daily launches, upvotes, and comments across tech categories. AI agents can use this stream for:

  • Product launch intelligence: detect new tools as they appear and feed signals into trend models.
  • Trend detection: aggregate votes and topics over time to surface emerging technologies.
  • Competitive monitoring: track rivals’ launches, pricing mentions, and user feedback in near real‑time.

These use cases rely on fresh, structured data—raw HTML adds noise and wastes token budget.

Why raw HTTP requests fail for agents

Direct requests to Product Hunt often encounter:

  • Rate limits that block automated traffic after a few calls.
  • JavaScript‑driven content that requires a headless browser to render.
  • Bot detection mechanisms (CAPTCHAs, fingerprinting) that return challenges or empty responses.
  • Failed requests that consume LLM context windows with retry logic instead of useful data.

Each failure forces the agent to spend tokens on error handling, reducing the space for actual reasoning.

Connecting your agent to Product Hunt via AlterLab

AlterLab’s Extract API (/api/v1/extract) fetches a URL, executes any needed JavaScript, bypasses anti‑bot layers, and returns data matching a provided schema. No HTML parsing is required on the agent side.

Python example

Python
import alterlab

client = alterlab.Client("YOUR_API_KEY")

# Define the shape of data you need from a Product Hunt post
schema = {
    "title": "string",
    "tagline": "string",
    "votes": "integer",
    "topics": ["string"],
    "url": "string"
}

result = client.extract(
    url="https://www.producthunt.com/posts/example-tool",
    schema=schema
)

print(result.data)  # => {'title': 'Example Tool', 'tagline': '…', …}

The client handles authentication, retries, and returns a Python dict. The agent can pass result.data straight into an LLM prompt.

cURL equivalent

Bash
curl -X POST https://api.alterlab.io/api/v1/extract \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.producthunt.com/posts/example-tool",
    "schema": {
      "title": "string",
      "tagline": "string",
      "votes": "integer",
      "topics": ["string"],
      "url": "string"
    }
  }'

The response body is JSON with a data field containing the extracted fields.

Using the Scrape API for raw HTML

If you need the full page (e.g., to run custom selectors), AlterLab also offers a Scrape API (/api/v1/scrape). It returns the rendered HTML after anti‑bot bypass, useful for debugging or when a schema isn’t sufficient.

Python
html = client.scrape(url="https://www.producthunt.com/posts/example-tool")
# html is a string you can feed to BeautifulSoup or lxml if needed

Most agent workflows prefer the Extract API because it eliminates the parsing step and reduces token usage.

Using the Search API for Product Hunt queries

AlterLab’s Search API (/api/v1/search) lets you query Product Hunt’s public listings via a natural language or keyword input, returning structured results without constructing URLs manually.

Python example

Python
response = client.search(
    query="open source AI agents",
    limit=10
)

for item in response.data:
    print(item["title"], item["votes"])

The API internally converts the query to a Product Hunt search URL, applies the same anti‑bot handling, and extracts a list of objects matching a default schema (title, url, votes, topics, etc.). Adjust the limit parameter to control batch size.

cURL example

Bash
curl -X POST https://api.alterlab.io/api/v1/search \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "open source AI agents",
    "limit": 10
  }'

The response contains a data array; each entry is ready for LLM consumption.

MCP integration

AlterLab provides an MCP (Model Context Protocol) server that exposes the Extract and Search APIs as tools for agents built with Claude, GPT‑4o, or Cursor. By registering the MCP server, your agent can call alterlab_extract or alterlab_search as native functions, eliminating boilerplate HTTP code.

See the full walkthrough: AlterLab for AI Agents.
The MCP server handles authentication, retries, and returns typed objects that fit directly into LLM tool‑call schemas.

Building a product launch intelligence pipeline

Here’s an end‑to‑end example that demonstrates how an agent can turn raw Product Hunt data into actionable insight.

  1. Agent triggers a search for recent posts in the “developer tools” category.
  2. AlterLab returns a list of structured objects (title, votes, topics, URL).
  3. Agent filters for posts with votes > 50 and topics containing “AI”.
  4. For each candidate, the agent calls the Extract API to get detailed fields like description, maker name, and website.
  5. Agent compiles a short briefing and feeds it into an LLM prompt: “Summarize the top three AI‑focused developer tool launches today and suggest why they might gain traction.”
  6. LLM outputs a concise brief that the agent can store in a knowledge base or send to a stakeholder.

Pipeline code sketch

Python
import alterlab
from typing import List, Dict

client = alterlab.Client("YOUR_API_KEY")

def hunt_ai_tools() -> List[Dict]:
    # Step 1: search for recent AI‑related posts
    search_res = client.search(query="AI developer tools", limit=20)
    candidates = [item for item in search_res.data if item.get("votes", 0) > 50]

    detailed = []
    for post in candidates:
        # Step 2: extract richer fields from the post page
        extracted = client.extract(
            url=post["url"],
            schema={
                "title": "string",
                "description": "string",
                "maker": "string",
                "website": "string",
                "topics": ["string"]
            }
        )
        detailed.append(extracted.data)
    return detailed

if __name__ == "__main__":
    tools = hunt_ai_tools()
    # Pass `tools` to your LLM framework for summarization
    print(tools[:3])  # preview

The pipeline requires no custom HTML parsing, no manual retry loops, and respects Product Hunt’s terms by relying on AlterLab’s compliant fetching.

Key takeaways

  • Use AlterLab’s Extract API to turn any Product Hunt page into clean, schema‑driven JSON.
  • Leverage the Search API for keyword‑based discovery without building URLs.
  • Integrate via the MCP server for seamless tool calls in agent frameworks.
  • Focus agent tokens on reasoning, not on handling anti‑bot or parsing overhead.
  • Always verify robots.txt and Terms of Service; AlterLab automates compliant fetching but responsibility remains with the user.

AlterLab // Web Data, Simplified.

Share

Was this article helpful?

Frequently Asked Questions

Accessing publicly available data is generally permitted, but agents must review Product Hunt’s robots.txt and Terms of Service, respect rate limits, and avoid private or login‑protected information.
AlterLab automatically rotates proxies, solves CAPTCHAs, and renders JavaScript, returning structured data so agents receive reliable responses without retries or parsing overhead.
AlterLab charges per successful request; see the pricing page for volume discounts. Agent workloads typically pay only for the data they retrieve, with no minimum commitment.