Integration

Supabase

Supabase Integration

Scrape any website and store results directly in your Supabase database. Edge Function examples, pg_cron scheduling, and full Python and Node.js SDK walkthroughs included.

Edge Functions

pg_cron Scheduling

Python + Node.js SDKs

Structured Extraction

Overview

AlterLab turns any URL into structured data. Supabase gives you a managed Postgres database, Edge Functions for serverless compute, and pg_cron for scheduling — with no infrastructure to manage. Together, they cover the full scrape-to-database pipeline in minutes.

Serverless Scraping

Call AlterLab from Supabase Edge Functions — no servers, no queue management. Scale to zero when idle.

Scheduled Pipelines

Use pg_cron to trigger scrape jobs on any schedule — hourly price checks, daily news ingestion, weekly content audits.

Structured Output

Extract typed JSON from any page using a JSON Schema. Store prices, titles, inventory, and more as real columns.

Quickstart

Create a Supabase table and start storing scraped pages in under five minutes.

Step 1 — Create the database table

Open the SQL editor in your Supabase dashboard and run:

SQL

-- Create a table for scraped pages
create table scraped_pages (
  id          uuid primary key default gen_random_uuid(),
  url         text unique not null,
  markdown    text,
  html        text,
  scraped_at  timestamptz default now(),
  tier_used   int,
  cost        numeric(10, 6),
  created_at  timestamptz default now()
);

-- Index for fast URL lookups
create index on scraped_pages (url);
create index on scraped_pages (scraped_at desc);

Step 2 — Add your AlterLab API key as a secret

In your Supabase project, go to Project Settings → Edge Functions → Secrets and add:

Bash

ALTERLAB_API_KEY=sk_live_your_key_here

Get your API key from the AlterLab dashboard. New accounts include $1 in free balance — enough for ~5,000 static page scrapes.

Edge Functions

Deploy a Supabase Edge Function that calls AlterLab and upserts the result into your table. The upsert pattern means re-scraping a URL updates the existing row rather than creating duplicates.

TYPESCRIPT

// supabase/functions/scrape-and-store/index.ts
import { createClient } from "jsr:@supabase/supabase-js@2";

const supabase = createClient(
  Deno.env.get("SUPABASE_URL")!,
  Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
);

Deno.serve(async (req) => {
  const { url } = await req.json();

  // Scrape the URL with AlterLab
  const response = await fetch("https://api.alterlab.io/v1/scrape", {
    method: "POST",
    headers: {
      "X-API-Key": Deno.env.get("ALTERLAB_API_KEY")!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      url,
      formats: ["markdown", "html"],
      waitFor: 1000,
    }),
  });

  if (!response.ok) {
    return new Response(
      JSON.stringify({ error: "Scrape failed", status: response.status }),
      { status: 502, headers: { "Content-Type": "application/json" } },
    );
  }

  const result = await response.json();

  // Insert into Supabase
  const { data, error } = await supabase
    .from("scraped_pages")
    .upsert({
      url,
      markdown: result.markdown,
      html: result.html,
      scraped_at: new Date().toISOString(),
      tier_used: result.meta?.tier,
      cost: result.meta?.cost,
    }, { onConflict: "url" });

  if (error) {
    return new Response(
      JSON.stringify({ error: error.message }),
      { status: 500, headers: { "Content-Type": "application/json" } },
    );
  }

  return new Response(
    JSON.stringify({ success: true, data }),
    { headers: { "Content-Type": "application/json" } },
  );
});

Deploy the function

Bash

supabase functions deploy scrape-and-store --no-verify-jwt

# Invoke manually to test
curl -i --location --request POST \
  'https://your-project.supabase.co/functions/v1/scrape-and-store' \
  --header 'Authorization: Bearer YOUR_ANON_KEY' \
  --header 'Content-Type: application/json' \
  --data '{"url":"https://example.com"}'

JavaScript-heavy sites: Add waitFor: 2000 and "renderJs": true to the scrape request body. AlterLab will use a full headless browser with anti-bot bypass automatically.

Example: Price Monitor Edge Function

A more complete example that uses AlterLab's structured extraction to pull typed price data and store it in a history table:

TYPESCRIPT

// supabase/functions/price-monitor/index.ts
// Called by pg_cron every hour to check product prices

import { createClient } from "jsr:@supabase/supabase-js@2";

const supabase = createClient(
  Deno.env.get("SUPABASE_URL")!,
  Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
);

const PRODUCTS = [
  { name: "Widget Pro", url: "https://shop.example.com/widget-pro" },
  { name: "Gadget X",   url: "https://shop.example.com/gadget-x" },
];

// Simple price extraction schema
const EXTRACT_SCHEMA = {
  type: "object",
  properties: {
    price:    { type: "number", description: "Current price in USD" },
    currency: { type: "string", description: "Currency code, e.g. USD" },
    in_stock: { type: "boolean", description: "Is the product in stock?" },
  },
};

Deno.serve(async () => {
  const results = [];

  for (const product of PRODUCTS) {
    const res = await fetch("https://api.alterlab.io/v1/scrape", {
      method: "POST",
      headers: {
        "X-API-Key":     Deno.env.get("ALTERLAB_API_KEY")!,
        "Content-Type":  "application/json",
      },
      body: JSON.stringify({
        url:     product.url,
        extract: { schema: EXTRACT_SCHEMA },
      }),
    });

    if (!res.ok) continue;

    const data = await res.json();
    const extracted = data.extract ?? {};

    await supabase.from("price_history").insert({
      product_name: product.name,
      url:          product.url,
      price:        extracted.price,
      currency:     extracted.currency ?? "USD",
      in_stock:     extracted.in_stock,
      checked_at:   new Date().toISOString(),
    });

    results.push({ product: product.name, ...extracted });
  }

  return new Response(JSON.stringify({ checked: results.length, results }), {
    headers: { "Content-Type": "application/json" },
  });
});

Scheduling with pg_cron

pg_cron is a PostgreSQL extension built into Supabase. It runs SQL on a cron schedule directly inside the database — no external cron infrastructure required.

Enable pg_cron first: Supabase Dashboard → Database → Extensions → pg_cron → Enable. Also enable the pg_net extension for HTTP calls from SQL.

SQL

-- Enable pg_cron (once per project, via Supabase Dashboard → Extensions)
-- create extension if not exists pg_cron;

-- Schedule the Edge Function every hour
select cron.schedule(
  'scrape-price-monitor',   -- job name (unique)
  '0 * * * *',              -- every hour at :00
  $$
  select net.http_post(
    url     := current_setting('app.supabase_url') || '/functions/v1/price-monitor',
    headers := jsonb_build_object(
      'Authorization', 'Bearer ' || current_setting('app.service_role_key'),
      'Content-Type',  'application/json'
    ),
    body    := '{}'::jsonb
  ) as request_id;
  $$
);

-- View scheduled jobs
select jobid, schedule, command from cron.job;

-- Remove a job
select cron.unschedule('scrape-price-monitor');

Common schedules

Schedule	cron	Use case
Every hour	0 * * * *	Price monitoring
Every 15 min	/15 * * *	Stock / inventory checks
Daily at 6am UTC	0 6 * * *	News ingestion, content sync
Weekly on Monday	0 8 * * 1	Lead enrichment, SEO audits

Python SDK

Use the AlterLab Python SDK with the Supabase Python client for scripts, data pipelines, and backend services.

Installation

Bash

pip install alterlab supabase

Scrape and store

Python

import alterlab
from supabase import create_client, Client
from datetime import datetime

# Initialize clients
scraper = alterlab.Client(api_key="sk_live_...")
supabase: Client = create_client(
    "https://your-project.supabase.co",
    "your-service-role-key",
)

def scrape_and_store(url: str) -> dict:
    """Scrape a URL and store the result in Supabase."""
    # Scrape with AlterLab
    result = scraper.scrape(
        url,
        formats=["markdown"],
        wait_for=1000,
    )

    # Upsert into Supabase (update if URL already exists)
    response = (
        supabase.table("scraped_pages")
        .upsert({
            "url": url,
            "markdown": result.markdown,
            "scraped_at": datetime.utcnow().isoformat(),
            "tier_used": result.meta.get("tier"),
            "cost": result.meta.get("cost"),
        }, on_conflict="url")
        .execute()
    )

    return response.data[0] if response.data else {}


# Batch scrape a list of URLs
urls = [
    "https://news.ycombinator.com",
    "https://techcrunch.com",
    "https://theverge.com",
]

for url in urls:
    row = scrape_and_store(url)
    print(f"Stored {url} → id={row.get('id')}")

Node.js SDK

The AlterLab Node.js SDK works natively in Deno (Edge Functions) and Node.js runtimes. Use it with the Supabase JS client for TypeScript projects.

Installation

Bash

npm install alterlab @supabase/supabase-js

Scrape and store

TYPESCRIPT

import AlterLab from "alterlab";
import { createClient } from "@supabase/supabase-js";

const scraper = new AlterLab({ apiKey: process.env.ALTERLAB_API_KEY! });
const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!,
);

async function scrapeAndStore(url: string) {
  // Scrape with AlterLab
  const result = await scraper.scrape(url, {
    formats: ["markdown"],
    waitFor: 1000,
  });

  // Upsert into Supabase
  const { data, error } = await supabase
    .from("scraped_pages")
    .upsert(
      {
        url,
        markdown: result.markdown,
        scraped_at: new Date().toISOString(),
        tier_used: result.meta?.tier,
        cost: result.meta?.cost,
      },
      { onConflict: "url" },
    )
    .select()
    .single();

  if (error) throw error;
  return data;
}

// Price monitor: check products on an interval
const urls = [
  "https://example.com/product/a",
  "https://example.com/product/b",
];

for (const url of urls) {
  const row = await scrapeAndStore(url);
  console.log(`Stored ${url} → id=${row.id}`);
}

Common Patterns

Price monitoring

Scrape product pages with structured extraction (extract.schema) to pull price, currency, and stock status into a price_history table. Use Supabase Realtime to push alerts when prices drop.

Lead enrichment

Trigger an Edge Function via Supabase Database Webhooks when a new lead is inserted. The function scrapes the lead's website and appends company size, tech stack, and description back to the row.

Content aggregation

Schedule a pg_cron job to scrape news sources and blogs daily. Store full Markdown content in Supabase and use pgvector with OpenAI embeddings for semantic search across all ingested articles.

Competitor tracking

Monitor competitor pricing pages, job listings, and feature announcements on a schedule. Use Supabase Edge Functions to send a Slack notification via webhook when changes are detected.

Error Handling

AlterLab returns standard HTTP status codes. For production pipelines, implement exponential backoff for 429 (rate limit) and 503 (temporary failure) responses.

TYPESCRIPT

// Exponential backoff for transient AlterLab errors
async function scrapeWithRetry(
  url: string,
  maxRetries = 3,
): Promise<ScrapeResult> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await scraper.scrape(url, { formats: ["markdown"] });
    } catch (err: unknown) {
      const isRetryable =
        err instanceof Error &&
        (err.message.includes("429") || err.message.includes("503"));

      if (!isRetryable || attempt === maxRetries - 1) throw err;

      // Exponential backoff: 1s, 2s, 4s
      await new Promise((r) => setTimeout(r, 1000 * 2 ** attempt));
    }
  }
  throw new Error("Max retries exceeded");
}

AlterLab automatically escalates through its tier system when anti-bot protection is detected — you do not need to handle this manually. A request that starts at Tier 1 (static HTTP) will escalate to Tier 4 (full browser) if needed.

LlamaIndex Chrome Extension

Last updated: June 2026

Integration

Supabase

Supabase Integration

Scrape any website and store results directly in your Supabase database. Edge Function examples, pg_cron scheduling, and full Python and Node.js SDK walkthroughs included.

Edge Functions

pg_cron Scheduling

Python + Node.js SDKs

Structured Extraction

Overview

Serverless Scraping

Call AlterLab from Supabase Edge Functions — no servers, no queue management. Scale to zero when idle.

Scheduled Pipelines

Use pg_cron to trigger scrape jobs on any schedule — hourly price checks, daily news ingestion, weekly content audits.

Structured Output

Extract typed JSON from any page using a JSON Schema. Store prices, titles, inventory, and more as real columns.

Quickstart

Create a Supabase table and start storing scraped pages in under five minutes.

Step 1 — Create the database table

Open the SQL editor in your Supabase dashboard and run:

SQL

-- Create a table for scraped pages
create table scraped_pages (
  id          uuid primary key default gen_random_uuid(),
  url         text unique not null,
  markdown    text,
  html        text,
  scraped_at  timestamptz default now(),
  tier_used   int,
  cost        numeric(10, 6),
  created_at  timestamptz default now()
);

-- Index for fast URL lookups
create index on scraped_pages (url);
create index on scraped_pages (scraped_at desc);

Step 2 — Add your AlterLab API key as a secret

In your Supabase project, go to Project Settings → Edge Functions → Secrets and add:

Bash

ALTERLAB_API_KEY=sk_live_your_key_here

Get your API key from the AlterLab dashboard. New accounts include $1 in free balance — enough for ~5,000 static page scrapes.

Edge Functions

Deploy a Supabase Edge Function that calls AlterLab and upserts the result into your table. The upsert pattern means re-scraping a URL updates the existing row rather than creating duplicates.

TYPESCRIPT

// supabase/functions/scrape-and-store/index.ts
import { createClient } from "jsr:@supabase/supabase-js@2";

const supabase = createClient(
  Deno.env.get("SUPABASE_URL")!,
  Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
);

Deno.serve(async (req) => {
  const { url } = await req.json();

  // Scrape the URL with AlterLab
  const response = await fetch("https://api.alterlab.io/v1/scrape", {
    method: "POST",
    headers: {
      "X-API-Key": Deno.env.get("ALTERLAB_API_KEY")!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      url,
      formats: ["markdown", "html"],
      waitFor: 1000,
    }),
  });

  if (!response.ok) {
    return new Response(
      JSON.stringify({ error: "Scrape failed", status: response.status }),
      { status: 502, headers: { "Content-Type": "application/json" } },
    );
  }

  const result = await response.json();

  // Insert into Supabase
  const { data, error } = await supabase
    .from("scraped_pages")
    .upsert({
      url,
      markdown: result.markdown,
      html: result.html,
      scraped_at: new Date().toISOString(),
      tier_used: result.meta?.tier,
      cost: result.meta?.cost,
    }, { onConflict: "url" });

  if (error) {
    return new Response(
      JSON.stringify({ error: error.message }),
      { status: 500, headers: { "Content-Type": "application/json" } },
    );
  }

  return new Response(
    JSON.stringify({ success: true, data }),
    { headers: { "Content-Type": "application/json" } },
  );
});

Deploy the function

Bash

supabase functions deploy scrape-and-store --no-verify-jwt

# Invoke manually to test
curl -i --location --request POST \
  'https://your-project.supabase.co/functions/v1/scrape-and-store' \
  --header 'Authorization: Bearer YOUR_ANON_KEY' \
  --header 'Content-Type: application/json' \
  --data '{"url":"https://example.com"}'

JavaScript-heavy sites: Add waitFor: 2000 and "renderJs": true to the scrape request body. AlterLab will use a full headless browser with anti-bot bypass automatically.

Example: Price Monitor Edge Function

A more complete example that uses AlterLab's structured extraction to pull typed price data and store it in a history table:

TYPESCRIPT

// supabase/functions/price-monitor/index.ts
// Called by pg_cron every hour to check product prices

import { createClient } from "jsr:@supabase/supabase-js@2";

const supabase = createClient(
  Deno.env.get("SUPABASE_URL")!,
  Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
);

const PRODUCTS = [
  { name: "Widget Pro", url: "https://shop.example.com/widget-pro" },
  { name: "Gadget X",   url: "https://shop.example.com/gadget-x" },
];

// Simple price extraction schema
const EXTRACT_SCHEMA = {
  type: "object",
  properties: {
    price:    { type: "number", description: "Current price in USD" },
    currency: { type: "string", description: "Currency code, e.g. USD" },
    in_stock: { type: "boolean", description: "Is the product in stock?" },
  },
};

Deno.serve(async () => {
  const results = [];

  for (const product of PRODUCTS) {
    const res = await fetch("https://api.alterlab.io/v1/scrape", {
      method: "POST",
      headers: {
        "X-API-Key":     Deno.env.get("ALTERLAB_API_KEY")!,
        "Content-Type":  "application/json",
      },
      body: JSON.stringify({
        url:     product.url,
        extract: { schema: EXTRACT_SCHEMA },
      }),
    });

    if (!res.ok) continue;

    const data = await res.json();
    const extracted = data.extract ?? {};

    await supabase.from("price_history").insert({
      product_name: product.name,
      url:          product.url,
      price:        extracted.price,
      currency:     extracted.currency ?? "USD",
      in_stock:     extracted.in_stock,
      checked_at:   new Date().toISOString(),
    });

    results.push({ product: product.name, ...extracted });
  }

  return new Response(JSON.stringify({ checked: results.length, results }), {
    headers: { "Content-Type": "application/json" },
  });
});

Scheduling with pg_cron

pg_cron is a PostgreSQL extension built into Supabase. It runs SQL on a cron schedule directly inside the database — no external cron infrastructure required.

Enable pg_cron first: Supabase Dashboard → Database → Extensions → pg_cron → Enable. Also enable the pg_net extension for HTTP calls from SQL.

SQL

-- Enable pg_cron (once per project, via Supabase Dashboard → Extensions)
-- create extension if not exists pg_cron;

-- Schedule the Edge Function every hour
select cron.schedule(
  'scrape-price-monitor',   -- job name (unique)
  '0 * * * *',              -- every hour at :00
  $$
  select net.http_post(
    url     := current_setting('app.supabase_url') || '/functions/v1/price-monitor',
    headers := jsonb_build_object(
      'Authorization', 'Bearer ' || current_setting('app.service_role_key'),
      'Content-Type',  'application/json'
    ),
    body    := '{}'::jsonb
  ) as request_id;
  $$
);

-- View scheduled jobs
select jobid, schedule, command from cron.job;

-- Remove a job
select cron.unschedule('scrape-price-monitor');

Common schedules

Schedule	cron	Use case
Every hour	0 * * * *	Price monitoring
Every 15 min	/15 * * *	Stock / inventory checks
Daily at 6am UTC	0 6 * * *	News ingestion, content sync
Weekly on Monday	0 8 * * 1	Lead enrichment, SEO audits

Python SDK

Use the AlterLab Python SDK with the Supabase Python client for scripts, data pipelines, and backend services.

Installation

Bash

pip install alterlab supabase

Scrape and store

Python

import alterlab
from supabase import create_client, Client
from datetime import datetime

# Initialize clients
scraper = alterlab.Client(api_key="sk_live_...")
supabase: Client = create_client(
    "https://your-project.supabase.co",
    "your-service-role-key",
)

def scrape_and_store(url: str) -> dict:
    """Scrape a URL and store the result in Supabase."""
    # Scrape with AlterLab
    result = scraper.scrape(
        url,
        formats=["markdown"],
        wait_for=1000,
    )

    # Upsert into Supabase (update if URL already exists)
    response = (
        supabase.table("scraped_pages")
        .upsert({
            "url": url,
            "markdown": result.markdown,
            "scraped_at": datetime.utcnow().isoformat(),
            "tier_used": result.meta.get("tier"),
            "cost": result.meta.get("cost"),
        }, on_conflict="url")
        .execute()
    )

    return response.data[0] if response.data else {}


# Batch scrape a list of URLs
urls = [
    "https://news.ycombinator.com",
    "https://techcrunch.com",
    "https://theverge.com",
]

for url in urls:
    row = scrape_and_store(url)
    print(f"Stored {url} → id={row.get('id')}")

Node.js SDK

The AlterLab Node.js SDK works natively in Deno (Edge Functions) and Node.js runtimes. Use it with the Supabase JS client for TypeScript projects.

Installation

Bash

npm install alterlab @supabase/supabase-js

Scrape and store

TYPESCRIPT

import AlterLab from "alterlab";
import { createClient } from "@supabase/supabase-js";

const scraper = new AlterLab({ apiKey: process.env.ALTERLAB_API_KEY! });
const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!,
);

async function scrapeAndStore(url: string) {
  // Scrape with AlterLab
  const result = await scraper.scrape(url, {
    formats: ["markdown"],
    waitFor: 1000,
  });

  // Upsert into Supabase
  const { data, error } = await supabase
    .from("scraped_pages")
    .upsert(
      {
        url,
        markdown: result.markdown,
        scraped_at: new Date().toISOString(),
        tier_used: result.meta?.tier,
        cost: result.meta?.cost,
      },
      { onConflict: "url" },
    )
    .select()
    .single();

  if (error) throw error;
  return data;
}

// Price monitor: check products on an interval
const urls = [
  "https://example.com/product/a",
  "https://example.com/product/b",
];

for (const url of urls) {
  const row = await scrapeAndStore(url);
  console.log(`Stored ${url} → id=${row.id}`);
}

Common Patterns

Price monitoring

Scrape product pages with structured extraction (extract.schema) to pull price, currency, and stock status into a price_history table. Use Supabase Realtime to push alerts when prices drop.

Lead enrichment

Trigger an Edge Function via Supabase Database Webhooks when a new lead is inserted. The function scrapes the lead's website and appends company size, tech stack, and description back to the row.

Content aggregation

Schedule a pg_cron job to scrape news sources and blogs daily. Store full Markdown content in Supabase and use pgvector with OpenAI embeddings for semantic search across all ingested articles.

Competitor tracking

Monitor competitor pricing pages, job listings, and feature announcements on a schedule. Use Supabase Edge Functions to send a Slack notification via webhook when changes are detected.

Error Handling

AlterLab returns standard HTTP status codes. For production pipelines, implement exponential backoff for 429 (rate limit) and 503 (temporary failure) responses.

TYPESCRIPT

// Exponential backoff for transient AlterLab errors
async function scrapeWithRetry(
  url: string,
  maxRetries = 3,
): Promise<ScrapeResult> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await scraper.scrape(url, { formats: ["markdown"] });
    } catch (err: unknown) {
      const isRetryable =
        err instanceof Error &&
        (err.message.includes("429") || err.message.includes("503"));

      if (!isRetryable || attempt === maxRetries - 1) throw err;

      // Exponential backoff: 1s, 2s, 4s
      await new Promise((r) => setTimeout(r, 1000 * 2 ** attempt));
    }
  }
  throw new Error("Max retries exceeded");
}

LlamaIndex Chrome Extension

Last updated: June 2026