BYOK Extraction
Use your own LLM API key for extraction. AlterLab orchestrates the pipeline and charges a flat invocation fee — your token costs stay between you and your provider.
Zero markup on tokens
How It Works
Register your provider key
Add your OpenAI, Anthropic, OpenRouter, or Groq API key to your AlterLab account. Keys are stored encrypted and never logged.
Send extraction request
POST content + an extraction_prompt to /api/v1/extract. AlterLab cleans the content, builds the LLM prompt, and calls your registered provider.
Receive structured output
Get clean, typed JSON back — with token usage, model name, and latency surfaced in the response for cost tracking.
Pay your provider separately
Token costs are billed directly by your LLM provider. AlterLab deducts only its flat invocation fee from your balance.
Supported Providers
| Provider | Key Format | Best For | Notes |
|---|---|---|---|
| OpenAI | sk-... | GPT-4o, GPT-4o-mini — general extraction | High accuracy, moderate cost. JSON mode enforced. |
| Anthropic | sk-ant-... | Claude Haiku, Sonnet — complex reasoning | Excellent instruction following. Best for nuanced prompts. |
| OpenRouter | sk-or-... | Access 100+ models via one key | Route to cheapest available model. Good for cost optimization. |
| Groq | gsk_... | Llama 3, Mixtral — ultra-fast inference | Fastest latency. Best for high-throughput pipelines. |
Model selection
model_used.Cost Model
BYOK extraction has two cost components: an AlterLab invocation fee and the token cost billed by your provider.
| Cost Component | Amount | Billed By | Notes |
|---|---|---|---|
| AlterLab base extraction | $0.0025 / call | AlterLab balance | Applies to all /v1/extract calls, including algorithmic |
| LLM invocation fee | +$0.001 / call | AlterLab balance | Only when extraction_prompt is provided |
| Token cost | Provider rates | Your provider | Billed by OpenAI / Anthropic / OpenRouter / Groq directly |
| Large content surcharge | +$0.0025 / call | AlterLab balance | Content > 200K characters |
Example cost breakdown
Extracting product data from a 50K-character HTML page using GPT-4o-mini:
Token usage in response
model_used and extraction metadata so you can track costs accurately across providers.When to Use LLM Extraction
AlterLab runs algorithmic extraction by default — it is faster, cheaper, and deterministic. LLM extraction is invoked only when you provide an extraction_prompt.
| Scenario | Recommended | Reason |
|---|---|---|
| HTML with Schema.org / Open Graph metadata | Schema or profile (no prompt) | Algorithmic — faster, $0.0025 total, deterministic |
| E-commerce product pages | Profile: product (no prompt) | Profile templates handle standard product data structures |
| Plain text, OCR output, transcripts | LLM with prompt + schema | No HTML structure to parse — LLM understands natural language |
| Summarization, sentiment, classification | LLM with prompt | Requires reasoning — algorithmic extraction cannot infer semantics |
| Complex multi-field extraction with context | LLM with prompt + schema | Schema ensures typed output; prompt provides reasoning context |
Register Your API Key
Register your LLM provider key via the dashboard or API. Keys are encrypted at rest and are never returned in API responses or logs.
Via Dashboard
- Go to Dashboard → Settings → LLM Keys
- Click Add Provider Key
- Select your provider (OpenAI, Anthropic, OpenRouter, Groq)
- Paste your API key and give it a label
- Click Save — the key is tested and encrypted immediately
Key permissions
model.request is sufficient. Never use organization admin keys.Your First Extraction
Once your key is registered, add an extraction_prompt to any extraction request. The LLM is invoked automatically when a prompt is present.
import requests
response = requests.post(
"https://api.alterlab.io/api/v1/extract",
headers={"X-API-Key": "YOUR_ALTERLAB_KEY"},
json={
"content": article_html,
"content_type": "html",
"extraction_prompt": (
"Extract the article title, author name, publish date, "
"and a 2-sentence summary of the main argument."
),
"extraction_schema": {
"type": "object",
"properties": {
"title": {"type": "string"},
"author": {"type": "string"},
"published_date": {"type": "string"},
"summary": {"type": "string"}
}
}
}
)
data = response.json()
print(f"Model: {data['model_used']}") # e.g. "gpt-4o-mini"
print(f"Method: {data['extraction_method']}") # "llm"
print(data["formats"]["json"])Provider Examples
The provider is determined by which key you have registered. The extraction request syntax is identical regardless of provider — only the key registration differs.
Use Case: Batch classification with Groq (high throughput)
import asyncio
import aiohttp
API_KEY = "YOUR_ALTERLAB_KEY"
async def classify(session, text):
async with session.post(
"https://api.alterlab.io/api/v1/extract",
headers={"X-API-Key": API_KEY},
json={
"content": text,
"content_type": "text",
"extraction_prompt": (
"Classify this customer support message. "
"Determine the category, urgency, and whether it needs human review."
),
"extraction_schema": {
"type": "object",
"properties": {
"category": {"type": "string"},
"urgency": {"type": "string"},
"needs_human": {"type": "boolean"}
}
}
}
) as resp:
return await resp.json()
async def main(tickets):
async with aiohttp.ClientSession() as session:
tasks = [classify(session, t) for t in tickets]
return await asyncio.gather(*tasks)
results = asyncio.run(main(ticket_texts))
for r in results:
print(r["formats"]["json"]) # {"category": "...", "urgency": "...", "needs_human": ...}Use Case: Deep analysis with Anthropic Claude
# Claude excels at nuanced extraction with complex context
response = requests.post(
"https://api.alterlab.io/api/v1/extract",
headers={"X-API-Key": "YOUR_ALTERLAB_KEY"},
json={
"content": legal_document_text,
"content_type": "text",
"extraction_prompt": (
"Extract all contractual obligations from this document. "
"For each obligation, identify: who is obligated, what they must do, "
"the deadline if specified, and any penalties for non-compliance."
),
"extraction_schema": {
"type": "object",
"properties": {
"obligations": {
"type": "array",
"items": {
"type": "object",
"properties": {
"party": {"type": "string"},
"obligation": {"type": "string"},
"deadline": {"type": "string"},
"penalty": {"type": "string"}
}
}
}
}
}
}
)
result = response.json()
print(f"Extracted by: {result['model_used']}")
for ob in result["formats"]["json"]["obligations"]:
print(f"- {ob['party']}: {ob['obligation']}")Best Practices
Use algorithmic extraction first
For HTML content with semantic structure (Schema.org, Open Graph, standard product markup), omit extraction_prompt and use extraction_schema or extraction_profile alone. This is 3x cheaper and deterministic.
Always pair prompts with schemas
LLM output without a schema may return inconsistent field names or types across calls. Adding extraction_schema forces typed, structured output that matches your data model.
Write concise prompts
Prompts are included in the LLM context window, which affects token cost. Keep prompts under 500 characters for simple extraction tasks. The 2,000 character limit is for genuinely complex multi-step reasoning.
Truncate content before sending
AlterLab truncates content to 30K characters before passing it to the LLM. For very large pages, extract the relevant section first to reduce token cost and improve extraction accuracy.
Use field descriptions for disambiguation
When your schema has fields with ambiguous names (e.g., value, name), add JSON Schema description to each field. The LLM uses these to resolve ambiguity.
Monitor your provider usage
Token costs accumulate independently on your provider account. Set spending limits directly with your provider (OpenAI, Anthropic usage limits, OpenRouter budget caps) to prevent unexpected costs.