How to Bypass Cloudflare Bot Protection with Puppeteer in 2026
How to Bypass Cloudflare Bot Protection with Puppeteer in 2026 Cloudflare protects roughly 20% of all websites on the internet. If you are scraping anything...
Yash Dubey
February 19, 2026
Cloudflare protects roughly 20% of all websites on the internet. If you are scraping anything at scale, you will hit Cloudflare's bot detection sooner or later. Puppeteer is the tool most developers reach for first, and it is also the tool Cloudflare has spent the most effort detecting.
This guide covers exactly what Cloudflare looks for, how to make Puppeteer less detectable, and where the limits of DIY solutions actually are.
How Cloudflare Detects Puppeteer
Before writing a single line of bypass code, you need to understand what you are actually fighting. Cloudflare does not use one detection method. It layers five or six independent signals and makes a composite decision.
TLS Fingerprinting
Every TLS client hello message contains a fingerprint. The order of cipher suites, supported extensions, elliptic curves, and compression methods create a unique signature. Headless Chrome's TLS fingerprint is different from regular Chrome. Cloudflare compares your TLS hello against a database of known browser fingerprints using JA3/JA4 hashing.
This is the hardest signal to fake. Puppeteer uses whatever TLS stack Node.js (or the bundled Chromium) provides. You cannot change it from JavaScript. If the JA3 hash of your connection does not match a real browser, Cloudflare flags you before your first HTTP request even arrives.
JavaScript Challenge (5-Second Shield)
The classic "Checking your browser" interstitial. Cloudflare serves a page that runs JavaScript to fingerprint the browser environment, then sets a cf_clearance cookie if the check passes. The JavaScript looks for headless browser indicators: navigator.webdriver being true, missing plugins, incorrect screen dimensions, and dozens of other signals.
Managed Challenges
Cloudflare's adaptive challenge system. Instead of always showing a CAPTCHA, it silently evaluates the client. If the browser environment looks human enough, the challenge resolves automatically. If it does not, the user gets a Turnstile widget or a full CAPTCHA.
Turnstile
Cloudflare's replacement for reCAPTCHA. Turnstile performs invisible proof-of-work challenges and collects behavioral signals (mouse movement, keyboard timing, interaction patterns). It is embedded on pages as a widget and increasingly replaces managed challenges.
HTTP/2 Fingerprinting
Beyond TLS, Cloudflare analyzes HTTP/2 connection parameters: SETTINGS frame values, WINDOW_UPDATE sizes, header compression (HPACK) behavior, and stream priority. Headless Chrome has distinct HTTP/2 behavior that does not match regular Chrome.
Canvas and WebGL Fingerprinting
The JavaScript challenge renders invisible canvas elements and checks WebGL renderer strings. Headless Chrome reports different GPU information than headed Chrome, and canvas rendering produces slightly different pixel values.
Step 1: Basic Puppeteer Setup
Start with a clean Puppeteer installation. The goal is to build up defenses incrementally so you can see what each layer actually fixes.
const puppeteer = require('puppeteer');
async function scrape(url) {
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-accelerated-2d-canvas',
'--disable-gpu',
'--window-size=1920,1080',
],
});
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.setUserAgent(
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ' +
'(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
);
await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });
const html = await page.content();
await browser.close();
return html;
}
scrape('https://target-site.com').then(console.log);This will fail on any Cloudflare-protected site. The navigator.webdriver flag is set, the headless Chrome fingerprint is exposed, and the TLS signature does not match. But it gives you a baseline to build from.
Step 2: puppeteer-extra-plugin-stealth
The puppeteer-extra ecosystem is the de facto standard for Puppeteer fingerprint evasion. The stealth plugin patches about a dozen known detection vectors.
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
async function scrape(url) {
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--window-size=1920,1080',
],
});
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });
const html = await page.content();
await browser.close();
return html;
}The stealth plugin handles:
- Removing
navigator.webdriverflag - Faking
navigator.pluginsandnavigator.mimeTypes - Spoofing
chrome.runtimeandchrome.loadTimes - Patching
Permissions.queryfor notifications - Fixing
navigator.languagesto include multiple entries - Overriding
HTMLMediaElement.canPlayTypecodecs - Spoofing WebGL vendor and renderer strings
This gets you past basic checks, but Cloudflare's managed challenges and Turnstile will still catch you. The stealth plugin has not kept pace with Cloudflare's detection evolution in 2025-2026.
Install Stealth Plugin
npm install puppeteer-extra puppeteer-extra-plugin-stealth
Patch Browser Fingerprint
Stealth plugin applies 12+ patches to hide automation indicators
Handle CF Challenge
Wait for cf_clearance cookie after JavaScript challenge resolves
Extract Data
Read page content after Cloudflare clears the request
Step 3: Waiting for Cloudflare Challenges
Cloudflare challenges take time. If you navigate and immediately try to read the page, you will get the challenge HTML instead of the actual content.
async function waitForCloudflare(page, url) {
await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 60000 });
// Wait for Cloudflare challenge to resolve
const maxWait = 30000;
const start = Date.now();
while (Date.now() - start < maxWait) {
const title = await page.title();
// Cloudflare challenge pages have specific titles
if (
title.includes('Just a moment') ||
title.includes('Checking your browser') ||
title.includes('Attention Required')
) {
await page.waitForTimeout(1000);
continue;
}
// Check if cf_clearance cookie exists
const cookies = await page.cookies();
const hasClearance = cookies.some(c => c.name === 'cf_clearance');
if (hasClearance) {
// Wait a bit more for page to fully load after challenge
await page.waitForTimeout(2000);
break;
}
await page.waitForTimeout(500);
}
return await page.content();
}This loop waits for the challenge to resolve by checking the page title and looking for the cf_clearance cookie. On sites using only the JS challenge (not Turnstile), this sometimes works with the stealth plugin. On sites with managed challenges, it usually times out.
Step 4: Cookie Persistence and Session Reuse
Cloudflare's cf_clearance cookie is valid for a set duration (usually 15-30 minutes). Instead of solving the challenge on every request, save and reuse cookies.
const fs = require('fs');
async function saveCookies(page, filePath) {
const cookies = await page.cookies();
fs.writeFileSync(filePath, JSON.stringify(cookies, null, 2));
}
async function loadCookies(page, filePath) {
if (!fs.existsSync(filePath)) return false;
const cookies = JSON.parse(fs.readFileSync(filePath, 'utf8'));
const now = Date.now() / 1000;
// Filter out expired cookies
const validCookies = cookies.filter(c => !c.expires || c.expires > now);
if (validCookies.length === 0) return false;
await page.setCookie(...validCookies);
return true;
}
async function scrapeWithCookies(url) {
const browser = await puppeteer.launch({ headless: 'new' });
const page = await browser.newPage();
const cookieFile = './cf_cookies.json';
// Try loading existing cookies first
const hasCookies = await loadCookies(page, cookieFile);
await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });
// Check if we still hit the challenge
const title = await page.title();
if (title.includes('Just a moment')) {
// Cookies expired or invalid, need to solve challenge again
await waitForCloudflare(page, url);
}
await saveCookies(page, cookieFile);
const html = await page.content();
await browser.close();
return html;
}Session reuse cuts down on challenge solves, but the cookies are tied to your IP address. If your IP changes (which happens with proxy rotation), the cookies become invalid.
Step 5: Request Interception
Cloudflare fingerprints you based on what your browser requests and how it handles responses. Request interception lets you modify headers and block fingerprinting scripts.
async function setupInterception(page) {
await page.setRequestInterception(true);
page.on('request', (request) => {
const url = request.url();
// Don't block Cloudflare challenge scripts
if (
url.includes('/cdn-cgi/challenge-platform') ||
url.includes('challenges.cloudflare.com')
) {
request.continue();
return;
}
// Block unnecessary resources to speed up loading
const blockedTypes = ['image', 'media', 'font'];
if (blockedTypes.includes(request.resourceType())) {
request.abort();
return;
}
// Modify headers to look more like a real browser
const headers = {
...request.headers(),
'accept-language': 'en-US,en;q=0.9',
'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
};
request.continue({ headers });
});
}A warning: blocking Cloudflare's challenge scripts will prevent the clearance cookie from being issued. Only block resources you know are safe to skip. And be careful with header overrides. If the sec-ch-ua version does not match your actual browser version, Cloudflare catches the mismatch.
Step 6: Using a Real Browser Profile
One of the more effective approaches is running Puppeteer with a persistent browser profile that has real browsing history, cached data, and stored cookies.
const path = require('path');
async function scrapeWithProfile(url) {
const userDataDir = path.join(__dirname, 'chrome-profile');
const browser = await puppeteer.launch({
headless: false, // headed mode is harder to detect
userDataDir, // persistent profile
args: [
'--no-sandbox',
'--window-size=1920,1080',
'--disable-blink-features=AutomationControlled',
],
ignoreDefaultArgs: ['--enable-automation'],
});
const page = await browser.newPage();
// Emulate human-like behavior
await page.evaluateOnNewDocument(() => {
// Override the webdriver property
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined,
});
// Add realistic plugins
Object.defineProperty(navigator, 'plugins', {
get: () => [
{ name: 'Chrome PDF Plugin', filename: 'internal-pdf-viewer' },
{ name: 'Chrome PDF Viewer', filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai' },
{ name: 'Native Client', filename: 'internal-nacl-plugin' },
],
});
});
await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });
const html = await page.content();
await browser.close();
return html;
}Using headless: false with --disable-blink-features=AutomationControlled and a persistent profile gets past more checks than headless mode. The trade-off is that you need a display server (Xvfb on Linux) and it uses more memory per instance.
Step 7: Handling Turnstile Challenges
Turnstile is the hardest Cloudflare challenge to bypass programmatically. It collects behavioral data and performs proof-of-work challenges. There is no reliable Puppeteer-only solution for Turnstile in 2026.
async function handleTurnstile(page) {
// Wait for the Turnstile iframe to appear
const turnstileFrame = await page.waitForSelector(
'iframe[src*="challenges.cloudflare.com"]',
{ timeout: 10000 }
).catch(() => null);
if (!turnstileFrame) return true; // No Turnstile, proceed
// Get the iframe content
const frame = await turnstileFrame.contentFrame();
if (!frame) return false;
// Wait for the checkbox to appear
const checkbox = await frame.waitForSelector(
'#cf-turnstile-response, .cf-turnstile-wrapper input',
{ timeout: 10000 }
).catch(() => null);
if (!checkbox) return false;
// Simulate human mouse movement toward the checkbox
const box = await checkbox.boundingBox();
if (!box) return false;
await page.mouse.move(
box.x + box.width / 2 + (Math.random() * 10 - 5),
box.y + box.height / 2 + (Math.random() * 10 - 5),
{ steps: 25 }
);
await page.waitForTimeout(200 + Math.random() * 300);
await checkbox.click();
// Wait for Turnstile to process
await page.waitForTimeout(5000);
// Check if challenge was solved
const cookies = await page.cookies();
return cookies.some(c => c.name === 'cf_clearance');
}This code shows the approach, but it has a low success rate in practice. Turnstile's behavioral analysis is sophisticated enough to distinguish scripted clicks from human interaction. The proof-of-work component also gets harder when Cloudflare suspects automation.
The Detection vs Evasion Arms Race
Here is the reality of bypassing Cloudflare with Puppeteer in 2026:
| Feature | Detection Layer | Puppeteer+Stealth | Headed+Profile | Scraping API |
|---|---|---|---|---|
| navigator.webdriver | ||||
| Plugin/MIME spoofing | ||||
| JS Challenge (basic) | ||||
| TLS Fingerprinting | ||||
| HTTP/2 Fingerprinting | ||||
| Managed Challenges | ||||
| Turnstile | ||||
| Canvas/WebGL |
The fundamental problem is not code quality. It is architectural. Puppeteer runs on top of Chrome DevTools Protocol, and Cloudflare can detect CDP connections. The stealth plugin patches visible JavaScript APIs, but it cannot change the TLS stack, HTTP/2 behavior, or the underlying connection characteristics that Cloudflare checks at the network level.
Every time the community finds a new bypass, Cloudflare patches it. The average lifespan of a new stealth technique in 2026 is about 48 hours before detection signatures are updated.
Combining Everything: A Production-Grade Attempt
Here is a complete script that combines all the techniques above into a single, reasonably robust scraper:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
const fs = require('fs');
const path = require('path');
puppeteer.use(StealthPlugin());
class CloudflareScraper {
constructor(options = {}) {
this.profileDir = options.profileDir || path.join(__dirname, 'chrome-profile');
this.cookieFile = options.cookieFile || path.join(__dirname, 'cookies.json');
this.maxRetries = options.maxRetries || 3;
this.browser = null;
}
async init() {
this.browser = await puppeteer.launch({
headless: false,
userDataDir: this.profileDir,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--window-size=1920,1080',
'--disable-blink-features=AutomationControlled',
'--lang=en-US,en',
],
ignoreDefaultArgs: ['--enable-automation'],
});
}
async createPage() {
const page = await this.browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.evaluateOnNewDocument(() => {
Object.defineProperty(navigator, 'webdriver', { get: () => undefined });
const originalQuery = window.navigator.permissions.query;
window.navigator.permissions.query = (parameters) =>
parameters.name === 'notifications'
? Promise.resolve({ state: Notification.permission })
: originalQuery(parameters);
window.chrome = { runtime: {}, loadTimes: () => ({}) };
});
await this.loadCookies(page);
return page;
}
async scrape(url) {
let lastError;
for (let attempt = 0; attempt < this.maxRetries; attempt++) {
try {
const page = await this.createPage();
await page.goto(url, {
waitUntil: 'domcontentloaded',
timeout: 60000,
});
const resolved = await this.waitForChallenge(page);
if (!resolved) {
await page.close();
continue;
}
await this.saveCookies(page);
const html = await page.content();
await page.close();
return html;
} catch (err) {
lastError = err;
console.error(`Attempt ${attempt + 1} failed: ${err.message}`);
await new Promise(r => setTimeout(r, 2000 * (attempt + 1)));
}
}
throw lastError;
}
async waitForChallenge(page) {
const maxWait = 30000;
const start = Date.now();
while (Date.now() - start < maxWait) {
const title = await page.title();
if (
!title.includes('Just a moment') &&
!title.includes('Checking your browser') &&
!title.includes('Attention Required')
) {
return true;
}
const hasTurnstile = await page.$('iframe[src*="challenges.cloudflare.com"]');
if (hasTurnstile) {
console.warn('Turnstile detected. Automated bypass unlikely.');
return false;
}
await page.waitForTimeout(1000);
}
return false;
}
async saveCookies(page) {
const cookies = await page.cookies();
fs.writeFileSync(this.cookieFile, JSON.stringify(cookies, null, 2));
}
async loadCookies(page) {
if (!fs.existsSync(this.cookieFile)) return;
const cookies = JSON.parse(fs.readFileSync(this.cookieFile, 'utf8'));
const now = Date.now() / 1000;
const valid = cookies.filter(c => !c.expires || c.expires > now);
if (valid.length > 0) await page.setCookie(...valid);
}
async close() {
if (this.browser) await this.browser.close();
}
}
// Usage
(async () => {
const scraper = new CloudflareScraper();
await scraper.init();
try {
const html = await scraper.scrape('https://target-site.com');
console.log(`Got ${html.length} bytes`);
} finally {
await scraper.close();
}
})();This is about as far as you can get with Puppeteer alone. It works against sites using only the JavaScript challenge. Against sites with Turnstile or aggressive managed challenges, the success rate drops below 20%.
When to Use Puppeteer and When to Use an API
Puppeteer is the right tool when you are scraping sites with no bot protection or basic JavaScript challenges. It gives you full control over the browser, and the stealth plugin handles the easy stuff. For hobby projects, small-scale data collection, or targets that do not use Cloudflare, Puppeteer works great.
But Cloudflare's detection has outpaced what the open-source stealth community can keep up with. The core problem is that Puppeteer cannot modify its TLS fingerprint, HTTP/2 behavior, or low-level network characteristics. These are the signals Cloudflare relies on most heavily in 2026.
If you are scraping Cloudflare-protected sites at production scale, you will spend more time maintaining evasion patches than building your actual product.
The API Alternative
Scraping APIs solve the Cloudflare problem at the infrastructure level. Instead of patching a headless browser, they use custom-built HTTP stacks with real browser TLS fingerprints, residential proxy networks, and challenge-solving pipelines.
AlterLab handles Cloudflare bypass (including Turnstile) at the API level. You send a URL, get back clean HTML. No browser fingerprinting, no stealth plugins, no cookie management. The API routes through real browser fingerprints and residential IPs, maintaining a 98%+ success rate against Cloudflare.
// Using AlterLab API instead of Puppeteer
const response = await fetch('https://api.alterlab.io/v1/scrape', {
method: 'POST',
headers: {
'X-API-Key': 'your_api_key',
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: 'https://cloudflare-protected-site.com',
formats: ['html', 'markdown'],
}),
});
const data = await response.json();
console.log(data.content);The trade-off is cost vs control. DIY gives you full control but demands ongoing maintenance. An API costs per request but eliminates the infrastructure burden entirely.
Summary
Bypassing Cloudflare with Puppeteer is increasingly an uphill battle. Here is what works, what sometimes works, and what does not:
Works: Stealth plugin for basic JS challenges, cookie persistence, human-like interaction patterns, headed mode with persistent profiles.
Sometimes works: Request interception with correct headers, combining multiple stealth techniques, using --disable-blink-features=AutomationControlled.
Does not work: Any approach that ignores TLS fingerprinting, headless mode against managed challenges, automated Turnstile solving with Puppeteer alone.
For production scraping against Cloudflare, the honest answer is that Puppeteer alone is not enough. You either need to invest serious engineering time in building custom browser infrastructure with modified TLS stacks, or use a scraping API that has already solved these problems at the infrastructure level.
Pick the approach that matches your scale, budget, and how much time you want to spend fighting bot detection instead of building your product.