Pricing Compare Playground Blog Docs Changelog

How to Bypass Cloudflare Bot Protection with Puppeteer in 2026

How to Bypass Cloudflare Bot Protection with Puppeteer in 2026 Cloudflare protects roughly 20% of all websites on the internet. If you are scraping anything...

Yash DubeyFebruary 19, 2026

13 min read

1,241 views

On this page

Cloudflare protects roughly 20% of all websites on the internet. If you are scraping anything at scale, you will hit Cloudflare's bot detection sooner or later. Puppeteer is the tool most developers reach for first, and it is also the tool Cloudflare has spent the most effort detecting.

This guide covers exactly what Cloudflare looks for, how to make Puppeteer less detectable, and where the limits of DIY solutions actually are.

20%Websites Behind Cloudflare

5+Detection Layers

~48hrsStealth Patch Lifespan

How Cloudflare Detects Puppeteer

Before writing a single line of bypass code, you need to understand what you are actually fighting. Cloudflare does not use one detection method. It layers five or six independent signals and makes a composite decision.

TLS Fingerprinting

Every TLS client hello message contains a fingerprint. The order of cipher suites, supported extensions, elliptic curves, and compression methods create a unique signature. Headless Chrome's TLS fingerprint is different from regular Chrome. Cloudflare compares your TLS hello against a database of known browser fingerprints using JA3/JA4 hashing.

This is the hardest signal to fake. Puppeteer uses whatever TLS stack Node.js (or the bundled Chromium) provides. You cannot change it from JavaScript. If the JA3 hash of your connection does not match a real browser, Cloudflare flags you before your first HTTP request even arrives.

JavaScript Challenge (5-Second Shield)

The classic "Checking your browser" interstitial. Cloudflare serves a page that runs JavaScript to fingerprint the browser environment, then sets a cf_clearance cookie if the check passes. The JavaScript looks for headless browser indicators: navigator.webdriver being true, missing plugins, incorrect screen dimensions, and dozens of other signals.

Managed Challenges

Cloudflare's adaptive challenge system. Instead of always showing a CAPTCHA, it silently evaluates the client. If the browser environment looks human enough, the challenge resolves automatically. If it does not, the user gets a Turnstile widget or a full CAPTCHA.

Turnstile

Cloudflare's replacement for reCAPTCHA. Turnstile performs invisible proof-of-work challenges and collects behavioral signals (mouse movement, keyboard timing, interaction patterns). It is embedded on pages as a widget and increasingly replaces managed challenges.

HTTP/2 Fingerprinting

Beyond TLS, Cloudflare analyzes HTTP/2 connection parameters: SETTINGS frame values, WINDOW_UPDATE sizes, header compression (HPACK) behavior, and stream priority. Headless Chrome has distinct HTTP/2 behavior that does not match regular Chrome.

Canvas and WebGL Fingerprinting

The JavaScript challenge renders invisible canvas elements and checks WebGL renderer strings. Headless Chrome reports different GPU information than headed Chrome, and canvas rendering produces slightly different pixel values.

Step 1: Basic Puppeteer Setup

Start with a clean Puppeteer installation. The goal is to build up defenses incrementally so you can see what each layer actually fixes.

JAVASCRIPT

const puppeteer = require('puppeteer');

async function scrape(url) {
  const browser = await puppeteer.launch({
    headless: 'new',
    args: [
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--disable-dev-shm-usage',
      '--disable-accelerated-2d-canvas',
      '--disable-gpu',
      '--window-size=1920,1080',
    ],
  });

  const page = await browser.newPage();

  await page.setViewport({ width: 1920, height: 1080 });
  await page.setUserAgent(
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ' +
    '(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
  );

  await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });

  const html = await page.content();
  await browser.close();
  return html;
}

scrape('https://target-site.com').then(console.log);

This will fail on any Cloudflare-protected site. The navigator.webdriver flag is set, the headless Chrome fingerprint is exposed, and the TLS signature does not match. But it gives you a baseline to build from.

Step 2: puppeteer-extra-plugin-stealth

The puppeteer-extra ecosystem is the de facto standard for Puppeteer fingerprint evasion. The stealth plugin patches about a dozen known detection vectors.

JAVASCRIPT

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');

puppeteer.use(StealthPlugin());

async function scrape(url) {
  const browser = await puppeteer.launch({
    headless: 'new',
    args: [
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--disable-dev-shm-usage',
      '--window-size=1920,1080',
    ],
  });

  const page = await browser.newPage();
  await page.setViewport({ width: 1920, height: 1080 });

  await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });
  const html = await page.content();

  await browser.close();
  return html;
}

The stealth plugin handles:

Removing navigator.webdriver flag
Faking navigator.plugins and navigator.mimeTypes
Spoofing chrome.runtime and chrome.loadTimes
Patching Permissions.query for notifications
Fixing navigator.languages to include multiple entries
Overriding HTMLMediaElement.canPlayType codecs
Spoofing WebGL vendor and renderer strings

This gets you past basic checks, but Cloudflare's managed challenges and Turnstile will still catch you. The stealth plugin has not kept pace with Cloudflare's detection evolution in 2025-2026.

Install Stealth Plugin

npm install puppeteer-extra puppeteer-extra-plugin-stealth

Patch Browser Fingerprint

Stealth plugin applies 12+ patches to hide automation indicators

Handle CF Challenge

Wait for cf_clearance cookie after JavaScript challenge resolves

Extract Data

Read page content after Cloudflare clears the request

Step 3: Waiting for Cloudflare Challenges

Cloudflare challenges take time. If you navigate and immediately try to read the page, you will get the challenge HTML instead of the actual content.

JAVASCRIPT

async function waitForCloudflare(page, url) {
  await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 60000 });

  // Wait for Cloudflare challenge to resolve
  const maxWait = 30000;
  const start = Date.now();

  while (Date.now() - start < maxWait) {
    const title = await page.title();

    // Cloudflare challenge pages have specific titles
    if (
      title.includes('Just a moment') ||
      title.includes('Checking your browser') ||
      title.includes('Attention Required')
    ) {
      await page.waitForTimeout(1000);
      continue;
    }

    // Check if cf_clearance cookie exists
    const cookies = await page.cookies();
    const hasClearance = cookies.some(c => c.name === 'cf_clearance');

    if (hasClearance) {
      // Wait a bit more for page to fully load after challenge
      await page.waitForTimeout(2000);
      break;
    }

    await page.waitForTimeout(500);
  }

  return await page.content();
}

This loop waits for the challenge to resolve by checking the page title and looking for the cf_clearance cookie. On sites using only the JS challenge (not Turnstile), this sometimes works with the stealth plugin. On sites with managed challenges, it usually times out.

Cloudflare's cf_clearance cookie is valid for a set duration (usually 15-30 minutes). Instead of solving the challenge on every request, save and reuse cookies.

JAVASCRIPT

const fs = require('fs');

async function saveCookies(page, filePath) {
  const cookies = await page.cookies();
  fs.writeFileSync(filePath, JSON.stringify(cookies, null, 2));
}

async function loadCookies(page, filePath) {
  if (!fs.existsSync(filePath)) return false;

  const cookies = JSON.parse(fs.readFileSync(filePath, 'utf8'));
  const now = Date.now() / 1000;

  // Filter out expired cookies
  const validCookies = cookies.filter(c => !c.expires || c.expires > now);

  if (validCookies.length === 0) return false;

  await page.setCookie(...validCookies);
  return true;
}

async function scrapeWithCookies(url) {
  const browser = await puppeteer.launch({ headless: 'new' });
  const page = await browser.newPage();
  const cookieFile = './cf_cookies.json';

  // Try loading existing cookies first
  const hasCookies = await loadCookies(page, cookieFile);

  await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });

  // Check if we still hit the challenge
  const title = await page.title();
  if (title.includes('Just a moment')) {
    // Cookies expired or invalid, need to solve challenge again
    await waitForCloudflare(page, url);
  }

  await saveCookies(page, cookieFile);
  const html = await page.content();

  await browser.close();
  return html;
}

Session reuse cuts down on challenge solves, but the cookies are tied to your IP address. If your IP changes (which happens with proxy rotation), the cookies become invalid.

Step 5: Request Interception

Cloudflare fingerprints you based on what your browser requests and how it handles responses. Request interception lets you modify headers and block fingerprinting scripts.

JAVASCRIPT

async function setupInterception(page) {
  await page.setRequestInterception(true);

  page.on('request', (request) => {
    const url = request.url();

    // Don't block Cloudflare challenge scripts
    if (
      url.includes('/cdn-cgi/challenge-platform') ||
      url.includes('challenges.cloudflare.com')
    ) {
      request.continue();
      return;
    }

    // Block unnecessary resources to speed up loading
    const blockedTypes = ['image', 'media', 'font'];
    if (blockedTypes.includes(request.resourceType())) {
      request.abort();
      return;
    }

    // Modify headers to look more like a real browser
    const headers = {
      ...request.headers(),
      'accept-language': 'en-US,en;q=0.9',
      'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
      'sec-ch-ua-mobile': '?0',
      'sec-ch-ua-platform': '"Windows"',
      'sec-fetch-dest': 'document',
      'sec-fetch-mode': 'navigate',
      'sec-fetch-site': 'none',
      'sec-fetch-user': '?1',
      'upgrade-insecure-requests': '1',
    };

    request.continue({ headers });
  });
}

A warning: blocking Cloudflare's challenge scripts will prevent the clearance cookie from being issued. Only block resources you know are safe to skip. And be careful with header overrides. If the sec-ch-ua version does not match your actual browser version, Cloudflare catches the mismatch.

Step 6: Using a Real Browser Profile

One of the more effective approaches is running Puppeteer with a persistent browser profile that has real browsing history, cached data, and stored cookies.

JAVASCRIPT

const path = require('path');

async function scrapeWithProfile(url) {
  const userDataDir = path.join(__dirname, 'chrome-profile');

  const browser = await puppeteer.launch({
    headless: false,  // headed mode is harder to detect
    userDataDir,      // persistent profile
    args: [
      '--no-sandbox',
      '--window-size=1920,1080',
      '--disable-blink-features=AutomationControlled',
    ],
    ignoreDefaultArgs: ['--enable-automation'],
  });

  const page = await browser.newPage();

  // Emulate human-like behavior
  await page.evaluateOnNewDocument(() => {
    // Override the webdriver property
    Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined,
    });

    // Add realistic plugins
    Object.defineProperty(navigator, 'plugins', {
      get: () => [
        { name: 'Chrome PDF Plugin', filename: 'internal-pdf-viewer' },
        { name: 'Chrome PDF Viewer', filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai' },
        { name: 'Native Client', filename: 'internal-nacl-plugin' },
      ],
    });
  });

  await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });
  const html = await page.content();

  await browser.close();
  return html;
}

Using headless: false with --disable-blink-features=AutomationControlled and a persistent profile gets past more checks than headless mode. The trade-off is that you need a display server (Xvfb on Linux) and it uses more memory per instance.

Step 7: Handling Turnstile Challenges

Turnstile is the hardest Cloudflare challenge to bypass programmatically. It collects behavioral data and performs proof-of-work challenges. There is no reliable Puppeteer-only solution for Turnstile in 2026.

JAVASCRIPT

async function handleTurnstile(page) {
  // Wait for the Turnstile iframe to appear
  const turnstileFrame = await page.waitForSelector(
    'iframe[src*="challenges.cloudflare.com"]',
    { timeout: 10000 }
  ).catch(() => null);

  if (!turnstileFrame) return true; // No Turnstile, proceed

  // Get the iframe content
  const frame = await turnstileFrame.contentFrame();
  if (!frame) return false;

  // Wait for the checkbox to appear
  const checkbox = await frame.waitForSelector(
    '#cf-turnstile-response, .cf-turnstile-wrapper input',
    { timeout: 10000 }
  ).catch(() => null);

  if (!checkbox) return false;

  // Simulate human mouse movement toward the checkbox
  const box = await checkbox.boundingBox();
  if (!box) return false;

  await page.mouse.move(
    box.x + box.width / 2 + (Math.random() * 10 - 5),
    box.y + box.height / 2 + (Math.random() * 10 - 5),
    { steps: 25 }
  );

  await page.waitForTimeout(200 + Math.random() * 300);
  await checkbox.click();

  // Wait for Turnstile to process
  await page.waitForTimeout(5000);

  // Check if challenge was solved
  const cookies = await page.cookies();
  return cookies.some(c => c.name === 'cf_clearance');
}

This code shows the approach, but it has a low success rate in practice. Turnstile's behavioral analysis is sophisticated enough to distinguish scripted clicks from human interaction. The proof-of-work component also gets harder when Cloudflare suspects automation.

The Detection vs Evasion Arms Race

Here is the reality of bypassing Cloudflare with Puppeteer in 2026:

Feature	Detection Layer	Puppeteer+Stealth	Headed+Profile	Scraping API
navigator.webdriver
Plugin/MIME spoofing
JS Challenge (basic)
TLS Fingerprinting
HTTP/2 Fingerprinting
Managed Challenges
Turnstile
Canvas/WebGL

The fundamental problem is not code quality. It is architectural. Puppeteer runs on top of Chrome DevTools Protocol, and Cloudflare can detect CDP connections. The stealth plugin patches visible JavaScript APIs, but it cannot change the TLS stack, HTTP/2 behavior, or the underlying connection characteristics that Cloudflare checks at the network level.

Every time the community finds a new bypass, Cloudflare patches it. The average lifespan of a new stealth technique in 2026 is about 48 hours before detection signatures are updated.

Combining Everything: A Production-Grade Attempt

Here is a complete script that combines all the techniques above into a single, reasonably robust scraper:

JAVASCRIPT

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
const fs = require('fs');
const path = require('path');

puppeteer.use(StealthPlugin());

class CloudflareScraper {
  constructor(options = {}) {
    this.profileDir = options.profileDir || path.join(__dirname, 'chrome-profile');
    this.cookieFile = options.cookieFile || path.join(__dirname, 'cookies.json');
    this.maxRetries = options.maxRetries || 3;
    this.browser = null;
  }

  async init() {
    this.browser = await puppeteer.launch({
      headless: false,
      userDataDir: this.profileDir,
      args: [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-dev-shm-usage',
        '--window-size=1920,1080',
        '--disable-blink-features=AutomationControlled',
        '--lang=en-US,en',
      ],
      ignoreDefaultArgs: ['--enable-automation'],
    });
  }

  async createPage() {
    const page = await this.browser.newPage();
    await page.setViewport({ width: 1920, height: 1080 });

    await page.evaluateOnNewDocument(() => {
      Object.defineProperty(navigator, 'webdriver', { get: () => undefined });

      const originalQuery = window.navigator.permissions.query;
      window.navigator.permissions.query = (parameters) =>
        parameters.name === 'notifications'
          ? Promise.resolve({ state: Notification.permission })
          : originalQuery(parameters);

      window.chrome = { runtime: {}, loadTimes: () => ({}) };
    });

    await this.loadCookies(page);
    return page;
  }

  async scrape(url) {
    let lastError;

    for (let attempt = 0; attempt < this.maxRetries; attempt++) {
      try {
        const page = await this.createPage();

        await page.goto(url, {
          waitUntil: 'domcontentloaded',
          timeout: 60000,
        });

        const resolved = await this.waitForChallenge(page);
        if (!resolved) {
          await page.close();
          continue;
        }

        await this.saveCookies(page);

        const html = await page.content();
        await page.close();
        return html;

      } catch (err) {
        lastError = err;
        console.error(`Attempt ${attempt + 1} failed: ${err.message}`);
        await new Promise(r => setTimeout(r, 2000 * (attempt + 1)));
      }
    }

    throw lastError;
  }

  async waitForChallenge(page) {
    const maxWait = 30000;
    const start = Date.now();

    while (Date.now() - start < maxWait) {
      const title = await page.title();

      if (
        !title.includes('Just a moment') &&
        !title.includes('Checking your browser') &&
        !title.includes('Attention Required')
      ) {
        return true;
      }

      const hasTurnstile = await page.$('iframe[src*="challenges.cloudflare.com"]');
      if (hasTurnstile) {
        console.warn('Turnstile detected. Automated bypass unlikely.');
        return false;
      }

      await page.waitForTimeout(1000);
    }

    return false;
  }

  async saveCookies(page) {
    const cookies = await page.cookies();
    fs.writeFileSync(this.cookieFile, JSON.stringify(cookies, null, 2));
  }

  async loadCookies(page) {
    if (!fs.existsSync(this.cookieFile)) return;
    const cookies = JSON.parse(fs.readFileSync(this.cookieFile, 'utf8'));
    const now = Date.now() / 1000;
    const valid = cookies.filter(c => !c.expires || c.expires > now);
    if (valid.length > 0) await page.setCookie(...valid);
  }

  async close() {
    if (this.browser) await this.browser.close();
  }
}

// Usage
(async () => {
  const scraper = new CloudflareScraper();
  await scraper.init();

  try {
    const html = await scraper.scrape('https://target-site.com');
    console.log(`Got ${html.length} bytes`);
  } finally {
    await scraper.close();
  }
})();

This is about as far as you can get with Puppeteer alone. It works against sites using only the JavaScript challenge. Against sites with Turnstile or aggressive managed challenges, the success rate drops below 20%.

When to Use Puppeteer and When to Use an API

Bypass Success Rate by Protection Level

Puppeteer is the right tool when you are scraping sites with no bot protection or basic JavaScript challenges. It gives you full control over the browser, and the stealth plugin handles the easy stuff. For hobby projects, small-scale data collection, or targets that do not use Cloudflare, Puppeteer works great.

But Cloudflare's detection has outpaced what the open-source stealth community can keep up with. The core problem is that Puppeteer cannot modify its TLS fingerprint, HTTP/2 behavior, or low-level network characteristics. These are the signals Cloudflare relies on most heavily in 2026.

If you are scraping Cloudflare-protected sites at production scale, you will spend more time maintaining evasion patches than building your actual product.

The API Alternative

Scraping APIs solve the Cloudflare problem at the infrastructure level. Instead of patching a headless browser, they use custom-built HTTP stacks with real browser TLS fingerprints, residential proxy networks, and challenge-solving pipelines.

AlterLab handles Cloudflare bypass (including Turnstile) at the API level. You send a URL, get back clean HTML. No browser fingerprinting, no stealth plugins, no cookie management. The API routes through real browser fingerprints and residential IPs, maintaining a 98%+ success rate against Cloudflare.

JAVASCRIPT

// Using AlterLab API instead of Puppeteer
const response = await fetch('https://api.alterlab.io/v1/scrape', {
  method: 'POST',
  headers: {
    'X-API-Key': 'your_api_key',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: 'https://cloudflare-protected-site.com',
    formats: ['html', 'markdown'],
  }),
});

const data = await response.json();
console.log(data.content);

The trade-off is cost vs control. DIY gives you full control but demands ongoing maintenance. An API costs per request but eliminates the infrastructure burden entirely.

Summary

Bypassing Cloudflare with Puppeteer is increasingly an uphill battle. Here is what works, what sometimes works, and what does not:

Works: Stealth plugin for basic JS challenges, cookie persistence, human-like interaction patterns, headed mode with persistent profiles.

Sometimes works: Request interception with correct headers, combining multiple stealth techniques, using --disable-blink-features=AutomationControlled.

Does not work: Any approach that ignores TLS fingerprinting, headless mode against managed challenges, automated Turnstile solving with Puppeteer alone.

For production scraping against Cloudflare, the honest answer is that Puppeteer alone is not enough. You either need to invest serious engineering time in building custom browser infrastructure with modified TLS stacks, or use a scraping API that has already solved these problems at the infrastructure level.

Pick the approach that matches your scale, budget, and how much time you want to spend fighting bot detection instead of building your product.

Was this article helpful?

Try it yourself

Handle website compatibility automatically

AlterLab resolves challenges and renders pages so your scraper always returns data — no manual tuning required.

curl -X POST https://api.alterlab.io/v1/scrape \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

No credit card required · 5,000 free requests

Yash Dubey

View all posts

Tutorials

Handling Infinite Scroll & Pagination in Headless Browsers

Learn how to reliably handle infinite scroll, cursor-based pagination, and dynamic rendering for autonomous AI web scraping agents using headless browsers.

Herald Blog Service

Jun 13, 2026

Tutorials

Playwright Network Interception Guide for AI Data Extraction

Learn how to intercept and block network requests in Playwright to accelerate AI agent data extraction, reduce bandwidth, and capture raw API JSON payloads.

Herald Blog Service

Jun 13, 2026

13m

Tutorials

Building an Autonomous CrewAI Web Scraping Tool for JSON Extraction

Learn how to build a custom CrewAI tool that autonomously scrapes dynamic websites and returns structured JSON using a headless browser API.

Herald Blog Service

Jun 12, 2026

Stay in the Loop

Get scraping insights, API tips, and platform updates. No spam — we only send when we have something worth reading.

Web Scraping API Resources

Part of the Web Scraping API Documentation cluster

Web Scraping API Documentation

Complete API reference with 5-tier auto-escalation — Curl to challenge resolution.

Pillar page

JavaScript Rendering Guide

Configure Tier 4 browser rendering for SPAs and dynamic content.

Authenticated Scraping Guide

Scrape pages behind login using session management.

Web Scraping API Benchmarks

Real success rates and cost data across all 5 tiers.

AlterLab for AI Agents

MCP Server, Python SDK, and Firecrawl-compatible API for AI agent workflows.

How Cloudflare Detects Puppeteer

TLS Fingerprinting

JavaScript Challenge (5-Second Shield)

Managed Challenges

Turnstile

HTTP/2 Fingerprinting

Canvas and WebGL Fingerprinting

Step 1: Basic Puppeteer Setup

Step 2: puppeteer-extra-plugin-stealth

Install Stealth Plugin

Patch Browser Fingerprint

Handle CF Challenge

Extract Data

Step 3: Waiting for Cloudflare Challenges

Step 4: Cookie Persistence and Session Reuse

Step 5: Request Interception

Step 6: Using a Real Browser Profile

Step 7: Handling Turnstile Challenges

The Detection vs Evasion Arms Race

Combining Everything: A Production-Grade Attempt

When to Use Puppeteer and When to Use an API

The API Alternative

Summary

Related Articles

Handling Infinite Scroll & Pagination in Headless Browsers

Playwright Network Interception Guide for AI Data Extraction

Building an Autonomous CrewAI Web Scraping Tool for JSON Extraction

Popular Posts

Why Your Headless Browser Gets Detected (and How to Fix It)

Best Web Scraping APIs in 2026: Complete Comparison Guide

Playwright Bot Detection: What Actually Works in 2026

How to Scrape Twitter/X: Complete Guide for 2026

How to Scrape Cloudflare-Protected Sites in 2026

Recommended

How to Scrape Amazon in 2026: Engineering Guide

How to Scrape AliExpress: Complete Guide for 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Indeed: Complete Guide for 2026

How to Scrape Twitter/X Data: Complete Guide for 2026

Newsletter

Recommended Reading

How to Scrape Amazon in 2026: Engineering Guide

How to Scrape AliExpress: Complete Guide for 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Indeed: Complete Guide for 2026

How to Scrape Twitter/X Data: Complete Guide for 2026

Stay in the Loop

Explore AlterLab

Anti-Bot Handling API

JavaScript Rendering API

Pricing

Documentation

Web Scraping API Resources