Integrating Till with Puppeteer

Till can be easily integrated with your Puppeteer scraper without much code changes.

Please follow the steps below.

Step 1: Install Till

Follow the instructions to install Till

Step 2: Modify your Puppeteer project

Next, you need to modify your existing Puppeteer project to integrate with Till.

The following is an example of a Puppeteer script:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch(
    {
      headless: true,
      ignoreHTTPSErrors: true, 
      acceptInsecureCerts: true, 
       args: [
         '--proxy-server=http://localhost:2933', // Connect to Till
         '--ignore-certificate-errors', 
         '--ignore-certificate-errors-spki-list ',
      ],
     }
  );

  const page = await browser.newPage();

  await page.setExtraHTTPHeaders({
    // Add the header to force a Cache Miss on Till
    'X-DH-Cache-Freshness': 'now' 
})

  await page.goto('https://fetchtest.datahen.com/echo/request');
  
  const txt = await page.content()
  console.log(txt);

  await browser.close();
})();

Note: To see a working example, you can visit this link.

Step 3: Run your script

Next, run your Puppeteer project like you normally would.

Note: If you don't have an existing Puppeteer project to try with Till, you can try our working example here.

Step 4: Verify that it works

Visit the Till UI at http://localhost:2980/requests to see that your new requests are shown.

Request Log UI