Introduction

Web scraping allows you to extract data from websites automatically. With n8n and the CustomJS Scraper node, you can scrape even complex, dynamic websites that require user interaction like clicking or typing.

Why Use the Scraper Node in n8n?

  • Extract data from sites that require JavaScript to render.
  • Automate data collection for market research or price monitoring.
  • Capture screenshots of websites after performing actions.

n8n Template

Link to n8n Workflow

Prerequisites

  • A self-hosted n8n instance
  • The CustomJS PDF Toolkit node installed
  • The URL of the target website

Workflow Overview

The workflow consists of the following steps:

  1. Get URL – Provide the URL of the website to scrape.
  2. Scrape Website – Use the Scraper node to perform actions and extract data.
  3. Process Data - Use the extracted HTML or screenshot in the next steps of your workflow.

Step-by-Step Guide

1. Get the Target URL

Use a Start node with a fixed URL or get URLs dynamically from another source like a Google Sheet.

2. Configure the Scraper Node

  • Add the Scraper node from the CustomJS Toolkit.
  • Set the Website URL.
  • Define user actions (e.g., click('#button'), type('#search', 'my query'), wait(2000)).
  • Choose the output: Raw HTML or Screenshot (PNG).

3. Use the Scraped Data

  • If you extracted HTML, you can parse it with the HTML Extract node.
  • If you took a screenshot, you can save it to a file or send it in a notification.

Final Thoughts

Automating web scraping with n8n and the Scraper node lets you collect and process web data at scale, even from complex or interactive sites. This saves time and ensures you always have up-to-date information for your business or research.