Scrape, crawl, and discover the web.

YakSpider is a configurable scraping and discovery platform: a tiered bypass ladder, wizard-driven runs, scheduling and change-monitoring, pluggable outputs, and a REST API — for everything from a single page to a web-wide index.

Tiered fetching

Escalate from a plain HTTP request to stealth headers, proxies and a headless browser — only as far as a site actually requires.

Crawl or discover

Crawl known sites within scope, or discover brand-new domains across the web to build an index of what you're after.

Extraction rules

Pull structured records with CSS/XPath/JSON selectors. No code — configure it in the wizard.

Schedules & monitoring

Run once or on a cron, re-scrape on a schedule, and get notified the moment a page changes.

Pluggable outputs

Store results, POST a signed webhook, or push to object storage. Wire your own destinations.

API + recipes

Drive everything from a REST API, or start from a ready-made recipe: OpenAPI specs, safety data sheets, product pricing.