Scrape, crawl, and discover the web.
YakSpider is a configurable scraping and discovery platform: a tiered bypass ladder, wizard-driven runs, scheduling and change-monitoring, pluggable outputs, and a REST API — for everything from a single page to a web-wide index.
Tiered fetching
Escalate from a plain HTTP request to stealth headers, proxies and a headless browser — only as far as a site actually requires.
Crawl or discover
Crawl known sites within scope, or discover brand-new domains across the web to build an index of what you're after.
Extraction rules
Pull structured records with CSS/XPath/JSON selectors. No code — configure it in the wizard.
Schedules & monitoring
Run once or on a cron, re-scrape on a schedule, and get notified the moment a page changes.
Pluggable outputs
Store results, POST a signed webhook, or push to object storage. Wire your own destinations.
API + recipes
Drive everything from a REST API, or start from a ready-made recipe: OpenAPI specs, safety data sheets, product pricing.