OpenSource-Hub

crawl4ai

Library

unclecode/crawl4ai

Open-source LLM friendly web crawler for clean Markdown output.

Overview

Crawl4AI is an open-source web crawler designed to produce clean, structured Markdown for LLM consumption. It features asynchronous crawling, caching, anti-bot detection, and adaptive intelligence, deployable anywhere without API keys.

README Preview

# 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper.\n\n\n\n\n\n[](https://github.com/unclecode/crawl4ai/stargazers)\n[](https://github.com/unclecode/crawl4ai/network/members)\n\n[](https://badge.fury.io/py/crawl4ai)\n[](https://pypi.org/project/crawl4ai/)\n[](https://pepy.tech/project/crawl4ai)\n[](https://github.com/sponsors/unclecode)\n\n---\n#### 🚀 Crawl4AI Cloud API — Closed Beta (Launching Soon)\nReliable, large-scale web extraction, now built to be _**drastically more cost-effective**_ than any of the existing solutions.\n\n👉 **Apply [here](https://forms.gle/E9MyPaNXACnAMaqG7) for early access**  \n_We’ll be onboarding in phases and working closely with early users.\nLimited slots._\n\n---\n\n\n    \n      \n    \n    \n      \n    \n    \n      \n    \n  \n\n\nCrawl4AI turns the web into clean, LLM ready Markdown for RAG, agents, and data pipelines. Fast, controllable, battle tested by a 50k+ star community.\n\n[✨ Check out latest update v0.8.6](#-recent-updates)\n\n✨ **New in v0.8.6**: Security hotfix — replaced `litellm` with `unclecode-litellm` due to a PyPI supply chain compromise. If you're on v0.8.5, please upgrade immediately.\n\n✨ Recent v0.8.5: Anti-Bot Detection, Shadow DOM & 60+ Bug Fixes! Automatic 3-tier anti-bot detection with proxy escalation, Shadow DOM flattening, deep crawl cancellation, config defaults API, consent popup removal, and critical security patches. [Release notes →](https://github.com/unclecode/crawl4ai/blob/main/docs/blog/release-v0.8.5.md)\n\n✨ Previous v0.8.0: Crash Recovery & Prefetch Mode! Deep crawl crash recovery with `resume_state` and `on_state_change` callbacks for long-running crawls. New `prefetch=True` mode for 5-10x faster URL discovery. [Release notes →](https://github.com/unclecode/crawl4ai/blob/main/docs/blog/release-v0.8.0.md)\n\n✨ Previous v0.7.8: Stability & Bug Fix Release! 11 bug fixes addressing Docker API issues, LLM extraction improvements, URL handling fixes, and dependency updates. [Rele