What This Repo Claims
Fast, LLM-friendly web crawler that converts any page to clean markdown.
Supports fit_markdown — a filtered version of the page with noise removed —
ideal for RAG pipelines.
50k+ stars. Actively maintained.
Original Test (PARTIAL verdict)
Tested on crawl4ai v0.8.6. Used
result.fit_markdown to access filtered
markdown output. Returned 0 characters on all pages tested.
Root cause (identified on re-test):
result.fit_markdown was deprecated
in v0.5 and fully removed in v0.8.6. The correct path is
result.markdown.fit_markdown. The deprecation is documented in the official
docs but not prominently surfaced in the README or quickstart examples at the
time of original testing.
This was not a bug in crawl4ai. It was an API migration.
Re-Test
Environment:
- macOS, Python 3.13, crawl4ai 0.8.6
- Clean repoverifiertest user
- No prior crawl4ai state
Test code:
import asyncio
from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
from crawl4ai.content_filter_strategy import PruningContentFilter
from crawl4ai.markdown_generation_strategy import DefaultMarkdownGenerator
async def test():
config = CrawlerRunConfig(
markdown_generator=DefaultMarkdownGenerator(
content_filter=PruningContentFilter()
)
)
async with AsyncWebCrawler() as crawler:
result = await crawler.arun('https://example.com', config=config)
print('raw_markdown length:', len(result.markdown.raw_markdown or ''))
print('fit_markdown length:', len(result.markdown.fit_markdown or ''))
print('fit_markdown preview:', (result.markdown.fit_markdown or '')[:300])
asyncio.run(test())
Output:
raw_markdown length: 166
fit_markdown length: 166
fit_markdown preview: # Example Domain
This domain is for use in documentation examples without needing
permission. Avoid use in operations.
Learn more
What Works
- Install clean — no errors
- Crawler starts and fetches pages
result.markdown.raw_markdownreturns full page markdown
result.markdown.fit_markdownreturns filtered markdown with noise removed
- PruningContentFilter integrates correctly with DefaultMarkdownGenerator
- No API key required
API Migration Note
If you are upgrading from crawl4ai < v0.5, update your code:
# OLD — removed in v0.8.6
result.fit_markdown
# NEW — correct path
result.markdown.fit_markdown
Verdict: SOLID
The core claim holds. crawl4ai converts web pages to clean, LLM-ready
markdown. fit_markdown works correctly using the current API. The original
PARTIAL verdict was a testing error, not a crawl4ai defect.
*This review follows RepoVerifier Standard v1.0.
[Read the standard →](https://repoverifier.dev/about)*