← All Reviews
SOLID

crawl4ai — Does "LLM-Ready Markdown From Any Webpage" Actually Work?

Claim tested

crawl4ai converts web pages to LLM-ready markdown. Originally reviewed as PARTIAL due to fit_markdown returning empty — re-tested and confirmed the issue was a deprecated API path, not a crawl4ai bug. Using the correct result.markdown.fit_markdown path in v0.8.6, fit_markdown works correctly. Verdict updated from PARTIAL to SOLID.

Criteria Scorecard

CriterionScore
install_workstrue
claim_testabletrue
readme_accuratetrue
creator_notifiedfalse
errors_documentedtrue
claim_tested_clean_envtrue
verdict_matches_evidencetrue

Display this badge

RepoVerifier: SOLID
[![RepoVerifier: SOLID](https://repoverifier.dev/badges/solid.svg)](https://repoverifier.dev/reviews/unclecode-crawl4ai)
<a href="https://repoverifier.dev/reviews/unclecode-crawl4ai"><img src="https://repoverifier.dev/badges/solid.svg" alt="RepoVerifier: SOLID" height="20"></a>

Paste this in your repo’s README. Links back to the full review.

Environment

osmacOS
fixresult.markdown.fit_markdown
python3.13
crawl4ai0.8.6
test_urlhttps://example.com
original_issueresult.fit_markdown deprecated and removed in v0.8.6
api_key_requiredfalse
original_verdictPARTIAL

Full Review

What This Repo Claims



Fast, LLM-friendly web crawler that converts any page to clean markdown.
Supports fit_markdown — a filtered version of the page with noise removed —
ideal for RAG pipelines.

50k+ stars. Actively maintained.

Original Test (PARTIAL verdict)



Tested on crawl4ai v0.8.6. Used result.fit_markdown to access filtered
markdown output. Returned 0 characters on all pages tested.

Root cause (identified on re-test): result.fit_markdown was deprecated
in v0.5 and fully removed in v0.8.6. The correct path is
result.markdown.fit_markdown. The deprecation is documented in the official
docs but not prominently surfaced in the README or quickstart examples at the
time of original testing.

This was not a bug in crawl4ai. It was an API migration.

Re-Test



Environment:
  • macOS, Python 3.13, crawl4ai 0.8.6

  • Clean repoverifiertest user

  • No prior crawl4ai state


Test code:
import asyncio
from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
from crawl4ai.content_filter_strategy import PruningContentFilter
from crawl4ai.markdown_generation_strategy import DefaultMarkdownGenerator

async def test():
    config = CrawlerRunConfig(
        markdown_generator=DefaultMarkdownGenerator(
            content_filter=PruningContentFilter()
        )
    )
    async with AsyncWebCrawler() as crawler:
        result = await crawler.arun('https://example.com', config=config)
        print('raw_markdown length:', len(result.markdown.raw_markdown or ''))
        print('fit_markdown length:', len(result.markdown.fit_markdown or ''))
        print('fit_markdown preview:', (result.markdown.fit_markdown or '')[:300])

asyncio.run(test())


Output:
raw_markdown length: 166
fit_markdown length: 166
fit_markdown preview: # Example Domain
This domain is for use in documentation examples without needing
permission. Avoid use in operations.
Learn more

What Works



  • Install clean — no errors

  • Crawler starts and fetches pages

  • result.markdown.raw_markdown returns full page markdown

  • result.markdown.fit_markdown returns filtered markdown with noise removed

  • PruningContentFilter integrates correctly with DefaultMarkdownGenerator

  • No API key required


API Migration Note



If you are upgrading from crawl4ai < v0.5, update your code:

# OLD — removed in v0.8.6
result.fit_markdown

# NEW — correct path
result.markdown.fit_markdown


Verdict: SOLID



The core claim holds. crawl4ai converts web pages to clean, LLM-ready
markdown. fit_markdown works correctly using the current API. The original
PARTIAL verdict was a testing error, not a crawl4ai defect.

*This review follows RepoVerifier Standard v1.0.
[Read the standard →](https://repoverifier.dev/about)*
This review follows RepoVerifier Standard v1.0. Read the standard →