The free Common Crawl WebGraph API (March 2026) from CustomDatasets.com is now live.

I’ve also pushed two new GitHub repos to get you started quickly.

The stats:

  • Total domains : 442,345,490
  • Total hosts : 5,789,009,175

The live database sizes:

  • Total domains db raw GBs : 260.29
  • Total hosts db raw GBs : 900.03
  • Total all db raw GBs : 1,160.32

If you buy the database, you’ll get these compressed DB files:

  • Total domains db zst GBs : 75.07
  • Total hosts db zst GBs : 302.45
  • Total all db zst GBs : 377.52

Remember that this is a totally free an open API, with no registration required.

Search it for free now in your browser.


Updates:

    • March 2026 link graph data added (updated this week)
    • Hostname history expanded — now covers January 2020 → present.
    • Domain-level history continues to cover 2017 → present.
    • New GitHub repository with example API clients (Go, PHP, Python, C, Bash/curl). Remember that this is free and open, with no registration/keys needed.
    • New database + client implementations (Go, PHP, Python) for customers working with the full database.

If you need lots of historical PageRank data, the database is also available for purchase.


Build your own free API client. Examples in Go, PHP, Python, C, Bash/curl:

https://github.com/CustomDatasets/webgraph-api-clients

Build your own database (paid) server+client. Examples in Go, PHP, Python.

https://github.com/CustomDatasets/webgraph-database-server

Buy the full database with monthly updates:

https://customdatasets.com/webgraph/db/