Hi everyone, we recently had an internal discussion about how to handle automated traffic, especially regarding the balance between “good bots” and “bad bots.” During that meeting, we talked about Anubis, so I am very glad to see some feedback from the community about this tool, as well as Cloudflare.
I am sharing this insightful article by OAPEN: Traffic management and bot protection for the OAPEN Library and DOAB: Implementing Cloudflare and Anubis.
Here is a short summary of the key takeaways:
Automated traffic (bots and AI agents) now accounts for 60-80% of all web requests, surpassing human traffic for the first time. For open science infrastructures like OAPEN and DOAB, this surge threatens system stability, inflates costs, and distorts metrics.
To ensure sustainable and equitable access, a layered traffic management strategy has been implemented without resorting to blanket blocking:
- DOAB uses Cloudflare to filter massive traffic and mitigate overload, virtually eliminating “503 - service unavailable” errors.
- OAPEN deployed Anubis, an open-source anti-bot tool that uses lightweight proof-of-work challenges to filter automated scraping while keeping access seamless for humans.
The goal is to maintain a delicate balance, protecting infrastructure performance and stability while keeping allowlists updated so that “good bots” (scholarly indexers, search engines, and research AI tools) can continue to discover and reuse open content.