Legitbot

Ruby gem to make sure that an IP really belongs to a bot, typically a search engine.

Usage

Suppose you have a Web request and you would like to check it is not diguised:

bot = Legitbot.bot(userAgent, ip)

bot will be nil if no bot signature was found in the User-Agent. Otherwise, it will be an object with methods

bot.detected_as # => :google
bot.valid? # => true
bot.fake? # => false

Sometimes you already know which search engine to expect. For example, you might be using rack-attack:

Rack::Attack.blocklist("fake Googlebot") do |req|
  req.user_agent =~ %r(Googlebot) && Legitbot::Google.fake?(req.ip)
end

Or if you do not like all those ghoulish crawlers stealing your content, evaluating it and getting ready to invade your site with spammers, then block them all:

Rack::Attack.blocklist 'fake search engines' do |request|
  Legitbot.bot(request.user_agent, request.ip)&.fake?
end

Versioning

Semantic versioning with the following clarifications:

MINOR version is incremented when support for new bots is added.
PATCH version is incremented when validation logic for a bot changes (IP list updated, for example).

Supported

Ahrefs
Alexa
Amazon AdBot
Applebot
Baidu spider
Bingbot
DuckDuckGo bot
Facebook crawler
Google crawlers
IAS
OpenAI GPTBot
Oracle Data Cloud Crawler
Petal search engine
Pinterest
Twitterbot, the list of IPs is in the Troubleshooting page
Yandex robots
You.com

License

Apache 2.0

Other projects

Play Framework variant in Scala: play-legitbot
Article When (Fake) Googlebots Attack Your Rails App
Voight-Kampff is a Ruby gem that detects bots by User-Agent
crawler_detect is a Ruby gem and Rack middleware to detect crawlers by few different request headers, including User-Agent
Project Honeypot's http:BL can not only classify IP as a search engine, but also label them as suspicious and reports the number of days since the last activity. My implementation of the protocol in Scala is here.
CIDRAM is a PHP routing manager with built-in support to validate bots.

Name		Name	Last commit message	Last commit date
Latest commit History 204 Commits
.github/workflows		.github/workflows
lib		lib
rakelib		rakelib
test		test
.editorconfig		.editorconfig
.gitignore		.gitignore
.rubocop.yml		.rubocop.yml
.ruby-version		.ruby-version
Gemfile		Gemfile
LICENSE.txt		LICENSE.txt
README.md		README.md
Rakefile		Rakefile
codecov.yml		codecov.yml
legitbot.gemspec		legitbot.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Legitbot

Usage

Versioning

Supported

License

Other projects

About

Releases

Packages

Languages

License

ajwgibson/legitbot

Folders and files

Latest commit

History

Repository files navigation

Legitbot

Usage

Versioning

Supported

License

Other projects

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages