CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent. Currently able to detect 100's of bots/spiders/crawlers.
Run composer require jaybizzle/crawler-detect 1.*
or add "jaybizzle/crawler-detect" :"1.*"
to your composer.json
.
use Jaybizzle\CrawlerDetect\CrawlerDetect;
$CrawlerDetect = new CrawlerDetect;
// Check the user agent of the current 'visitor'
if($CrawlerDetect->isCrawler()) {
// true if crawler user agent detected
}
// Pass a user agent as a string
if($CrawlerDetect->isCrawler('Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)')) {
// true if crawler user agent detected
}
// Output which regex rules matched (if any)
var_dump($CrawlerDetect->getMatches());
If you find a bot/spider/crawler user agent that CrawlerDetect fails to detect, please submit a pull request with with the regex pattern added to the $crawlers
array in CrawlerDetect.php
and add the failing user agent to tests/crawlers.txt
.
Failing that, just create an issue with the user agent you have found, and we'll take it from there :)
If you would like to use this with Laravel 5, please see Laravel-Crawler-Detect
Massive performance gains! Over 77% faster in some cases! Firstly, by removing common strings from the user agent so the regex parser doesn't have to do as many steps to find a match i.e. there is no point matching against terms such as Mozzila
, Android
, Chrome
etc as these strings are never going to match as a bot. Secondly, as we have this generic regex pattern [a-z0-9\\-_]*((?<!cu)bot|crawler|archiver|transcoder|spider)
there was no point having any other bots in our regex array that had the term bot
, spider
, crawler
etc.
See #42 for some simple benchmarks.
v1.0.20
v1.0.19
- Added 'Traackr.com'
v1.0.18
- Added 'W3C Validators'
- Fixed some regexes
v1.0.17
- Added 'getprismatic.com'
- Added 'LongURL API'
v1.0.16
- Added 'MagpieRSS'
- Added 'ScoutURLMonitor'
v1.0.15
- Added 3 new bots - see #30 (thanks to @romaricdrigon)
- Added 'Pattern'
v1.0.14
- Added 10 new bots - see #27 (thanks to @romaricdrigon)
v1.0.13
- Added 'Google favicon' (thanks to @castevinz)
v1.0.12
- Added a generic bot detector regular expression
[a-z0-9\\-_]*((?<!cu)bot|crawler|archiver|transcoder|spider)
v1.0.11
- Made compatible with PHP 5.3 (thanks to @bLeveque42)
v1.0.10
v1.0.9
- Added '007ac9 Crawler'
- Added 'Airmail'
- Added 'Anemone'
- Added 'Butterfly'
- Added 'Content Crawler'
- Added 'Digg'
- Added 'DomainAppender'
- Added 'EasouSpider'
- Added 'ElectricMonk'
- Added 'InAGist'
- Added 'IODC'
- Added 'iZSearch'
- Added 'FRCrawler'
- Added LinksCrawler
- Added 'Lipperhey Link Explorer'
- Added 'ltx71'
- Added 'MetaURI'
- Added 'MSIECrawler'
- Added 'ocrawler'
- Added 'Online Website Link Checker'
- Added 'OpenWebSpider'
- Added 'ow.ly'
- Added 'PercolateCrawler'
- Added 'Robosourcer'
- Added 'Scrapy'
- Added 'SpiderMan'
- Added 'SSL-Crawler'
- Added 'UnwindFetchor'
- Added 'urlresolver'
- Added 'XML Sitemaps Generator'
- Added 'Y!J-ASR'
- Added 'YisouSpider'
v1.0.8
- Added 'bl.uk_lddc_bot'
- Added 'classbot'
- Added 'CoPubbot'
- Added 'Domain Re-Animator Bot'
- Added 'Healthbot'
- Added 'IstellaBot'
- Added 'LinkpadBot'
- Added 'lufsbot'
- Added 'PaperLiBot'
- Added 'Plukkie'
- Added 'SearchmetricsBot'
- Added 'TrueBot'
- Added 'UnisterBot'
- Added 'YioopBot'
- Added 'Insitesbot'
- Added 'xintellibot'
- Added 'NerdyBot'
- Added 'NextGenSearchBot'
- Added 'ScreenerBot'
- Added 'ShowyouBot'
v1.0.7
- Added 'SurdotlyBot'
- Added 'AddThis'
- Tweaked some regex patterns
- Fixed #10
v1.0.6
- Added 'findxbot'
- Added 'SemrushBot'
- Added 'yoozBot'
v1.0.5
- Added 'GigablastOpenSource'
- Added 'MegaIndex.ru'
- Added 'Pingdom.com_bot'
- Added 'WeSEE:Ads/PageBot'
v1.0.4
- Added 'CrawlBot'
- Added 'Flamingo_SearchEngine'
- Added 'python-requests'
- Added 'Seznam screenshot-generator'
- Added 'SklikBot'
- Added 'trendictionbot'
v1.0.3
- Added 'BUbiNG'
- Added 'Qwantify'
- Added 'archive.org_bot'
- Added 'Applebot'
- Added 'TweetmemeBot'
v1.0.2
- Added 'AbiLogicBot'
- Added 'Link Valet'
- Added 'Mrcgiguy'
- Added 'LinkExaminer'
- Added 'LinksManager.com_bot'
- Added 'Notifixious'
- Added 'online link validator'
- Added 'Ploetz + Zeller'
- Added 'InfoWizards Reciprocal Link System PRO'
- Added REL Link Checker
- Added 'SiteBar'
- Added 'Vivante Link Checker'
v1.0.1
- Added 'Yahoo Link Preview' bot
Parts of this class are based on the brilliant MobileDetect