Skip to content

Crawler Detect is a PHP class for detecting bots/crawlers/spiders via the user agent

License

Notifications You must be signed in to change notification settings

castevinz/Crawler-Detect

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CrawlerDetect

Build Status Total Downloads Scrutinizer Code Quality MIT Version StyleCI

CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent. Currently able to detect 100's of bots/spiders/crawlers.

Installation

Run composer require jaybizzle/crawler-detect 1.* or add "jaybizzle/crawler-detect" :"1.*" to your composer.json.

Usage

use Jaybizzle\CrawlerDetect\CrawlerDetect;

$CrawlerDetect = new CrawlerDetect;

// Check the user agent of the current 'visitor'
if($CrawlerDetect->isCrawler()) {
	// true if crawler user agent detected
}

// Pass a user agent as a string
if($CrawlerDetect->isCrawler('Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)')) {
	// true if crawler user agent detected
}

// Output which regex rules matched (if any)
var_dump($CrawlerDetect->getMatches());

Contributing

If you find a bot/spider/crawler user agent that CrawlerDetect fails to detect, please submit a pull request with with the regex pattern added to the $crawlers array in CrawlerDetect.php and add the failing user agent to tests/crawlers.txt.

Failing that, just create an issue with the user agent you have found, and we'll take it from there :)

Laravel Package

If you would like to use this with Laravel 5, please see Laravel-Crawler-Detect

Changelog

v1.1.0

Massive performance gains! Over 77% faster in some cases! Firstly, by removing common strings from the user agent so the regex parser doesn't have to do as many steps to find a match i.e. there is no point matching against terms such as Mozzila, Android, Chrome etc as these strings are never going to match as a bot. Secondly, as we have this generic regex pattern [a-z0-9\\-_]*((?<!cu)bot|crawler|archiver|transcoder|spider) there was no point having any other bots in our regex array that had the term bot, spider, crawler etc.

See #42 for some simple benchmarks.

v1.0.20

  • Added more bots see #40 and #41

v1.0.19

v1.0.18

v1.0.17

v1.0.16

v1.0.15

v1.0.14

v1.0.13

v1.0.12

  • Added a generic bot detector regular expression [a-z0-9\\-_]*((?<!cu)bot|crawler|archiver|transcoder|spider)

v1.0.11

v1.0.10

v1.0.9

v1.0.8

v1.0.7

v1.0.6

v1.0.5

v1.0.4

v1.0.3

v1.0.2

v1.0.1

Parts of this class are based on the brilliant MobileDetect

About

Crawler Detect is a PHP class for detecting bots/crawlers/spiders via the user agent

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PHP 100.0%