Skip to content

Commit

Permalink
feat: Block AI user agents (#1145)
Browse files Browse the repository at this point in the history
* feat: Block AI user agents

* fix the syntax and format of robots.txt

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
  • Loading branch information
pghorpade and github-actions[bot] authored Jun 14, 2024
1 parent b7a4f16 commit 7b878bf
Showing 1 changed file with 95 additions and 2 deletions.
97 changes: 95 additions & 2 deletions public/robots_allow.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,97 @@
Sitemap: https://digital.library.ucla.edu/sitemap.xml

User-agent: *
Disallow:
User-agent: AdsBot-Google
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Applebot
Disallow: /

User-agent: AwarioRssBot
Disallow: /

User-agent: AwarioSmartBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: DataForSeoBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: FacebookBot
Disallow: /

User-agent: FriendlyCrawler
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: GoogleOther
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: img2dataset
Disallow: /

User-agent: ImagesiftBot
Disallow: /

User-agent: magpie-crawler
Disallow: /

User-agent: Meltwater
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: peer39_crawler
Disallow: /

User-agent: peer39_crawler/1.0
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: PiplBot
Disallow: /

User-agent: scoop.it
Disallow: /

User-agent: Seekr
Disallow: /

User-agent: YouBot
Disallow: /

0 comments on commit 7b878bf

Please sign in to comment.