Skip to content

Commit

Permalink
Don't crawl URLs on x.com
Browse files Browse the repository at this point in the history
The crawler logic currently skips links to twitter.com; as of cfpb/consumerfinance.gov@846e03a the CFPB website now links to x.com instead. This change adds x.com to the crawl exclusion list.
  • Loading branch information
chosak authored Jul 3, 2024
1 parent c21128f commit 26061f3
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions crawler/wpull_plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
[
r"^https://www.facebook.com/dialog/share\?.*",
r"^https://twitter.com/intent/tweet\?.*",
r"^https://x.com/intent/tweet\?.*",
r"^https://www.linkedin.com/shareArticle\?.*",
],
)
Expand Down

0 comments on commit 26061f3

Please sign in to comment.