You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
「www.sushiexpress.com.tw,爭鮮也已經出面回應這是假訊息。」 is being extracted as URL, and "http://" is not prepended, thus the crawler goes to https://cofacts.g0v.tw/article/www.sushiexpress.com.tw,⋯⋯ instead.
We should fix the URL scrapping logic, prepend http:// or https:// when needed, and rewrite all articles & replies' hyperlinks field to remove the wrongly extracted URLs.
The text was updated successfully, but these errors were encountered:
In google webmaster tool, we are receiving crawl error in weird URLs:
These URLs are extracted from pages like this:
https://cofacts.g0v.tw/article/2yje6no2cqv2v
「www.sushiexpress.com.tw,爭鮮也已經出面回應這是假訊息。」 is being extracted as URL, and "http://" is not prepended, thus the crawler goes to
https://cofacts.g0v.tw/article/www.sushiexpress.com.tw,⋯⋯
instead.We should fix the URL scrapping logic, prepend
http://
orhttps://
when needed, and rewrite all articles & replies'hyperlinks
field to remove the wrongly extracted URLs.The text was updated successfully, but these errors were encountered: