bestrecipes.com.au scraper broken #1354

Surfoo · 2024-11-03T18:22:51Z

Pre-filing checks

I have searched for open issues that report the same problem
I have checked that the bug affects the latest version of the library

The URL of the recipe(s) that are not being scraped correctly

The results you expect to see
I don't know.

The results (including any Python error messages) that you are seeing
I didn't run the scraper, I have an issue:

$ python -m pipx install recipe-scrapers --include-deps                                                                                                                                                                
'recipe-scrapers' already seems to be installed. Not modifying existing installation in '/home/johndoe/.local/pipx/venvs/recipe-scrapers'. Pass '--force' to force installation.

$ python                                                                                                                                                                                                               
Python 3.12.6 (main, Sep  8 2024, 13:18:56) [GCC 14.2.1 20240805] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> from recipe_scrapers import scrape_html
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'recipe_scrapers'
>>>

The text was updated successfully, but these errors were encountered:

jknndy · 2024-11-03T20:29:57Z

Hi @Surfoo, it looks like the issue you’re experiencing is related to importing the recipe_scrapers library rather than the specific URLs. The ModuleNotFoundError: No module named 'recipe_scrapers' error suggests that Python wasn’t able to locate recipe_scrapers at all before attempting to access bestrecipes.

Could you share any additional output or error messages, if available, that might clarify the environment setup? It may also help to check if recipe_scrapers is installed in the same environment where you’re running the script.

Surfoo · 2024-11-03T20:49:04Z

I tried to help by following the Getting Started part in the readme. Which command would you like me to execute? I don't know Python.

I had the bug with the mealie app, here the log, but mealie use recipe_scrapers in backend.

mealie            | INFO     2024-11-03T19:03:47 - HTTP Request: GET https://www.bestrecipes.com.au/recipes/mini-marsala-fruit-cakes-recipe/kwlyzyae "HTTP/1.1 403 Forbidden"

jknndy · 2024-11-07T00:02:55Z

Sorry for the delay here, I am traveling for work so free time is rare! The 403 error makes me believe this could be related to the way mealie is attempting to access the site.

@jayaddison could you weigh in here? I believe this is similar to the other bots-protection issue opened recently.

jayaddison · 2024-11-07T22:32:21Z

Initially: yes, it seems likely that this could be some form of bot protection (network request filtering). I'll try to confirm that soon. @Surfoo did you manage to find a way to get that import to work? We don't generally suggest using pip here, but pipx should work equally well I'd expect.

Surfoo · 2024-11-08T22:19:29Z

Hello,
No I haven't tried it since the last time

jayaddison · 2024-11-12T19:04:09Z

I can confirm that I'm able to scrape the first recipe (the Satay Chicken one) from HTML successfully, so this does indeed seem to be some kind of network-request-filtering problem (aka bots-protection).

Surfoo added the bug label Nov 3, 2024

jayaddison added the bots-protection А form of bot protection is preventing the fetching of the recipe's HTML label Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bestrecipes.com.au scraper broken #1354

bestrecipes.com.au scraper broken #1354

Surfoo commented Nov 3, 2024

jknndy commented Nov 3, 2024

Surfoo commented Nov 3, 2024

jknndy commented Nov 7, 2024

jayaddison commented Nov 7, 2024

Surfoo commented Nov 8, 2024

jayaddison commented Nov 12, 2024

bestrecipes.com.au scraper broken #1354

bestrecipes.com.au scraper broken #1354

Comments

Surfoo commented Nov 3, 2024

jknndy commented Nov 3, 2024

Surfoo commented Nov 3, 2024

jknndy commented Nov 7, 2024

jayaddison commented Nov 7, 2024

Surfoo commented Nov 8, 2024

jayaddison commented Nov 12, 2024