Katana doesn't seem to work when trying to scrape certain dynamically loaded pages, especially those that have a dynamic map #520

PixelNinja2023 · 2023-07-17T07:33:34Z

katana version: v1.0.2

Current Behavior:

The body of text being returned when scraping this specific page doesn't return any of the information related to the petrol stations/ shops.

Expected Behavior:

Since katana operates using playwright I assume that by loading the page with javascript with the correct input parameters it would return the entire body of text as if I were loading it with a normal browser.

Steps To Reproduce:

I insert the following command into my terminal to scrape the following page of its body of text:

/root/go/bin/katana -timeout 10 -headless -d 2 -flc /config.yaml -f address
-u https://www.oil-tankstellen.de/tankstellen-tankstationen/tankstellenfinder-kraftstoffpreise-benzinpreise/ -no-incognito -ns -jc -ct 10

where the config.yaml file has the following basic regex rule to extract the entire page body:

name: address
type: regex
regex:
- '(.*)'
  group: 1

Is there anyway I can scrape the entire page with all the information for all the Oil! petrol stations using katana?

PixelNinja2023 added the Type: Bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Katana doesn't seem to work when trying to scrape certain dynamically loaded pages, especially those that have a dynamic map #520

Katana doesn't seem to work when trying to scrape certain dynamically loaded pages, especially those that have a dynamic map #520

PixelNinja2023 commented Jul 17, 2023

Katana doesn't seem to work when trying to scrape certain dynamically loaded pages, especially those that have a dynamic map #520

Katana doesn't seem to work when trying to scrape certain dynamically loaded pages, especially those that have a dynamic map #520

Comments

PixelNinja2023 commented Jul 17, 2023

katana version: v1.0.2

Current Behavior:

Expected Behavior:

Steps To Reproduce: