-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to scrape JavaScript website #891
Comments
Hey @nihaludin! I'm a bot here to help you with bugs, questions, and contributions. Let's tackle this issue together! To address the timeout error you're experiencing, consider the following adjustments:
Here's an updated configuration suggestion: graph_config = {
"llm": {
"api_key": OPENAI_API_KEY,
"model": "openai/gpt-4o",
},
"verbose": True,
"headless": False,
"loader_kwargs": {
"requires_js_support": True,
"timeout": 180, # Increase timeout
"retry_limit": 3,
"slow_mo": 20000
}
} These steps should help mitigate the timeout issue. If the problem persists, consider testing with a different URL that has known dynamic content. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
I increased the timeout parameter to 1000 but still im getting the same error @dosu |
If increasing the timeout to 1000 seconds didn't resolve the issue, consider the following steps:
If these steps don't resolve the issue, it might be helpful to log more detailed information about the scraping attempts to diagnose the problem further. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
@VinciGit00 Not able to fetch details after "requires_js_support": True is there any other approach to overcome this issue. Thanks! |
Why you navigated to https://www.example.com/? It does not require javascript |
@VinciGit00 Apologies ,it's just an example not the same website I'm actually working with an e-commerce website which has javascript. |
ok send the complete code please |
`graph_config = {
}` `smart_scraper = SmartScraperGraph( result = smart_scraper.run()` @VinciGit00 please find the code above, Thanks! |
I'm trying to scrape this website @VinciGit00 Any help would be appreciated, Thanks! |
Hi @ALL,
I’m encountering a similar issue and would appreciate your help. Initially, I received the message, "You need to enable JavaScript to run this app."
However, after enabling requires_js_support: True in my configuration, I am now encountering a timeout error.
--- Executing Fetch Node ---
--- (Fetching HTML from: https://www.example.com/) ---
Attempt 1 failed: Page.goto: Timeout 30000ms exceeded.
Call log:
navigating to "https://www.example.com/", waiting until "networkidle"
Here’s my current configuration:
graph_config = { "llm": { "api_key": OPENAI_API_KEY, "model": "openai/gpt-4o", }, "verbose": True, "headless": False, "loader_kwargs": { "requires_js_support": True, "timeout": 120, "retry_limit": 3, "slow_mo": 20000 } }
The text was updated successfully, but these errors were encountered: