Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_requests_per_second cannot be zero otherwise it will break here :… #100

Merged
merged 2 commits into from
Feb 13, 2024

Conversation

spacewaterbear
Copy link
Contributor

Env

python3.11

Error description

by running
cmon download --match_type=domain --limit=100 html_output example.com html

I got this error

/cmoncrawl/processor/pipeline/downloader.py", line 97, in __init__
    self.throttler = Throttler(int(1000 / max_requests_per_second))
                                   ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
ZeroDivisionError: division by zero

because in cmoncrawl/integrations/download.py, the default value is 0 if ouput is html

max_requests_per_second = (
        args.max_requests_per_second if mode == DownloadOutputFormat.RECORD else 0
    )

Solution

set default value to 1 when html mode chosen

… cmoncrawl/processor/pipeline/downloader.py", line 97, in __init__

    self.throttler = Throttler(int(1000 / max_requests_per_second))
@hynky1999
Copy link
Owner

Great spot!
However the correct fix should remove the mode check entirely and directly use the value which was parsed, since it's set in both of the modes.

I can fix that myself or you can do that! (let me know)

max_requests_per_second = (
        args.max_requests_per_second if mode == DownloadOutputFormat.RECORD else 0
    )

->

max_requests_per_second = args.max_requests_per_second

(Better to directly set it as the variable is not really needed anymore)

Thanks for the contribution :)

@hynky1999 hynky1999 merged commit b324fe2 into hynky1999:main Feb 13, 2024
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants