Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

looping through books several times #48

Open
crodgersfl opened this issue Feb 5, 2022 · 0 comments
Open

looping through books several times #48

crodgersfl opened this issue Feb 5, 2022 · 0 comments

Comments

@crodgersfl
Copy link

I fired off the main script and it says I have 374 books. It then downloads the books (37 pages at 10 per page). When it starts downloading books, the progress shows that it is trying to download 6818 books. After downloading just over 350 books, it starts listing the same books again and skipping the downloads because they already exist. It is like the database of books was loaded 19 or 20 times over.

Here is the output with Verbose turned on, cutting out some repetitive sections and showing a random error in between:

chris@chris-VB-LinuxMint:~/Linux_Shared/packtpub-downloader-master$ python3 main.py -e user@domain -p PaSsWoRd -d /packtpub_books -b pdf,epub,mobi,code -v -s
You are in!
You have 374 books
Getting list of books...
0%| | 0/37 [00:00<?, ?Pages/s]https://services.packtpub.com/entitlements-v1/users/me/products?sort=createdAt:DESC&offset=10&limit=10
3%|████▌ | 1/37 [00:00<00:27, 1.31Pages/s]https://services.packtpub.com/entitlements-v1/users/me/products?sort=createdAt:DESC&offset=20&limit=10
5%|█████████▏ | 2/37 [00:01<00:29, 1.18Pages/s]https://services.packtpub.com/entitlements-v1/users/me/products?sort=createdAt:DESC&offset=30&limit=10

......
97%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 36/37 [00:23<00:00, 1.78Pages/s]https://services.packtpub.com/entitlements-v1/users/me/products?sort=createdAt:DESC&offset=370&limit=10
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:24<00:00, 1.98Pages/s]
Downloading books...
/packtpub_books/Digital_Forensics_and_Incident_Response/Digital_Forensics_and_Incident_Response.epub already exists, skipping.
/packtpub_books/Digital_Forensics_and_Incident_Response/Digital_Forensics_and_Incident_Response.mobi already exists, skipping.
/packtpub_books/Digital_Forensics_and_Incident_Response/Digital_Forensics_and_Incident_Response.pdf already exists, skipping.
/packtpub_books/Beginning_Data_Science_with_Python_and_Jupyter/Beginning_Data_Science_with_Python_and_Jupyter.code already exists, skipping.
/packtpub_books/Beginning_Data_Science_with_Python_and_Jupyter/Beginning_Data_Science_with_Python_and_Jupyter.epub already exists, skipping.
/packtpub_books/Beginning_Data_Science_with_Python_and_Jupyter/Beginning_Data_Science_with_Python_and_Jupyter.mobi already exists, skipping.
/packtpub_books/Beginning_Data_Science_with_Python_and_Jupyter/Beginning_Data_Science_with_Python_and_Jupyter.pdf already exists, skipping.
/packtpub_books/Extreme_C/Extreme_C.code already exists, skipping.
/packtpub_books/Extreme_C/Extreme_C.epub already exists, skipping.
/packtpub_books/Extreme_C/Extreme_C.mobi already exists, skipping.
/packtpub_books/Extreme_C/Extreme_C.pdf already exists, skipping.
/packtpub_books/Wireframing_Essentials/Wireframing_Essentials.epub already exists, skipping.
/packtpub_books/Wireframing_Essentials/Wireframing_Essentials.mobi already exists, skipping.
/packtpub_books/Wireframing_Essentials/Wireframing_Essentials.pdf already exists, skipping.
/packtpub_books/TensorFlow_Reinforcement_Learning_Quick_Start_Guide/TensorFlow_Reinforcement_Learning_Quick_Start_Guide.code already exists, skipping.
/packtpub_books/TensorFlow_Reinforcement_Learning_Quick_Start_Guide/TensorFlow_Reinforcement_Learning_Quick_Start_Guide.epub already exists, skipping.
/packtpub_books/TensorFlow_Reinforcement_Learning_Quick_Start_Guide/TensorFlow_Reinforcement_Learning_Quick_Start_Guide.mobi already exists, skipping.
/packtpub_books/TensorFlow_Reinforcement_Learning_Quick_Start_Guide/TensorFlow_Reinforcement_Learning_Quick_Start_Guide.pdf already exists, skipping.
0%| | 5/6818 [00:12<4:28:38, 2.37s/Book]Starting to download /packtpub_books/Splunk_7_x_Quick_Start_Guide/Splunk_7_x_Quick_Start_Guide.code
Finished /packtpub_books/Splunk_7_x_Quick_Start_Guide/Splunk_7_x_Quick_Start_Guide.code
Starting to download /packtpub_books/Splunk_7_x_Quick_Start_Guide/Splunk_7_x_Quick_Start_Guide.epub
Finished /packtpub_books/Splunk_7_x_Quick_Start_Guide/Splunk_7_x_Quick_Start_Guide.epub
Starting to download /packtpub_books/Splunk_7_x_Quick_Start_Guide/Splunk_7_x_Quick_Start_Guide.mobi
Finished /packtpub_books/Splunk_7_x_Quick_Start_Guide/Splunk_7_x_Quick_Start_Guide.mobi
/packtpub_books/Splunk_7_x_Quick_Start_Guide/Splunk_7_x_Quick_Start_Guide.pdf already exists, skipping.
0%|▏ | 6/6818 [00:17<6:20:58, 3.36s/Book]Starting to download /packtpub_books/Salesforce_Platform_Developer_I_Certification_Guide/Salesforce_Platform_Developer_I_Certification_Guide.code
Finished /packtpub_books/Salesforce_Platform_Developer_I_Certification_Guide/Salesforce_Platform_Developer_I_Certification_Guide.code
Starting to download /packtpub_books/Salesforce_Platform_Developer_I_Certification_Guide/Salesforce_Platform_Developer_I_Certification_Guide.epub
Finished /packtpub_books/Salesforce_Platform_Developer_I_Certification_Guide/Salesforce_Platform_Developer_I_Certification_Guide.epub
......

Eventually I get the same list of books being attemted over and over (skipping) as the counter keeps going up into the thousands.

Starting to download /packtpub_books/Learning_Drupal_8/Learning_Drupal_8.pdf
Finished /packtpub_books/Learning_Drupal_8/Learning_Drupal_8.pdf
5%|████████▊ | 373/6818 [1:02:11<7:54:10, 4.41s/Book]Starting to download /packtpub_books/Functional_C3/Functional_C3.code
Finished /packtpub_books/Functional_C3/Functional_C3.code
Starting to download /packtpub_books/Functional_C3/Functional_C3.epub
Finished /packtpub_books/Functional_C3/Functional_C3.epub
Starting to download /packtpub_books/Functional_C3/Functional_C3.mobi
Finished /packtpub_books/Functional_C3/Functional_C3.mobi
Starting to download /packtpub_books/Functional_C3/Functional_C3.pdf
Finished /packtpub_books/Functional_C3/Functional_C3.pdf
/packtpub_books/Hands-On_Dashboard_Development_with_Shiny/Hands-On_Dashboard_Development_with_Shiny.code already exists, skipping.
/packtpub_books/Hands-On_Dashboard_Development_with_Shiny/Hands-On_Dashboard_Development_with_Shiny.epub already exists, skipping.
/packtpub_books/Hands-On_Dashboard_Development_with_Shiny/Hands-On_Dashboard_Development_with_Shiny.mobi already exists, skipping.
/packtpub_books/Hands-On_Dashboard_Development_with_Shiny/Hands-On_Dashboard_Development_with_Shiny.pdf already exists, skipping.
/packtpub_books/Enterprise_Augmented_Reality_Projects/Enterprise_Augmented_Reality_Projects.code already exists, skipping.
/packtpub_books/Enterprise_Augmented_Reality_Projects/Enterprise_Augmented_Reality_Projects.epub already exists, skipping.
/packtpub_books/Enterprise_Augmented_Reality_Projects/Enterprise_Augmented_Reality_Projects.mobi already exists, skipping.
/packtpub_books/Enterprise_Augmented_Reality_Projects/Enterprise_Augmented_Reality_Projects.pdf already exists, skipping.
/packtpub_books/Continuous_Delivery_with_Docker_and_Jenkins/Continuous_Delivery_with_Docker_and_Jenkins.code already exists, skipping.
/packtpub_books/Continuous_Delivery_with_Docker_and_Jenkins/Continuous_Delivery_with_Docker_and_Jenkins.epub already exists, skipping.
/packtpub_books/Continuous_Delivery_with_Docker_and_Jenkins/Continuous_Delivery_with_Docker_and_Jenkins.mobi already exists, skipping.
/packtpub_books/Continuous_Delivery_with_Docker_and_Jenkins/Continuous_Delivery_with_Docker_and_Jenkins.pdf already exists, skipping.

......

When I ctl-c killed it, I saw:
/packtpub_books/Azure_for_Architects_-Second_Edition/Azure_for_Architects-Second_Edition.code already exists, skipping.
/packtpub_books/Azure_for_Architects
-Second_Edition/Azure_for_Architects-Second_Edition.epub already exists, skipping.
/packtpub_books/Azure_for_Architects
-Second_Edition/Azure_for_Architects-Second_Edition.mobi already exists, skipping.
/packtpub_books/Azure_for_Architects
-Second_Edition/Azure_for_Architects-_Second_Edition.pdf already exists, skipping.
30%|████████████████████████████████████████████████▋ | 2064/6818 [2:12:51<3:21:20, 2.54s/Book]^CTraceback (most recent call last):
File "/home/chris/.local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 377, in _make_request
httplib_response = conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'

Yes, up to book # 2064 of 6818 books (and all of the skips in between).

Is there a way to de-dupe the books list? We probably want to keep the order so that it downloads most recent adds first, so you can kill it when you get into the repeated skipping for all of the books you have already downloaded.

I am not familiar with python, but I used to code in C. Is there a bit of code I could add to dump the books list to a file before it starts so that I can see what it is about to loop through? Or could we add an option to dump the data to a file and leave it after it has completed (same writable location as target directory, remove file if it already exists, create new database dump file, dump the books data into the file, close the file, if verbose then echo the file name and location (dumped the books database to xxx) - something like that for an additional option - maybe -x or --extrainfo)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant