Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use concurrent batch pipeline for ~30x speed up #236

Merged
merged 10 commits into from
Oct 7, 2024
Merged

Conversation

dfulu
Copy link
Member

@dfulu dfulu commented Jul 24, 2024

Add the PVNet concurrent datapipe to the concurrent batch creation and backtest scripts. The new data is already in ocf_datapipes and is about 30x faster than the old datapipe for creating concurrent batches

@dfulu dfulu requested a review from zakwatts July 24, 2024 13:18
@dfulu dfulu requested a review from Sukh-P July 24, 2024 13:18
@dfulu dfulu changed the title Concurrent pipeline Use concurrent pipeline for ~30x speed up Jul 24, 2024
@dfulu dfulu changed the title Use concurrent pipeline for ~30x speed up Use concurrent batch pipeline for ~30x speed up Jul 24, 2024
Copy link

codecov bot commented Jul 24, 2024

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Please upload report for BASE (main@4473d33). Learn more about missing BASE report.

Files Patch % Lines
pvnet/models/base_model.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #236   +/-   ##
=======================================
  Coverage        ?   58.48%           
=======================================
  Files           ?       29           
  Lines           ?     1879           
  Branches        ?        0           
=======================================
  Hits            ?     1099           
  Misses          ?      780           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zakwatts
Copy link
Contributor

Looks good!

@zakwatts
Copy link
Contributor

Might be worth getting @AUdaltsova and @Sukh-P to review also as they use datapipes quite a bit

pvnet/models/base_model.py Outdated Show resolved Hide resolved
scripts/backtest_uk_gsp.py Outdated Show resolved Hide resolved
scripts/backtest_uk_gsp.py Outdated Show resolved Hide resolved
@Sukh-P
Copy link
Member

Sukh-P commented Jul 24, 2024

Great work adding this in and getting the speedup! Just checking has this been reran and tested locally?

@dfulu
Copy link
Member Author

dfulu commented Jul 24, 2024

Great work adding this in and getting the speedup! Just checking has this been reran and tested locally?

Yeh, it has been tested locally on a small set of examples. I'm going to leave it open for now since I am making concurrent batches to train on and Zak is doing a backtest. Just in case more bugs come out of the woodwork during those

@dfulu dfulu merged commit c93996c into main Oct 7, 2024
3 checks passed
@dfulu dfulu deleted the concurrent_pipeline branch October 7, 2024 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants