-
Notifications
You must be signed in to change notification settings - Fork 119
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reduced number of files by merging data using batch_size
One observation is that current logic is creating multiple files, which is okay. But these files don’t really have a lot of entries. What could be more efficient is to perhaps store more entries until a threshold say 5000 or 10000 (like batch_size in load_multi_timeline_for_range). If this default threshold batch size isn't reached, keep adding to the same file. Keeping updating the end_ts but start_ts would remain the same. ---- Found an edge case Incremental export is fine. Let’s say we have chosen full export. In the sample data we have 1906 entries. In batch testing I’m setting batch_size_limit to 500. Now, when the code executes: - current_end_ts will be set to initEndTs which is current time () - FUZZ time as set by the pipeline queries. - new_entries will have all 1906 entries which is more than the batch_size_limit - BOTH batch_size_limit check and current_end_ts checks will be TRUE. - It will export the excessive batch of more than limit and also delete entries. - While it seems fine, it will cause issues when we attempt to restore data whose size exceeds batch size. Hence, need a way to handle this by perhaps: - Setting the current_end_ts to the ts value of the entry at the batch_size_limit - 1 index. - Fetching entries unto this point only. - Then fetching the next batch of entries. Essentially, in this scenario, unlike the incremental scenario where we are incrementing current_end_ts by 3600 seconds, Here, we need to increment current_end_ts to the next batch size limit - 1 index entry’s ts value. -------- Working on this but pending writing tests for this. Also, batch size still being exceeded.
- Loading branch information
Mahadik, Mukul Chandrakant
authored and
Mahadik, Mukul Chandrakant
committed
Aug 31, 2024
1 parent
63f7985
commit 34ab73d
Showing
3 changed files
with
191 additions
and
139 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.