Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting on a pattern performance stream vs fold #2837

Open
harendra-kumar opened this issue Sep 23, 2024 · 0 comments
Open

Splitting on a pattern performance stream vs fold #2837

harendra-kumar opened this issue Sep 23, 2024 · 0 comments
Labels
type:performance impacts performance
Milestone

Comments

@harendra-kumar
Copy link
Member

Stream based split operations are much faster compared to fold based.

Stream ops:

All.Data.Stream/o-1-space.FileSplitSeqUtf8.S.splitOnSeqUtf8 abcdefghijklmnopqrstuvwxyz (1/10)        173741.00
All.Data.Stream/o-1-space.FileSplitSeqUtf8.splitOnSeqUtf8 abcdefgh (1/10)                            187702.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSuffixSeq suffix lf                                     94668.50
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSuffixSeq suffix empty pattern                          66416.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSuffixSeq suffix crlf                                  148361.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSuffixSeq suffix abcdefghijklmnopqrstuvwxyz (1/10)      41291.60
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix lf                                           185537.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix empty pattern                                 68133.50
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix crlf                                         216108.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix catcatcatcatcat                              399620.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix abcdefghijklmnopqrstuvwxyz                   378587.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix abcdefghi                                    358539.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix abcdefgh                                     186022.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix aaaa                                         205258.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix aa                                           206173.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix a                                            184371.00
All.Data.Stream/o-1-space.FileSplitSeq.splitOnSeq infix 100k long pattern                            326212.00
All.Data.Stream/o-1-space.FileSplitSeq.splitWithSuffixSeq suffix crlf                                162380.00
All.Data.Stream/o-1-space.FileSplitSeq.splitWithSuffixSeq suffix abcdefghijklmnopqrstuvwxyz (1/10)    45155.90
All.Data.Stream/o-1-space.FileSplitElem.splitOn infix lf                                             174255.00

Fold based operations:

All.Data.Fold/o-1-space.FileSplitSeqUtf8.takeEndBySeq_ infix abcdefghijklmnopqrstuvwxyz (1/10)   200549.00
All.Data.Fold/o-1-space.FileSplitSeqUtf8.takeEndBySeq_ infix abcdefgh (1/10)                     208258.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ suffix lf                                     227089.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ suffix empty pattern                           64978.60
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ suffix crlf                                   441353.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ suffix abcdefghijklmnopqrstuvwxyz (1/10)       99868.90
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix lf                                      228112.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix empty pattern                            66797.40
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix crlf                                    473822.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix catcatcatcatcat                         880502.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix abcdefghijklmnopqrstuvwxyz              909916.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix abcdefghi                               881699.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix abcdefgh                                475380.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix aaaa                                    474110.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix aa                                      477609.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix a                                       227431.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq_ infix 100k long pattern                       888842.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq suffix crlf                                    478244.00
All.Data.Fold/o-1-space.FileSplitSeq.takeEndBySeq suffix abcdefghijklmnopqrstuvwxyz (1/10)        95273.50
All.Data.Fold/o-1-space.FileSplitElem.takeEndBy_ infix (splitOn)                                 151155.00
@harendra-kumar harendra-kumar added the type:performance impacts performance label Sep 23, 2024
@harendra-kumar harendra-kumar added this to the 0.12.0 milestone Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:performance impacts performance
Projects
None yet
Development

No branches or pull requests

1 participant