Releases: IBM/spark-s3-shuffle
Releases · IBM/spark-s3-shuffle
Maintenance release: Create builds for Spark 3.5.x
Move files using NIO if the shuffle dir is mounted as a local file system.
What's Changed
- Add block sizes to benchmarks and use an array-backed buffer for uploading to S3. by @pspoerri in #80
- Move files using NIO if the shuffle dir is mounted as a local file system. by @pspoerri in #81
Full Changelog: v0.9.4...v0.9.5
Fix timeout issue on S3A filesystem.
What's Changed
- Fix S3A timeout issue and avoid memory leak. by @pspoerri in #75
- Increase default maxConcurrencyTask to 10. by @pspoerri in #76
- Improve documentation and add a tuning guide and disable fallback fetch in the benchmarks. by @pspoerri in #77
- Bump the default version to 0.9.4 by @pspoerri in #78
- Revert: Modify the multipart upload size. by @pspoerri in #79
Full Changelog: v0.9.3...v0.9.4
Automatically adapt shuffle fetch concurrency based on I/O wait time.
What's Changed
- Increase example block size to 128 MiB. by @pspoerri in #67
- Dynamically adapt the number of threads based on the I/O wait time. by @pspoerri in #68
- Measure I/O statistics when writing shuffle data. by @pspoerri in #69
- Enable prometheus integration for non-nfs examples. by @pspoerri in #71
- Improve shuffle storage path for efficient lookup and delete. by @pspoerri in #70
- Enable Spark fetching mechanism (optional) and improve reading/writing of index and checksum files. by @pspoerri in #72
- Bump Version in the Dockerfiles and Config to 0.9.3. by @pspoerri in #73
- Log Stage and Task ID to understand I/O bottlenecks. by @pspoerri in #74
Full Changelog: v0.9.2...v0.9.3
v0.9.2
Remove unused configuration options.
What's Changed
- Remove unused configuration options. Fix NFS example config. Create test harness. by @pspoerri in #65
Full Changelog: v0.9...v0.9.1
Buffer streams in parallel.
What's Changed
- WIP: Fix performance regression and rename maxBufferSize to bufferInputSize. by @pspoerri in #54
- Enable local testing. by @pspoerri in #56
- Buffer streams in parallel up to a threshold. by @pspoerri in #57
- Configure readahead if the filesystem supports it. by @pspoerri in #62
- Enable tests if scala version 2.12. by @pspoerri in #63
Full Changelog: v0.8.2...v0.9
v0.9-spark3.1
0.9 release for Spark 3.1. The features
were omitted.
What's Changed
- Make Spark-3.1 branch consistent with main. by @pspoerri in #42
- Update Spark-3.1 branch to 0.9 by @pspoerri in #64
Full Changelog: v0.8-spark3.1...v0.9-spark3.1
Allow configuration of buffer sizes to optimize I/O on distributed file systems.
What's Changed
- Use SBT Build Info plugin to populate version number. by @pspoerri in #48
- Integrate a JVM-Profiler. by @pspoerri in #49
- Allow configuration of buffer sizes to optimize I/O on distributed file systems. by @pspoerri in #50
- Fix Scala 2.13 builds in SBT. by @pspoerri in #52
Full Changelog: v0.8.1...v0.8.2
Fix MetadataFetchFailedException when pods are crashing.
What's Changed
- Register blocks in FallbackStorage by default to avoid the
MetadataFetchFailedException
. by @pspoerri in #45 - Disable caching when listing shuffle indices. by @pspoerri in #44
- Enable/disable caching of partition lengths with a configuration variable by @pspoerri in #46
Full Changelog: v0.8...v0.8.1