Fix/SK-1060 | Move mnist data to scaleout public bucket #712
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request includes significant changes to the data handling and dependencies in the
examples/mnist-pytorch/client
project. The most important changes involve replacing the use oftorchvision
with direct data downloads, updating the data loading logic, and modifying the environment dependencies.Data Handling Improvements:
examples/mnist-pytorch/client/data.py
: Replacedtorchvision
dataset download with direct data download usingrequests
. This includes setting a random split ID and downloading the corresponding data file from Scaleout's own public bucket.examples/mnist-pytorch/client/data.py
: Updated theload_data
function to dynamically find the correct data split ID and verify the existence of the data file.examples/mnist-pytorch/client/data.py
: Removed thesplitset
andsplit
functions, as the data splitting is now handled by the pre-downloaded data files.Dependency Updates:
examples/mnist-pytorch/client/python_env.yaml
: Removedtorchvision
from the dependencies and updatednumpy
versions to ensure compatibility with different platforms and Python versions.