GitHub - vittoriopolverino/mapreduce-wordcount: MapReduce python implementation

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

Repository files navigation

MapReduce Word Count

A naive python implementation (no distributed computing) to mimic and understand the MapReduce paradigm.

🧐 About

MapReduce is a programming model for processing and generating big data sets with a parallel, distributed algorithm on a cluster. The "MapReduce System" is usually composed of three functions (or steps):

Map: The map function, also referred to as the map task, processes a single key/value input pair and produces a set of intermediate key/value pairs.
Shuffle: The shuffle function transfer data from Mapper to Reducer. It is a mandatory operation for reducers to proceed their jobs further as the shuffling process serves as input for the reduce tasks.
Reduce: The reduce function, also referred to as the reduce task, consists of taking all key/value pairs produced in the map phase that share the same intermediate key and producing zero, one, or more data items.

🏁 Getting Started

Use the Pipfile to install packages in the virtualenv:

pipenv install
pipenv install --dev

💻 Usage

Run the MapReduce example:

pipenv run wordcount

🐛 Test

Run Unit and Integration tests

pipenv run test

⛏️ Built Using

Python | Programming language
Pipenv | Dependency management
Pytest | Testing
Pre-Commit | Managing and maintaining hooks
Github Actions | CI/CD
clean-text | Data cleaning

✏️ Authors

Made with ❤️ by @vittoriopolverino ️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MapReduce Word Count

📜 Table of Contents

🧐 About

🏁 Getting Started

💻 Usage

🐛 Test

⛏️ Built Using

✏️ Authors

About

Releases

Packages

Languages

License

vittoriopolverino/mapreduce-wordcount

Folders and files

Latest commit

History

Repository files navigation

MapReduce Word Count

📜 Table of Contents

🧐 About

🏁 Getting Started

💻 Usage

🐛 Test

⛏️ Built Using

✏️ Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages