Skip to content

el-tegy/stream_persona

Repository files navigation

stream_persona

Build status Python Version Dependencies Status

Code style: black Security: bandit Pre-commit Semantic Versions License Coverage Report

stream_persona is a comprehensive data engineering solution designed to seamlessly capture, process, store, and analyze data streams. Leveraging state-of-the-art technologies such as Apache Kafka, Apache Airflow, Apache Spark, and Apache Cassandra, this architecture ensures efficient data flow and processing capabilities.

Very first steps

Initialize your code

  1. Initialize git inside your repo:
cd stream_persona && git init
  1. If you don't have Poetry installed run:
make poetry-download
  1. Initialize poetry and install pre-commit hooks:
make install
make pre-commit-install
  1. Run the codestyle:
make codestyle
  1. Upload initial code to GitHub:
git add .
git commit -m ":tada: Initial commit"
git branch -M main
git remote add origin https://github.com/el-tegy/stream_persona.git
git push -u origin main

Set up bots

  • Set up Dependabot to ensure you have the latest dependencies.
  • Set up Stale bot for automatic issue closing.

Poetry

Want to know more about Poetry? Check its documentation.

Details about Poetry

Poetry's commands are very intuitive and easy to learn, like:

  • poetry add numpy@latest
  • poetry run pytest
  • poetry publish --build

etc

Building and releasing your package

Building a new version of the application contains steps:

  • Bump the version of your package poetry version <version>. You can pass the new version explicitly, or a rule such as major, minor, or patch. For more details, refer to the Semantic Versions standard.
  • Make a commit to GitHub.
  • Create a GitHub release.
  • And... publish πŸ™‚ poetry publish --build

🎯 What's next

Well, that's up to you πŸ’ͺ🏻. I can only recommend the packages and articles that helped me.

  • Typer is great for creating CLI applications.
  • Rich makes it easy to add beautiful formatting in the terminal.
  • Pydantic – data validation and settings management using Python type hinting.
  • Loguru makes logging (stupidly) simple.
  • tqdm – fast, extensible progress bar for Python and CLI.
  • IceCream is a little library for sweet and creamy debugging.
  • orjson – ultra fast JSON parsing library.
  • Returns makes you function's output meaningful, typed, and safe!
  • Hydra is a framework for elegantly configuring complex applications.
  • FastAPI is a type-driven asynchronous web framework.

Articles:

πŸš€ Features

Development features

Deployment features

Open source community features

Installation

pip install -U stream_persona

or install with Poetry

poetry add stream_persona

Then you can run

stream_persona --help

or with Poetry:

poetry run stream_persona --help

Makefile usage

Makefile contains a lot of functions for faster development.

1. Download and remove Poetry

To download and install Poetry run:

make poetry-download

To uninstall

make poetry-remove

2. Install all dependencies and pre-commit hooks

Install requirements:

make install

Pre-commit hooks coulb be installed after git init via

make pre-commit-install

3. Codestyle

Automatic formatting uses pyupgrade, isort and black.

make codestyle

# or use synonym
make formatting

Codestyle checks only, without rewriting files:

make check-codestyle

Note: check-codestyle uses isort, black and darglint library

Update all dev libraries to the latest version using one comand

make update-dev-deps
4. Code security

make check-safety

This command launches Poetry integrity checks as well as identifies security issues with Safety and Bandit.

make check-safety

5. Type checks

Run mypy static type checker

make mypy

6. Tests with coverage badges

Run pytest

make test

7. All linters

Of course there is a command to rule run all linters in one:

make lint

the same as:

make test && make check-codestyle && make mypy && make check-safety

8. Docker

make docker-build

which is equivalent to:

make docker-build VERSION=latest

Remove docker image with

make docker-remove

More information about docker.

9. Cleanup

Delete pycache files

make pycache-remove

Remove package build

make build-remove

Delete .DS_STORE files

make dsstore-remove

Remove .mypycache

make mypycache-remove

Or to remove all above run:

make cleanup

πŸ“ˆ Releases

You can see the list of available releases on the GitHub Releases page.

We follow Semantic Versions specification.

We use Release Drafter. As pull requests are merged, a draft release is kept up-to-date listing the changes, ready to publish when you’re ready. With the categories option, you can categorize pull requests in release notes using labels.

List of labels and corresponding titles

Label Title in Releases
enhancement, feature πŸš€ Features
bug, refactoring, bugfix, fix πŸ”§ Fixes & Refactoring
build, ci, testing πŸ“¦ Build System & CI/CD
breaking πŸ’₯ Breaking Changes
documentation πŸ“ Documentation
dependencies ⬆️ Dependencies updates

You can update it in release-drafter.yml.

GitHub creates the bug, enhancement, and documentation labels for you. Dependabot creates the dependencies label. Create the remaining labels on the Issues tab of your GitHub repository, when you need them.

πŸ›‘ License

License

This project is licensed under the terms of the MIT license. See LICENSE for more details.

πŸ“ƒ Citation

@misc{stream_persona,
  author = {Elisee TEGUE},
  title = {stream_persona is a comprehensive data engineering solution designed to seamlessly capture, process, store, and analyze data streams. Leveraging state-of-the-art technologies such as Apache Kafka, Apache Airflow, Apache Spark, and Apache Cassandra, this architecture ensures efficient data flow and processing capabilities.},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/el-tegy/stream_persona}}
}

Credits πŸš€ Your next Python package needs a bleeding-edge project structure.

This project was generated with python-package-template

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •