Skip to content

Latest commit

 

History

History
103 lines (89 loc) · 7.25 KB

README.md

File metadata and controls

103 lines (89 loc) · 7.25 KB

rezAWARE

The README file is useful for projects that are using rezaware platform for AI/ML and augmented BI pipelines. It is designed to integrate data wrangling, mining, and visualization in to a single coherent project. Here we introduce ways for getting started with the platform framework. The WIKI for comprehensive documentation on the rezaware methodology, functional components, and behaviour.

NOTE: instructions and content is specific to Debian distros and was tested on Ubuntu 20.04.

Table of Content

Starting a New Project

  1. Create an empty git repository with the a desired project name; e.g., MyNewProj .
    • Presupose that you have git installed and initialized on your computer.
  2. Clone your MyNewProj into a desired director location; for example
    • cd ~/all_rez_projects/
    • git clone https://github.com/<my_git_user_name>/MyNewProj.git
  3. Move into the newly created project folder
    • cd ~/all_rez_projects/MyNewProj
  4. Now clone and initialize rezaware platform as a submodule
    • git submodule add -b main https://github.com/waidyanatha/rezaware.git rezaware
    • git submodule init; will copy the mapping from the .gitmodules file into the local ./.git/config file
  5. Navigate into the rezaware folder and run setup to initialize the project with AI/ML functional app classes
    • cd rezaware
    • In the next command run the setup for rezaware separately and the apps separately
      • python3 -m 000_setup --app=rezaware --with_ini_files; it is important to use the --with_ini_files directive_ flag because it instructs 000_setup.py to build the rezaware app and python init.py and app.ini files necessary for the seamless package integration
      • python3 -m 000_setup; at the onset you would not have any wrangler, mining, and visuals code in the respective modules folders; hence, you cannot build the python init.py and app.ini files. Without the --with_ini_files directive the process will simply generate the app folder structure and default app.cfg file.
    • You have now created your MyNewProj with the rezaware platform framework and can begin to start coding.
    • Note you need to configure the app.cfg in the mining,wrangler,and visuals apps
      • each time you add new module packages; it needs to be added or removed from app.cfg
      • any other parameters, specific to the project must be changed.
  6. Change back to the project director
    • cd .. or cd ~/all_rez_projects/MyNewProj
  7. Add the submodule and initialize
    • git add .gitmodules rezaware/
    • git init
  8. Install dependencies with python poetry.
    • The pyproject.tom file would be created from the previous 000_setup.py step
    • poetry --version will confirm if poetry dependency manager is installed
    • If required, follow the poetry installation docs
    • Activate the lock file with poetry lock
    • Install dependencies with poetry install
    • confirm installation and environment with poetry shell; create a default shell with (rezaware-py3.10)
  9. (Optional) Include a README.md file, if not already
    • echo "# Welcome to MyNewProj" >> README.md
  10. Add and commit all newly created files and folders in MyNewProj
    • git add .
    • git commit -m "added rezaware submudle and setup project"
  11. Push the submodule and new commits to the repo
    • git push origin main
    • Check your github project in the browser; you will see a folder rezaware @ xxxxxxx; where xxxxxxx is the last 7 digits from the rezaware.git repo commit code

Test the new Project

Run pytest by executing the command in your terminal prompt

  • pytest

Update rezaware from remote repo

From time to time you will need to update the rezaware submodule, in your project.

  1. change your directory to MyNewProj folder
    • cd ~/all_rez_projects/MyNewProj
  2. fetch latest changes from rezaware.git repository, and merge them into current MyNewProj branch.
    • git submodule update --remote --merge
  3. update the repo in github:
    • git commit -s -am "updating rezaware submodule with latest"
    • git push origin main

Reconfiguring existing project

When you add a new module package into the mining, wrangler, and visuals app folders; as well as defining them in the app.cfg file, the init and app.ini framework files need to be updated. For such simply run the 000_setup.py

  • cd ~/all_rez_projects/MyNewProj/rezaware navigate into the rezaware folder
  • python3 -m 000_setup --with_ini_files will re-configure all the apps
  • Alternatively python3 -m 000_setup --app=wrangler,mining will only re-configure the specific apps

About the Post Setup Artifacts

  1. Mining - Arificial Intelligence (AI) and Machine Learning (ML) analytical methods
  2. Wrangler- for processing data extract, transform, and load automated pipelines
  3. Visuals - interactive dashboards with visual analytics for Business Intelligence (BI)
  4. utils.py- contains a set of framework functions useful for all apps
  5. app.cfg - defines the app specific config section-wise key/value pairs
  6. Folders - each of the mining, wrangler, and visuals folders will contain a set of subfolders
    • dags - organizing airflow or other scheduler pipelines scripts
    • data - specific parametric data and tmp files
    • db - database scripts for creating the schema, tables, and initial data
    • logs - log files created by each module package
    • modules - managing the package functional class libraries
    • notebooks - jupyter notebooks for developing and testing pipeline scripts
    • tests - pytest scripts for applying unit & functional tests for any of the packages

Deprecated

  1. (Recommended) you may also consider setting up an Anaconda environment with python-3.8.10 to avoid any distutils issues.
    • create a new environment using the requirements.txt file that is in the rezaware folder:
      • conda create --name rezenv python=3.8.10 --file requirements.txt
    • Thereafter, check if all packages, listed in requirements.txt was installed
      • conda list will print a list to stdout
    • Activate your conda environment;
      • e.g. conda activate rezenv