-
Notifications
You must be signed in to change notification settings - Fork 12
Home
-
Clone the repo and open it in VSCode.
-
Open the terminal in VSCode and make sure that you are in the folder:
nlp-sdg
-
Create a new python environment to host the project.
conda create -n nlp-sdg python==3.8.8 -y
-
Activate your conda environment
conda activate nlp-sdg
-
Since this is a kedro project, the first thing you will need to do is install kedro in your environment.
pip install kedro
The first time you do this, you may get some error messages about missing packages. This is expected, do not worry about them, the next step will install them.
-
Now that you have installed kedro, you must install all the project dependencies.
pip install -r src/requirements.txt
-
We are now ready to build and run the pipeline. Before doing that, make sure that you have the kaggle
train.csv
dataset under thedata/01_raw/
folder:
-
With all your dependencies installed and the training dataset in the correct folder, you can now build the docker container to house your kedro project.
kedro docker build
Here is some additional info regarding kedro-docker
-
Now use the below command to run the dummy pipeline:
kedro docker run
-
The pipeline will start running and you should see some logs on your terminal, here is some sample output: