Process running ETL pipeline 0.0.0 (fr/en)

scraping script for https://books.toscrape.com/ (beta version), all details in mission folder

Comment executer en local le script python (en fr)

1. Ouvrir un Terminal : "PowerShell" sous Windows et "Terminal" sous Mac

2. Se placer dans un répertoire de travail (ex "mes documents") :

# navigation dans un terminal :
  pwd               # affiche le repertoire de travail
  ls                # liste les éléments contenus dans répertoire
  cd ..             # permet de remonter au dossier parent
  cd 'name_dossier' # permet d'accéder à un dossier fils

3. Créer un nouveau dossier "etl_script" où l'on va importer le script python ETL à l'aide de la commande "mkdir" :

  mkdir etl_script

4. Se placer dans le répertoire "ETL" à l'aide de la commande "cd" :

  cd etl_script

5. Cloner le repo en entrant la commande suivante :

  git clone https://github.com/Nidal94320/OC_P2.git

6. Créer un environnement virtuel à l’aide de la commande :

  python -m venv env

7. Switcher sur l’environnement virtuel que l'on vient de créer :

env/Scripts/activate # sous Windows
source env/bin/activate # sous Mac

8. Installer les packages Python :

pip install –r requirements.txt

9. Exécuter le script en entrant la commande "python main.py" (patienter 5-10mn jusqu’à la fin de l’exécution) :

python main.py

10. Accéder aux data

cd data     # Pour accéder aux données des livres de chaque catégorie au format csv
cd data/img # Pour accéder aux images des livres

How running locally the python script (in en)

1. Open Shell

2. Navigate into a work directory (ex "my documents) :

# to navigate in a terminal:
  pwd               # print working directory
  ls                # list folder elements
  cd ..             # navigate to the parent folder
  cd 'name_dossier' # navigate to a son folder

3. Create a new folder "etl_script" where you could import the repo :

  mkdir etl_script

4. Navigate to etl_script/ :

  cd etl_script

5. Clone the repo :

  git clone https://github.com/Nidal94320/OC_P2.git

6. Create a new virtuel environment through the command :

  python -m venv env

7. Switch on the new virtuel environment you just created it :

env/Scripts/activate # under Windows
source env/bin/activate # under Mac

8. Install required Python packages :

pip install –r requirements.txt

9. Run the script :

python main.py

10. Get data :

cd data     # To get books data through csv files (more info about how to read csv files https://www.youtube.com/watch?v=XsTvCcejcYE)
cd data/img # to get books images

Feedback/Question

If you have any feedback or questions, please reach out to me at [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
data		data
mission		mission
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Process running ETL pipeline 0.0.0 (fr/en)

Comment executer en local le script python (en fr)

1. Ouvrir un Terminal : "PowerShell" sous Windows et "Terminal" sous Mac

2. Se placer dans un répertoire de travail (ex "mes documents") :

3. Créer un nouveau dossier "etl_script" où l'on va importer le script python ETL à l'aide de la commande "mkdir" :

4. Se placer dans le répertoire "ETL" à l'aide de la commande "cd" :

5. Cloner le repo en entrant la commande suivante :

6. Créer un environnement virtuel à l’aide de la commande :

7. Switcher sur l’environnement virtuel que l'on vient de créer :

8. Installer les packages Python :

9. Exécuter le script en entrant la commande "python main.py" (patienter 5-10mn jusqu’à la fin de l’exécution) :

10. Accéder aux data

How running locally the python script (in en)

1. Open Shell

2. Navigate into a work directory (ex "my documents) :

3. Create a new folder "etl_script" where you could import the repo :

4. Navigate to etl_script/ :

5. Clone the repo :

6. Create a new virtuel environment through the command :

7. Switch on the new virtuel environment you just created it :

8. Install required Python packages :

9. Run the script :

10. Get data :

Feedback/Question

About

Releases

Packages

Languages

License

NidalChateur/OC_P2_SCRAPING

Folders and files

Latest commit

History

Repository files navigation

Process running ETL pipeline 0.0.0 (fr/en)

Comment executer en local le script python (en fr)

1. Ouvrir un Terminal : "PowerShell" sous Windows et "Terminal" sous Mac

2. Se placer dans un répertoire de travail (ex "mes documents") :

3. Créer un nouveau dossier "etl_script" où l'on va importer le script python ETL à l'aide de la commande "mkdir" :

4. Se placer dans le répertoire "ETL" à l'aide de la commande "cd" :

5. Cloner le repo en entrant la commande suivante :

6. Créer un environnement virtuel à l’aide de la commande :

7. Switcher sur l’environnement virtuel que l'on vient de créer :

8. Installer les packages Python :

9. Exécuter le script en entrant la commande "python main.py" (patienter 5-10mn jusqu’à la fin de l’exécution) :

10. Accéder aux data

How running locally the python script (in en)

1. Open Shell

2. Navigate into a work directory (ex "my documents) :

3. Create a new folder "etl_script" where you could import the repo :

4. Navigate to etl_script/ :

5. Clone the repo :

6. Create a new virtuel environment through the command :

7. Switch on the new virtuel environment you just created it :

8. Install required Python packages :

9. Run the script :

10. Get data :

Feedback/Question

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages