Ecotaxa Webscraping

Description

This repository contains scripts and tools developed as part of web scraping and API interaction with the Ecotaxa platform. The primary goal is to automate the extraction of plancton and other microorganism data, specifically focusing on images metadata, to facilitate research and analysis work.

Ecotaxa is a web application dedicated to the visual exploration and management of planktonic data. Accessing this rich platform programmatically requires understanding of the Ecotaxa API, authentication mechanisms, and data extraction techniques. This project aims to encapsulate these aspects into a user-friendly set of scripts.

Getting Started

Prerequisites

Required Python libraries: requests, beautifulsoup4, selenium, json, csv, tqdm, os.

Usage

Clone this repository to your local machine:

git clone https://github.com/PlanktoScope/Ecotaxa-webscraping.git

Navigate to the cloned directory:
```
cd ecotaxa-webscraping
```
Webscarping using Ecotaxa API (specify: project ID & own Ecotaxa credentials):
```
ecotaxa_api_history.py
```
Webscarping using Selenium (specify: project ID & own Ecotaxa credentials):
```
ecotaxa_scraping_v3.py
```

License

This project is licensed under the Apache-2.0.

Copyright Wassim Chakroun and PlanktoScope project contributors.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Reference		Reference
Scripts		Scripts
.gitattributes		.gitattributes
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ecotaxa Webscraping

Description

Getting Started

Prerequisites

Usage

License

About

Releases

Packages

Languages

License

PlanktoScope/ecotaxa-webscraping

Folders and files

Latest commit

History

Repository files navigation

Ecotaxa Webscraping

Description

Getting Started

Prerequisites

Usage

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages