Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs #872

Merged
merged 11 commits into from
Jan 8, 2025
24 changes: 23 additions & 1 deletion docs/getting_started/comparison.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,26 @@ There are many open-source projects for training machine learning models. We see

`Torchgeo <https://github.com/microsoft/torchgeo>`_ is a Python library written by developers at Microsoft to help automate remote sensing machine learning. Torchgeo has general structures, but the documents and general structure are focused on raster-based remote sensing, especially using earth-facing satellite data. Torchgeo has a number of useful datasets and curates pretrained models for remote sensing applications. The Torchgeo audience is generally more experienced with machine learning than the average DeepForest user.

We hope to continue to connect with both Roboflow and Torchgeo to improve interoperability among all model types and training. The future of open-source depends on collaboration, and we welcome users from all packages to submit ideas on how best to serve the community and reduce any duplication and wasted effort. There are many packages that hold useful individual models (e.g., `DetectTree2 <https://github.com/PatBall1/detectree2>`_) related to individual scientific publications. Our hope with DeepForest is to wrap general routines beyond individual research projects to make machine learning applications to environmental monitoring easier.
We hope to continue to connect with both Roboflow and Torchgeo to improve interoperability among all model types and training. The future of open-source depends on collaboration, and we welcome users from all packages to submit ideas on how best to serve the community and reduce any duplication and wasted effort. There are many packages that hold useful individual models (e.g., `DetectTree2 <https://github.com/PatBall1/detectree2>`_) related to individual scientific publications. Our hope with DeepForest is to wrap general routines beyond individual research projects to make machine learning applications to environmental monitoring easier.

Similar tools
-------------

There are many open-source projects for training machine learning models. We see DeepForest as a complement to many existing and excellent packages.

* Roboflow

The `supervision <https://supervision.roboflow.com/latest/>`_, `inference <https://inference.roboflow.com/>`_ and related packages within Roboflow's ecosystem are well executed and used throughout DeepForest. The inference machine underlying Roboflow requires connection to Roboflow, a computer vision software company which requires an API key, and has a range of commercial and license structures. We think of DeepForest as a small set of curated models that are targeted towards the ecological and environmental monitoring community. Finding robust models is challenging amongst the thousands of Roboflow projects. Roboflow is designed to be an all-encompassing ecosystem, whereas DeepForest is intentionally small and aimed at existing pipelines.

* Torchgeo

`Torchgeo <https://github.com/microsoft/torchgeo>`_ is a Python library written by developers at Microsoft to help automate remote sensing machine learning. Torchgeo has general structures, but the documents and general structure are focused on raster-based remote sensing, especially using earth-facing satellite data. Torchgeo has a number of useful datasets and curates pretrained models for remote sensing applications. The Torchgeo audience is generally more experienced with machine learning than the average DeepForest user.

We hope to continue to connect with both Roboflow and Torchgeo to improve interoperability among all model types and training. The future of open-source depends on collaboration, and we welcome users from all packages to submit ideas on how best to serve the community and reduce any duplication and wasted effort. There are many packages that hold useful individual models (e.g., `DetectTree2 <https://github.com/PatBall1/detectree2>`_) related to individual scientific publications. Our hope with DeepForest is to wrap general routines beyond individual research projects to make machine learning applications to environmental monitoring easier.

* AIDE
`AIDE <https://github.com/microsoft/aerial_wildlife_detection>`_ is two things in one: a tool for manually annotating images and a tool for training and running machine (deep) learning models. Those two things are coupled in an active learning loop: the human annotates a few images, the system trains a model, that model is used to make predictions and to select more images for the human to annotate, etc.

More generally, AIDE is a modular Web framework for labeling image datasets with AI assistance. AIDE is configurable for a variety of tasks, but it is particularly intended for ecological applications, such as accelerating wildlife surveys that use aerial images.

AIDE was developed by B. Kellenberger, and while it hasn't been updated in a while, it's still a great tool.
6 changes: 6 additions & 0 deletions docs/getting_started/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@ DeepForest has Windows, Linux and OSX prebuilt wheels on pypi. We
*strongly* recommend using a conda or virtualenv to create a clean
installation container.

For example

::

conda create -n DeepForest python=3.11
conda activate DeepForest
::

pip install DeepForest
Expand Down
12 changes: 4 additions & 8 deletions docs/getting_started/intro_tutorials/02_model_loader.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,23 +26,19 @@ The `load_model` function loads a pretrained model from Hugging Face using the r

### Example Usage

#### Load a Model
#### Load a Model and Predict an Image

```python
from deepforest import main
from deepforest import get_data
import matplotlib.pyplot as plt

from deepforest.visualize import plot_results
# Initialize the model class
model = main.deepforest()

# Load a pretrained tree detection model from Hugging Face
model.load_model(model_name="weecology/deepforest-tree", revision="main")

sample_image_path = get_data("OSBS_029.png")
img = model.predict_image(path=sample_image_path, return_plot=True)

plt.imshow(img[:,:,::-1])
plt.show()

img = model.predict_image(path=sample_image_path)
plot_results(img)
```
18 changes: 6 additions & 12 deletions docs/getting_started/intro_tutorials/03_use_pretrained_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,18 @@ How do I use a pretrained model to predict an image?

.. code-block:: python
from deepforest import main, get_data
import matplotlib.pyplot as plt
# Initialize the model
from deepforest import main
from deepforest import get_data
from deepforest.visualize import plot_results
# Initialize the model class
model = main.deepforest()
# Load a pretrained tree detection model from Hugging Face
model.load_model(model_name="weecology/deepforest-tree", revision="main")
# Get the sample image path and predict image
sample_image_path = get_data("OSBS_029.png")
img = model.predict_image(path=sample_image_path, return_plot=True)
# predict_image returns plot in BlueGreenRed (opencv style), but matplotlib likes RedGreenBlue
# Switch the channel order for correct display
plt.imshow(img[:,:,::-1])
plt.show()
img = model.predict_image(path=sample_image_path)
plot_results(img)
.. image:: ../../../www/getting_started1.png
:align: center
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ How do I predict on large geospatial tiles?
Predict a tile
~~~~~~~~~~~~~~

Large tiles covering wide geographic extents cannot fit into memory during prediction and would yield poor results due to the density of bounding boxes. Often provided as geospatial .tif files, remote sensing data is best suited for the ``predict_tile`` function, which splits the tile into overlapping windows, performs prediction on each of the windows, and then reassembles the resulting annotations.
Large tiles covering wide geographic extents cannot fit into memory during prediction and would yield poor results due to the density of bounding boxes. Often provided as geospatial .tif files, remote sensing data is best suited for the ``predict_tile`` function, which splits the tile into overlapping windows, performs prediction on each of the windows, and then reassembles the resulting annotations. Overlapping detections are removed based on the ``iou_threshold`` parameter.

Let’s show an example with a small image. For larger images, patch_size should be increased.

Expand Down
4 changes: 2 additions & 2 deletions docs/getting_started/intro_tutorials/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Getting started tutorials
.. toctree::
:maxdepth: 1

01_load_sample_data
02_model_loader.md
03_use_pretrained_model
04_predict_large_tile
04_predict_large_tile
load_sample_data
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
How do I use the package Sample data?
How do I use the package sample data?
=====================================

Sample data
Expand Down
22 changes: 3 additions & 19 deletions docs/getting_started/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Package overview
What is DeepForest?
*******************

DeepForest is a python package for training and predicting ecological objects in airborne imagery. DeepForest comes with prebuilt models for immediate use and fine-tuning by annotating and training custom models on your own data. DeepForest models can also be extended to species classification based on new data. DeepForest is designed for:
DeepForest is a python package for training and predicting ecological objects in airborne imagery. DeepForest comes with prebuilt models for immediate use and fine-tuning by annotating and training custom models on your own data. DeepForest models can also be extended to classification (e.g., species) based on new data. DeepForest is designed for:

1. Applied researchers with limited machine learning experience
2. Applications with limited data that can be supported by prebuilt models
Expand Down Expand Up @@ -38,7 +38,7 @@ Practical Intro to Computer Vision in Ecology Research

Where can I get help, learn from others, and report bugs?
---------------------------------------------------------
Given the enormous array of forest types and image acquisition environments, it is unlikely that your image will be perfectly predicted by a prebuilt model. Below are some tips and general guidelines to improve predictions.
Given the enormous array of taxa, background and image acquisition environments, it is unlikely that your image will be perfectly predicted by a prebuilt model. Check out the 'training', 'annotation', and 'predicting' sections of the documentation for more information on how to improve predictions using your own data.

Get suggestions on how to improve a model by using the `discussion board <https://github.com/weecology/DeepForest/discussions>`_. Please be aware that only feature requests or bug reports should be posted on the issues page. The most helpful thing you can do is leave feedback on the DeepForest `issue page`_. No feature, issue, or positive affirmation is too small. Please do it now!

Expand All @@ -64,7 +64,7 @@ DeepForest is an open-source python project that depends on user contributions.

* Making recommendations to the API and workflow. Please open an issue for anything that could help reduce friction and improve user experience.
* Leading implementations of new features. Check out the 'good first issue' tag on the repo and get in touch with the maintainers and tell us about your skills.
* Data contributions! The DeepForest backbone tree and bird models are not perfect. Please consider posting any annotations you make on Zenodo, or sharing them with DeepForest maintainers. Open an `issue <https://github.com/weecology/DeepForest/issues>`_ and tell us about the RGB data and annotations. For example, we are collecting tree annotations to create an `open-source benchmark <https://milliontrees.idtrees.org/>`_. Please consider sharing data to make the models stronger and benefit you and other users.
* Data contributions! The DeepForest backbone models are not perfect. Please consider posting any annotations you make on Zenodo, or sharing them with DeepForest maintainers. Open an `issue <https://github.com/weecology/DeepForest/issues>`_ and tell us about the RGB data and annotations. For example, we are collecting tree annotations to create an `open-source benchmark <https://milliontrees.idtrees.org/>`_. Please consider sharing data to make the models stronger and benefit you and other users.

Citation
--------
Expand All @@ -80,22 +80,6 @@ The second is the paper describing the particular model. See `Prebuilt Setup <..

.. _issue page: https://github.com/weecology/DeepForest/issues

Similar tools
-------------

There are many open-source projects for training machine learning models. We see DeepForest as a complement to many existing and excellent packages.

* Roboflow

The `supervision <https://supervision.roboflow.com/latest/>`_, `inference <https://inference.roboflow.com/>`_ and related packages within Roboflow's ecosystem are well executed and used throughout DeepForest. The inference machine underlying Roboflow requires connection to Roboflow, a computer vision software company which requires an API key, and has a range of commercial and license structures. We think of DeepForest as a small set of curated models that are targeted towards the ecological and environmental monitoring community. Finding robust models is challenging amongst the thousands of Roboflow projects. Roboflow is designed to be an all-encompassing ecosystem, whereas DeepForest is intentionally small and aimed at existing pipelines.

* Torchgeo

`Torchgeo <https://github.com/microsoft/torchgeo>`_ is a Python library written by developers at Microsoft to help automate remote sensing machine learning. Torchgeo has general structures, but the documents and general structure are focused on raster-based remote sensing, especially using earth-facing satellite data. Torchgeo has a number of useful datasets and curates pretrained models for remote sensing applications. The Torchgeo audience is generally more experienced with machine learning than the average DeepForest user.

We hope to continue to connect with both Roboflow and Torchgeo to improve interoperability among all model types and training. The future of open-source depends on collaboration, and we welcome users from all packages to submit ideas on how best to serve the community and reduce any duplication and wasted effort. There are many packages that hold useful individual models (e.g., `DetectTree2 <https://github.com/PatBall1/detectree2>`_) related to individual scientific publications. Our hope with DeepForest is to wrap general routines beyond individual research projects to make machine learning applications to environmental monitoring easier.


License
-------

Expand Down
5 changes: 3 additions & 2 deletions docs/user_guide/01_Reading_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,12 @@ The most time-consuming part of many open-source projects is getting the data in

## Annotation Geometries and Coordinate Systems

DeepForest was originally designed for bounding box annotations. As of DeepForest 1.4.0, point and polygon annotations are also supported. There are two ways to format annotations, depending on the annotation platform you are using. `read_file` can read points, polygons, and boxes, in both image coordinate systems (relative to image origin at top-left 0,0) as well as projected coordinates on the Earth's surface. The `read_file` method also appends the location of the current image directory as an attribute. To access this attribute use
DeepForest was originally designed for bounding box annotations. As of DeepForest 1.4.0, point and polygon annotations are also supported. There are two ways to format annotations, depending on the annotation platform you are using. `read_file` can read points, polygons, and boxes, in both image coordinate systems (relative to image origin at top-left 0,0) as well as projected coordinates on the Earth's surface. The `read_file` method also appends the location of the current image directory as an attribute. To access this attribute use the `root_dir` attribute.

```
filename = get_data("OSBS_029.csv")
df = utilities.read_file(filename)
df.root_dir
```

**Note:** For CSV files, coordinates are expected to be in the image coordinate system, not projected coordinates (such as latitude/longitude or UTM).
Expand Down
14 changes: 7 additions & 7 deletions docs/user_guide/02_prebuilt.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Prebuilt models

DeepForest has a few prebuilt models.
DeepForest comes with prebuilt models to help you get started. These models are available on Hugging Face and are loaded using the `load_model` function, they always are seen as the starting point for further training, rather than a general purpose tool for new imagery.

## Tree Crown Detection model

Expand Down Expand Up @@ -38,6 +38,12 @@ We have created a [GPU colab tutorial](https://colab.research.google.com/drive/1

For more information, or specific questions about the bird detection, please create issues on the [BirdDetector repo](https://github.com/weecology/BirdDetector)

## Livestock Detectors model

This model has a single label 'cattle' trained on drone imagery of cows, sheep and other large mammals in agricultural settings. The model was trained on data from [insert countries and other metadata about landscapes].

![image](../../www/livestock-example.png)

## Crop Classifiers model

### Alive/Dead trees model
Expand All @@ -55,12 +61,6 @@ Table S1 Confusion matrix for the Alive/Dead model in Weinstein et al. 2023

Citation: Weinstein, Ben G., et al. "Capturing long‐tailed individual tree diversity using an airborne imaging and a multi‐temporal hierarchical model." Remote Sensing in Ecology and Conservation 9.5 (2023): 656-670.

## Livestock Detectors model

This model has a single label 'cattle' trained on drone imagery of cows, sheep and other large mammals in agricultural settings. The model was trained on data from [insert countries and other metadata about landscapes].

![image](../../www/livestock-example.png)

## Want more pretrained models?

Please consider contributing your data to open source repositories, such as zenodo or lila.science. The more data we gather, the more we can combine the annotation and data collection efforts of hundreds of researchers to built models available to everyone. We welcome suggestions on what models and data are most urgently [needed](https://github.com/weecology/DeepForest/discussions).
2 changes: 1 addition & 1 deletion docs/user_guide/06_multi_species.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ m = main.deepforest(config_args={"num_classes":2},label_dict={"Alive":0,"Dead":1
```

It is often, but not always, useful to start with a prebuilt model when trying to identify multiple species. This helps the model focus on learning the multiple classes and not waste data and time re-learning bounding boxes. To load the backbone and box prediction portions of the release model, but create a classification model for more than one species.
Here is an example using the alive/dead tree data stored in the package, but the same logic applies to the bird detector.
Here is an example using the alive/dead tree data stored in the package, but the same logic applies to other detectiors.

``` python
# Initialize new Deepforest model ( the model that you will train ) with your classes
Expand Down
Loading
Loading