Metrics for tools described in analytical flexibility review paper
Clone this repository:
git clone https://github.com/neurodatascience/analytical-flexibility-tool-metrics.git
Move to the newly created directory:
cd analytical-flexibility-tool-metrics
One of the Python packages (condastats
) needs to be installed using conda
(there is a version on PyPI but the installation did not work). Hence, running the code requires creating a conda
environment. Please refer to the official instructions on how to install Anaconda or Miniconda for your operating system.
Once conda
is installed, the next step is to create a new conda
environment with the appropriate Python version. Here we call the environment metrics
, but it can be any name. The code runs with Python 3.11.6. It might run with other versions but has not been tested.
conda create --name metrics python=3.11.6
Activate the environment:
conda activate metrics
Then, the condastats
package can be installed with:
conda install -c conda-forge condastats
Finally, the other dependencies can be installed via pip
. Assuming we are still in the analytical-flexibility-tool-metrics
directory, the command is:
pip install -r requirements.txt
The latest versions of the required packages will most likely work. The exception is pandas
, because some condastats
functions crash with pandas
later than 2.0.0. In case something does not work, the exact versions used when developing the code are:
condastats
0.2.1matplotlib
3.8.2pandas
1.5.3pypistats
1.5.0python-dotenv
1.0.0requests
2.31.0seaborn
0.13.0
The main script is code/generate_figures.py
:
usage: generate_figures.py [-h] [--tools FPATH_TOOLS] [--figs-dir DPATH_FIGS]
[--load-metrics METRICS_CSV_IN]
[--save-metrics METRICS_CSV_OUT] [--overwrite]
options:
-h, --help show this help message and exit
--tools FPATH_TOOLS path to CSV file containing information about tools
(default: <PATH_TO_REPO>/data/tools.csv)
--figs-dir DPATH_FIGS
path to output figures directory (default:
<PATH_TO_REPO>/figs)
--load-metrics METRICS_CSV_IN
path to read metrics CSV file (optional). Note: --load-
metrics and --save-metrics cannot both be specified. Also,
if --load-metrics is specified, --tools is ignored
--save-metrics METRICS_CSV_OUT
path to write metrics CSV file (optional). Note: --load-
metrics and --save-metrics cannot both be specified
--overwrite overwrite existing figures (and metrics file if applicable)
All examples assume the working directory is the root directory of this repo.
./code/generate_figures.py --overwrite
./code/generate_figures.py --load-metrics data/metrics.csv --overwrite
./code/generate_figures.py --tools <PATH_TO_TOOLS_FILE> --figs-dir <PATH_TO_FIGS_DIR>
See data/tools.csv
for an example input file.
tool_name
: Name of the tool as it it appear in the figures.review_paper_section
: Used to group tools into separate figures and to determine the name of the saved image file.doi
: Digital Object Identifier, such thathttps://www.doi.org/{doi}
will resolve.github
: GitHub repository owner and name (e.g.,neurodatascience/analytical-flexibility-tool-metrics
).gitlab
: GitLab project ID, which can be found in the raw HTML source for the project's GitLab page (look for something likeproject-id
/data-project-id
).- See here (might be outdated).
docker1
: Container image name on DockerHub.docker2
: Same asdocker1
, for tools with container images published in two different places (e.g., because they were moved).github_container
: Package name for GitHub container registry.- Note: It seems there is no GitHub API that provides this information at the time of writing (see here), so this is currently not implemented.
pypi
: Package name for Python packages distributed through PyPI.conda
: Package name for Python packages distributed through aconda
channel.
The script generates figures (by default in the figs
directory) with panels for each of the computed metrics (or metric groups). Here is an example of a complete figure:
The metrics used in each panel are:
- Citations over time: Cumulative number of citations obtained from the OpenCitations API.
- Column(s) used:
doi
.
- Column(s) used:
- Code repository metrics: Number of stars and forks obtained from the GitHub and/or GitLab APIs.
- Column(s) used:
github
,gitlab
- Column(s) used:
- Container pulls: Number of container downloads obtained from the DockerHub API.
- Column(s) used:
docker1
,docker2
- Column(s) used:
- Python package downloads in the last 180 days: Number of downloads from PyPI (obtained using
pypistats
) and/orconda
(obtained usingcondastats
).- Column(s) used:
pypi
,conda
- Column(s) used: