Skip to content

Latest commit

 

History

History
63 lines (39 loc) · 5.02 KB

File metadata and controls

63 lines (39 loc) · 5.02 KB

Contributing guidelines

The main branch is the single source of truth for the paper. All analysis and figures should be reproducible only with the code and access to neuPrint-CNS.

New contributor guide

To get started, there are a number of documents that can help you:

  • Getting started with python: How to set up python, a virtual environment, and dependencies
  • Notebook Editors: Two alternatives on how to edit the Jupyter notebooks, which contain the majority of our analysis
  • Git basics: Basic steps how to get started with git and GitHub

If you haven't contributed before, you should add your name, email, and ORCID to the list below.

Code submission

Keep the code that you want to submit in a branch. The changes for each branch should be as self-contained and small as possible – in the best case one feature per branch.

Before submission, make sure that all your notebooks run. If you are using intermediate data files, regenerate them before running your code.

Once you confirmed that your code works, remove the output from the notebooks since this the output is often not parsed on GitHub and the files are unnecessarily big. In Jupyter Lab you can remove the output via the Cell → All Output → Clear menu, in VS Code hit the Clear All Outputs button.

Now commit your latest changes, push them to GitHub, and open a Pull Request (PR). The PR should contain a small explanation what your submission is.

At least one other person should review the code before it is accepted into the main branch. The submitter and reviewer are responsible for the code to work for others as well.

Data management

There should be a direct path from the neuPrint data base to the results. If intermediate data files are required, the branch that uses the data files needs to contain the code that generates the intermediate files.

In the rare occasion that an intemediate file is not generated from the neuPrint data base, but instead requires manual intervention, make sure to write extensively how you arrived at the data file. This explanation should be in a README.md file in the same directory as the intermediate data file resides.

Currently there are two locations for data files: ./cache/* and ./results/*. The cache directory should not end up on GitHub while the results directory is shared with the code.

The cache folder should be transparent (= invisible) to anyone interacting with the notebook. The folder is used by some functions and methods to store data that takes a long time to download from the data base, for example the synapses of a certain cell type, a skeleton, or certain volumes. If the cache directory is deleted, there should be no need to run the code any differently, it might just take a bit longer to complete certain tasks. The cache directory manages its own subfolders. If you write a new caching function make sure to create a unique subdirectory and don't interfere with existing files.

The results directory contains files that are generated by one script and then used by another. It also contains manually generated files. Make sure that the origin of these files is clearly documented. The results also contains the figure for the paper and other *pdf files. If the files name is clearly visible in the script that generates them, it is not necessary to mention the origin the README.md file, but it doesn't hurt.

Orphaned files (files that have no know origin) within the results directory get cleaned out regularly.

Updates and cleanup

If you regularly work with the OL-Connectome code, you are encouraged to do the following two tasks regularly once every other week:

  • remove the cache directory (you can run make clean or delete manually)
  • reinstall dependencies: newer code often relies on new function from libraries and requirements.txt is updated at least once a week

List of Code Contributors