Most insect neurons are morphological highly stereotyped: they look almost exactly the same across animals - often down to medium size branches. We also know that some neurons are clearly unique (i.e. there is only ever one exemplar per hemisphere) while others come in multiple, highly similar copies.
This knowledge is largely based on genetic driver lines that can be used to label neurons with similar (but not necessarily known) genetic fingerprints that then turn out to also have very similar morphology. Combined, these two clues (genetics + morphology) provide strong support for declaring a "cell type".
The advent of connectomes comes with a plethora of dense data: all of the sudden we have morphologies for every single neuron neuron in the fly brain - something that is next to impossible by genetic means.
Great! So let's just find similar looking neurons and make them a cell type. Alas it's not that easy. Where do you make the cut off? How similar do two neurons have to be, to be consider the same type? Can we find the same groups across data sets?
That last point is of particular importance: we can define cell types based on a single dataset - e.g. using confirmed types as landmarks - but they are of limited use unless these types persist across animals. To address this, we need to be able to identify neurons across data sets, and assess variability and groupings to define sensible types.
Why we bother with this if it's so complicated? Because it's much easier to think about a few thousand different cell types than about a hundred thousand individual neurons.
Three complete sets of antennal lobe project neurons (ALPNs, ~335 neurons per set) reconstructed in two different Drosophila brains:
- FAFB (full adult fly brain) right hemisphere
- FAFB left hemisphere
- Janelia hemibrain ("FIB") right hemisphere
- Neurons can only match within the same developmental (hemi-)lineage. This
reduces the group size to ~90 in the worst case and to 1 in the best case.
See
/data/meta.csv
for that information.
- Every now and then a neuron has a developmental hiccup and does something unexpected. That can range from a missing or additional branch to taking a different tract. This can lead to bad scores between otherwise "correct" matches.
- Some ALPNs are truncated in the hemibrain. We tried to compensate for this during the NBLAST but we currently don't factor in how much we loose due to truncation.
- In some cases - typically when there are multiple sister neurons of the same type - there can be variations in numbers. Hence a 1:1:1 matching is not always possible.
- For some neurons the lineage annotation might not be accurate. This is because the primary neurite bundles for some lineages (like l2PNs) do not tightly co-fasciculate, and some neurons can look ambiguous.
/data/nblast_scores.csv
contains an all-by-all NBLAST similarity score matrix.
See /data/nblast.ipynb
for details on how these data were generated but in brief:
- FAFB and hemibrain (FIB) skeletons were transformed to the JRC2018F template brain
- left FAFB skeletons were additionally mirrored to the right
- small twigs (< 5 microns) were pruned off
- skeletons were resampled to 1 micron and turned into dotprops (points + tangent vectors)
- for FIB vs FAFB NBLASTs, the FAFB dotprops were truncated to the hemibrain volume
- for FAFB vs FAFB NBLASTs, the full dotprops were used
- scores were combined into one big score matrix
Important: These scores are normalized but not symmetrized. Typically we use the mean of the forward and reverse scores which can be obtains like so:
>>> import pandas as pd
>>> scores = pd.read_csv('nblast_scores.csv')
>>> scores_mean = (scores + scores.T) / 2
/data/meta.csv
contains meta data for each neuron:
id
: ID that match index/columns innblast_scores.csv
name
: name of the neuron; this can be used as sanity checklineage
: constraint for matching/typinglabel
: labels of known types; we expect mostly pairs/groups where labels matchis_canonical
: whether neuron labels are well establishedntype
: whether neuron is uni- or multiglomerular projection neuronsource
: to which dataset this neuron belongs
The data provided in this repository is based on three publications:
- Schlegel, Bates et al., bioRxiv (2020)
- Bates, Schlegel et al., Current Biology (2020)
- Scheffer et al., eLife (2020)
In particular, Figure 4 and S6 in Schlegel, Bates et al. represent our attempts at cross-matching these ALPNs.