-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit fd3f780
Showing
25 changed files
with
1,126 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) 2020 Calico | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
HOST=127.0.0.1 | ||
TEST_PATH=./tests | ||
|
||
PROJECT_NAME=MyProject | ||
PROJECT_NEW_NAME=MyProject | ||
FILES_WITH_PROJECT_NAME=Makefile .github/workflows/build-docs.yml README.md docs/index.md mkdocs.yml setup.py | ||
FILES_WITH_PROJECT_NAME_LC=Makefile README.md tests/test_myproject.py | ||
PROJECT_NAME_LC=$(shell echo ${PROJECT_NAME} | tr '[:upper:]' '[:lower:]') | ||
PROJECT_NEW_NAME_LC=$(shell echo ${PROJECT_NEW_NAME} | tr '[:upper:]' '[:lower:]') | ||
|
||
clean-pyc: | ||
find . -name '*.pyc' -exec rm --force {} + | ||
find . -name '*.pyo' -exec rm --force {} + | ||
name '*~' -exec rm --force {} | ||
|
||
clean-build: | ||
rm --force --recursive build/ | ||
rm --force --recursive dist/ | ||
rm --force --recursive *.egg-info | ||
|
||
rename-project: | ||
@echo renaming "${PROJECT_NAME}" to "${PROJECT_NEW_NAME}" | ||
@sed -i '' -e "s/${PROJECT_NAME}/${PROJECT_NEW_NAME}/g" ${FILES_WITH_PROJECT_NAME} | ||
@echo renaming "${PROJECT_NAME_LC}" to "${PROJECT_NEW_NAME_LC}" | ||
@sed -i '' -e "s/${PROJECT_NAME_LC}/${PROJECT_NEW_NAME_LC}/g" ${FILES_WITH_PROJECT_NAME_LC} && \ | ||
git mv src/${PROJECT_NAME_LC} src/${PROJECT_NEW_NAME_LC} && \ | ||
git mv tests/test_${PROJECT_NAME_LC}.py tests/test_${PROJECT_NEW_NAME_LC}.py | ||
@echo Project renamed | ||
|
||
lint: | ||
flake8 --exclude=.tox | ||
|
||
test: | ||
pytest --verbose --color=yes $(TEST_PATH) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
<p> | ||
<a href="https://docs.calicolabs.com/python-template"><img alt="docs: Calico Docs" src="https://img.shields.io/badge/docs-Calico%20Docs-28A049.svg"></a> | ||
<a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a> | ||
</p> | ||
|
||
# DISH Scoring | ||
|
||
![](https://github.com/calico/myproject) | ||
|
||
## Overview | ||
|
||
This tool applies ML models to the analysis of DEXA images for measuring | ||
bone changes that are related to diffuse idiopathic skeletal hyperostosis (DISH). | ||
|
||
## [DISH Analysis: Methods Description](docs/analysis.md) | ||
|
||
## [Developer Documentation](docs/developer.md) | ||
|
||
## Installation | ||
The recommended build environment for the code is to have [Anaconda](https://docs.anaconda.com/anaconda/install/) installed and then to create a conda environment for python 3 as shown below: | ||
|
||
``` | ||
conda create -n dish python=3.7 | ||
``` | ||
|
||
Once created, activate the environment and install all the needed libraries as follows: | ||
|
||
``` | ||
conda activate dish | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## Usage | ||
An example for a recommended invokation of the code: | ||
|
||
``` | ||
python scoreSpines.py -i <dir of imgs> -o <out file> --aug_flip --aug_one | ||
``` | ||
### [Detailed Usage Instructions](docs/getstarted.md) | ||
|
||
|
||
## License | ||
|
||
See LICENSE | ||
|
||
## Maintainers | ||
|
||
See CODEOWNERS |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
0.0.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,192 @@ | ||
[Back to home.](../README.md) | ||
|
||
# DISH Scoring: Methods Description | ||
|
||
The pipeline described here scores the extent of hyperostosis that can be observed in a | ||
lateral dual-energy X-ray absorptiometry (DEXA) scan image of a human torso. As described | ||
by [Kuperus et al (2018)](https://www.jrheum.org/content/jrheum/45/8/1116.full.pdf), such | ||
hyperostosis can create bridges between vertebrae that limit flexibility and ultimately | ||
contribute to diffuse idiopathic skeletal hyperostosis (DISH). | ||
|
||
The analysis occurs in three steps: | ||
1. Identification of anterior intervertebral junctions; | ||
2. Scoring the hyperostosis of each intervertebral junction; | ||
3. Summing the bridge scores across the spine. | ||
|
||
Details on each of those steps are given below, along with the final performance of the | ||
system against hold-out test data generated by human annotators. | ||
|
||
## Step 1: Identification of anterior intervertebral junctions. | ||
|
||
Pathological DISH involves the linking of vertebrae by bony outgrowths that traverse | ||
intervertebral gaps. Its pathology results from the summed effects of hyperostosis | ||
between all adjacent pairs of vertebrae in the spine. The first step on analysis of | ||
DISH was therefore the identification of the anterior portions of the intervertebral | ||
gaps along the entire spine. These are the loci where DISH-relevant bridges can form | ||
that are visible in lateral DEXA images. An object-detection model was applied to this | ||
task. It was trained by transfer learning from the | ||
**ssd_mobilnet_v1** model, using annotations similar to these below: | ||
|
||
<img src="imgs/objDetTrainExamples.jpeg" alt="examples of bridge score categories" height="100%" width="100%"> | ||
|
||
A set of 160 images was annotated by this author, which included 2,271 boxes drawn | ||
around vertebral junctions. The average number of boxes per image (14.2) is used | ||
to define the threshold for junction annotation: for each image being evaluated, | ||
the 14 highest-confidence annotations returned by the object detector will be used. | ||
|
||
The annotated images were separated into training and test sets | ||
of 100 and 60 images, respectively. Training-set images were augmented by horizontal | ||
flipping (all images in the study set are right-facing), inwards adjustment of image borders, | ||
brightness, and contrast. In addition, in order to simulate artifacts observed at low frequency | ||
across the study set, augmentation was performed by drawing large black or white blocks randomly | ||
along the image edges. The final augmented training set included 1200 images and 10,244 boxes. | ||
|
||
Performance of the object detector was evaluated in the 60-image test set using | ||
intersection-over-union (IoU) for the 14 top-scoring predicted boxes versus all of the | ||
annotated boxes, allowing each predicted box's intersection to only be counted for its | ||
most-overlapping annotated counterpart. The average IoU across the 60 test images was | ||
**68.9% (SD 5.9%)**. | ||
|
||
## Step 2: Scoring the hyperostosis of each intervertebral junction. | ||
|
||
For each intervertebral junction, a numeric score was to be assigned according to the criteria | ||
described by [Kuperus et al (2018)](https://www.jrheum.org/content/jrheum/45/8/1116.full.pdf) | ||
in Figure 2 of that manuscript. Those authors provide examples and descriptions of hyperostosis | ||
between adjacent vertebral bodies, scored on a 0-3 scale in terms of both "bridge" and "flow". | ||
I automated that scoring, with greater attention paid to the "Bridge score" than the | ||
"Flow score" scale, using an image classification model. This model classified images of individual bridges, | ||
i.e. images extracted from the source image | ||
using the 14 top-scoring boxes, defined by the object detection model described above. Four | ||
categories were established and named numerically with reference to the bridge score | ||
("br0", "br1", "br2", and "br3"), corresponding to the severity | ||
of hyperostosis: | ||
|
||
<img src="imgs/scoreExamples.jpeg" alt="examples of bridge score categories" height="50%" width="50%"> | ||
|
||
For the training and testing of this image classification model, the object detection model was | ||
used to draw boxes (top-scoring 14 per image) across 893 DEXA spine images. Each of the resulting | ||
12,502 box images was manually classified as described above. For the test set, 200 of the DEXA | ||
images (comprising 2800 bridge images) were randomly selected; the remaining 693 DEXA images (9,702 | ||
bridge images) made up the pre-augmentation training set. The categories (named "br0", "br1", "br2", | ||
and "br3", corresponding to the bridge scores) were not evenly balanced (shown for the total annotation set): | ||
|
||
| Class | Count | % | | ||
| ----- | ----: | --: | | ||
| br0 | 10270 | 82.15 | | ||
| br1 | 1740 | 13.63 | | ||
| br2 | 356 | 2.85 | | ||
| br3 | 172 | 1.38 | | ||
|
||
For the training set, the full data set was augmented first using a horizontal flip. | ||
In the following augmentation steps, imbalance between the classes was reduced by down-sampling | ||
from the "br0" and "br1" classes (including in the selection of non-augmented boxes). For each | ||
augmentation step, a separate randomly-selected subset of the available boxes (bridge images) was sampled, ensuring | ||
maximum diversity of images but nonetheless consistent proportions of augmentation treatments across | ||
the classes. The use of only 10% of "br0" boxes and 25% of "br1" boxes resulted in the following proportions: | ||
|
||
| Class | Input % | Sampled % | Final % | ||
| ----- | ------: | ------: | ------: | | ||
| br0 | 82.15 | 10 | 51.8 | | ||
| br1 | 13.63 | 25 | 21.5 | | ||
| br2 | 2.85 | 100 | 18.0 | | ||
| br3 | 1.38 | 100 | 8.7 | | ||
|
||
Bridge images were extracted during the augmentation process, allowing the box itself to be randomly | ||
modified. The following augmentation combinations were performed: 1) non-augmented; 2) random tilt up to 30 deg.; | ||
3) random adjustment of the box edge positions by up to 20% of the box width or height; 4) tilt & edge; 5) tilt & | ||
brightness; 6) edge & brightness; 7) tilt & contrast; 8) edge & contrast. Augmentation therefore increased the | ||
training set size by 8-fold, resulting in the following counts for bridge images by class: | ||
|
||
| Class | Count | | ||
| ----- | ----: | | ||
| br0 | 12752 | | ||
| br1 | 5272 | | ||
| br2 | 4496 | | ||
| br3 | 2112 | | ||
|
||
Training was performed using transfer learning from the **efficientnet/b1** model. Evaluated using the | ||
test set described above, the Cohen's kappa value for the final model was 0.405 with the following | ||
confusion matrix (rows=human, cols=model): | ||
|
||
| | br0 | br1 | br2 | br3 | total | | ||
| ---- | ----:| ---:| ---:| ---:| -----:| | ||
| **br0** | 2102 | 194 | 31 | 65 | 2300 | | ||
| **br1** | 195 | 171 | 31 | 40 | 385 | | ||
| **br2** | 8 | 19 | 29 | 26 | 75 | | ||
| **br2** | 1 | 6 | 5 | 33 | 40 | | ||
| **total** | 2306 | 234 | 96 | 164 | | | ||
|
||
**Cohen's kappa (test set) = 0.405** | ||
|
||
Due to the numeric nature of the classes, the model was also evaluated against the test set using | ||
Pearson correlation (using the numeric values of each class "br0", "br1", "br2", and "br3"): | ||
|
||
**Pearson correlation (test set) = 0.581** | ||
|
||
## Step 3: Summing the bridge scores across the spine. | ||
|
||
The final output value of the model evaluates overall DISH-like hyperostosis across the spine. | ||
Final evaluation | ||
was performed using a hold-out set of 200 DEXA images that were scored by three independent raters | ||
(evaluation was performed using the mean rater score for each DEXA image). | ||
Those raters used the same bridge-score scheme described above, with the appearance of DISH-related | ||
bony outgrowth scored as either a 1, 2 or 3 (bridges without observable outgrowth implicitly received | ||
a score of 0). For each DEXA image, those numeric scores were summed to produce the final DISH score. | ||
|
||
In addition to the final hold-out test used for model evaluation, the independent rater also produced | ||
a training set of 199 images (**Rater Training**) that were used to compare alternative ML models and | ||
alternative strategies for interpretation of the ML model output. The classification model's test set | ||
annotations were used ensemble across each DEXA image for the same purpose (**Preliminary Training**). | ||
In the case of Rater Training, performances of the | ||
object-detection and classification models were being evaluated simultaneously. In the case of | ||
Preliminary Training, only the performance of the classification model (and the interpretation of | ||
its output) were being evaluated. | ||
|
||
For each DEXA image, the top-scoring 14 boxes from the object-detection model were used to define | ||
sub-images that were scored by the classification model, both described above. Initially, the numbers | ||
associated with the class assigned to each of the 14 bridge images ("br0", "br1", "br2", "br3") were summed | ||
to produce the model-derived DISH score. Two modifications were added to this process, described below. | ||
|
||
First, bridges assigned | ||
a score of 1 ("br1") were re-evaluated and assigned a decimal score in the interval \[0-1\]. That value | ||
was calculated as the fraction of confidence scores assigned by the model to classes "br1", "br2", and "br3". | ||
This had the general effect of down-weighting "br1" assignments, which frequently were made spuriously (see | ||
the confusion matrix above), unless they looked more like "br2"/"br3" instances (which provide a rare source | ||
for mis-classification) than they looked like "br0" instances (which provide an abundant source for | ||
mis-classification). This modification is referred to below as the "augmentation of one" (**Aug.One**). | ||
|
||
Second, the training of both models on horizontally-flipped images, despite the invariance of right-facing | ||
images in the study set for which this tool was being developed, allowed the implementation of a | ||
horizontal-flip data augmentation strategy during evaluation. Each DEXA image was scored twice: once in | ||
its original orientation, once in its horizontally-flipped orientation. The output score was taken as the | ||
average of those two scores. This allowed the impact of both models' idiosyncrasies to be minimized. | ||
This modification is referred to below as "**Aug.Flip**". | ||
|
||
Pearson correlation coefficients: | ||
|
||
| Modification | Prelim. Tr. | Rater Tr. | | ||
| ------------ | :---------- | :-------- | | ||
| None | 0.832 | 0.821 | | ||
| **Aug.One** | 0.824 | 0.834 | | ||
| **Aug.Flip** | 0.839 | 0.838 | | ||
| **Aug.One + Aug.Flip** | 0.828 | **0.850** | | ||
|
||
Use of both **Aug.One** and **Aug.Flip** was the strategy selected for the final application of | ||
the model. Here is a plot of performance versus the Rater Training set: | ||
|
||
<img src="imgs/trainRegress.jpeg" alt="performance versus Rater Training set" height="45%" width="45%"> | ||
|
||
## Final performance evaluation. | ||
|
||
The Rater Test set provided the basis for the final evaluation of the full DISH scoring tool, | ||
as described above, and it was considered after the model | ||
had been applied to all study images. Its performance is shown below: | ||
|
||
<img src="imgs/testRegress.jpeg" alt="performance versus Rater Training set" height="45%" width="45%"> | ||
|
||
**Pearson correlation (Rater Test set) = 0.774** | ||
|
||
## References | ||
|
||
Kuperus et al (2018) "The natural course of diffuse idiopathic skeletal | ||
hyperostosis in the thoracic spine of adult males." *The Journal of Rheumatology.* 45:1116-1123. doi:10.3899/jrheum.171091 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
[Back to home.](../README.md) | ||
|
||
# Developer Documentation | ||
|
||
## Module Organization | ||
|
||
This system is implemented using a data-centric abstraction model. There | ||
is a set of classes responsible for the analysis of data, | ||
a set of classes responsible for input & output of progress and data, | ||
and a pair of classes responsible for managing the overall workflow. The following | ||
classes are implemented, with the indicated dependencies on one another: | ||
|
||
![Module Dependency Diagram](imgs/mdd_modules.jpeg) | ||
|
||
### Data analysis: | ||
|
||
**DishScorer** executes the scoring of DISH in given DEXA images. It applies ML models using stored instances of **TfObjectDetector** and **TfClassifier**, | ||
and it interprets the results of those analyses, executing all augmentation options. | ||
|
||
**TfObjectDetector** applies a stored tensorflow object-detection model to an image, returning a series of confidence-scored **Box** instances. | ||
|
||
**Box** records the edges off a box on an x-y plane. Instances can also carry score values for themselves, as well as other arbitrary data ('labels'). | ||
|
||
**TfClassifier** applies a stored tensorflow image-classification model to an image, returning an instance of **TfClassResult**. | ||
|
||
**TfClassResult** records the confidence scores assigned to each of a set of competing classes, as is output by an image-classification ML model. | ||
|
||
### I/O Functions: | ||
|
||
**ImgLister** defines an interface for reading from a set of images (in this case, DEXA spines) to be | ||
analyzed. Two implementing classes are provided: **ImgDirLister** iterates through all image files | ||
in a given directory, while **ImgFileLister** iterates through a set of image files listed in a text | ||
file (allowing the image files to be distributed across many directories). | ||
|
||
**ProgressWriter** defines a listener interface for reporting progress of the DISH scoring tool across a data | ||
set. Two implementing classes are provided: **DotWriter** prints dots to the shell as progress is made, while | ||
**NullDotWriter** does nothing (this allows results printed to stdout to be uncluttered by progress reports). | ||
|
||
### Task management: | ||
|
||
**ImgDirScorer** executes DISH scoring across a series of DEXA images, defined by its stored **ImgLister** instance. | ||
|
||
**PerformanceAnalyzer** executes DISH scoring across a series of images stored in an annotation file (listing a | ||
score for each image). Rather than output those scores, it prints the results of a statistical analysis of the | ||
correlative relationship between the given & determined values. | ||
|
||
Additional documentation is found within the code itself. | ||
|
||
## Support data files: ML models | ||
|
||
In addition to the source code, this pipeline requires two tensorflow | ||
saved-model files (`.pb`) and accompanying | ||
label files. These represent the ML models that are described in | ||
the [methods](analysis.md) documentation. | ||
|
||
In the table below, for each ML model, the model & corresponding label file | ||
are indicated, with a brief description of the model's purpose: | ||
|
||
| Model File | Label File | Purpose | | ||
| ------------------------ | ---------- | ------------- | | ||
| bridgeDetectorModel.pb | bridgeDetectorLabels.pbtxt | Object detector of anterior side of gaps between adjacent vertebrae. | | ||
| bridgeScoreModel.pb | bridgeScoreLabels.txt | Image classification model for identifying the extent of hyperostosis for a given gap between vertebrae. | | ||
|
||
## I/O File Formats | ||
|
||
See input/output descriptions in the [usage instructions](getstarted.md). |
Oops, something went wrong.