Skip to content

Commit

Permalink
Merge upstream into fork
Browse files Browse the repository at this point in the history
  • Loading branch information
splichte authored and Ubuntu committed Feb 11, 2019
2 parents 70cde01 + a7622f2 commit a362541
Show file tree
Hide file tree
Showing 12 changed files with 168 additions and 139 deletions.
11 changes: 3 additions & 8 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
language: c
env:
- BRANCH=master
- BRANCH=devel

before_install:
- export PATH="nim-$BRANCH/bin/nim/bin:${PATH:+:$PATH}"
- export LD_LIBRARY_PATH=./htslib/
- export PATH="nim-$BRANCH/bin${PATH:+:$PATH}"
- export PATH="$(pwd)/nimble/src:$PATH"
- export PATH="$TRAVIS_BUILD_DIR/nim-$BRANCH/bin:$PATH"
- export NIM_LIB_PREFIX=$TRAVIS_BUILD_DIR/nim-$BRANCH/

install:
- bash ./scripts/install.sh
Expand All @@ -15,10 +14,6 @@ script:
- bash functional-tests.sh
- nim c -d:release --cc:$CC mosdepth.nim
- ./mosdepth -h
cache:
directories:
- nim-master
- nim-devel
branches:
except:
- gh-pages
15 changes: 15 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,18 @@
0.2.5 (dev)
=====
+ remove dependency on PCRE

0.2.4
=====
+ Add optional `--include-flag` to allow counting only reads that have some bits in the specified flag set.
This will only be used rarely--e.g. to count only supplemental reads, use `-F 0 --include-flag 2048`.
+ Fix case when only a single argument was given to --quantize
+ add --read-groups option to allow specifying that only certain read-groups should be used in the depth calculation. (#60)
+ add --fast-mode that does not look at internal cigar operations like (I)insertions or (D)eletions, but does consider soft and
hard-clips at the end of the alignment. Also does not correct for mate overlap. This makes mosdepth as much as **2X faster for
CRAM** and is likely the desired mode for people using the depth for CNV or general coverage values as drops in coverage
due to CIGAR operations are often not of interest for coverage-based analyses.

0.2.3
=====
+ fix bug in region.dist with chromosomes in bam header, but without any reads. thanks (@vladsaveliev for reporting)
Expand Down
22 changes: 19 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,14 @@ Common Options:
Other options:
-F --flag <FLAG> exclude reads with any of the bits in FLAG set [default: 1796]
-i --include-flag <FLAG> only include reads with any of the bits in FLAG set. default is unset. [default: 0]
-x --fast-mode dont look at internal cigar operations or correct mate overlaps (recommended for most use-cases).
-q --quantize <segments> write quantized output see docs for description.
-Q --mapq <mapq> mapping quality threshold [default: 0]
-T --thresholds <thresholds> for each interval in --by, write number of bases covered by at
least threshold bases. Specify multiple integer values separated
by ','.
-R --read-groups <string> only calculate depth for these comma-separated read groups IDs.
-h --help show help
```
Expand Down Expand Up @@ -86,12 +89,16 @@ The distribution of depths will go to `sample-output.mosdepth.dist.txt`
For 500-base windows

```
mosdepth -n --by 500 sample.wgs $sample.wgs.bam
mosdepth -n --fast-mode --by 500 sample.wgs $sample.wgs.cram
```

`-n` means don't output per-base data, this will make `mosdepth`
a bit faster as there is some cost to outputting that much text.

--fast-mode avoids the extra calculations of mate pair overlap and cigar operations,
and also allows htslib to extract less data from CRAM, providing a substantial speed
improvement.

### Callable regions example

To create a set of "callable" regions as in [GATK's callable loci tool](https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_coverage_CallableLoci.php):
Expand Down Expand Up @@ -128,6 +135,12 @@ This also forces the output to have 5 decimals of precision rather than the defa

The simplest way is to [![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square)](http://bioconda.github.io/recipes/mosdepth/README.html)

It can also be installed with `brew` as `brew install brewsci/bio/mosdepth` or used via docker with quay:
```
docker pull quay.io/biocontainers/mosdepth:0.2.4--he527e40_0
docker run -v /hostpath/:/opt/mount quay.io/biocontainers/mosdepth:0.2.4--he527e40_0 mosdepth -n --fast-mode -t 4 --by 1000 /opt/mount/sample /opt/mount/$bam
```

Unless you want to install [nim](https://nim-lang.org), simply download the
[binary from the releases](https://github.com/brentp/mosdepth/releases).

Expand Down Expand Up @@ -156,6 +169,9 @@ Then pass that path to mosdepth just like we did with htslib
LD_LIBRARY_PATH=~/src/pcre-8.41/.libs/:/~/src/htslib/ mosdepth -h
```

If you still see an error about `could not import: pcre_free_study` then
for some, the solution has been to do: `ln -s /usr/local/lib/libpcre.so /usr/local/lib/libpcre.so.3`

If you do want to install from source, see the [travis.yml](https://github.com/brentp/mosdepth/blob/master/.travis.yml)
and the [install.sh](https://github.com/brentp/mosdepth/blob/master/scripts/install.sh).

Expand All @@ -166,8 +182,8 @@ If you use archlinux, you can [install as a package](https://aur.archlinux.org/p
This is **useful for QC**.

The `$prefix.mosdepth.global.dist.txt` file contains, a cumulative distribution indicating the
proportion of total bases (or the proportion of the `--by` for `$prefix.mosdepth.region.dist.txt) that were covered
for at least a given coverage value. It does this for each chromosom, and for then
proportion of total bases (or the proportion of the `--by` for `$prefix.mosdepth.region.dist.txt`) that were covered
for at least a given coverage value. It does this for each chromosome, and for the
whole genome.

Each row will indicate:
Expand Down
26 changes: 24 additions & 2 deletions functional-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ test -e ssshtest || wget -q https://raw.githubusercontent.com/ryanlayer/ssshtest
set -o nounset

set -e
nim c --boundChecks:off mosdepth.nim
nim c --boundChecks:on -x:on mosdepth.nim
set +e
exe=./mosdepth
bam=/data/human/NA12878.subset.bam
Expand All @@ -27,6 +27,14 @@ assert_equal "$(zgrep ^MT t.per-base.bed.gz)" "MT 0 80 1
MT 80 16569 0"
assert_equal "$(zgrep -w ^1 t.per-base.bed.gz)" "1 0 249250621 0"

run overlapFastMode $exe t --fast-mode tests/ovl.bam
assert_equal "$(zgrep ^MT t.per-base.bed.gz)" "MT 0 6 1
MT 6 42 2
MT 42 80 1
MT 80 16569 0"
assert_exit_code 0


run missing_chrom $exe -c nonexistent --by 20000 t tests/ovl.bam
assert_in_stderr "[mosdepth] chromosome nonexistent not found"
assert_exit_code 1
Expand Down Expand Up @@ -54,6 +62,9 @@ assert_equal "$(zgrep ^MT t.quantized.bed.gz)" "MT 0 80 1:1000
MT 80 16569 0:1"
assert_equal "$(zgrep -w ^1 t.quantized.bed.gz)" "1 0 249250621 0:1"

run single-quant $exe -q 60 t tests/nanopore.bam
assert_exit_code 0


rm -f t.thresholds.bed.gz*
run threshold_test $exe --by 100 -T 0,1,2,3,4,5 -c MT t tests/ovl.bam
Expand Down Expand Up @@ -84,10 +95,21 @@ assert_exit_code 1
assert_in_stderr "skipping bad bed line:MT 2"
assert_in_stderr "invalid integer: asdf"

run big_chrom $exe t tests/big.bam

$exe -n t tests/ovl.bam
run test_read_group $exe -n tt tests/ovl.bam -R GT04008021_119
assert_equal $(cat tt.mosdepth.global.dist.txt | wc -l) 4
assert_equal $(diff tt.mosdepth.global.dist.txt t.mosdepth.global.dist.txt | wc -l) 0
assert_exit_code 0

run test_missing_read_group $exe -n tt tests/ovl.bam -R MISSING
assert_equal "$(cat tt.mosdepth.global.dist.txt)" "MT 0 1.00
total 0 1.00"

run big_chrom $exe t tests/big.bam
assert_exit_code 0

rm -f tt.mosdepth.region.dist.txt
rm -f t.mosdepth.region.dist.txt
run empty_tids $exe t -n --thresholds 1,5 --by tests/empty-tids.bed tests/empty-tids.bam
assert_exit_code 0
Expand Down
Loading

0 comments on commit a362541

Please sign in to comment.