-
Notifications
You must be signed in to change notification settings - Fork 6
/
combineprobforallelecombination.xml
67 lines (46 loc) · 2.45 KB
/
combineprobforallelecombination.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
<tool id="combineproballelecom" name="Combine read profile probabilities " version="2.0.0">
<description>from the same allele combination</description>
<command interpreter="python2.7">combinedprobforallelecombination.py $input > $output </command>
<inputs>
<param name="input" type="data" label="Select microsatellite length profile" />
</inputs>
<outputs>
<data name="output" format="tabular" />
</outputs>
<tests>
<!-- Test data with valid values -->
<test>
<param name="input" value="probvalueforhetero_out.txt"/>
<output name="output" file="combineprob_out.txt"/>
</test>
</tests>
<help>
.. class:: infomark
**What it does**
- This tool will combine the read profile probabilities for each allele combination in the input and calculates the probability to detect heterozygote for each allele combination and each depth.
**Citation**
When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
**Input**
The input format is the same as output from **Evaluate the probability of the allele combination to generate read profile** tool.
- Column 1 = location of STR locus.
- Column 2 = length profile (length of STR in each read that mapped to this location in comma separated format).
- Column 3 = motif of STR in this locus. The input file can contain more than three columns.
- Column 4 = homozygote/heterozygote label.
- Column 5 = log based 10 of (the probability of homozygote/the probability of heterozygote)
- Column 6 = Allele for most probable homozygote.
- Column 7 = Allele 1 for most probable heterozygote.
- Column 8 = Allele 2 for most probable heterozygote.
- Column 9 = Probability of the allele combination to generate given read profile.
- Column 10 = Number of possible rearrangement of given read profile.
- Column 11 = Probability of the allele combination to generate read profile with any rearrangement (Product of column 9 and column 10)
- Column 12 = Read depth
Only columns 2,3,4,7,8,11 were used in calculation.
**Output**
The output will contain the following header and columns
- Line 1 header: read_depth allele heterozygous_prob motif
- Column 1 = read depth
- Column 2 = allele combination
- Column 3 = probability to detect heterozygote of that allele combination
- Column 4 = motif
</help>
</tool>