Skip to content

Commit

Permalink
Changes to vcf2scoary (#63)
Browse files Browse the repository at this point in the history
Bug fixes in vcf2scoary: Should now split correctly even when there are commas in values.
  • Loading branch information
AdmiralenOla authored Oct 4, 2017
1 parent 36d2e76 commit b713e10
Show file tree
Hide file tree
Showing 4 changed files with 20 additions and 22 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# CHANGELOG
v1.6.16 (Aug 2017)
- Bug fixes to vcf2scoary

v1.6.15 (Jul 2017)
- Apparently 1.6.14 did not fix the pip issue for all users. Deleted the function that seemed to cause the pip crash

Expand Down
16 changes: 2 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,20 +32,8 @@ Scoary is designed to take the gene_presence_absence.csv file from [Roary](https

## What's new?

**LATEST VERSION - 1.6.15**
- Some people still have problems with pip installs. Removed the readme() function from setup

**1.6.14**
- Fixed a bug where scoary could not be upgraded using pip

**1.6.13**
- Fixed bug where handling of converted VCF files would fail due to non-Roary column names
- Fixed the ALL keyword used with --include_input_columns

**1.6.12**
- Convert VCF files to Roary/Scoary format, allowing analysis on a wide range of variants (SNPs, indels, structural variations etc)
- Grab columns from the Roary input and put in the output (To get strain-specific protein names, for example)
- Scoary now comes with a manual, located under docs/tex/scoary_manual.pdf
**LATEST VERSION - 1.6.16**
- Bug fixes to vcf2scoary

All changes are logged in the [CHANGELOG](CHANGELOG.md)

Expand Down
2 changes: 1 addition & 1 deletion scoary/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '1.6.15'
__version__ = '1.6.16'
__author__ = 'Ola Brynildsrud'
__credits__ = ['Ola Brynildsrud']
__license__ = 'GPL3'
Expand Down
21 changes: 14 additions & 7 deletions scoary/vcf2scoary.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ def main():
sys.exit("Unable to locate input file %s" % args.vcf)

with open(args.vcf,'rU') as vcffile, open(args.out,'w') as outfile:
lines = csv.reader(vcffile, delimiter="\t")
lines = csv.reader(vcffile, delimiter='\t', quotechar='"')
metainfo = {"##INFO" : {},
"##FILTER" : {},
"##FORMAT" : {},
Expand All @@ -125,7 +125,7 @@ def main():
# Capture list output for complex tags
if infoline[0] in metainfo:
ID=re.search(r'ID=(\w+)',infoline[1]).group(1)
infolist = re.split(',',infoline[1].strip("<>"))
infolist = re.split(',(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)',infoline[1].strip("<>"))
metainfo[infoline[0]][ID] = {}
# Enter all elements in infolist into appropriate dic
for e in infolist:
Expand Down Expand Up @@ -203,11 +203,18 @@ def writeLine(line, outfile):

def fixdummy(line,c):
newline = line[:]
for x in range(len(line)):
if int(line[x]) == c:
newline[x] = "1"
else:
newline[x] = "0"
try:
for x in range(len(line)):
if line[x] == ".":
# Missing data get entered as reference / no presence
newline[x] = "0"
elif int(line[x]) == c:
newline[x] = "1"
else:
newline[x] = "0"
except ValueError:
print(newline, c)
sys.exit(-1)
return newline

########
Expand Down

0 comments on commit b713e10

Please sign in to comment.