Skip to content

Commit

Permalink
New flu subtypes (H5NX, H7NX, H9NX) (#620)
Browse files Browse the repository at this point in the history
* Add new subtypes: H5NX, H7NX, H9NX. Also, add putative signal peptides for B subtypes (B-vic, B-yam)

* Update flu ingest structure

* Use configurable min_date for global_seq_data analysis

* Shorten sequence mutation table name to accomodate longer partition names

* Add new subtypes to flu config files
  • Loading branch information
atc3 authored May 1, 2023
1 parent c91fdb3 commit 4598e5a
Show file tree
Hide file tree
Showing 69 changed files with 1,613 additions and 117 deletions.
9 changes: 9 additions & 0 deletions config/config_flu_genbank.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,9 @@ report_group_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# Surveillance plot options
# see: workflow_main/scripts/surveillance.py
Expand All @@ -95,6 +98,9 @@ surv_group_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# ---------------
# DATABASE
Expand Down Expand Up @@ -134,6 +140,9 @@ default_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# Home page
show_home_banner: false
Expand Down
9 changes: 9 additions & 0 deletions config/config_flu_gisaid.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,9 @@ report_group_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# Surveillance plot options
# see: workflow_main/scripts/surveillance.py
Expand All @@ -99,6 +102,9 @@ surv_group_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# ---------------
# DATABASE
Expand Down Expand Up @@ -139,6 +145,9 @@ default_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# Home page
show_home_banner: false
Expand Down
9 changes: 9 additions & 0 deletions config/config_flu_gisaid_dev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,9 @@ report_group_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# Surveillance plot options
# see: workflow_main/scripts/surveillance.py
Expand All @@ -99,6 +102,9 @@ surv_group_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# ---------------
# DATABASE
Expand Down Expand Up @@ -139,6 +145,9 @@ default_references:
B-yam: B-Phuket-3073-2013
H1N1: A-Wisconsin-67-2022
H3N2: A-Darwin-6-2021
H5NX: A-Goose-Guangdong-1-96
H7NX: A-Shanghai-02-2013
H9NX: A-Hong-Kong-1073-99

# Home page
show_home_banner: false
Expand Down
4 changes: 2 additions & 2 deletions services/server/cg_server/db_seed/seed.py
Original file line number Diff line number Diff line change
Expand Up @@ -480,7 +480,7 @@ def seed_database(conn, schema="public"):
# Clean up the reference name as a SQL ident - no dots
reference_name_sql = reference_name.replace(".", "_")

reference_partition_name = f"{table_name}_{reference_name_sql}"
reference_partition_name = f"seqmut_{mutation_field}_{reference_name_sql}"

# Create reference partition
cur.execute(
Expand Down Expand Up @@ -511,7 +511,7 @@ def seed_database(conn, schema="public"):
"""
).format(
date_partition_name=sql.Identifier(
f"{table_name}_{reference_name_sql}_{i}"
f"seqmut_{mutation_field}_{reference_name_sql}_{i}"
),
reference_partition_name=sql.Identifier(
reference_partition_name
Expand Down
Binary file modified static_data/flu/alignments/B-vic_4_FLBHAAA.dna
Binary file not shown.
Binary file modified static_data/flu/alignments/B-yam_4_MK715607.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H5N1_1_NC_007357.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H5N1_2_NC_007358.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H5N1_3_NC_007359.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H5N1_4_NC_007362.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H5N1_5_NC_007360.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H5N1_6_NC_007361.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H5N1_7_NC_007363.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H5N1_8_NC_007364.dna
Binary file not shown.
Binary file not shown.
Binary file added static_data/flu/alignments/H7N2_2_NC_026423.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H7N2_3_NC_026424.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H7N2_4_NC_026425.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H7N2_5_NC_026426.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H7N2_6_NC_026429.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H7N2_7_NC_026427.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H7N2_8_NC_026428.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H9N2_1_NC_004910.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H9N2_2_NC_004911.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H9N2_3_NC_004912.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H9N2_4_NC_004908.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H9N2_5_NC_004905.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H9N2_6_NC_004909.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H9N2_7_NC_004907.dna
Binary file not shown.
Binary file added static_data/flu/alignments/H9N2_8_NC_004906.dna
Binary file not shown.
Binary file added static_data/flu/alignments/HA_all.praln
Binary file not shown.
18 changes: 9 additions & 9 deletions static_data/flu/genes.csv
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ H5NX,A-Goose-Guangdong-1-96,NS1,8,15..707,1,0,[],
H7NX,A-Shanghai-02-2013,PB2,1,1..2280,1,0,[],
H7NX,A-Shanghai-02-2013,PB1,2,1..2274,1,0,[],
H7NX,A-Shanghai-02-2013,PA,3,1..2151,1,0,[],
H7NX,A-Shanghai-02-2013,HA,4,1..1683,1,16,"[{""name"": ""signal peptide"", ""ranges"": [[1, 18]]}, {""name"": ""HA1"", ""ranges"": [[19, 339]]}, {""name"": ""HA2"", ""ranges"": [[340, 560]]}]",
H7NX,A-Shanghai-02-2013,HA,4,1..1683,1,18,"[{""name"": ""signal peptide"", ""ranges"": [[1, 18]]}, {""name"": ""HA1"", ""ranges"": [[19, 339]]}, {""name"": ""HA2"", ""ranges"": [[340, 560]]}]",
H7NX,A-Shanghai-02-2013,NP,5,1..1497,1,0,[],
H7NX,A-Shanghai-02-2013,NA,6,1..1398,1,0,[],
H7NX,A-Shanghai-02-2013,M1,7,1..759,1,0,[],
Expand All @@ -140,55 +140,55 @@ H7NX,A-Shanghai-02-2013,NS1,8,1..654,1,0,[],
H9NX,A-Hong-Kong-1073-99,PB2,1,28..2307,1,0,[],
H9NX,A-Hong-Kong-1073-99,PB1,2,24..2300,1,0,[],
H9NX,A-Hong-Kong-1073-99,PA,3,21..2171,1,0,[],
H9NX,A-Hong-Kong-1073-99,HA,4,1..1714,1,16,"[{""name"": ""signal peptide"", ""ranges"": [[1, 18]]}, {""name"": ""HA1"", ""ranges"": [[19, 338]]}, {""name"": ""HA2"", ""ranges"": [[339, 560]]}]",
H9NX,A-Hong-Kong-1073-99,HA,4,1..1714,1,18,"[{""name"": ""signal peptide"", ""ranges"": [[1, 18]]}, {""name"": ""HA1"", ""ranges"": [[19, 338]]}, {""name"": ""HA2"", ""ranges"": [[339, 560]]}]",
H9NX,A-Hong-Kong-1073-99,NP,5,40..1536,1,0,[],
H9NX,A-Hong-Kong-1073-99,NA,6,1..1404,1,0,[],
H9NX,A-Hong-Kong-1073-99,M1,7,33..791,1,0,[],
H9NX,A-Hong-Kong-1073-99,M2,7,33..59;748..1014,1,0,[],
H9NX,A-Hong-Kong-1073-99,NEP,8,27..56;529..864,1,0,[],
H9NX,A-Hong-Kong-1073-99,NS1,8,27..719,1,0,[],
B-yam,B-Massachusetts-02-2012,HA,4,1..1755,1,0,[],
B-yam,B-Massachusetts-02-2012,HA,4,1..1755,1,15,"[{""name"": ""signal peptide"", ""ranges"": [[1, 15]]}, {""name"": ""HA1"", ""ranges"": [[16, 361]]}, {""name"": ""HA2"", ""ranges"": [[362, 584]]}]",
B-yam,B-Massachusetts-02-2012,NA,6,1..1401,1,0,[],
B-yam,B-Massachusetts-02-2012,NB,6,3..296,1,0,[],
B-yam,B-Massachusetts-02-2012,NEP,8,1..33;689..1024,1,0,[],
B-yam,B-Massachusetts-02-2012,NS1,8,1..846,1,0,[],
B-yam,B-Phuket-3073-2013,PB2,1,1..2313,1,0,[],
B-yam,B-Phuket-3073-2013,PB1,2,1..2259,1,0,[],
B-yam,B-Phuket-3073-2013,PA,3,1..2181,1,0,[],
B-yam,B-Phuket-3073-2013,HA,4,1..1755,1,0,[],
B-yam,B-Phuket-3073-2013,HA,4,1..1755,1,15,"[{""name"": ""signal peptide"", ""ranges"": [[1, 15]]}, {""name"": ""HA1"", ""ranges"": [[16, 361]]}, {""name"": ""HA2"", ""ranges"": [[362, 584]]}]",
B-yam,B-Phuket-3073-2013,NP,5,1..1683,1,0,[],
B-yam,B-Phuket-3073-2013,NA,6,34..1434,1,0,[],
B-yam,B-Phuket-3073-2013,NB,6,27..329,1,0,[],
B-yam,B-Phuket-3073-2013,M,7,1..747,1,0,[],
B-yam,B-Phuket-3073-2013,BM2,7,747..1076,1,0,[],
B-yam,B-Phuket-3073-2013,NEP,8,1..33;689..1024,1,0,[],
B-yam,B-Phuket-3073-2013,NS1,8,1..846,1,0,[],
B-yam,B-Wisconsin-01-2010,HA,4,1..1755,1,0,[],
B-yam,B-Wisconsin-01-2010,HA,4,1..1755,1,15,"[{""name"": ""signal peptide"", ""ranges"": [[1, 15]]}, {""name"": ""HA1"", ""ranges"": [[16, 361]]}, {""name"": ""HA2"", ""ranges"": [[362, 584]]}]",
B-yam,B-Wisconsin-01-2010,NA,6,1..1401,1,0,[],
B-yam,B-Wisconsin-01-2010,NB,6,3..296,1,0,[],
B-yam,B-Wisconsin-01-2010,NEP,8,1..33;689..1024,1,0,[],
B-yam,B-Wisconsin-01-2010,NS1,8,1..846,1,0,[],
B-vic,B-Austria-1359417-2021,PB2,1,1..2313,1,0,[],
B-vic,B-Austria-1359417-2021,PB1,2,1..2259,1,0,[],
B-vic,B-Austria-1359417-2021,PA,3,1..2181,1,0,[],
B-vic,B-Austria-1359417-2021,HA,4,1..1749,1,0,[],
B-vic,B-Austria-1359417-2021,HA,4,1..1749,1,15,"[{""name"": ""signal peptide"", ""ranges"": [[1, 15]]}, {""name"": ""HA1"", ""ranges"": [[16, 359]]}, {""name"": ""HA2"", ""ranges"": [[360, 582]]}]",
B-vic,B-Austria-1359417-2021,NP,5,1..1683,1,0,[],
B-vic,B-Austria-1359417-2021,NA,6,8..1408,1,0,[],
B-vic,B-Austria-1359417-2021,NB,6,1..303,1,0,[],
B-vic,B-Austria-1359417-2021,M,7,1..747,1,0,[],
B-vic,B-Austria-1359417-2021,BM2,7,747..1076,1,0,[],
B-vic,B-Austria-1359417-2021,NEP,8,1..36;692..1027,1,0,[],
B-vic,B-Austria-1359417-2021,NS1,8,1..849,1,0,[],
B-vic,B-Brisbane-60-2008,HA,4,34..1791,1,0,[],
B-vic,B-Colorado-06-2017,HA,4,34..1785,1,0,[],
B-vic,B-Brisbane-60-2008,HA,4,34..1791,1,15,"[{""name"": ""signal peptide"", ""ranges"": [[1, 15]]}, {""name"": ""HA1"", ""ranges"": [[16, 362]]}, {""name"": ""HA2"", ""ranges"": [[363, 585]]}]",
B-vic,B-Colorado-06-2017,HA,4,34..1785,1,15,"[{""name"": ""signal peptide"", ""ranges"": [[1, 15]]}, {""name"": ""HA1"", ""ranges"": [[16, 360]]}, {""name"": ""HA2"", ""ranges"": [[361, 583]]}]",
B-vic,B-Colorado-06-2017,NA,6,54..1454,1,0,[],
B-vic,B-Colorado-06-2017,NB,6,47..349,1,0,[],
B-vic,B-Colorado-06-2017,M,7,25..771,1,0,[],
B-vic,B-Colorado-06-2017,BM2,7,771..1100,1,0,[],
B-vic,B-Washington-02-2019,PB2,1,10..2322,1,0,[],
B-vic,B-Washington-02-2019,PB1,2,8..2266,1,0,[],
B-vic,B-Washington-02-2019,PA,3,16..2196,1,0,[],
B-vic,B-Washington-02-2019,HA,4,20..1768,1,0,[],
B-vic,B-Washington-02-2019,HA,4,20..1768,1,15,"[{""name"": ""signal peptide"", ""ranges"": [[1, 15]]}, {""name"": ""HA1"", ""ranges"": [[16, 359]]}, {""name"": ""HA2"", ""ranges"": [[360, 582]]}]",
B-vic,B-Washington-02-2019,NP,5,47..1729,1,0,[],
B-vic,B-Washington-02-2019,NA,6,40..1440,1,0,[],
B-vic,B-Washington-02-2019,NB,6,33..335,1,0,[],
Expand Down
Loading

0 comments on commit 4598e5a

Please sign in to comment.