Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refining multiple metadata fields exceptions #19

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,23 +61,25 @@ where
* `---genome_info METADATA_FILE` : genomes metadata file in tsv format
* `-m, --mags, --b, --bins`: select for bin or MAG upload. If in doubt, look at [their definition according to ENA](<https://ena-docs.readthedocs.io/en/latest/submit/assembly/metagenome.html>)
* `--out`: output folder (default: working directory)
* `--force`: forces reset of sample xmls generation
* `--live`: registers genomes on ENA's live server. Omitting this option allows to validate samples beforehand (it will need the `-test` option in the upload command for the test submission to work)
* `--force`: forces reset of submission xml and ENA backup
* `--live`: registers genomes on ENA's live server. Omitting this option allows to validate samples beforehand (it will need the `-test` option in the upload command for test submissions to work)
* `--webin WEBIN_ID`: webin id (format: Webin_XXXXX)
* `--password PASSWORD`: webin password
* `--centre_name CENTRE_NAME`: name of the centre generating and uploading genomes
* `--tpa`: if uploading TPA (Third PArty) generated genomes

It is recommended to validate your genomes in test mode (i.e. without `--live` in the registration step and with `-test` during the upload) before attempting the final upload. Launching the registration in test mode will add a timestamp to the genome name to allow multiple executions of the test process.

Sample xmls won't be regenerated automatically if a previous xml already exists. If any metadata or value in the tsv table changes, `--force` will allow xml regeneration.
`submission.xml` and `ENA_backup.json` won't be regenerated automatically if a previous run already exists,. To regenerate them, select `--force`.

### Produced files:
The script produces the following files and folders:
```bash
bin_upload/MAG_upload
├── manifests
│ └── ...
├── manifests_test # folder generated for validation in test mode
│ └── ...
├── ENA_backup.json # backup file to prevent re-download of metadata from ENA. Regeneration can be forced with --force
├── genome_samples.xml # xml generated to register samples on ENA before the upload
├── registered_bins/MAGs.tsv # list of genomes registered on ENA in live mode - needed for manifest generation
Expand Down
2 changes: 1 addition & 1 deletion genomeuploader/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '2.1.1'
__version__ = '2.2.0'
6 changes: 3 additions & 3 deletions genomeuploader/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,7 @@
"Glorioso Islands",
"Greece",
"Greenland",
"GrENAda",
"Grenada",
"Guadeloupe",
"Guam",
"Guatemala",
Expand Down Expand Up @@ -568,11 +568,11 @@
"Ross Sea",
"Russia",
"Rwanda",
"Saint HelENA",
"Saint Helena",
"Saint Kitts and Nevis",
"Saint Lucia",
"Saint Pierre and Miquelon",
"Saint Vincent and the GrENAdines",
"Saint Vincent and the Grenadines",
"Samoa",
"San Marino",
"Sao Tome and Principe",
Expand Down
2 changes: 1 addition & 1 deletion genomeuploader/ena.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def get_run(self, run_accession, webin, password, attempt=0, search_params=None)
except (IndexError, TypeError, ValueError):
raise ValueError("Could not find run {} in ENA.".format(run_accession))
except:
raise Exception("Could not query ENA API: {}".format(response.text))
raise Exception("Could not query ENA API for run {}: {}".format(run_accession, response.text))

return run

Expand Down
Loading
Loading