Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference store sequences as reference #53

Open
dgoswamia opened this issue Jan 19, 2024 · 1 comment
Open

Reference store sequences as reference #53

dgoswamia opened this issue Jan 19, 2024 · 1 comment

Comments

@dgoswamia
Copy link

dgoswamia commented Jan 19, 2024

how to use the reference store sequences as reference, as I followed the few workshops, and in these instructions were shown on how to create a reference store and store reference human seq and so on but I have not seen a way how this reference store stored references in the parameter-input.json

For example, in the GATK workflow workshop, the workflow was run fastq sequences were used that was registered as ReadSet for using the following parameter-input.json with s3 stored reference,

``
{
"sample_name": "NA12878_20K",
"fastq_1": "omics://account-id.storage.region.amazonaws.com/sequence-store-id/readSet/read-set-id/source1",
"fastq_2": "omics://account-id.storage.region.amazonaws.com/sequence-store-id/readSet/read-set-id/source2",
"readgroup_name": "NA12878",
"library_name": "Solexa-NA12878",
"platform_name": "Illumina",
"run_date": "2016-09-01T02:00:00+0200",
"sequencing_center": "ABCD",
"ref_fasta": "s3://aws-genomics-static-<AWS_REGION>/omics-workshop/data/references/hg38/Homo_sapiens_assembly38.fasta",
"dbSNP_vcf": "s3://aws-genomics-static-<AWS_REGION>/omics-workshop/data/references/hg38/Homo_sapiens_assembly38.dbsnp138.vcf",
"Mills_1000G_indels_vcf": "s3://aws-genomics-static-<AWS_REGION>/omics-workshop/data/references/hg38/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz",
"known_indels_vcf": "s3://aws-genomics-static-<AWS_REGION>/omics-workshop/data/references/hg38/Homo_sapiens_assembly38.known_indels.vcf.gz",
"scattered_calling_intervals_archive": "s3://aws-genomics-static-<AWS_REGION>/omics-workshop/intervals.tar.gz",
"gatk_docker": "AWS_ACCOUNT_ID.dkr.ecr.<AWS_REGION>.amazonaws.com/gatk:4.1.9.0",
"gotc_docker": "AWS_ACCOUNT_ID.dkr.ecr.<AWS_REGION>.amazonaws.com/genomes-in-the-cloud:2.4.7-1603303710"
}

``
Do you have any suggestion on how to update this parameter-input.json for the reference fasta ("ref_fasta") which is stored in the Reference Store

omics:us-east-1:account_ID:referenceStore/3242349265/reference/8625408453

@djemec
Copy link
Contributor

djemec commented Mar 19, 2024

similar to a sequence, you can reference a reference store file using:
omics://account-id.storage.region.amazonaws.com/reference-store-id/reference/reference-id/source or
omics://account-id.storage.region.amazonaws.com/reference-store-id/reference/reference-id/index

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants