-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BCDC Cloud Computing Environment Charter #1
base: master
Are you sure you want to change the base?
Changes from all commits
59848d2
87e00b8
54fdad4
ad3ea61
569a806
ebd60c7
993a3ce
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# BCDC Cloud Computing Environment (Terra) | ||
## Description | ||
We connect, develop, add value to, and train our community around the BCDC Cloud Computing Environment. This is a cloud-native environment that enables many functionalities including cloud storage, scalable data processing pipelines, and horizontally scalable data-science environments. | ||
|
||
# Roles | ||
[Principal Investigator] Timothy Tickle ([email protected]) | ||
[Pipelines Development Product Manager] Kylee Degatano ([email protected]) | ||
[Customer Success and Education] Salin Thomas ([email protected]) | ||
[Project Manager] Cara Mason ([email protected]) | ||
|
||
## Definitions | ||
**pipeline:** A collection of one or more functional tasks that operate on input data and, from that, transform the input data, or derive features often used to interpret the input data. In a high throughput setting, these tasks are often automated to be performed in a batch setting. | ||
|
||
## Objectives | ||
- To create and maintain a cloud native, horizontally scalable environment for use by the BICCN Community. | ||
- To work with community members to bring data processing pipelines used by the community or of value to the community into the cloud computing environment so they may be operated at scale. | ||
- To leverage data-driven feedback from the consortium to update data processing pipelines focusing on improvement to both science and engineering performance. These updates will be versioned and documented. | ||
|
||
## In-scope | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should something be considered in scope to maintain and upgrade pipelines depending on updates to genomes etc (not sure if this happens regularly enough to impact us for this grant) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, absolutely There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For consideration: To support the enrichment of the integrated dataset within the BCDC Cell Registry to facilitate linkage between datasets, modalities and updates through integration of secondary analysis and features extraction pipelines into the computing environment supporting access to data in the BCDC Cell Registry and data archives and publishing results back to the BCDC Cell Registry and data archives. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will add. |
||
- Connect data flow from BICCN Ingest services and archives to the BCDC Cloud Computing Environment to enable the processing of sequence-based data. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is related to my question on the first objective. |
||
- Work with community members to translate community pipelines and pipelines of high value to cloud-native, horizontally scalable data processing pipelines. | ||
- Work with the BICCN community to improve pipelines currently found in the BCDC Cloud Computing Environment. | ||
- Perform training on how to use the BCDC Cloud Computing Environment to BICCN community members, those wanting to process consortium data, and those who want to use a consortium derived pipeline (on private or public data). | ||
|
||
## Out-of-scope | ||
- Data operations or processing data on behalf of the BICCN community. | ||
- Running or maintaining pipelines not compatible with the BICCN cloud computing environment. | ||
|
||
# Communication | ||
## Slack Channels | ||
[BICCN Joint Analysis/pipelines](https://biccn-joint-analysis.slack.com/messages/pipelines) | ||
To speak directly to the implementation team and community members around pipelines in general. | ||
|
||
[BICCN Joint Analysis/methylation-pipelines](https://biccn-joint-analysis.slack.com/messages/methylation-pipelines) | ||
To speak directly to the implementation team and community members around methylation pipelines. | ||
|
||
## Github repositories | ||
[CEMBA](https://github.com/BICCN/CEMBA) | ||
Contains the implementation for the snmC-seq pipeline. | ||
|
||
[Snap-ATAC](https://github.com/HumanCellAtlas/skylab/tree/master/pipelines/snap-atac) | ||
Contains the implementation for the Snap-ATAC scATAC-seq pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this limited to processing of molecular *omics data, or could they cover other modalities in the future? (I think the current members know this, but a new member might not)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will state and clarify potential other areas.