You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue
In Part 2 of the tutorial, gene identifier mapping is not explicitly mentioned, but there are instances where gene mapping is needed. Especially, the tutorial involves various steps related to gene expression data and the selection of specific genes like selecting, intersecting, and manipulating gene sets based on their relevance to the analysis.
Example Scenario:
Consider Task 2 of the BuDDI analysis tutorial, where gene IDs are formatted as 'ENSG00000000003', 'ENSG00000000005', 'ENSG00000000419', 'ENSG00000000457', and the goal is to transform these gene IDs into a different format like 'MIR1302-2HG', 'FAM138A', 'OR4F5', 'AL627309.1', 'AL627309.3'.
For your specific situation, the single-cell matched tissue has the mapping of the genes.
Suggested Approach:
In certain scenarios, the gene identifier mapping may not be available for all genes when transitioning from Ensembl IDs to ontology-based names. To address this, it is recommended to leverage the gene mapping from the single-cell matched tissue, as it likely contains a more comprehensive set of mappings.
In the provided example, a gene mapping is demonstrated using a Pandas DataFrame. The mapping includes columns for gene names ("Name") and Ensembl identifiers ("Ens"). The mapping can be done follow:
Create an empty DataFrame with columns for gene names and Ensembl IDs
gene_maps = pd.DataFrame(columns=["Name", "Ens"])
Populate the "Name" column with gene names from the single cell AnnData object
gene_maps["Name"] = adata.var.index
Populate the "Ens" column with Ensembl IDs from the AnnData object
gene_maps["Ens"] = adata.var["gene_ids"].values
Save the gene mapping DataFrame to a CSV file
gene_maps.to_csv(f'{data_path}/gene_maps.csv')
Extract the gene names for later use
gene_ids = gene_maps["Name"]
The text was updated successfully, but these errors were encountered:
Issue
In Part 2 of the tutorial, gene identifier mapping is not explicitly mentioned, but there are instances where gene mapping is needed. Especially, the tutorial involves various steps related to gene expression data and the selection of specific genes like selecting, intersecting, and manipulating gene sets based on their relevance to the analysis.
Example Scenario:
Consider Task 2 of the BuDDI analysis tutorial, where gene IDs are formatted as 'ENSG00000000003', 'ENSG00000000005', 'ENSG00000000419', 'ENSG00000000457', and the goal is to transform these gene IDs into a different format like 'MIR1302-2HG', 'FAM138A', 'OR4F5', 'AL627309.1', 'AL627309.3'.
For your specific situation, the single-cell matched tissue has the mapping of the genes.
Suggested Approach:
In certain scenarios, the gene identifier mapping may not be available for all genes when transitioning from Ensembl IDs to ontology-based names. To address this, it is recommended to leverage the gene mapping from the single-cell matched tissue, as it likely contains a more comprehensive set of mappings.
In the provided example, a gene mapping is demonstrated using a Pandas DataFrame. The mapping includes columns for gene names ("Name") and Ensembl identifiers ("Ens"). The mapping can be done follow:
Create an empty DataFrame with columns for gene names and Ensembl IDs
Populate the "Name" column with gene names from the single cell AnnData object
Populate the "Ens" column with Ensembl IDs from the AnnData object
Save the gene mapping DataFrame to a CSV file
Extract the gene names for later use
The text was updated successfully, but these errors were encountered: