Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mus mus cox map #1452

Closed

Conversation

peterdfields
Copy link
Contributor

Hi!

I'm creating this pull request to add a genetic map object for the mouse reference. Please let me know how to proceed with review of the object document and the map files.

@codecov
Copy link

codecov bot commented Mar 12, 2023

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (d89010f) 99.85% compared to head (5201252) 99.72%.
Report is 7 commits behind head on main.

❗ Current head 5201252 differs from pull request most recent head ddc7b7b. Consider uploading reports for the commit ddc7b7b to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1452      +/-   ##
==========================================
- Coverage   99.85%   99.72%   -0.13%     
==========================================
  Files         121      118       -3     
  Lines        4178     4037     -141     
  Branches      588      566      -22     
==========================================
- Hits         4172     4026     -146     
- Misses          3        8       +5     
  Partials        3        3              
Files Coverage Δ
stdpopsim/catalog/MusMus/genetic_maps.py 0.00% <0.00%> (ø)

... and 8 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nspope
Copy link
Collaborator

nspope commented Mar 13, 2023

Awesome, thanks @peterdfields! We'll need @andrewkern to upload the map (tar'd, gzipped) to AWS. Then, you can fill in the correct URL.

One other quick comment -- to actually get the map registered in stdpopsim, you'll need to add a line to catalog/MusMus/__init__.py that does the import -- see for example here.

One other thing to keep in mind, is that I think we'll want to update the per-chromosome recombination rates so as to match the averages from this map (at least, this is what we've done for other species in the catalog). These can be calculated via msprime.RateMap.mean_rate. But perhaps hold off on that until @igronau is done with the QC tests for the initial MusMus PR.

@igronau
Copy link
Contributor

igronau commented Mar 13, 2023

I'm not sure we need to update the species model. The rec rates in that model were computed by the same publication (Cox et al. 2009). It's possible that averaging over the rates in the map leads to slightly different rates than the one they report in the table. Still, I think it makes sense to use the values that they report in the species model. So, unless this is the clear standard we wish to use in all species with recombination maps, I would leave the current values as is.

@nspope
Copy link
Collaborator

nspope commented Mar 13, 2023

Oh got it-- I didn't realize the map was from the same publication (although it sounds like these have been lifted over to a new assembly version). Not requiring an exact match between per-chrom rates and map averages is fine by me (if they're close enough-- we should check). But may be worth bringing up in meeting to see what others think.

@andrewkern
Copy link
Member

hey @peterdfields -- is the map tarred up somewhere that I can access it? I'll basically have to download the tarball and then upload to AWS.... annoying glitch in our system here...

@peterdfields
Copy link
Contributor Author

Awesome, thanks @peterdfields! We'll need @andrewkern to upload the map (tar'd, gzipped) to AWS. Then, you can fill in the correct URL.

One other quick comment -- to actually get the map registered in stdpopsim, you'll need to add a line to catalog/MusMus/__init__.py that does the import -- see for example here.

One other thing to keep in mind, is that I think we'll want to update the per-chromosome recombination rates so as to match the averages from this map (at least, this is what we've done for other species in the catalog). These can be calculated via msprime.RateMap.mean_rate. But perhaps hold off on that until @igronau is done with the QC tests for the initial MusMus PR.

@nspope I have added the entry to catalog/MusMus/init.py. @igronau and @nspope I am working on comparing the whole chromosome value vs. the average of the lifted map. At present they're not lining up so precisely. In the summary provided at: https://github.com/kbroman/CoxMapV3 they're providing cM and position. The basic arithmetic I'm applying is to divide the first cM value by the bp position, and then for all subsequent positions dividing the difference in cM between the next and last position and bp in the next and last position. After averaging these values and multiplying by 1x10^6, thats the values I'm comparing to those reported in the original map. Does this seem like a sensible approach? I might need to put in an issue request to track down more details about the conversion.

@andrewkern sorry for the delay, once I can figure out the average difference I can send the map directory along.

@igronau
Copy link
Contributor

igronau commented Mar 19, 2023

@peterdfields , the approach sounds reasonable, but I'm not sure why you need to do this. We have chromosome-average recombination rates and we have the detailed recombination maps. I understand that there is a script that can compute chromosome-average recombination rates for a given map. I think that @nspope suggested applying the script to the map and comparing the results to the average rates published by Cox et al.

@peterdfields
Copy link
Contributor Author

@peterdfields , the approach sounds reasonable, but I'm not sure why you need to do this. We have chromosome-average recombination rates and we have the detailed recombination maps.

@igronau I should have been more precise, sorry about that. The reason I'm doing the calculation is because as the map is presented without the column of cM/Mb required by the submission format. You can see the csv with the map data here.

@igronau
Copy link
Contributor

igronau commented Mar 20, 2023

I see. Your approach sounds okay to me, but maybe consult with @andrewkern , who did this for other species.

@peterdfields
Copy link
Contributor Author

@igronau should I wait for this issue to be resolved before working on the demographic model and DFE submissions?

@igronau
Copy link
Contributor

igronau commented Mar 25, 2023

No, I think you can go ahead and implement the demographic model and DFE model, since they involve different files and separate QC processes. Ping me when you're done and I can QC those as well :-)

@peterdfields
Copy link
Contributor Author

I see. Your approach sounds okay to me, but maybe consult with @andrewkern , who did this for other species.

@andrewkern would you be able to have a look at what might be going wrong with the calculation? Let me know if you have any questions or if other details would be helpful.

@petrelharp
Copy link
Contributor

@peterdfields
Copy link
Contributor Author

@petrelharp @igronau okay, just let me know if there's anything I can help with.

@igronau
Copy link
Contributor

igronau commented Nov 1, 2023

@peterdfields , I had a look at this. You can create an msprime.RateMap object by applying the following function call:
msprime.RateMap.read_hapmap("cox_v3_map.csv",position_col=5,map_col=2)

Note that you first have to split cox_v3_map.csv into 20 different files (one per chrom). I checked and this creates a valid rates object that specifies a rate for each interval along the chromosome. This computes rates in reasonable orders of magnitude, and by checking a few of them, they follow the expected logic. Please give this a try and let me know if you have any questions, or encounter unexpected outcomes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants