-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mus mus cox map #1452
Mus mus cox map #1452
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1452 +/- ##
==========================================
- Coverage 99.85% 99.72% -0.13%
==========================================
Files 121 118 -3
Lines 4178 4037 -141
Branches 588 566 -22
==========================================
- Hits 4172 4026 -146
- Misses 3 8 +5
Partials 3 3
☔ View full report in Codecov by Sentry. |
Awesome, thanks @peterdfields! We'll need @andrewkern to upload the map (tar'd, gzipped) to AWS. Then, you can fill in the correct URL. One other quick comment -- to actually get the map registered in stdpopsim, you'll need to add a line to One other thing to keep in mind, is that I think we'll want to update the per-chromosome recombination rates so as to match the averages from this map (at least, this is what we've done for other species in the catalog). These can be calculated via |
I'm not sure we need to update the species model. The rec rates in that model were computed by the same publication (Cox et al. 2009). It's possible that averaging over the rates in the map leads to slightly different rates than the one they report in the table. Still, I think it makes sense to use the values that they report in the species model. So, unless this is the clear standard we wish to use in all species with recombination maps, I would leave the current values as is. |
Oh got it-- I didn't realize the map was from the same publication (although it sounds like these have been lifted over to a new assembly version). Not requiring an exact match between per-chrom rates and map averages is fine by me (if they're close enough-- we should check). But may be worth bringing up in meeting to see what others think. |
hey @peterdfields -- is the map tarred up somewhere that I can access it? I'll basically have to download the tarball and then upload to AWS.... annoying glitch in our system here... |
@nspope I have added the entry to catalog/MusMus/init.py. @igronau and @nspope I am working on comparing the whole chromosome value vs. the average of the lifted map. At present they're not lining up so precisely. In the summary provided at: https://github.com/kbroman/CoxMapV3 they're providing cM and position. The basic arithmetic I'm applying is to divide the first cM value by the bp position, and then for all subsequent positions dividing the difference in cM between the next and last position and bp in the next and last position. After averaging these values and multiplying by 1x10^6, thats the values I'm comparing to those reported in the original map. Does this seem like a sensible approach? I might need to put in an issue request to track down more details about the conversion. @andrewkern sorry for the delay, once I can figure out the average difference I can send the map directory along. |
@peterdfields , the approach sounds reasonable, but I'm not sure why you need to do this. We have chromosome-average recombination rates and we have the detailed recombination maps. I understand that there is a script that can compute chromosome-average recombination rates for a given map. I think that @nspope suggested applying the script to the map and comparing the results to the average rates published by Cox et al. |
@igronau I should have been more precise, sorry about that. The reason I'm doing the calculation is because as the map is presented without the column of cM/Mb required by the submission format. You can see the csv with the map data here. |
I see. Your approach sounds okay to me, but maybe consult with @andrewkern , who did this for other species. |
@igronau should I wait for this issue to be resolved before working on the demographic model and DFE submissions? |
No, I think you can go ahead and implement the demographic model and DFE model, since they involve different files and separate QC processes. Ping me when you're done and I can QC those as well :-) |
@andrewkern would you be able to have a look at what might be going wrong with the calculation? Let me know if you have any questions or if other details would be helpful. |
@igronau will look at this, referring to https://tskit.dev/msprime/docs/latest/api.html#msprime.RateMap.read_hapmap |
@petrelharp @igronau okay, just let me know if there's anything I can help with. |
@peterdfields , I had a look at this. You can create an Note that you first have to split cox_v3_map.csv into 20 different files (one per chrom). I checked and this creates a valid rates object that specifies a rate for each interval along the chromosome. This computes rates in reasonable orders of magnitude, and by checking a few of them, they follow the expected logic. Please give this a try and let me know if you have any questions, or encounter unexpected outcomes. |
ddc7b7b
to
b8c2329
Compare
Hi!
I'm creating this pull request to add a genetic map object for the mouse reference. Please let me know how to proceed with review of the object document and the map files.