Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the sequencing chemistry in sequel #1

Open
ZhangBio opened this issue Jun 23, 2022 · 10 comments
Open

About the sequencing chemistry in sequel #1

ZhangBio opened this issue Jun 23, 2022 · 10 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested

Comments

@ZhangBio
Copy link

HI! Happy to see a sequel version of SMALR.
The chemistries in sequel seem to be more described as "sequencing kit v xxx"
How should I set "--model“, if I only know the version of sequencing kit, do you know the relationship between the version of sequencing kit and SP2-C2 or SP3-C3?

@ZhangBio ZhangBio changed the title About the sequencing chemistry is sequel About the sequencing chemistry in sequel Jun 23, 2022
@ZhangBio
Copy link
Author

Like I know there are Sequel II sequencing kit 1.0, Sequel II sequencing kit 2.0.
It sounds like they are different, but they could both use SP3-C3?

@GDelevoye
Copy link
Owner

GDelevoye commented Jun 23, 2022

Hi,

It was never 100% clear to me.

My understanding is that :

  • RS II = P6-C4
  • Sequel I = SP2-C2
  • Sequel II v1 : No model
  • Sequel II v2 : SP2-C2 works OK, SP3-C3 works better

Except the difference between Sequel II v1 and Sequel II v2, it seems that for a same sequencer, different sequencing kits can be used interchangeably without having to switch the in-silico control model.

I opened an issue at the PacBio KineticsTools two years about a similar subject; see here

There you can read one PacBio developer say; I quote :

rhallPB commented on 28 Jan 2020

Note also, S2-P2 model has been shown to be effective with Sequel II chemistry version 2.0. We don't have a good model for Sequel II chemistry version 1.0.

Which is consistent with the summary I made above

@GDelevoye
Copy link
Owner

I never thought anyone would be interested in my software. Please let me know if I can help anyhow in your analysis.

@ZhangBio
Copy link
Author

Thank you very much for your reply!
The kinetics features in P6-C4 and P5-C3 P4-C2 model provided in SMRT analysis v2.3 seem to be a lot different. It will be much easier if the sequel could share the same model. There are so little information on the internet, it's really precious to have your reply.

@ZhangBio
Copy link
Author

Actually, I would expect works like SMALR will have more attention since methylation heterogenenity in prokaryotes could be very important. This software will definely be helpful when people want to conduct analysis using today's sequel data!

@GDelevoye
Copy link
Owner

GDelevoye commented Jun 24, 2022

Yes, information on the subject is very hard to find indeed

Like you say P6-C4 and P5-C3 P4-C2 are very different but I presumed (maybe I was wrong) that these were just incremental upgrades with P6-C4 just being the "best" (?)

I will do my best to find the right sources in a near future, and compile them in the README

Until then what I can tell with certitude is that the SP2-C2 model worked great on our Sequel I E. coli data.

I never tested on Sequel II data. If you have SMSN data produced with a Sequel II sequencer that can be used for benchmark, I would be glad to help

I'm letting this issue opened for the moment and I'll close it when I'll find more informations on the models versus sequencing kits. Maybe I could even just parse the header to match it automatically; I just did not do it yet because I thought no one else would use it

You can also have a look at this repo where I did some retro-engineering of the in silico control

https://github.com/EMeyerLab/ipdtools

I did this at the time where SP3-C3 was not yet released, and before the model formats changed, but I'm reasonably confident that the repo is still valid as of June 2022

@GDelevoye GDelevoye self-assigned this Jun 24, 2022
@GDelevoye GDelevoye added documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested labels Jun 24, 2022
@GDelevoye GDelevoye pinned this issue Jun 24, 2022
@ZhangBio
Copy link
Author

A method is compare the "tMean" and "modelPrediction" with WGA data, if the 2 values are good correlated, it means it's using the correct model, maybe I'll try this later when I have time.
But a recent paper shows the correlationship between obeserved IPD and predicted IPD in WGA is not good, I dont know where is wrong.
https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-022-08471-2

@GDelevoye
Copy link
Owner

Hi, sorry for the late answer.

To my experience there is a systematic biais between the observed IPDs and the modelPrediction which, as you mention, kind of prevents this kind checking...

My own verification on my own data was that, using the SP2-C2 model on my E. coli data, almost all the DNA modifications that I can detect were located either in GATC or EcoK sites, which are indeed known for their abundance of 6mA. But I do not have access to more recent SMSN Sequel II data. I would be glad if someone could provide me some

I have a bit of time to take care of that issue at the moment...
Are you still interested in using the software ? Do you have any data that may help ?

Guillaume

@GDelevoye
Copy link
Owner

Perhaps this repo that I have made a few years ago now, could help to test your suggested solution.

@ZhangBio ZhangBio closed this as completed Jan 6, 2023
@ZhangBio
Copy link
Author

ZhangBio commented Jan 6, 2023

Sorry for the late reply.
I used some man-made data from previous researches where "positive" are treated by MTase, and corresponding controls are WGA data. But "A" sites in MTase treated group seem not always predicted to be methylated. I'm can't tell whether it's the problem of ipdsummary or the enzyme treatment efficiency.
https://www.ncbi.nlm.nih.gov/sra/SRX12017172[accn]
https://www.ncbi.nlm.nih.gov/sra/SRX9611878[accn]

@ZhangBio ZhangBio reopened this Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants