-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plotAbundance improvements #132
Comments
Also consider plotting more than 20 (maybe 25) taxa with discrete colors. As seen in plots above, the colors are in continuous scale which makes it hard to read. If there are 20 or less taxa, the color scale is discrete. |
Also related: microbiome/OMA#197 |
There are three options to display sample names without cluttering.
|
Thanks Couple more things came to my mind while generating plots in one project
Sometimes user wants to define the order of taxa. For instance, there might be some specific taxa that user wants to be listed first. For example, here in figure 3 they have plotted "Other" first: https://www.researchgate.net/publication/347867791_The_Urinary_Microbiome_in_Postmenopausal_Women_with_Recurrent_Urinary_Tract_Infections/figures For instance, below Firmicutes is plotted first. I am not sure what is the best way to achieve the desired behavior. (Maybe we could check if values are factors and get the order from levels?)
When we want to display sample type, for instance, the type is plotted as colors. However, it might be better to have it as own facet? Below is our current solution
Behind the link, in figure 2, you can see how the same thing is achieved with facets: https://www.researchgate.net/publication/347867791_The_Urinary_Microbiome_in_Postmenopausal_Women_with_Recurrent_Urinary_Tract_Infections/figures
Sometimes we have samples that are drawn from same patient (for instance, time is varying). Currently, we do not have method for plotting that kind of plot. The best that can be done currently is this:
but as you can see, the samples do not match. (Maybe we could add missing samples, for instance in the figure above, to sampletype2?) @Daenarys8 Can you check if you can find solutions for these? We can then discuss more how to implement them. |
Looks very nice. Perhaps 1 is enough. I still have to test it. 2. Looks good. As you can see from my plot, sample 10 is missing from the sampletype2. You are correct that it is not there at the first place (we do not have sample for "sample10" - "sampletype2"). However, because there are missing sample, the samples are misaligned in plots. The plot would be tidier, if the sampletype2 and sampletype1 would align with each other. (Would be easier to read and in practice, we would not need the sample labels anymore.) However, I am wondering what is the best way to showcase paired samples. One option is to add "empty sample" in place of missing samples (here "sample10" - "sampletype2"). Can you check if this is already solved in some papers? We could then get the idea from them |
That also orders the data based on certain feature. However, my collaborator wants that "unidentified" taxa is in the bottom of the graph. We could add additional parameter to The idea of .features_plotter is to visualize colData variable. However, it can also visualize continuous variables which facets cannot. For me, facets look better for categorical variables. However, for some people the current option might look better. That is why I think we should have option for this. Maybe, As already mentioned, we should handle missing samples if user wants to visualize paired samples. There could be Can you create a draft that takes into account these? Let's then discuss what is the best approach as this might be little bit complex issue and requires re-structuring the function. |
|
The point was that sample information is now plotted as separate plot. However, these groups could be plotted also as facets. However, facets are only for categorical variables, not for numeric variables. That is why we should still keep the current functionality also. One problem is that it makes the function more complex for user if we have many different options |
|
|
That might be the easiest and most transparent solution. However, we should check that those elements in a vector match with features. If user wants to agglimerate the data, it might not be clear what those names are. We could disable the vector option if user wants to agglomerate. (The same solution could work for columns also) |
Sounds good. There could be informative warning if user tries to do both. |
@Daenarys8 Would you be able to create a draft for these? |
I am currently working with this and hopefully get something out tomorrow |
When sample names are plotted, one cannot read them as they are over each other
Some other functions seem to have
angle_x_text
parameter, butplotAbundance
does not have option to rotate text.Also, we could consider if sample names could be specified from
colData(tse)
. For example, paired samples must have unique names currently, but better option would be to allow shared names so that one can easily see which samples are drawn from same patient.I user wants to compare abundances between groups or if samples are paired for instance, our solution might be suboptimal.
It might be hard to read the plot when there are multiple groups (space between groups might help).
Another option would be to plot abundances as shown here in figure 1b
The text was updated successfully, but these errors were encountered: