Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Systematic Text-Mining Component to the Review? #1022

Open
swamidass opened this issue Nov 2, 2020 · 3 comments
Open

Add a Systematic Text-Mining Component to the Review? #1022

swamidass opened this issue Nov 2, 2020 · 3 comments

Comments

@swamidass
Copy link
Contributor

So, what do you think about doing some quantitative analysis publications? E.g. look at the number of abstracts that use deep learning in particular domains? Count the number of authors? Look at key themes?

It seems that the field has just exploded, and adding a systematic component would be a very strong addition, and could be a good way to direct the paper. This would guide how we should update the text. Enough of us do informatics that I expect someone here already has PubMed abstracts downloaded and has requisite experience in text-mining. Give how large the review is (and well cited), it might be valuable to also look who who has cited the review and papers cited in the review. Here, we may need to reach out to someone who has citation data, but it could really be worth it.

What do you think @cgreene and @agitter?

@agitter
Copy link
Collaborator

agitter commented Nov 2, 2020

I like the idea. It would help us objectively assess what has changed in the domain since v1, adding to our own subjective takes.

The @greenelab has tools and data for https://greenelab.github.io/preprint-similarity-search/ that could contribute to this.

If someone does want to work on this, I'd like to think about how we could automate it so we could discuss a snapshot of the results but continue generating current versions. We have some good examples of this type of automated analysis in a Manubot manuscript that we could follow.

@swamidass
Copy link
Contributor Author

Great. I like the idea and would like to participate in the design of the experiments and the write up.

How do we get citation data? Is there any good preprocessed versions of PubMed we can work off of so as to avoid reinventing the wheel?

@swamidass
Copy link
Contributor Author

swamidass commented Nov 2, 2020

This might be a winner...

https://opencitations.net/index/coci

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants