Use case examples - open science code availability for reproducibility #2200
PipaFlores
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone,
I would like to share this library to colleagues and researchers in my field. A big struggle however is setting up coding environments, scripts, and pipelines. In this sense, I think the available notebooks are doing a good job introducing the procedure.
However, for a deeper insight, it would be interesting to gather papers or results that have shared their datasets and scripts (involving data wrangling, topic modeling and visualization). This would provide deeper insight on grounded scientic usage of the library.
We can observe already in the documented use cases several interesting examples, but unfortunately, none share their topic modeling scripts. While this is somewhat expected, the growth of open science has led to more scripts and datasets becoming available, yet many go unnoticed.
It seems that researchers approach this method differently, producing varied outputs—not just the standard BERTopic results. Gathering concrete examples and scripts from published research would provide valuable, grounded insights into scientific applications of this method.
I am asking this for pedagogical reasons, as I would encourage Higher Education students to reproduce others' research as a way to broaden their scientific perspectives. I will attempt to share whatever I find here, and I encourage others bypassers to do so.
Thanks
My findings:
Exploring the topics, sentiments and hate speech in the Spanish information environment -- This is the golden standard of what I am advocating for, provides data and scripts in an explainable structure. Unfortunately, is mostly valuable for spanish speakers, as data and comments are in that language.
Unveiling the dynamics of AI applications: A review of reviews using scientometrics and BERTopic modeling --- Has data, not code, but seems interesting to reproduce.
Bibliometric and Scientometric Python Library ---- Library for specific use that uses BERTopic - example of implementation.
Other's contributions:
Use example with open ended surveys. ----- Provides a nice workflow and deeper insight into dimensionality reduction and clustering with HDBSCAN without getting too technical. (Made by Kevin Reuning)
Beta Was this translation helpful? Give feedback.
All reactions