14 million, semi-supervised, mental disorder detection data.
We have now released a total of 700,000 post IDs with their associated labels.
We have now released a total of 500,000 post IDs with their associated labels. Although our intention was to release the entire dataset along with the content at once, we are currently constrained by the community guidelines of Reddit, which prevents us from doing so.
We are pleased to announce the release of approximately 33,000 samples, each containing an ID and its corresponding label. The full dataset is available upon request, subject to a data usage agreement. The terms and conditions for accessing the full dataset will be provided in due course.
If you find this dataset useful in your research, we kindly ask you to cite our paper:
@inproceedings{raihan2024mentalhelp,
title={MentalHelp: A Multi-Task Dataset for Mental Health in Social Media},
author={Raihan, Nishat and Puspo, Sadiya Sayara Chowdhury and Farabi, Shafkat and Bucur, Ana-Maria and Ranasinghe, Tharindu and Zampieri, Marcos},
booktitle={Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
pages={11196--11203},
year={2024}
}