-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sample Data] Request a list of all repositories in the open-digger dataset, in the format "GithubId/repository name". #1408
Comments
Hi, you can use the data in the file for your research. or you can use the labeled data we released. they both contain the repositories list. |
@Zzzzzhuzhiwei I think @ZhangChunXian is requesting the whole repo and user list that OpenDigger export which is not currently in OpenDigger sample data and exported data. I think we can do this in monthly export task to a |
I see! If we export this file, it might be too large. Now, there are 328,032,951 different repositories in the clickhouse. |
And the user file will be about 5MB, I think this is feasible for monthly export task. /self-assign |
@ZhangChunXian Thanks for the issue, the lists have been exported to |
Usage
For personal research
Extract SQL
I wanna the list of all repositories in the open-digger dataset, in the format "GithubId/repository name, such as "X-lab2017/open-digger". I'd appreciate it if you could provide it.
我想要open-digger数据集中收录的所有仓库的名字, 格式为"githubId/仓库名", 就比如"X-lab2017/open-digger". 如果能提供的话, 万分感激.
Does this dataset need to be updated regularly?
No response
The text was updated successfully, but these errors were encountered: