-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support for Storing Repo Labels #1683
Conversation
Adds two new tables: `repo_labels` and `labels` to store label data Created a new service class to store and associate labels for repos if the same `label` exists on multiple `repos`, it does not make duplicate `label` records. Add simple rake task to iterate through all `repos` `in_batches` and fetch one by one
return unless github_bub_response.success? | ||
|
||
remote_labels.each do |label_hash| | ||
label_name = label_hash['name'].downcase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue stated that we wanted to only store hacktoberfest
related labels. I'm happy to add a next unless
line here if that's still the case but this is pretty simple as is.
desc 'fetch and assign labels for repos' | ||
task fetch_labels_and_assign: :environment do | ||
Repo.find_each(batch_size: 100) do |repo| | ||
RepoLabelAssigner.new(repo: repo).create_and_associate_labels! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered creating a new background job here and enqueueing it for 10 repos at a time but worried about rate-limiting and not knowing the concurrency of background processing in production. I also wasn't sure what the total count of repos in production was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repo count is on the main page it's ~6,000. The way we do API requests is that we rotate through everyone's tokens on the service for each request.
CodeTriage/config/initializers/git_hub_bub.rb
Lines 48 to 54 in 31bd0e7
GitHubBub::Request.set_before_callback do |request| | |
if request.token? | |
# Request is authorized, do nothing | |
else | |
request.token = code_triage_random_api_key_store.call | |
end | |
end |
It's a good idea not to make a bunch of extra requests, but we're not that constrained. If this is N API requests where N is the number of repos in the system, that's absolutely no problem.
|
||
require 'test_helper' | ||
|
||
class RepoLabelsAssignerTest < ActiveSupport::TestCase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside of my comfort zone here (I'm used to RSpec) so if this needs to change please let me know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome, thanks a ton!
@schneems thanks for accepting! It seems like Hacktoberfest wants you to add |
Updated |
@schneems thanks for the opportunity to contribute!
Description
Adds two new tables:
repo_labels
andlabels
to store label dataCreated a new service class to store and associate labels for repos
if the same
label
exists on multiplerepos
, it does not makeduplicate
label
records.Add simple rake task to iterate through all
repos
in_batches
and fetch one by oneRelated Issue
Partially resolved #1680
Motivation and Context
People should be able to target Hacktoberfest
How Has This Been Tested?
Some simple integration testing of the newly created PORO
Types of changes
Checklist: