Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sample_public_name for RVI Program - Bait Capture study #4594

Merged
merged 20 commits into from
Jan 29, 2025

Conversation

sabrine33
Copy link
Contributor

No description provided.

Copy link

codecov bot commented Jan 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.41%. Comparing base (7491e60) to head (b498de3).
Report is 21 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4594      +/-   ##
===========================================
+ Coverage    89.38%   89.41%   +0.03%     
===========================================
  Files         1406     1406              
  Lines        30298    30298              
===========================================
+ Hits         27081    27091      +10     
+ Misses        3217     3207      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sabrine33
Copy link
Contributor Author

Notes for testing on training :

  1. Before deploying, we can verify the number of samples belonging to 'RVI Program - Bait Capture' with a sample_public_name of NULL (the SQL will update only those where the public name is not set) using this SQL:
SELECT count(*) FROM  sample_metadata
                          JOIN samples ON sample_metadata.sample_id = samples.id
                          JOIN study_samples ON sample_metadata.sample_id = study_samples.sample_id
                          JOIN studies ON study_samples.study_id = studies.id
WHERE sample_metadata.sample_public_name IS NULL
  AND studies.name = 'RVI Program - Bait Capture' AND sanger_sample_id LIKE 'RVI%';
  1. Keep a record of the returned number.
  2. Deploy on training.
  3. Run this SQL to validate that the samples' public name is successfully updated:
SELECT count(*) FROM  sample_metadata
                          JOIN samples ON sample_metadata.sample_id = samples.id
                          JOIN study_samples ON sample_metadata.sample_id = study_samples.sample_id
                          JOIN studies ON study_samples.study_id = studies.id
WHERE sample_metadata.sample_public_name = samples.sanger_sample_id
  AND studies.name = 'RVI Program - Bait Capture' AND sanger_sample_id LIKE 'RVI%';
  1. Validate that the returned number equals the one from step 2.

Copy link
Contributor

@harrietc52 harrietc52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQL looks good, and the query to check before and after deployment looks good too . I’m just wondering, as the migration is in a change block, what would happen if this migration were to be rolled back? I am not sure, in previous similar situations, whether updates like this are best in a rake task, which can be ran the one from the command line on the server. Just a thought, maybe gather opinions from others with more Rails knowledge

@sabrine33
Copy link
Contributor Author

sabrine33 commented Jan 17, 2025

@harrietc52 thank you for reviewing it, it is a good point! I created it as a rake task in the first place and @KatyTaylor suggested to to use the ORM so the update triggers the event to update the mlwh accordingly.

The change is a new terminology for the up/down - it covers both. So if case of the rollback ruby on Rials will make sure to roll the changes smoothly

Copy link
Contributor

@dasunpubudumal dasunpubudumal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I see it, the query retrieves records for which the sample_public_name is null, and sanger_sample_id starts with RVI, and study name is RVI Program - Bait Capture, and sets the sample_public_name to sanger_sample_id.

A set of small questions:

  1. Do we need to check for records in which sample_public_name is empty as well? I assume the field is of type string (VARCHAR).
  2. Do we need to set autocommit manually? As far as I know, Rails migrations are automatically wrapped around a transaction1.
  3. What would happen we need to rollback the migration? I think this migration is reversible but better to check2.

Footnotes

  1. Rails Migrations Guide

  2. Irreversible Migrations

@sabrine33
Copy link
Contributor Author

As far as I see it, the query retrieves records for which the sample_public_name is null, and sanger_sample_id starts with RVI, and study name is RVI Program - Bait Capture, and sets the sample_public_name to sanger_sample_id.

A set of small questions:

  1. Do we need to check for records in which sample_public_name is empty as well? I assume the field is of type string (VARCHAR).
  2. Do we need to set autocommit manually? As far as I know, Rails migrations are automatically wrapped around a transaction1.
  3. What would happen we need to rollback the migration? I think this migration is reversible but better to check2.

Footnotes

  1. Rails Migrations Guide
  2. Irreversible Migrations

@sabrine33 sabrine33 closed this Jan 21, 2025
@sabrine33 sabrine33 reopened this Jan 21, 2025
@sabrine33
Copy link
Contributor Author

Thank you @dasunpubudumal for revieing it, The sample_public_name check ensures that user-specified names are not overridden. I was unaware that data updates are excluded from Rails rollbacks, so I will likely need to update this accordingly.

Copy link
Contributor

@dasunpubudumal dasunpubudumal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tested it, but looks good. Better to do this in UAT and record the time it takes.

@sabrine33 sabrine33 merged commit 6c137ad into develop Jan 29, 2025
23 checks passed
@sabrine33 sabrine33 deleted the Y24-300-rake-task branch January 29, 2025 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Y24-300 - Addition of anonymous ids for public data sharing - existing data
3 participants