Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔨 (db) migrate Dataset and Source to knex #3131

Merged
merged 4 commits into from
Feb 27, 2024

Conversation

sophiamersmann
Copy link
Member

@sophiamersmann sophiamersmann commented Jan 24, 2024

  • Makes the Dataset and Source TypeORM classes obsolete by switching db access to knex

  • This PR does not do any clean-up, e.g. SQL queries in the router files are not moved, although they ultimately should live in the db folder

  • I added two helper functions that can be used as drop-in replacements for queryMysql and mysqlFirst that are often used in our codebase, knexRaw and knexRawFirst (both live in db.ts)

  • Routes that I could easily test are tested, but not all of the code is!

@danyx23 let me know if this is how you imagined it, or if I should do anything differently.

@coderabbitai ignore

Copy link

coderabbitai bot commented Jan 24, 2024

Important

Auto Review Skipped

Auto reviews are disabled on base/target branches other than the default branch. Please add the base/target branch pattern to the list of additional branches to be reviewed in the settings.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

@sophiamersmann sophiamersmann changed the title 🔨 (types) migrate Dataset and Source to knex 🔨 (db) migrate Dataset and Source to knex Jan 26, 2024
@sophiamersmann sophiamersmann force-pushed the db-types-migrate-dataset-and-source branch from 605992e to ce15753 Compare January 26, 2024 09:12
Copy link
Contributor

@danyx23 danyx23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for test driving the knex approach! This looks very good overall. There are two things that I would change on the db-types PR learning from this and two things I'd ask you to change.

The two changs I'd like to see to this PR is:

  • not make the knex instance optional in the knexRaw and knexRawFirst helper functions
  • and to wrap all API functions we touch in a transaction scope even if we only read data as we go through them. The reasoning here is that we want reads to be consistent and it's probably better to just implement a standard pattern everywhere rather than having to evaluate if it is necessary each time.

The two things I'll change on the other PR are:

  • bump the knex version so that we can create readyOnly transactions which are a bit more performant than generic ones
  • add a migration to make name on datasets mandatory rather than nullable

db/model/Source.ts Outdated Show resolved Hide resolved
db/db.ts Outdated
str: string,
params?: any[],
knex?: Knex<any, any[]>
): Promise<TRow[]> => (await (knex ?? knexInstance()).raw(str, params ?? []))[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this is convenient but I really think we should only get the knexInstance at the top level entry point and not implicitly in a helper function. This stuff can lead to some nested code using the global knex instance where it actually should be using the knex instance that is handed down from the transaction scope.

Of course in theory you could also just call knexInstance() in a nested utility funciton but then this is an explicit call that is easy to find via grep.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm strongly in favour of making the knex parameter not optional

adminSiteServer/apiRouter.ts Outdated Show resolved Hide resolved
db/db.ts Outdated
export const knexRawFirst = async <TRow = unknown>(
str: string,
params?: any[],
knex?: Knex<any, any[]>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dito here of course

Copy link
Contributor

@danyx23 danyx23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for test driving the knex approach! This looks very good overall. There are two things that I would change on the db-types PR learning from this and two things I'd ask you to change.

The two changs I'd like to see to this PR is:

  • not make the knex instance optional in the knexRaw and knexRawFirst helper functions
  • and to wrap all API functions we touch in a transaction scope even if we only read data as we go through them. The reasoning here is that we want reads to be consistent and it's probably better to just implement a standard pattern everywhere rather than having to evaluate if it is necessary each time.

The two things I'll change on the other PR are:

  • bump the knex version so that we can create readyOnly transactions which are a bit more performant than generic ones
  • add a migration to make name on datasets mandatory rather than nullable

@danyx23 danyx23 force-pushed the db-types-migrate-dataset-and-source branch 3 times, most recently from 486942e to ce49d9f Compare January 27, 2024 00:19
@danyx23 danyx23 changed the base branch from db-types to datasets-table-name-field-nullable January 27, 2024 01:29
@danyx23 danyx23 force-pushed the db-types-migrate-dataset-and-source branch 2 times, most recently from a834c76 to 435d84a Compare January 29, 2024 08:38
@danyx23 danyx23 force-pushed the datasets-table-name-field-nullable branch from 67b9b99 to c28f073 Compare January 29, 2024 14:11
@danyx23 danyx23 force-pushed the db-types-migrate-dataset-and-source branch from 435d84a to a35b059 Compare January 29, 2024 14:11
@danyx23 danyx23 force-pushed the datasets-table-name-field-nullable branch from c28f073 to 253ae3b Compare January 29, 2024 14:15
@danyx23 danyx23 force-pushed the db-types-migrate-dataset-and-source branch from a35b059 to 009d6fb Compare January 29, 2024 14:15
Base automatically changed from datasets-table-name-field-nullable to master January 31, 2024 08:39
Copy link

This PR has had no activity within the last two weeks. It is considered stale and will be closed in 3 days if no further activity is detected.

@github-actions github-actions bot added the stale label Feb 15, 2024
@danyx23 danyx23 force-pushed the db-types-migrate-dataset-and-source branch from 009d6fb to 8c71da0 Compare February 22, 2024 13:31
@danyx23 danyx23 force-pushed the db-types-migrate-dataset-and-source branch from c538a40 to 92444de Compare February 27, 2024 12:07
Copy link
Member Author

@sophiamersmann sophiamersmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go I think!

Before the name field was made non-nullable in the db, I added a check to make sure a dataset name is given. We could get rid of that now :)

return dataPackage
}

export function isDatasetWithName(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this check (and all its references) now since we've made the dataset name non-nullable in the db.

})

try {
await removeDatasetFromGitRepo(dataset.name, dataset.namespace, {
await removeDatasetFromGitRepo(dataset.name!, dataset.namespace, {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, one of my changes... The '!' shouldn't be necessary anymore

Copy link
Contributor

danyx23 commented Feb 27, 2024

Merge activity

  • Feb 27, 10:59 AM EST: @danyx23 started a stack merge that includes this pull request via Graphite.
  • Feb 27, 10:59 AM EST: @danyx23 merged this pull request with Graphite.

@danyx23 danyx23 merged commit a2b1345 into master Feb 27, 2024
19 of 24 checks passed
@danyx23 danyx23 deleted the db-types-migrate-dataset-and-source branch February 27, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants