Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new agate Integer data_type in adapter code #9004

Merged
merged 3 commits into from
Nov 7, 2023

Conversation

gshank
Copy link
Contributor

@gshank gshank commented Nov 6, 2023

resolves #8895

Problem

In #8153 we added an agate Integer data type, but did not include conversion methods that are used when constructing a csv table.

Solution

Add Integer type to 'convert_agate_type' and add new 'convert_integer_type' method.

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@gshank gshank requested review from a team as code owners November 6, 2023 15:42
@cla-bot cla-bot bot added the cla:yes label Nov 6, 2023
Copy link

codecov bot commented Nov 6, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (7fddd6e) 86.51% compared to head (bce7d56) 86.50%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9004      +/-   ##
==========================================
- Coverage   86.51%   86.50%   -0.02%     
==========================================
  Files         179      179              
  Lines       26508    26526      +18     
==========================================
+ Hits        22934    22945      +11     
- Misses       3574     3581       +7     
Flag Coverage Δ
integration 83.38% <83.33%> (-0.03%) ⬇️
unit 64.79% <66.66%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
core/dbt/adapters/sql/impl.py 93.47% <100.00%> (+0.14%) ⬆️
core/dbt/adapters/base/impl.py 82.86% <66.66%> (-0.07%) ⬇️

... and 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -0,0 +1,6 @@
kind: Fixes
body: Fix exception running empty seed file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR and this changelog indicate we're solving multiple issues, can we add a second changelog to cover new data_type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't solve multiple issues, it's all the same issue. The problem with the empty seed file was happening because of the lack of support for the new Integer data type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you mean additional info in the existing changelog? Sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup! not questioning the PR just want to make sure our changelogs cover everything that's being changed

@colin-rogers-dbt
Copy link
Contributor

do we need to beef up our tests around dbt-show or docs to catch this? https://github.com/dbt-labs/dbt-core/blob/main/tests/functional/show/test_show.py#L104C3-L104C3

@gshank
Copy link
Contributor Author

gshank commented Nov 7, 2023

I think the "show" tests are adequate. The problem was that the show command was fixed by changing infrastructure that also affected the code used to create an empty table for an empty seed. I added a new seed test, but probably that test should be moved to the adapter zone seed tests.

@mikealfare
Copy link
Contributor

mikealfare commented Nov 7, 2023

I want to consider different scenarios with how adapter maintainers may have implemented this, and see how this solution would potentially affect them. (I'm basically processing out loud here.)

  1. The adapter has overwritten convert_agate_type.
  • In this scenario, the adapter remains stable because they don't actually pick up this change.
  1. The adapter has not overwritten convert_agate_type, but has an existing method convert_integer_type.
  • This scenario is extremely unlikely. And in the event that this occurred, I think it's fair to assume what convert_integer_type will be used for given the existing naming convention. If this breaks an adapter, I think it's fair to assume the adapter maintainer should update the adapter.
  1. The adapter has not overwritten convert_agate_type, nor is there an existing method convert_integer_type. Hence these changes are fully implemented. And 'integer' is a valid type for the database platform.
  • This is a likely scenario, and is the scenario we're solving for. We want this.
  1. The adapter has not overwritten convert_agate_type, nor is there an existing method convert_integer_type. Hence these changes are fully implemented. However, 'integer' is not a valid type for the database platform.
  • This is a less likely scenario, but not an unlikely scenario. This will break the adapter unexpectedly as Integer types will all of a sudden get converted whereas in the past they would have just flowed through. I'm concerned about this scenario.

@gshank / @colin-rogers-dbt, are there any other scenarios that you can think of? And in particular, are we comfortable with scenario 4 above, where we will break adapters which have not tried to implement something like this in the past and which do not support "integer" as a valid type?

@gshank
Copy link
Contributor Author

gshank commented Nov 7, 2023

The Integer agate type is already out in 1.7.0, so this fixes a regression. Right now, as far as I know, the main fallout from that is that empty seeds generate incorrect SQL (they have None as a type instead of "integer"). It's possible there are other edge cases that we just haven't encountered yet. So integer types were not actually "flowing through", they were failing.

I did put "convert_integer_type" into both base/impl.py and sql/impl.py, to cover as much ground as possible. I tried reusing the existing "convert_number_type" and having Integer inherit from Number, but there is code buried in the middle of agate that expects an actual float, not an integer, so that was a dead end.

@mikealfare
Copy link
Contributor

I'll focus again on scenario 4 above, since I think that's the only problematic scenario. Given the following observations:

  • this is broken (hence we need to do something)
  • Integer is only present in 1.6+
  • the adapter would need to have implemented Integer to be impacted by scenario 4
  • in the event scenario 4 is an issue, the fix is trivial (overwrite convert_integer_type to return the appropriate data type)
  • combining convert_integer_type with convert_number_type does not address the concern of attempting to use an invalid type of "integer" (and makes the code more complicated)

I think I'm comfortable with the approach. To confirm, are the observations above valid?

@colin-rogers-dbt
Copy link
Contributor

To confirm, are the observations above valid?

@mikealfare I believe so, I'm comfortable with this approach

@gshank
Copy link
Contributor Author

gshank commented Nov 7, 2023

Yes, those observations are valid.

Although it's not just that combining the integer and number types makes the code complicated, with the current method of calling MaxPrecision it can't be gotten to work... We'd have to update every "convert_number_type" method to do ... something else.

Thanks for all of the attention!

@gshank gshank merged commit 46b9a1d into main Nov 7, 2023
48 checks passed
@gshank gshank deleted the 8895-agate_integer_conversion branch November 7, 2023 18:43
Copy link
Contributor

github-actions bot commented Nov 7, 2023

The backport to 1.6.latest failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.6.latest 1.6.latest
# Navigate to the new working tree
cd .worktrees/backport-1.6.latest
# Create a new branch
git switch --create backport-9004-to-1.6.latest
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 46b9a1d621400be6204baed2680033ac0d5f1045
# Push it to GitHub
git push --set-upstream origin backport-9004-to-1.6.latest
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.6.latest

Then, create a pull request where the base branch is 1.6.latest and the compare/head branch is backport-9004-to-1.6.latest.

github-actions bot pushed a commit that referenced this pull request Nov 7, 2023
Add test for empty seed file with only headers

(cherry picked from commit 46b9a1d)
gshank added a commit that referenced this pull request Nov 8, 2023
…code (#9004)

Add test for empty seed file with only headers

(cherry picked from commit 46b9a1d)
gshank added a commit that referenced this pull request Nov 8, 2023
…code (#9004) (#9036)

Add test for empty seed file with only headers

(cherry picked from commit 46b9a1d)
@aranke aranke mentioned this pull request Jul 12, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-3259] [Regression] Running empty seed file raises unhandled exception
3 participants