Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Bump pydantic to >=2.0.0 #134

Closed
3 tasks done
Tracked by #140 ...
NiallRees opened this issue Aug 11, 2023 · 16 comments · Fixed by #238
Closed
3 tasks done
Tracked by #140 ...

[Feature] Bump pydantic to >=2.0.0 #134

NiallRees opened this issue Aug 11, 2023 · 16 comments · Fixed by #238
Assignees
Labels
enhancement New feature or request tech_debt

Comments

@NiallRees
Copy link

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward change to existing dbt-semantic-interfaces functionality, rather than a Big Idea better suited to a discussion

Describe the feature

This package currently blocks users of dbt-core from upgrading Python projects to pydantic 2.0.0 which has the well-documented performance improvements.

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

@NiallRees NiallRees added the enhancement New feature or request label Aug 11, 2023
@Jstein77
Copy link

Jstein77 commented Aug 14, 2023

Thanks for adding this issue Niall! Let me chat with the team about what it would take to upgrade pydantic.

@QMalcolm
Copy link
Collaborator

Thank you for the call out Nial 🙂 We've actually been waiting for the release of pydantic 2.0 for awhile, specifically because of the speed gains. Glad to see it's finally happened 🚀

We should absolutely look at moving to pydantic 2 for this project. I was hoping it would be straight forward. However, it looks like it's gonna be a bit hands on. A number of things have been deprecated, Optional fields are no longer auto defaulted to None, int/bool values when parsing raw to string field are no longer gracefully handled, and etc. Full migration guide details can be found here.

@QMalcolm
Copy link
Collaborator

I've started a branch qmalcolm--pydantic-2-support

@NiallRees
Copy link
Author

Thanks @QMalcolm !

@elongl
Copy link

elongl commented Sep 18, 2023

Hi @QMalcolm, can I help with anything?
There's also this tool by the way.

@r-richmond
Copy link

Echoing the previous comment. @QMalcolm and/or other DBT folks is there anything that you need help with on this?
I saw we had this as a wishlist item for 0.3 but it was missed from 0.4

@r-richmond r-richmond mentioned this issue Nov 6, 2023
@gnilrets
Copy link

I want to put dbt in a dagster environment where we're using pydantic 2.0. Can't do that until this is resolved.

@christeefy
Copy link

Curious to hear on the status of this work as well, and if there's anything we can do to help!

@18heim
Copy link

18heim commented Nov 13, 2023

Looking forward to this dev !

@QMalcolm
Copy link
Collaborator

Hoooboy do I need to pay more attention to my github notifications 😅 We absolutely want to get dbt-semantic-intefaces migrated to Pydantic 2. However, we don't currently have a good idea of where that falls on our timeline. That said, we'd absolutely welcome community contributions to make it happen.

@r-richmond
Copy link

That said, we'd absolutely welcome community contributions to make it happen.

Thanks for clarifying that @QMalcolm. I gave it a quick attempt and didn't get too far given my other time constraints. For anyone else who might want to try a community contribution I can say that getting the dev environment up and running with tests was pretty smooth & expect a bit more work than you think :).

@esciara
Copy link

esciara commented Nov 24, 2023

Hi,

I wanted to help out, so I had a look.

The change does not seem straight forward at all.

@QMalcolm I tried to follow the same approach as you and started from your branch on my own fork. I did the following:

  • I rebased with main to be up-to-date.
  • I updated the dependency to pydantic~=2.5 to start with the latest version.
  • I ran the bump-pydantic tool as suggested by @elongl .
  • I organised the changes in commits, mimicking the way you did it to separate the changes in stages.
  • I committed the code that still needs work (as suggested by bump-pydantic) in a "WIP" commit.

A lot of tests are not passing. I digged into some of them.

One of the main issue is that the deprecation of __get_validators__ has a big impact, especially you used it to instantiate object during deserialisation by not using a dict or another instance of the same class, but by using base types such as str for instance (here is an example with PydanticWhereFilter). I could not find a way to emulate that with __get_pydantic_core_schema__ (supposedly __get_validators__'s replacement).

Another issue with is that for some reason, validation fails on XxxAsPydantic classes as it thinks that the instances that are passed do not have the same type as expected, even though they seems to do... Probably has to do with them being created dynamically. Here is the result for running :

test_dataclass_serialization.py::test_nested_dataclass FAILED            [100%]
test_dataclass_serialization.py:93 (test_nested_dataclass)

[..]

>           return PydanticModel(**field_values)
E           pydantic_core._pydantic_core.ValidationError: 1 validation error for NestedDataclassAsPydantic
E           field1
E             Input should be a valid dictionary or instance of SimpleDataclassAsPydantic [type=model_type, input_value=SimpleDataclassAsPydantic(field0=1), input_type=SimpleDataclassAsPydantic]
E               For further information visit https://errors.pydantic.dev/2.5/v/model_type

../dbt_semantic_interfaces/dataclass_serialization.py:223: ValidationError

Any guidance for a clue?

@esciara
Copy link

esciara commented Nov 24, 2023

PR done. All tests passing. 😉

@QMalcolm
Copy link
Collaborator

Bit of an update 😅 This is a copy and paste of my comment on the PR

As we've been reviewing this we also realized we are likely going to have to support both Pydantic 1 and Pydantic 2 for quite some time. About 60 percent of people using Pydantic are still on Pydantic 1. Pydantic 2 is very exciting. It's faster, and we plan to support it in 0.5 of DSI. Although we'd love to cut directly to just supporting Pydantic 2, that'll likely cause immediate pain for community.

In regards to this PR, we still want to get it in. Currently we're investigating whether the changes in this PR will also work with any Pydantic 1 versions (hopefully at least 1.10.x). If that is not the case, we're then going to investigate if that would be possible with some slight tweaks to this PR. If that does not work, there are some crunchier ways for us to support both which we'd likely do in a separate PR if it came to that.

@WillAyd
Copy link

WillAyd commented Jan 2, 2024

Looking forward to this change - the associated PR looks great. In case its helpful to have examples where this is problematic, Airflow 2.7.0 onwards uses pydantic 2.4.2 in its constraint files.

https://raw.githubusercontent.com/apache/airflow/constraints-2.7.2/constraints-3.11.txt

@QMalcolm
Copy link
Collaborator

QMalcolm commented Jan 3, 2024

With the merging of #238, pydantic 2 support will soon, but not immediately, available and transitively in dbt-core. This is because dbt-core 1.7 depends on dbt-semantic-interfaces 0.4.latest. To get the changes of #238 out in a 0.4.x version we need to first backport it to the 0.4.latest branch and then do a patch release from that branch which will result in dbt-semantic-interfaces 0.4.3. At that point fresh installs of dbt-core 1.7 will automatically pick up the changes from #238.

@QMalcolm QMalcolm self-assigned this Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment