Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bindings/python): generate python operator constructor types #5457

Merged
merged 15 commits into from
Dec 27, 2024

Conversation

trim21
Copy link
Contributor

@trim21 trim21 commented Dec 25, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

A python pyi generator

This PR requires #5463

a problem I found:

FsConfig.root is Option<String> which is actually required.

pub struct FsConfig {
/// root dir for backend
pub root: Option<String>,

let root = match self.config.root.map(PathBuf::from) {
Some(root) => Ok(root),
None => Err(Error::new(
ErrorKind::ConfigInvalid,
"root is not specified",
)),
}?;

Are there any user-facing changes?

Yes, add python typing.

@trim21 trim21 force-pushed the gen-pyi branch 10 times, most recently from eaadb04 to 34c2d63 Compare December 25, 2024 18:46
Copy link
Member

@messense messense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, but I think we should revist the _bool and _int type aliases later (due to the use of Operator::from_iter that only accepts a iterator of (String, String)), IMHO at least the _int type should accepts int.

@trim21

This comment was marked as off-topic.

@trim21
Copy link
Contributor Author

trim21 commented Dec 26, 2024

Mostly LGTM, but I think we should revist the _bool and _int type aliases (due to the use of Operator::from_iter that only accepts a iterator of (String, String)), IMHO at least the _int type should accepts int.

pyo3 doesn't do implicit type conversion here:

Traceback (most recent call last):
  File "C:\Users\Trim21\proj\test\f.py", line 4, in <module>
    opendal.Operator("redis", db=1)
pyo3_runtime.PanicException: must be valid hashmap: PyErr { type: <class 'TypeError'>, value: TypeError("'int' object cannot be converted to 'PyString'"), traceback: None }

@trim21 trim21 changed the title dev: generate python types feat: generate python operator constructor types Dec 26, 2024
@Xuanwo
Copy link
Member

Xuanwo commented Dec 26, 2024

FsConfig.root is Option<String> which is actually required.

I think we can fix it from the core side.

dev/src/generate/binding_python.rs Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we just generate the .pyi code here. Would it work better if we had classes or types like S3Config on the Python side? This way, we could include more information, such as comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, cc @trim21 @messense what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other concern here is that I would prefer not to maintain .pyi files ourselves in the future. We should use tools like a stub file generator, as mentioned in PyO3/pyo3#510, such as https://crates.io/crates/pyo3-stub-gen (raising by @trim21).

Instead, we can simply keep the struct in Python with comments and let other tools generate the .pyi files for us.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work better if we had classes or types like S3Config on the Python side? This way, we could include more information, such as comments.

I think having Python config types is better, but it can be done later.

Use pyo3-stub-gen would solve typing issue in python side (assuming it can satisfy our needs, I've never tried it so I'm not sure), not sure about other bindings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may need to rewrite some rust code to make it work with pyo3-stub-gen

dev/src/generate/binding_python.rs Outdated Show resolved Hide resolved
dev/src/generate/parser.rs Outdated Show resolved Hide resolved
@Zheaoli
Copy link
Member

Zheaoli commented Dec 26, 2024

For me, the PR mostly LGTM, but some idea

  1. We need to think more about _int type, a str type but with integer semantic
  2. If we can drop support for the version under 3.12, we may can use typedict for kwargs, FYI https://peps.python.org/pep-0692/

@trim21
Copy link
Contributor Author

trim21 commented Dec 26, 2024

For me, the PR mostly LGTM, but some idea

  1. We need to think more about _int type, a str type but with integer semantic
  2. If we can drop support for the version under 3.12, we may can use typedict for kwargs, FYI https://peps.python.org/pep-0692/

We do not need to drop py<3.12 to use this feature, pyi file is not runtime code, so even in old python as long as type checker support pep692, it's still fine to use it. (I'm not sure how old version type checker will behave if it doesn't support pep692)

For example, we generate XConfig to opendal/config.py and import them from opendal/__init__.pyi and unpack them.

from typing import NotRequired, TypeAlias
from typing_extensions import TypedDict

# `true`/`false`` in any case, for example, `true`/`True`/`TRUE` `false`/`False`/`FALSE`
_bool: TypeAlias = str
# a str represent a int, for example, `"10"`/`"0"`
_int: TypeAlias = str

# a human readable duration string
# see https://docs.rs/humantime/latest/humantime/fn.parse_duration.html
# for more details
_duration: TypeAlias = str

# A "," separated string, for example `"127.0.0.1:1,127.0.0.1:2"`
_strings: TypeAlias = str


class S3Config(TypedDict):
    bucket: str
    root: NotRequired[str]
    enable_versioning: NotRequired[_bool]
    endpoint: NotRequired[str]
    region: NotRequired[str]
    access_key_id: NotRequired[str]
    secret_access_key: NotRequired[str]
    session_token: NotRequired[str]
    role_arn: NotRequired[str]
    external_id: NotRequired[str]
    role_session_name: NotRequired[str]
    disable_config_load: NotRequired[_bool]
    disable_ec2_metadata: NotRequired[_bool]
    allow_anonymous: NotRequired[_bool]
    server_side_encryption: NotRequired[str]
    server_side_encryption_aws_kms_key_id: NotRequired[str]
    server_side_encryption_customer_algorithm: NotRequired[str]
    server_side_encryption_customer_key: NotRequired[str]
    server_side_encryption_customer_key_md5: NotRequired[str]
    default_storage_class: NotRequired[str]
    enable_virtual_host_style: NotRequired[_bool]
    # deprecated: Please use `delete_max_size` instead of `batch_max_operations`
    batch_max_operations: NotRequired[_int]
    delete_max_size: NotRequired[_int]
    disable_stat_with_override: NotRequired[_bool]
    checksum_algorithm: NotRequired[str]
    disable_write_with_if_match: NotRequired[_bool]
from typing import overload, Literal, Unpack

from opendal.config import S3Config

class _Base:
    """this is not a real base class but typing mixin,

    The services list here is support by opendal pypi wheel.
    """

    @overload
    def __init__(
        self,
        scheme: Literal["s3"],
        **kwargs: Unpack[S3Config],
    ) -> None: ...
    @overload
    def __init__(self, scheme: str, **kwargs: str) -> None: ...

@trim21 trim21 changed the title feat: generate python operator constructor types feat(python): generate python operator constructor types Dec 27, 2024
@trim21 trim21 changed the title feat(python): generate python operator constructor types feat(bindings/python): generate python operator constructor types Dec 27, 2024
@trim21
Copy link
Contributor Author

trim21 commented Dec 27, 2024

I guess rust code is fine for now? Still need to resolve what kind of typing we want to generate.

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @trim21 for working this, really great! Also thanks @messense and @Zheaoli's review. Let's move on!

@Xuanwo Xuanwo merged commit 6ccef00 into apache:main Dec 27, 2024
61 checks passed
@Xuanwo
Copy link
Member

Xuanwo commented Dec 27, 2024

Hi, @trim21 this will be large task, would you like to create a tracking issue for the progress?

@trim21
Copy link
Contributor Author

trim21 commented Dec 27, 2024

Thank you @trim21 for working this, really great! Also thanks @messense and @Zheaoli's review. Let's move on!

I thought we still have some opinion on _int _duration type alias?

@trim21 trim21 deleted the gen-pyi branch December 27, 2024 06:02
@messense
Copy link
Member

I thought we still have some opinion on _int _duration type alias?

It's not a blocking issue, this improves the status quo of typing for python binding already. The rest can be followed up later. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants