-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: initial Operator::from_uri
implementation
#5482
base: main
Are you sure you want to change the base?
Conversation
/// | ||
/// Ok(()) | ||
/// ``` | ||
pub fn from_uri( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this method be called via_uri
?
The from_iter
method returns an OperatorBuilder
instead of Operator
opendal/core/src/types/operator/builder.rs
Line 147 in 6eed396
) -> Result<OperatorBuilder<impl Access>> { |
and the via_iter
method returns an Operator
(as this from_uri
method does)
opendal/core/src/types/operator/builder.rs
Line 185 in 6eed396
) -> Result<Operator> { |
Should we include two new methods? via_uri
and from_uri
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_uri
is good enough, all via_xxx
API may be removed in the future.
} | ||
} | ||
|
||
fn operator_factory_from_configurator<C: Configurator>() -> OperatorFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
included this private function to reduce boilerplate of operator factory generation. Any suggestion about it or is it ok to have it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to remove operator_factory_from_configurator
.
@@ -0,0 +1,226 @@ | |||
use std::cell::LazyCell; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this file inside the crate::types::operator::registry
module. Is it ok? I thought about adding a crate::types::operator_registry
, but it seemed better this way.
|
||
// TODO: thread local or use LazyLock instead? this way the access is lock-free | ||
// TODO: should we expose the `GLOBAL_OPERATOR_REGISTRY` as public API at `crate::types::operator::GLOBAL_OPERATOR_REGISTRY`? | ||
thread_local! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the preferred way of having a global static variable such as this?
I prefer to have it thread_local
so there is not need for a LazyLock
, we can use LazyCell
instead (LazyCell
is lock-free but LazyLock
is not, it synchronizes access through threads)
core/src/types/operator/registry.rs
Outdated
// TODO: thread local or use LazyLock instead? this way the access is lock-free | ||
// TODO: should we expose the `GLOBAL_OPERATOR_REGISTRY` as public API at `crate::types::operator::GLOBAL_OPERATOR_REGISTRY`? | ||
thread_local! { | ||
pub static GLOBAL_OPERATOR_REGISTRY: LazyCell<OperatorRegistry> = LazyCell::new(|| OperatorRegistry::with_enabled_services()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MSRV check fails due to the usage of LazyCell
. Should we update the MSRV or use another thing instead?
I see this TODO
about the once_cell
crate usage:
Line 248 in 52c96bb
# TODO: remove once_cell when lazy_lock is stable: https://doc.rust-lang.org/std/cell/struct.LazyCell.html |
If the replacement is planned, I think it would be better to use LazyCell
than once_cell::Lazy
in new code like this one, to not increase technical debt.
@@ -255,6 +255,9 @@ reqwest = { version = "0.12.2", features = [ | |||
serde = { version = "1", features = ["derive"] } | |||
serde_json = "1" | |||
tokio = { version = "1.27", features = ["sync", "io-util"] } | |||
# TODO: I added this dependency in order to not re-implement the Url::query_pairs function. | |||
# Is it okay to include it? (we are using that crate in the ofs binary https://github.com/apache/opendal/blob/main/bin/ofs/Cargo.toml#L45 | |||
url = "2.5.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use http::uri
instead. During the development of core
, there is no need to reference other download stream projects. Simply make the best decision based on the current core
code.
@@ -137,6 +137,51 @@ pub trait Configurator: Serialize + DeserializeOwned + Debug + 'static { | |||
}) | |||
} | |||
|
|||
// TODO: should we split `from_uri` into two functions? `from_uri` and `from_uri_opts`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't worth it.
// Register only services in `Scheme::enabled()` | ||
|
||
pub struct OperatorRegistry { | ||
// TODO: add Arc<Mutex<...>> to make it cheap to clone + thread safe? or is it not needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to me; we rarely change it or keep the lock engaged for long.
} | ||
|
||
impl OperatorRegistry { | ||
pub fn new() -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's initialize the registry here directly.
} | ||
} | ||
|
||
fn operator_factory_from_configurator<C: Configurator>() -> OperatorFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to remove operator_factory_from_configurator
.
// ``` | ||
// and the apply_for_all_services macro would gate every statement behind the corresponding feature gate | ||
// This seems to not be the place where we should have a "list of enabled services". | ||
#[cfg(feature = "services-aliyun-drive")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe every service should have its own registration functions. There are two reasons:
- The service may have different schemes for registration. For example, S3 registers as
s3
,s3a
,minio
,r2
, and so on, while GCS registers asgcs
,gs
, and so forth. - Provide a register that simplifies integration with OpenDAL for external users.
Relates to #5445
Left some doubts as
// TODO
in the source. I have little experience contributing to this repo so I'm sorry if there are a lot of doubts about this. Just want to be sure all the changes of this PR aligns with the current codebase. Please take a look to all theTODOs
I left when reviewing.I would like to add more tests, but I don't know in which place those should be placed. The
core/tests
folder seems like a good place, but I don't find any place suitable, as placing those incore/tests/behaviour
seems weird to me. But as this implies various components, maybe we can have acore/tests/integration
? Although I would like to write some unit tests atcore/src/types/builder.rs
,core/src/types/operator/builder.rs
andcore/src/types/operator/registry.rs
, but didn't any existing unit tests there.In this PR I implemented a single
Configurator::from_uri
method, which will serve as default and takes only the query parameters as options. Services which need a more specific configuration such as s3 or azblob can be implemented in follow-up PRs.I also have pending documentating all the newly added public API, but will do that after an initial review round.
Thank you very much.