-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHPLIB-1180: Add basic infrastructure for codecs #1125
PHPLIB-1180: Add basic infrastructure for codecs #1125
Conversation
36c0c20
to
37eb67c
Compare
8fa0518
to
74cf9bb
Compare
@GromNaN thank you for the feedback, especially on the architecture doc. I've reworded a lot in the document and better outlined future work. I've also added descriptions to the codec interfaces and the methods therein. |
74cf9bb
to
78dc173
Compare
src/Codec/Encoder.php
Outdated
use MongoDB\Exception\InvalidArgumentException; | ||
|
||
/** | ||
* @internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it internal?
I expect there will be cases where you only need to encode or decode, and not both. For example: you want to decode the result of an aggregation with a particular structure: you don't need to write the code to encode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I was aware of an encoder-only use case for the aggregation pipeline builder, but that would use a more specific interface extending Encoder
. The use case of only needing to Decode documents for results from an aggregation pipeline completely slipped my mind. I'll remote the internal
designation from the two interfaces.
@@ -86,7 +87,8 @@ public function provideProjectClassNames() | |||
continue; | |||
} | |||
|
|||
$classNames[][] = 'MongoDB\\' . str_replace(DIRECTORY_SEPARATOR, '\\', substr($file->getRealPath(), strlen($srcDir) + 1, -4)); | |||
$className = 'MongoDB\\' . str_replace(DIRECTORY_SEPARATOR, '\\', substr($file->getRealPath(), strlen($srcDir) + 1, -4)); | |||
$classNames[$className][] = $className; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted this this ensures the class names gets used as a more descriptive data provider name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could yield the result directly.
$classNames[$className][] = $className; | |
yield $className => [$className]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An evil code shortcut:
$classNames[$className][] = $className; | |
yield $className => $className = 'MongoDB\\' . str_replace(DIRECTORY_SEPARATOR, '\\', substr($file->getRealPath(), strlen($srcDir) + 1, -4)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave that out to not have more changes than necessary here. That said, I've created PHPLIB-1191 to track making data providers static and refactoring them to return Generator
instances instead of arrays.
tests/Codec/ArrayCodecTest.php
Outdated
$value = [ | ||
'magic', | ||
'cigam', | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my first read through the file, I didn't realize that the magic
and cigam
strings were being encoded and decoded, respectively. I think this would benefit from a comment (at least on the first test case).
Alternatively, replace the string values with something more self-documenting, such as "encoded" and "decoded", so it's clear that decode()
ends up changing the only "encoded" value to "decoded" and leaves the other "decoded" value as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I've updated the test values.
78dc173
to
bc40f96
Compare
@jmikola @GromNaN Thank you for your reviews. I went through the comments and have made numerous changes. I've removed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the ArrayCodec and ObjectCodec classes
Glad to hear. I wasn't sure what those were actually used for (within the scope of this PR).
Just a few more comments/thoughts about the exception classes.
use function get_debug_type; | ||
use function sprintf; | ||
|
||
class UnencodableValueException extends InvalidArgumentException implements Exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts on this.
For GridFS, we lumped its three exceptions under a MongoDB\GridFS\Exception
namespace. Is that something we'd want to do for Codec as well, assuming these exceptions will only be used by classes in MongoDB\Codec
?
I think the only trade-off is that it makes the use
statement a bit longer, but it does convey that these are codec exceptions.
And a separate question for @GromNaN, since you suggested using dedicated exceptions. Do you feel strongly about having a separate exception for each method, or could there be a common UnsupportedValueException thrown by both encode()
and decode()
? I can't predict all ways the codec feature will be used, but I don't expect it'd be common for applications to encode and decode within the same try/catch
scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 for moving to MongoDB\Codec\Exception
namespace.
This exceptions are LogicException
, meaning they may pop in dev and never in prod. The message is more important than the exception class. I don't really know if it's better to have 1 or 2 classes.
I find it useful that @alcaeus added the $value
property, so that it can be inspected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In hindsight, I regret that we introduced so many exceptions in the initial release of PHPC. There are cases where different exceptions made sense (e.g. CommandException, ServerException, WriteException), but there are an equal number of mistakes (e.g. SSLConnectionException).
In this case, both exceptions expose a $value
property and there should be little confusion about the context. I'd very much be in favor of using a common exception to reduce the cognitive and documentation overhead.
private $value; | ||
|
||
/** @param mixed $value */ | ||
public function __construct($value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we end up sharing exceptions for encode
and decode
, I'd suggest making the constructor private and creating factory methods like we do for other exceptions.
I might suggest doing this regardless, as it keeps this non-internal class more flexible. By overriding the constructor here, you're making it difficult to ever pass original constructor arguments to InvalidArgumentException (e.g. code, previous exception).
My suggested factory method names would be:
- If a common UnsupportedValueException:
invalidEncodableType()
andinvalidDencodableType()
- If separate classes:
invalidType()
bc40f96
to
283eb0c
Compare
283eb0c
to
09f5475
Compare
@@ -0,0 +1,16 @@ | |||
<?php | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to add copyright copypasta to all of the source files:
/*
* Copyright 2023-present MongoDB, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
Not needed in tests and I think we can skip this for the stubs, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once you add copyright notices to source files and respond to my question about the CodecLibrary constructor.
private $encoders = []; | ||
|
||
/** @param Decoder|Encoder $items */ | ||
public function __construct(...$items) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we settle on this signature, I just want to confirm that you don't expect we'd ever need to solicit additional params in the constructor (e.g. options array).
If so, consider making this an array argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so - this particular library is a first-come-first-serve library. If we have the need for different behaviour, I'd rather create a separate library class for the purpose.
use function get_debug_type; | ||
use function sprintf; | ||
|
||
class UnsupportedValueException extends InvalidArgumentException implements Exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this wasn't in the Codec namespace. SGTM as this seems generally useful for casts where the argument isn't invalid simply because of its type and the actual value might be relevant.
For the existing cases in InvalidArgumentException, I think we only care about the type itself so there's no need to attach the value to the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack - I figured a new namespace for this one exception might be overkill.
instance to guarantee that data always encodes to a BSON document and decodes to a PHP object. | ||
|
||
All operations that return documents will use the codec to decode the documents into PHP objects. This includes | ||
the various `find` and `findAndModify` operations in collections as well as the `aggregate` and `watch` operations in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alcaeus: outstanding question above. It doesn't affect my review, but I'd be curious to know before this gets picked up in a future PR.
This commit also uses dataset names for better visibility during debugging
The new method name is more specific and helps prevent naming conflicts
09f5475
to
6a70032
Compare
PHPLIB-1180
This follows the prototyping work done in #1059.
This PR only introduces the codec interfaces and two standard codecs to traverse arrays and objects.Support for lazy BSON deserialisation will be added in a separate PR.Compared to the prototype PR, the following changes have been made to the codecs:
encodeIfSupported
anddecodeIfSupported
methods (suggested during review)null
in CodecLibrary has been removedOther comments in the prototype PR concerned other areas (such as lazy BSON structures), those will be worked into future pull requests.