Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHPLIB-1209: Add tutorial to show working with Codecs #1154

Merged
merged 18 commits into from
Sep 15, 2023

Conversation

alcaeus
Copy link
Member

@alcaeus alcaeus commented Sep 7, 2023

PHPLIB-1209

The tutorial uses the same classes and functionality we show in other BSON tutorials, namely the Person class to show serialisation (this is used to show the behaviour of Persistable and Serializable interfaces), and our example of storing a date as UTCDateTime with timezone information.

@alcaeus alcaeus requested a review from jmikola September 7, 2023 07:52
@alcaeus alcaeus self-assigned this Sep 7, 2023
Copy link
Member

@GromNaN GromNaN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need an example with a nested document person.city

docs/tutorial/codecs.txt Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
Comment on lines 281 to 311
return MongoDB\BSON\Document::fromPHP([
'street' => $value->street,
'postCode' => $value->postCode,
'city' => $value->city,
'country' => $value->country,
]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would performance be better without an intermediate array?

Suggested change
return MongoDB\BSON\Document::fromPHP([
'street' => $value->street,
'postCode' => $value->postCode,
'city' => $value->city,
'country' => $value->country,
]);
$doc = new MongoDB\BSON\Document();
$doc->set('street', $value->street);
$doc->set('postCode, $value->postCode);
$doc->set('city', $value->city);
$doc->set('country', $value->country);
return $doc;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BSON document class is currently read-only, so the only way to create one is through fromPHP or the other factory methods. The reason for this is that we only keep a bson_t* internally, which is essentially a byte stream of the BSON. In-place replacement only works for types with the same size, e.g. overwriting an int with another int. As soon as values grow or shrink in size, the BSON needs to be rewritten (i.e. create a new bson_t*, copy over everything except the property being changed, then write the changed property). Since the main motivation for adding the Document and PackedArray classes was to provide a way to access BSON data without decoding it to PHP objects, we refrained from adding a BSON writing mechanism. That said, there is PHPC-2172 which tracks creating an implementation of a BSON writer.

docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
@alcaeus alcaeus force-pushed the phplib-1209-codec-tutorial branch from aeec90d to 0037220 Compare September 11, 2023 06:50
Copy link
Member

@GromNaN GromNaN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
@alcaeus alcaeus force-pushed the phplib-1209-codec-tutorial branch from 87094fe to 36f4c51 Compare September 11, 2023 08:50
docs/tutorial.txt Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
@alcaeus alcaeus force-pushed the phplib-1209-codec-tutorial branch from 36f4c51 to d193e7b Compare September 12, 2023 11:13
@alcaeus alcaeus requested review from GromNaN and jmikola September 13, 2023 08:40
@alcaeus alcaeus force-pushed the phplib-1209-codec-tutorial branch from 2c7ffb4 to 7123d47 Compare September 13, 2023 08:41
* the correct type. This is a robust approach to avoid decoding document that are not supported and would cause
* exceptions.
*
* For large documents, this can be inefficient as we're inspecting the entire document 4 times (once for each
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking...

It's 8 times if you call decodeIfPossible as the decode calls canDecode.

Also, we don't know which document will be tested. It's not because this codec works for small documents that this function will not check huge ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. Inspecting large documents will be slow, but since we've removed libraries users are unlikely to run into such scenarios as they'd be using specific codecs for specific fields.

public function decode(mixed $value): DateTimeImmutable
{
if (! $this->canDecode($value)) {
throw UnsupportedValueException::invalidDecodableValue($value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you propose to add a method to the trait that wraps this generic 3 lines?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Will add in a separate PR.


return new Person(
$value->get('name'),
$this->dateTimeCodec->decode($value->get('createdAt')),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you get a nested BSON document, does it duplicates the data in memory or is it a reference to the parent data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It duplicates the data in memory. Each Document and PackedArray instance is tied to its own bson_t* so we don't have to start refcounting the libmongoc BSON structures.

@@ -0,0 +1,7 @@
<?php

// Returns a document using the default type map
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set a typemap, does it disable the codec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. Specifying an operation-level typeMap option also overrides the collection-level codec. I've added a sentence explaining this and updated the example to show both ways.

returned, an exception will be thrown.

You can disable codec usage for a specific operation or use a different codec (e.g. to decode the result of an
aggregation pipeline) by specifying the ``null`` for the ``codec`` option for any operation. This will override the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
aggregation pipeline) by specifying the ``null`` for the ``codec`` option for any operation. This will override the
aggregation pipeline) by specifying ``null`` for the ``codec`` option for any operation. This will override the

docs/examples/codecs/handling-data-types/DateTimeCodec.php Outdated Show resolved Hide resolved
docs/examples/codecs/handling-data-types/DateTimeCodec.php Outdated Show resolved Hide resolved
docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
Comment on lines 19 to 21
other methods (e.g. type maps), codecs allow for greater customization and handling of different data types. They allow
separating the logic for BSON encoding and decoding from the domain classes, allowing to decode BSON into plain old PHP
objects (POPOs).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enjoy this gem I came across while googling for "POPOs": http://www.javaleaks.org/open-source/php/plain-old-php-object.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll have an army of lawyers suing you for making POPOs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha. Leaving this unresolved so that someone can eventually stumble upon this while cave diving into old pull requests.

docs/tutorial/codecs.txt Outdated Show resolved Hide resolved
handles a particular data type in any document. This can be achieved by implementing the ``MongoDB\Codec\Codec``
interface.

The following example defines a codec to store ``DateTimeInterface`` instances as BSON dates including the timezone:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The following example defines a codec to store ``DateTimeInterface`` instances as BSON dates including the timezone:
The following example defines a codec that stores ``DateTimeInterface`` instances as an embedded document containing a BSON date and accompanying timezone string. Those same embedded documents can then be translated back into a ``DateTimeInterface`` during BSON decoding.

This drops the colon but should explain the entire example you're about to present. I think it's worth describing the encode and decode steps in prose here, as the remainder of the section is pretty light on words.

docs/tutorial/custom-types.txt Outdated Show resolved Hide resolved
@alcaeus alcaeus force-pushed the phplib-1209-codec-tutorial branch from 7123d47 to e1f63ed Compare September 14, 2023 06:44
@alcaeus alcaeus requested a review from jmikola September 14, 2023 07:05
@alcaeus alcaeus merged commit 9fd7457 into mongodb:master Sep 15, 2023
@alcaeus alcaeus deleted the phplib-1209-codec-tutorial branch September 15, 2023 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants