Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHPLIB-1249 Init code generator project for aggregation builder #1174

Merged
merged 31 commits into from
Oct 4, 2023

Conversation

GromNaN
Copy link
Member

@GromNaN GromNaN commented Sep 25, 2023

Fix PHPLIB-1249

Reimplement aggregation builder prototype.

The branch feature/agg-builder will be the target of all the PRs of the aggregation builder feature.

Use nette/php-generator to generate operator code.
Inspired by AsyncAWS Code Generator.

[
'configFile' => __DIR__ . '/stages.yaml',
'generatorClass' => ValueClassGenerator::class,
'namespace' => MongoDB\Aggregation\Stage::class,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About Aggregation Builder namespacing. I'd like to put all the classes in the same MongoDB\Operator namespace. They are grouped under this name this in the documentation: https://www.mongodb.com/docs/manual/reference/operator/

MongoDB\Operator\Query
MongoDB\Operator\Project
MongoDB\Operator\Update
MongoDB\Operator\Stage
MongoDB\Operator\Pipeline

(update operator example is for future, not implemented in this project).

Each of this would be classes with static factory methods for value-object classes in their sub-namespace. So we will have:

MongoDB\Operator\Query\AndQuery
MongoDB\Operator\Query\EqQuery
MongoDB\Operator\Query\ElemMatchQuery
...
MongoDB\Operator\Project\ElemMatchProjection
...
MongoDB\Operator\Update\SetUpdate
MongoDB\Operator\Update\PopUpdate
...
MongoDB\Operator\Stage\BucketStage
MongoDB\Operator\Stage\ProjectStage
MongoDB\Operator\Stage\SetStage
...
MongoDB\Operator\Aggregation\AbsAggregation
MongoDB\Operator\Aggregation\BinarySizeAggregation
MongoDB\Operator\Aggregation\EqAggregation
MongoDB\Operator\Aggregation\BinarySizeAggregation
MongoDB\Operator\Aggregation\FirstNAggregation
MongoDB\Operator\Aggregation\MaxNAggregation

@jmikola what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we already have the MongoDB\Operation namespace and there could be confusion between MongoDB\Operation\Aggregate and MongoDB\Operator\Aggregation.

We can choose MongoDB\Builder as namespace. Like "query builder", "aggregation builder", "update builder" ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intention of repeating the namespace component as a class name suffix? Is that to avoid requiring use as if multiple operators are included in the same file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid name restrictions (And and Match are not allowed as class names) and avoid use as when needing to use operators with the same name in the same file (Query\Eq and Aggregation\Eq).

But this classes will not be used outside of this library or ORMs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have alternative suggestions for the main namespace MongoDB\Builder\ and MongoDB\Builder\Aggregation for aggregation pipeline operators, I'm all ears.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current namespaces are fine, especially if users are unlikely to interface with these directly (assuming they use factory methods). We can always revisit later if needed.

generator/config/config.php Outdated Show resolved Hide resolved
generator/config/query-operators.yaml Show resolved Hide resolved
generator/README.md Outdated Show resolved Hide resolved
generator/src/AbstractGenerator.php Outdated Show resolved Hide resolved
generator/src/AbstractGenerator.php Outdated Show resolved Hide resolved
@GromNaN GromNaN marked this pull request as ready for review September 28, 2023 08:23
@GromNaN GromNaN requested review from jmikola and alcaeus September 28, 2023 08:23
public function testPipeline(): void
{
$pipeline = new Pipeline(
Stage::match(Aggregation::eq('$foo', 1)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the $match stage shouldn't accept the aggregation pipeline operators, except within $expr.

tests/Builder/BuiderCodecTest.php Outdated Show resolved Hide resolved
Comment on lines 73 to 80
Stage::project([
'items' => Aggregation::filter(
input: Expression::fieldPath('items'),
cond: Aggregation::gte(Expression::variable('item.price'), 100),
as: 'item',
limit: $limit,
),
]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the named argument form:

Suggested change
Stage::project([
'items' => Aggregation::filter(
input: Expression::fieldPath('items'),
cond: Aggregation::gte(Expression::variable('item.price'), 100),
as: 'item',
limit: $limit,
),
]),
Stage::project(
items: Aggregation::filter(
input: Expression::fieldPath('items'),
cond: Aggregation::gte(Expression::variable('item.price'), 100),
as: 'item',
limit: $limit,
),
),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked in PHPLIB-1267

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the change in this branch.

Comment on lines 165 to 167
if ($val !== null) {
$result->{$key} = $val;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assumption that fields with a null value can be skipped may not always be true, just to keep in mind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need a special "not specified" object to mark unspecified values...

}

// A pipeline is encoded as a list of stages
if ($value instanceof Pipeline) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check and all subsequent checks should be handled by their own encoder; we can then leverage an encoder map to encode values.

Copy link
Member Author

@GromNaN GromNaN Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a draft implementation to get an overview of how operators and stages are dumped. I'll work on specificities on this tickets:

Tracked by PHPLIB-1257

src/Builder/Pipeline.php Outdated Show resolved Hide resolved
Exclude generated files from phpcs

Add Interface suffix to interfaces to prevent confusion with factories

Update codec to be encoder only
@GromNaN GromNaN changed the base branch from agg-builder to feature/agg-builder September 28, 2023 20:06
'implements' => [ExpressionInterface::class],
'types' => ['mixed'],
],
// @todo check for use-case
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked by PHPLIB-1251

'implements' => [ExpressionInterface::class],
'types' => ['string'],
],
// @todo if replaced by a string, it must start with $$
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked by PHPLIB-1265

@@ -1,5 +1,6 @@
<?xml version="1.0"?>
<psalm
phpVersion="8.0"
Copy link
Member Author

@GromNaN GromNaN Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be configured per directory.
Tracked by PHPLIB-1268

@@ -59,6 +82,11 @@
<code><![CDATA[$mergedDriver['platform']]]></code>
</MixedAssignment>
</file>
<file src="src/Codec/EncodeIfSupported.php">
<InvalidReturnType>
<code>($value is NativeType ? BSONType : $value)</code>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alcaeus error details:

ERROR: InvalidReturnType
at src/Codec/EncodeIfSupported.php:46:22
The declared return type '(BSONType:MongoDB\Codec\EncodeIfSupported as mixed)|(TGeneratedFromParam0:fn-mongodb\codec\encodeifsupported::encodeifsupported as mixed)' for MongoDB\Codec\EncodeIfSupported::encodeIfSupported is incorrect, got '(TGeneratedFromParam0:fn-mongodb\codec\encodeifsupported::encodeifsupported as mixed)|array<array-key, mixed>|stdClass|string' (see https://psalm.dev/011)
     * @psalm-return ($value is NativeType ? BSONType : $value)

I'm having trouble using the template annotations you've defined.

generator/README.md Outdated Show resolved Hide resolved
generator/composer.json Outdated Show resolved Hide resolved
generator/composer.json Show resolved Hide resolved
generator/config/expressions.php Show resolved Hide resolved
generator/config/operators.php Show resolved Hide resolved
src/Builder/BuilderEncoder.php Show resolved Hide resolved
src/Builder/BuilderEncoder.php Show resolved Hide resolved
src/Builder/BuilderEncoder.php Outdated Show resolved Hide resolved
src/Builder/Expression.php Show resolved Hide resolved
{
$pipeline = new Pipeline(
// @todo array is accepted by the stage class, but we expect an object. The driver accepts both.
Stage::match((object) ['author' => 'dave']),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the object-cast necessary here? The match() factory method and Match constructor only say they accept mixed $query, so I don't see anything that enforces the argument type.

Does that mean the implementation for Match is incomplete, or is that intentional? I'm curious what would happen if you passed a scalar value here. Would it just encode to an invalid stage like { "$match": 1 }?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a perverse effect to convert all associative arrays into object with objectify to improve readability of expected result. The value could be an associative array here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that the (object) cast was just done to make the test assertion more readable (presumably in the event it fails)? I wasn't thinking of objectify() when I wrote the above comment and was more just generally curious about how Stage::match() reacts to an invalid value being passed in since it only uses mixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed this was a temporary issue, and the type system will be improved down the line.

Comment on lines 24 to 26
'int' => ['scalar' => true, 'types' => ['int', BSON\Int64::class]],
'number' => ['scalar' => true, 'types' => ['int', BSON\Int64::class]],
'decimal' => ['scalar' => true, 'types' => ['float', BSON\Decimal128::class]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of things here:

  • double is missing from the list; it would correspond to PHP's float type
  • number can be any of the numeric types (i.e. int, Int64, float, Decimal128)
  • An int is a valid value to be given whenever a decimal type is accepted.

Given that, I think this should look as follows:

Suggested change
'int' => ['scalar' => true, 'types' => ['int', BSON\Int64::class]],
'number' => ['scalar' => true, 'types' => ['int', BSON\Int64::class]],
'decimal' => ['scalar' => true, 'types' => ['float', BSON\Decimal128::class]],
'int' => ['scalar' => true, 'types' => ['int', BSON\Int64::class]],
'double' => ['scalar' => true, 'types' => ['int', BSON\Int64::class, 'float']],
'decimal' => ['scalar' => true, 'types' => ['int', BSON\Int64::class, 'float', BSON\Decimal128::class]],
'number' => ['scalar' => true, 'types' => ['int', BSON\Int64::class, 'float', BSON\Decimal128::class]],

I'll note that a 64-bit integer value may be truncated when converted to a double, but I believe that shouldn't be an issue in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

type: expression
isVariadic: true
variadicMin: 1
- name: eq
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue here, but we should consider splitting this file into multiple smaller files in the future.

generator/config/stages.yaml Outdated Show resolved Hide resolved
@GromNaN GromNaN merged commit 60908f0 into mongodb:feature/agg-builder Oct 4, 2023
4 of 12 checks passed
@GromNaN GromNaN deleted the PHPLIB-1249 branch October 4, 2023 10:44
@GromNaN
Copy link
Member Author

GromNaN commented Oct 4, 2023

Tests failing due to PHP version incompatibility. This will be fixed by PHPLIB-1268

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants