Alt SPIKE: New objectid #596

hacksparrow · 2020-09-03T13:11:38Z

No description provided.

lib/mongodb.js

bajtos

You are making good progress 👏🏻

I have few more comments to discuss, PTAL 👇🏻

SPIKE.md

bajtos · 2020-09-07T09:34:24Z

SPIKE.md

+All changes are limited to `loopback-connetor-mongodb` only, Juggler is not affected.
+
+- Before querying database:
+  - Convert `id` value to `ObjectID`.


Is it possible to use an id value that's not an ObjectID, e.g. a number? I vaguely remember that in such case, LoopBack 3 may be using a new property _id to hold the actual database id (e.g. {_id: objectIdGeneratedByMongo, id: 123}).

Can you please verify?

This item caught my attention because I find it a bit odd that we have to treat id (primary key) values differently from other properties, considering that both primary keys and other properties must be defined the same way (mongodb: {dataType: 'ObjectID'}).

Also if we have special treatment for primary keys, then shouldn't we also apply the same rules to foreign keys?

Have you considered a simpler design, where we apply the same rules for all properties, regardless of whether they are a primary key or not? Is there any edge case that not work with that solution?

The special treatment for primary key is the existing behavior, let me update the implementation to apply a common rule for all.

Is it possible to use an id value that's not an ObjectID, e.g. a number?

Yes, but using non-ObjectID _id it comes with unexpected behaviors, so not worth using it.

Is it possible to use an id value that's not an ObjectID, e.g. a number?

Yes, but using non-ObjectID _id it comes with unexpected behaviors, so not worth using it.

I am not referring to non-ObjectID _id field. I mean a model that's using a number as the primary key on the LoopBack side.

Here is some code you can add to lib/objectid.test.js to better understand what's going on.

context.only('numeric id', function() { const User = ds.createModel( 'UserNum', { id: {type: 'number', id: true, generated: false}, email: {type: 'string'}, }, {strictObjectIDCoercion: true}, ); beforeEach(() => User.deleteAll()); it('is supported and uses _id: ObjectID in background', async () => { console.log('id definition', User.definition.properties.id); const u = await User.create({id: 1, email: '[email protected]'}); console.log(u); console.log(u.toJSON()); const found = await User.findById(u.id); console.log('found', found); }); });

Console output:

id definition { type: [Function: Number], id: true, generated: false, updateOnly: false } { id: 1, email: '[email protected]' } { id: 1, email: '[email protected]' } found { email: '[email protected]', id: 1 }

Interestingly enough, when I change id definition and remove generated: false, and don't provide any id value when creating a new user, then the connector somehow starts storing ObjectID values in the numeric id property.

id definition { type: [Function: Number], id: true, updateOnly: false } { email: '[email protected]', id: 5f5b71894004d51b2a65de00 } { id: 5f5b71894004d51b2a65de00, email: '[email protected]' } found { email: '[email protected]', id: NaN }

This looks like a bug to me, notice that the model instance returned by findById has id: NaN.

@bajtos

Is it possible to use an id value that's not an ObjectID, e.g. a number? I vaguely remember that in such case, LoopBack 3 may be using a new property _id to hold the actual database id (e.g. {_id: objectIdGeneratedByMongo, id: 123}).

Yes, it possible to use an id value that's not an ObjectID, but the custom id field is not saved in the database, it is logically translated in the model from _id to whatever name was specified.

when I change id definition and remove generated: false, and don't provide any id value when creating a new user, then the connector somehow starts storing ObjectID values in the numeric id property.

This happens on the master branch too. We should throw an error saying id is required, like how specifying the id for generated: 1 throws an error.

Need some clarifications:

If the model is defined as {id: {id: true, type: 'string', generated: true }}, should it be interpreted as {id: {id: true, type: 'string', mongodb: {dataType: 'objectID'}}}? After all, the autogenerated id will be an ObjectID.

How should {id: {id: true, type: 'number', generated: true }} be treated? It is essentially saying {id: {id: true, type: 'number', mongodb: {dataType: 'objectID'}}}.

I have created a branch - https://github.com/strongloop/loopback-connector-mongodb/tree/non-objectid-id - for quickly testing the current behavior on master. File of interest: https://github.com/strongloop/loopback-connector-mongodb/blob/non-objectid-id/test/new-objectid.test.js.

Yes, it possible to use an id value that's not an ObjectID, but the custom id field is not saved in the database, it is logically translated in the model from _id to whatever name was specified.

Interesting, I thought the model would have two properties - user-defined id and database-generated _id. Thank you for taking a look and finding out what's the actual behavior.

If the model is defined as {id: {id: true, type: 'string', generated: true }}, should it be interpreted as {id: {id: true, type: 'string', mongodb: {dataType: 'objectID'}}}? After all, the autogenerated id will be an ObjectID.

One option is to silently (auto-magically) assume mongodb: {dataType: 'objectID'}. A better solution may be to throw a descriptive error and asks users to specify mongodb: {dataType: 'objectID'} in the property definition.

How should {id: {id: true, type: 'number', generated: true }} be treated? It is essentially saying {id: {id: true, type: 'number', mongodb: {dataType: 'objectID'}}}.

IMO, {type: 'number', mongodb: {dataType: 'objectID'}} is not a valid property definition, similarly for {type: 'date', mongodb: {dataType: 'objectID'}}, etc.

I am proposing to throw an error when such property definition is detected.

Connectors can provide define, defineProperty and defineForeignKey functions that are called by juggler's DataSource object to notify the connector about new models/properties/foreign-keys defined by the application. I think it would be best to implement these methods in MongoDB connector to validate property definitions at setup (define) time, instead of deferring the validation until the first DAO method is called.

This looks like a bug to me, notice that the model instance returned by findById has id: NaN.

Can we address this in a separate issue since the fix lies in Juggler? We will be able to make faster progress by not including changes in Juggler. From my initial look, it seems to be caused by the difference in behavior of the various DAO methods as noted here.

Sure, that's fine.

However, I think my proposal to reject property definitions like {type: 'number', mongodb: {dataType: 'objectID'}} would solve the issue and it is something we can implement inside the MongoDB connector.

I think it would be best to implement these methods in MongoDB connector to validate property definitions at setup (define) time, instead of deferring the validation until the first DAO method is called.

Let's do it in a follow up story. Let's focus specifically on ObjectID-to-string conversion and vice versa in this spike. If we include validation in the connector, it will further drag on the spike.

I feel it's important to get the developer experience right. If a user accidentally mis-configures their models, the framework should be helpful and tell them what they did wrong and how to fix it.

I am ok to work on this in a follow-up story, please make sure to include it the spike proposal. Should we start with a spike to better understand the complexities involved?

If a user accidentally mis-configures their models, the framework should be helpful and tell them what they did wrong and how to fix it.

100% agreed, and best scoped in a separate story.

Should we start with a spike to better understand the complexities involved?

I think so, validations are not going to be simple case of if-else when we have inferred and nested properties.

bajtos · 2020-09-07T09:43:05Z

lib/mongodb.js

@@ -401,6 +403,12 @@ MongoDB.prototype.fromDatabase = function(modelName, data) {
      if (data[p] instanceof mongodb.Binary) {
        // Convert the Binary into String
        data[p] = data[p].toString();
+      // SPIKE: properties that should be converted to string from ObjectID
+      // SPIKE: do the same for toDatabase
+      } else if (p === 'id' || prop.mongodb && prop.mongodb.dataType.match(/objectid/i)) {


Please keep in mind that primary keys are not required to be called id. If you really need to identify primary keys here (which I think you should not need to), then you need to check if prop.id is set (I think). I also vaguely remember that prop.id can be a number to describe multi-column index (e.g. on GitHub, userId would have id: 1 and repoId would have id: 2 to define (userId, repoId) as the primary key). I am not sure if those numeric values are indexed from 0 or 1 though.

bajtos · 2020-09-07T09:45:55Z

lib/mongodb.js

@@ -448,6 +458,16 @@ MongoDB.prototype.toDatabase = function(modelName, data) {
        coordinates: [data[p].lng, data[p].lat],
        type: 'Point',
      };
+    // SPIKE: Ideally coercePropertyValue should handle this but it does not have enough info
+    } else if (prop && prop.type === String) {


Please try find a way how to use the existing code for converting string to Decimal128.

bajtos · 2020-09-07T09:46:20Z

lib/mongodb.js

+      } else if (p === 'id' || prop.mongodb && prop.mongodb.dataType.match(/objectid/i)) {
+        if (data[p]) {
+          data[p] = data[p].toString();
+        }


Please try find a way how to use the existing code for converting string to Decimal128.

bajtos · 2020-09-07T09:48:14Z

lib/mongodb.js

@@ -1508,7 +1531,8 @@ MongoDB.prototype.count = function count(modelName, where, options, callback) {
 */
 MongoDB.prototype.replaceById = function replace(modelName, id, data, options, cb) {
  if (this.debug) debug('replace', modelName, id, data);
-  const oid = this.coerceId(modelName, id, options);
+  // SPIKE: every id should be converted to ObjectID before query
+  const oid = ObjectID(id);


See my comments above. IMO, the conversion to ObjectID should be applied only if the id property is defined with data type ObjectID.

It would be great if we could re-use existing bits from the infrastructure for coercing arbitrary model properties.

bajtos · 2020-09-07T09:49:29Z

test/new-objectid.test.js

+      name: String
+    }
+  );
+


Please add few tests to verify how nested properties are handled.

See the tests we added for Decimal128 in fb41e40 and 25122e0 for inspiration.

bajtos · 2020-09-11T12:49:10Z

SPIKE.md

+
+Specifically for `ObjectID`:
+- Properties that should be stored as `ObjectID` in MongoDB must be defined as `{type: 'string', mongodb: {dataType: 'ObjectID'}}`.
+- There will be no auto-magic coercion from `string` to `Object` for other properties (those defined without `dataType: 'ObjectID'`).


Suggested change

- There will be no auto-magic coercion from `string` to `Object` for other properties (those defined without `dataType: 'ObjectID'`).

- There will be no auto-magic coercion from `string` to `ObjectID` for other properties (those defined without `dataType: 'ObjectID'`).

jannyHou

The proposal and PoC looks great!

jannyHou · 2020-09-21T04:12:22Z

test/new-objectid.test.js

+  const Author = ds.createModel(
+    'Author',
+    {
+      id: {id: true, type: String, mongodb: {dataType: 'objectID'}},


nitpick: according to "if specified with {id: true}, need not add mongodb: {dataType: 'ObjectID'} (it will be automatically inferred)" the mongodb config here can be omitted.

jannyHou · 2020-09-21T04:14:56Z

test/new-objectid.test.js

+  });
+
+
+  // SPIKE: in these tests there is no way of knowing if objectID-like strings are actually stored


For the spike: we test the raw data from database for decimal properties like this, hope it helps.

Thanks Janny, added.

bajtos

Looks like we have a decent understanding of where we want to get, what ObjectID-related behavior we want to have 👍

Now we need to build a better understanding of how to get there. What is the first step (the first task/user story)? What is the next one? Which tasks must be implemented one after other because of dependencies, and which can be worked on in parallel, potentially by more than just one person?

Ideally, the proposed plan should have enough details that the creation of follow-up/implementation GitHub issues will become mostly copy & paste, and that anybody else on the team or even from our community can pick up any task and contribute the changes, without having to fully understand the entire spike document.

hacksparrow · 2020-09-21T13:20:17Z

@bajtos I will update the spike report with the follow up tasks.

hacksparrow · 2020-09-21T14:16:08Z

@bajtos added.

Follow-up Tasks

ObjectID string interface

SPIKE: Connector level model validation

Primary key property not validated in .create()

Refactor DAO methods

3 and 4 are somewhat related but not a requirement, they were discovered and created out of the first spike.

bajtos · 2020-09-25T06:30:54Z

@hacksparrow thank you for opening follow-up tasks, they look mostly good 👍🏻 I remember that we were discussing that in order to support the desired behavior, where PKs are always represented as strings at juggler level, we need to remove getDefaultIdType to prevent juggler from changing model definition and always setting PK type of ObjectID.

https://github.com/strongloop/loopback-connector-mongodb/blob/4551cc0428a6954091981eb8e4861e7c1c9b37d8/lib/mongodb.js#L351-L353

Is this still relevant? Which story is covering this part? Should we create a new one, since this change seems relatively isolated? I think this change will require us to update many unit tests, so it may be best to plan it independently, to keep the size of #598 smaller.

hacksparrow · 2020-09-25T07:45:07Z

@bajtos sure, let me create a separate story for removing getDefaultIdType(). Thanks for the suggestion!

hacksparrow · 2020-09-25T10:20:42Z

Created #600.

stale · 2020-12-25T13:13:41Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2021-07-14T00:00:55Z

This issue has been closed due to continued inactivity. Thank you for your understanding. If you believe this to be in error, please contact one of the code owners, listed in the CODEOWNERS file at the top-level of this repository.

hacksparrow requested review from agnes512, dhmlau, emonddr and jannyHou as code owners September 3, 2020 13:11

This was referenced Sep 3, 2020

SPIKE: New objectid #591

Closed

Spike: robust handling of ObjectID type for MongoDB loopbackio/loopback-next#3456

Closed

alt

7d2f503

hacksparrow force-pushed the alt-objectid branch from 6732fe7 to 7d2f503 Compare September 3, 2020 13:19

bajtos reviewed Sep 4, 2020

View reviewed changes

lib/mongodb.js Show resolved Hide resolved

bajtos suggested changes Sep 7, 2020

View reviewed changes

feedback

8c6148c

bajtos reviewed Sep 11, 2020

View reviewed changes

Yaapa Hage added 3 commits September 14, 2020 19:50

update implementation

86e8411

non-objectid id

5364b80

tests

06a1805

jannyHou approved these changes Sep 21, 2020

View reviewed changes

Yaapa Hage added 2 commits September 21, 2020 14:07

feedback

199b1c2

feedback from Janny

d5128e6

bajtos suggested changes Sep 21, 2020

View reviewed changes

hacksparrow force-pushed the alt-objectid branch from 8f1dad7 to 8db6d8f Compare September 21, 2020 14:11

followups

d500f0a

hacksparrow force-pushed the alt-objectid branch from 8db6d8f to d500f0a Compare September 21, 2020 14:12

hacksparrow requested a review from bajtos September 22, 2020 10:28

Remove MongoDB.prototype.getDefaultIdType

799396a

hacksparrow force-pushed the alt-objectid branch from 7ccaae4 to 799396a Compare September 25, 2020 08:00

stale bot added the stale label Dec 25, 2020

stale bot closed this Jul 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alt SPIKE: New objectid #596

Alt SPIKE: New objectid #596

hacksparrow commented Sep 3, 2020

bajtos left a comment

bajtos Sep 7, 2020

hacksparrow Sep 10, 2020

hacksparrow Sep 10, 2020

bajtos Sep 11, 2020 •

edited

Loading

hacksparrow Sep 15, 2020

bajtos Sep 15, 2020

bajtos Sep 15, 2020

hacksparrow Sep 18, 2020

bajtos Sep 21, 2020

hacksparrow Sep 21, 2020

bajtos Sep 7, 2020

bajtos Sep 7, 2020

bajtos Sep 7, 2020

bajtos Sep 7, 2020

bajtos Sep 7, 2020

bajtos Sep 11, 2020

jannyHou left a comment

jannyHou Sep 21, 2020

jannyHou Sep 21, 2020

hacksparrow Sep 21, 2020

bajtos left a comment

hacksparrow commented Sep 21, 2020

hacksparrow commented Sep 21, 2020

Follow-up Tasks

bajtos commented Sep 25, 2020

hacksparrow commented Sep 25, 2020

hacksparrow commented Sep 25, 2020

stale bot commented Dec 25, 2020

stale bot commented Jul 14, 2021

	- There will be no auto-magic coercion from `string` to `Object` for other properties (those defined without `dataType: 'ObjectID'`).
	- There will be no auto-magic coercion from `string` to `ObjectID` for other properties (those defined without `dataType: 'ObjectID'`).

		});


		// SPIKE: in these tests there is no way of knowing if objectID-like strings are actually stored

Alt SPIKE: New objectid #596

Alt SPIKE: New objectid #596

Conversation

hacksparrow commented Sep 3, 2020

bajtos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bajtos Sep 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jannyHou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bajtos left a comment

Choose a reason for hiding this comment

hacksparrow commented Sep 21, 2020

hacksparrow commented Sep 21, 2020

Follow-up Tasks

bajtos commented Sep 25, 2020

hacksparrow commented Sep 25, 2020

hacksparrow commented Sep 25, 2020

stale bot commented Dec 25, 2020

stale bot commented Jul 14, 2021

bajtos Sep 11, 2020 •

edited

Loading