Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alt SPIKE: New objectid #596

Closed
wants to merge 9 commits into from
Closed

Alt SPIKE: New objectid #596

wants to merge 9 commits into from

Conversation

hacksparrow
Copy link
Contributor

No description provided.

lib/mongodb.js Show resolved Hide resolved
Copy link
Member

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are making good progress 👏🏻

I have few more comments to discuss, PTAL 👇🏻

SPIKE.md Show resolved Hide resolved
SPIKE.md Show resolved Hide resolved
SPIKE.md Outdated Show resolved Hide resolved
SPIKE.md Outdated Show resolved Hide resolved
SPIKE.md Outdated
All changes are limited to `loopback-connetor-mongodb` only, Juggler is not affected.

- Before querying database:
- Convert `id` value to `ObjectID`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use an id value that's not an ObjectID, e.g. a number? I vaguely remember that in such case, LoopBack 3 may be using a new property _id to hold the actual database id (e.g. {_id: objectIdGeneratedByMongo, id: 123}).

Can you please verify?

This item caught my attention because I find it a bit odd that we have to treat id (primary key) values differently from other properties, considering that both primary keys and other properties must be defined the same way (mongodb: {dataType: 'ObjectID'}).

Also if we have special treatment for primary keys, then shouldn't we also apply the same rules to foreign keys?

Have you considered a simpler design, where we apply the same rules for all properties, regardless of whether they are a primary key or not? Is there any edge case that not work with that solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The special treatment for primary key is the existing behavior, let me update the implementation to apply a common rule for all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use an id value that's not an ObjectID, e.g. a number?

Yes, but using non-ObjectID _id it comes with unexpected behaviors, so not worth using it.

Copy link
Member

@bajtos bajtos Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use an id value that's not an ObjectID, e.g. a number?

Yes, but using non-ObjectID _id it comes with unexpected behaviors, so not worth using it.

I am not referring to non-ObjectID _id field. I mean a model that's using a number as the primary key on the LoopBack side.

Here is some code you can add to lib/objectid.test.js to better understand what's going on.

  context.only('numeric id', function() {
    const User = ds.createModel(
      'UserNum',
      {
        id: {type: 'number', id: true, generated: false},
        email: {type: 'string'},
      },
      {strictObjectIDCoercion: true},
    );

    beforeEach(() => User.deleteAll());
 
    it('is supported and uses _id: ObjectID in background', async () => {
      console.log('id definition', User.definition.properties.id);

      const u = await User.create({id: 1, email: '[email protected]'});
      console.log(u);
      console.log(u.toJSON());

      const found = await User.findById(u.id);
      console.log('found', found);
    });
  });

Console output:

id definition {
  type: [Function: Number],
  id: true,
  generated: false,
  updateOnly: false
}
{ id: 1, email: '[email protected]' }
{ id: 1, email: '[email protected]' }
found { email: '[email protected]', id: 1 }

Interestingly enough, when I change id definition and remove generated: false, and don't provide any id value when creating a new user, then the connector somehow starts storing ObjectID values in the numeric id property.

id definition { type: [Function: Number], id: true, updateOnly: false }
{ email: '[email protected]', id: 5f5b71894004d51b2a65de00 }
{ id: 5f5b71894004d51b2a65de00, email: '[email protected]' }
found { email: '[email protected]', id: NaN }

This looks like a bug to me, notice that the model instance returned by findById has id: NaN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bajtos

Is it possible to use an id value that's not an ObjectID, e.g. a number? I vaguely remember that in such case, LoopBack 3 may be using a new property _id to hold the actual database id (e.g. {_id: objectIdGeneratedByMongo, id: 123}).

Yes, it possible to use an id value that's not an ObjectID, but the custom id field is not saved in the database, it is logically translated in the model from _id to whatever name was specified.

when I change id definition and remove generated: false, and don't provide any id value when creating a new user, then the connector somehow starts storing ObjectID values in the numeric id property.

This happens on the master branch too. We should throw an error saying id is required, like how specifying the id for generated: 1 throws an error.

Need some clarifications:

  • If the model is defined as {id: {id: true, type: 'string', generated: true }}, should it be interpreted as {id: {id: true, type: 'string', mongodb: {dataType: 'objectID'}}}? After all, the autogenerated id will be an ObjectID.
  • How should {id: {id: true, type: 'number', generated: true }} be treated? It is essentially saying {id: {id: true, type: 'number', mongodb: {dataType: 'objectID'}}}.

I have created a branch - https://github.com/strongloop/loopback-connector-mongodb/tree/non-objectid-id - for quickly testing the current behavior on master. File of interest: https://github.com/strongloop/loopback-connector-mongodb/blob/non-objectid-id/test/new-objectid.test.js.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it possible to use an id value that's not an ObjectID, but the custom id field is not saved in the database, it is logically translated in the model from _id to whatever name was specified.

Interesting, I thought the model would have two properties - user-defined id and database-generated _id. Thank you for taking a look and finding out what's the actual behavior.

If the model is defined as {id: {id: true, type: 'string', generated: true }}, should it be interpreted as {id: {id: true, type: 'string', mongodb: {dataType: 'objectID'}}}? After all, the autogenerated id will be an ObjectID.

One option is to silently (auto-magically) assume mongodb: {dataType: 'objectID'}. A better solution may be to throw a descriptive error and asks users to specify mongodb: {dataType: 'objectID'} in the property definition.

  • How should {id: {id: true, type: 'number', generated: true }} be treated? It is essentially saying {id: {id: true, type: 'number', mongodb: {dataType: 'objectID'}}}.

IMO, {type: 'number', mongodb: {dataType: 'objectID'}} is not a valid property definition, similarly for {type: 'date', mongodb: {dataType: 'objectID'}}, etc.

I am proposing to throw an error when such property definition is detected.

Connectors can provide define, defineProperty and defineForeignKey functions that are called by juggler's DataSource object to notify the connector about new models/properties/foreign-keys defined by the application. I think it would be best to implement these methods in MongoDB connector to validate property definitions at setup (define) time, instead of deferring the validation until the first DAO method is called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a bug to me, notice that the model instance returned by findById has id: NaN.

Can we address this in a separate issue since the fix lies in Juggler? We will be able to make faster progress by not including changes in Juggler. From my initial look, it seems to be caused by the difference in behavior of the various DAO methods as noted here.

Sure, that's fine.

However, I think my proposal to reject property definitions like {type: 'number', mongodb: {dataType: 'objectID'}} would solve the issue and it is something we can implement inside the MongoDB connector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to implement these methods in MongoDB connector to validate property definitions at setup (define) time, instead of deferring the validation until the first DAO method is called.

Let's do it in a follow up story. Let's focus specifically on ObjectID-to-string conversion and vice versa in this spike. If we include validation in the connector, it will further drag on the spike.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's important to get the developer experience right. If a user accidentally mis-configures their models, the framework should be helpful and tell them what they did wrong and how to fix it.

I am ok to work on this in a follow-up story, please make sure to include it the spike proposal. Should we start with a spike to better understand the complexities involved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user accidentally mis-configures their models, the framework should be helpful and tell them what they did wrong and how to fix it.

100% agreed, and best scoped in a separate story.

Should we start with a spike to better understand the complexities involved?

I think so, validations are not going to be simple case of if-else when we have inferred and nested properties.

lib/mongodb.js Outdated
@@ -401,6 +403,12 @@ MongoDB.prototype.fromDatabase = function(modelName, data) {
if (data[p] instanceof mongodb.Binary) {
// Convert the Binary into String
data[p] = data[p].toString();
// SPIKE: properties that should be converted to string from ObjectID
// SPIKE: do the same for toDatabase
} else if (p === 'id' || prop.mongodb && prop.mongodb.dataType.match(/objectid/i)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep in mind that primary keys are not required to be called id. If you really need to identify primary keys here (which I think you should not need to), then you need to check if prop.id is set (I think). I also vaguely remember that prop.id can be a number to describe multi-column index (e.g. on GitHub, userId would have id: 1 and repoId would have id: 2 to define (userId, repoId) as the primary key). I am not sure if those numeric values are indexed from 0 or 1 though.

lib/mongodb.js Outdated
@@ -448,6 +458,16 @@ MongoDB.prototype.toDatabase = function(modelName, data) {
coordinates: [data[p].lng, data[p].lat],
type: 'Point',
};
// SPIKE: Ideally coercePropertyValue should handle this but it does not have enough info
} else if (prop && prop.type === String) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try find a way how to use the existing code for converting string to Decimal128.

lib/mongodb.js Outdated
} else if (p === 'id' || prop.mongodb && prop.mongodb.dataType.match(/objectid/i)) {
if (data[p]) {
data[p] = data[p].toString();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try find a way how to use the existing code for converting string to Decimal128.

lib/mongodb.js Outdated
@@ -1508,7 +1531,8 @@ MongoDB.prototype.count = function count(modelName, where, options, callback) {
*/
MongoDB.prototype.replaceById = function replace(modelName, id, data, options, cb) {
if (this.debug) debug('replace', modelName, id, data);
const oid = this.coerceId(modelName, id, options);
// SPIKE: every id should be converted to ObjectID before query
const oid = ObjectID(id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments above. IMO, the conversion to ObjectID should be applied only if the id property is defined with data type ObjectID.

It would be great if we could re-use existing bits from the infrastructure for coercing arbitrary model properties.

name: String
}
);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add few tests to verify how nested properties are handled.

See the tests we added for Decimal128 in fb41e40 and 25122e0 for inspiration.

SPIKE.md Outdated

Specifically for `ObjectID`:
- Properties that should be stored as `ObjectID` in MongoDB must be defined as `{type: 'string', mongodb: {dataType: 'ObjectID'}}`.
- There will be no auto-magic coercion from `string` to `Object` for other properties (those defined without `dataType: 'ObjectID'`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- There will be no auto-magic coercion from `string` to `Object` for other properties (those defined without `dataType: 'ObjectID'`).
- There will be no auto-magic coercion from `string` to `ObjectID` for other properties (those defined without `dataType: 'ObjectID'`).

Copy link
Contributor

@jannyHou jannyHou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal and PoC looks great!

const Author = ds.createModel(
'Author',
{
id: {id: true, type: String, mongodb: {dataType: 'objectID'}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: according to "if specified with {id: true}, need not add mongodb: {dataType: 'ObjectID'} (it will be automatically inferred)" the mongodb config here can be omitted.

});


// SPIKE: in these tests there is no way of knowing if objectID-like strings are actually stored
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the spike: we test the raw data from database for decimal properties like this, hope it helps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Janny, added.

Copy link
Member

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we have a decent understanding of where we want to get, what ObjectID-related behavior we want to have 👍

Now we need to build a better understanding of how to get there. What is the first step (the first task/user story)? What is the next one? Which tasks must be implemented one after other because of dependencies, and which can be worked on in parallel, potentially by more than just one person?

Ideally, the proposed plan should have enough details that the creation of follow-up/implementation GitHub issues will become mostly copy & paste, and that anybody else on the team or even from our community can pick up any task and contribute the changes, without having to fully understand the entire spike document.

@hacksparrow
Copy link
Contributor Author

@bajtos I will update the spike report with the follow up tasks.

@hacksparrow
Copy link
Contributor Author

@bajtos added.

Follow-up Tasks

  1. ObjectID string interface
  2. SPIKE: Connector level model validation
  3. Primary key property not validated in .create()
  4. Refactor DAO methods

3 and 4 are somewhat related but not a requirement, they were discovered and created out of the first spike.

@hacksparrow hacksparrow requested a review from bajtos September 22, 2020 10:28
@bajtos
Copy link
Member

bajtos commented Sep 25, 2020

@hacksparrow thank you for opening follow-up tasks, they look mostly good 👍🏻 I remember that we were discussing that in order to support the desired behavior, where PKs are always represented as strings at juggler level, we need to remove getDefaultIdType to prevent juggler from changing model definition and always setting PK type of ObjectID.

https://github.com/strongloop/loopback-connector-mongodb/blob/4551cc0428a6954091981eb8e4861e7c1c9b37d8/lib/mongodb.js#L351-L353

Is this still relevant? Which story is covering this part? Should we create a new one, since this change seems relatively isolated? I think this change will require us to update many unit tests, so it may be best to plan it independently, to keep the size of #598 smaller.

@hacksparrow
Copy link
Contributor Author

@bajtos sure, let me create a separate story for removing getDefaultIdType(). Thanks for the suggestion!

@hacksparrow
Copy link
Contributor Author

Created #600.

@stale
Copy link

stale bot commented Dec 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 25, 2020
@stale
Copy link

stale bot commented Jul 14, 2021

This issue has been closed due to continued inactivity. Thank you for your understanding. If you believe this to be in error, please contact one of the code owners, listed in the CODEOWNERS file at the top-level of this repository.

@stale stale bot closed this Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants