-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alt SPIKE: New objectid #596
Conversation
6732fe7
to
7d2f503
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are making good progress 👏🏻
I have few more comments to discuss, PTAL 👇🏻
SPIKE.md
Outdated
All changes are limited to `loopback-connetor-mongodb` only, Juggler is not affected. | ||
|
||
- Before querying database: | ||
- Convert `id` value to `ObjectID`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use an id value that's not an ObjectID
, e.g. a number? I vaguely remember that in such case, LoopBack 3 may be using a new property _id
to hold the actual database id (e.g. {_id: objectIdGeneratedByMongo, id: 123}
).
Can you please verify?
This item caught my attention because I find it a bit odd that we have to treat id
(primary key) values differently from other properties, considering that both primary keys and other properties must be defined the same way (mongodb: {dataType: 'ObjectID'}
).
Also if we have special treatment for primary keys, then shouldn't we also apply the same rules to foreign keys?
Have you considered a simpler design, where we apply the same rules for all properties, regardless of whether they are a primary key or not? Is there any edge case that not work with that solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The special treatment for primary key is the existing behavior, let me update the implementation to apply a common rule for all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use an id value that's not an ObjectID, e.g. a number?
Yes, but using non-ObjectID _id
it comes with unexpected behaviors, so not worth using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use an id value that's not an ObjectID, e.g. a number?
Yes, but using non-ObjectID
_id
it comes with unexpected behaviors, so not worth using it.
I am not referring to non-ObjectID _id
field. I mean a model that's using a number as the primary key on the LoopBack side.
Here is some code you can add to lib/objectid.test.js
to better understand what's going on.
context.only('numeric id', function() {
const User = ds.createModel(
'UserNum',
{
id: {type: 'number', id: true, generated: false},
email: {type: 'string'},
},
{strictObjectIDCoercion: true},
);
beforeEach(() => User.deleteAll());
it('is supported and uses _id: ObjectID in background', async () => {
console.log('id definition', User.definition.properties.id);
const u = await User.create({id: 1, email: '[email protected]'});
console.log(u);
console.log(u.toJSON());
const found = await User.findById(u.id);
console.log('found', found);
});
});
Console output:
id definition {
type: [Function: Number],
id: true,
generated: false,
updateOnly: false
}
{ id: 1, email: '[email protected]' }
{ id: 1, email: '[email protected]' }
found { email: '[email protected]', id: 1 }
Interestingly enough, when I change id
definition and remove generated: false
, and don't provide any id
value when creating a new user, then the connector somehow starts storing ObjectID values in the numeric id
property.
id definition { type: [Function: Number], id: true, updateOnly: false }
{ email: '[email protected]', id: 5f5b71894004d51b2a65de00 }
{ id: 5f5b71894004d51b2a65de00, email: '[email protected]' }
found { email: '[email protected]', id: NaN }
This looks like a bug to me, notice that the model instance returned by findById
has id: NaN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use an id value that's not an ObjectID, e.g. a number? I vaguely remember that in such case, LoopBack 3 may be using a new property _id to hold the actual database id (e.g. {_id: objectIdGeneratedByMongo, id: 123}).
Yes, it possible to use an id value that's not an ObjectID, but the custom id field is not saved in the database, it is logically translated in the model from _id
to whatever name was specified.
when I change id definition and remove generated: false, and don't provide any id value when creating a new user, then the connector somehow starts storing ObjectID values in the numeric id property.
This happens on the master branch too. We should throw an error saying id
is required, like how specifying the id
for generated: 1
throws an error.
Need some clarifications:
- If the model is defined as
{id: {id: true, type: 'string', generated: true }}
, should it be interpreted as{id: {id: true, type: 'string', mongodb: {dataType: 'objectID'}}}
? After all, the autogenerated id will be anObjectID
. - How should
{id: {id: true, type: 'number', generated: true }}
be treated? It is essentially saying{id: {id: true, type: 'number', mongodb: {dataType: 'objectID'}}}
.
I have created a branch - https://github.com/strongloop/loopback-connector-mongodb/tree/non-objectid-id - for quickly testing the current behavior on master
. File of interest: https://github.com/strongloop/loopback-connector-mongodb/blob/non-objectid-id/test/new-objectid.test.js.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it possible to use an id value that's not an ObjectID, but the custom id field is not saved in the database, it is logically translated in the model from
_id
to whatever name was specified.
Interesting, I thought the model would have two properties - user-defined id
and database-generated _id
. Thank you for taking a look and finding out what's the actual behavior.
If the model is defined as
{id: {id: true, type: 'string', generated: true }}
, should it be interpreted as{id: {id: true, type: 'string', mongodb: {dataType: 'objectID'}}}
? After all, the autogenerated id will be anObjectID
.
One option is to silently (auto-magically) assume mongodb: {dataType: 'objectID'}
. A better solution may be to throw a descriptive error and asks users to specify mongodb: {dataType: 'objectID'}
in the property definition.
- How should
{id: {id: true, type: 'number', generated: true }}
be treated? It is essentially saying{id: {id: true, type: 'number', mongodb: {dataType: 'objectID'}}}
.
IMO, {type: 'number', mongodb: {dataType: 'objectID'}}
is not a valid property definition, similarly for {type: 'date', mongodb: {dataType: 'objectID'}}
, etc.
I am proposing to throw an error when such property definition is detected.
Connectors can provide define
, defineProperty
and defineForeignKey
functions that are called by juggler's DataSource
object to notify the connector about new models/properties/foreign-keys defined by the application. I think it would be best to implement these methods in MongoDB connector to validate property definitions at setup (define) time, instead of deferring the validation until the first DAO method is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a bug to me, notice that the model instance returned by findById has id: NaN.
Can we address this in a separate issue since the fix lies in Juggler? We will be able to make faster progress by not including changes in Juggler. From my initial look, it seems to be caused by the difference in behavior of the various DAO methods as noted here.
Sure, that's fine.
However, I think my proposal to reject property definitions like {type: 'number', mongodb: {dataType: 'objectID'}}
would solve the issue and it is something we can implement inside the MongoDB connector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be best to implement these methods in MongoDB connector to validate property definitions at setup (define) time, instead of deferring the validation until the first DAO method is called.
Let's do it in a follow up story. Let's focus specifically on ObjectID
-to-string conversion and vice versa in this spike. If we include validation in the connector, it will further drag on the spike.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it's important to get the developer experience right. If a user accidentally mis-configures their models, the framework should be helpful and tell them what they did wrong and how to fix it.
I am ok to work on this in a follow-up story, please make sure to include it the spike proposal. Should we start with a spike to better understand the complexities involved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user accidentally mis-configures their models, the framework should be helpful and tell them what they did wrong and how to fix it.
100% agreed, and best scoped in a separate story.
Should we start with a spike to better understand the complexities involved?
I think so, validations are not going to be simple case of if
-else
when we have inferred and nested properties.
lib/mongodb.js
Outdated
@@ -401,6 +403,12 @@ MongoDB.prototype.fromDatabase = function(modelName, data) { | |||
if (data[p] instanceof mongodb.Binary) { | |||
// Convert the Binary into String | |||
data[p] = data[p].toString(); | |||
// SPIKE: properties that should be converted to string from ObjectID | |||
// SPIKE: do the same for toDatabase | |||
} else if (p === 'id' || prop.mongodb && prop.mongodb.dataType.match(/objectid/i)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep in mind that primary keys are not required to be called id
. If you really need to identify primary keys here (which I think you should not need to), then you need to check if prop.id
is set (I think). I also vaguely remember that prop.id
can be a number to describe multi-column index (e.g. on GitHub, userId
would have id: 1
and repoId
would have id: 2
to define (userId, repoId)
as the primary key). I am not sure if those numeric values are indexed from 0
or 1
though.
lib/mongodb.js
Outdated
@@ -448,6 +458,16 @@ MongoDB.prototype.toDatabase = function(modelName, data) { | |||
coordinates: [data[p].lng, data[p].lat], | |||
type: 'Point', | |||
}; | |||
// SPIKE: Ideally coercePropertyValue should handle this but it does not have enough info | |||
} else if (prop && prop.type === String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try find a way how to use the existing code for converting string
to Decimal128
.
lib/mongodb.js
Outdated
} else if (p === 'id' || prop.mongodb && prop.mongodb.dataType.match(/objectid/i)) { | ||
if (data[p]) { | ||
data[p] = data[p].toString(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try find a way how to use the existing code for converting string
to Decimal128
.
lib/mongodb.js
Outdated
@@ -1508,7 +1531,8 @@ MongoDB.prototype.count = function count(modelName, where, options, callback) { | |||
*/ | |||
MongoDB.prototype.replaceById = function replace(modelName, id, data, options, cb) { | |||
if (this.debug) debug('replace', modelName, id, data); | |||
const oid = this.coerceId(modelName, id, options); | |||
// SPIKE: every id should be converted to ObjectID before query | |||
const oid = ObjectID(id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comments above. IMO, the conversion to ObjectID should be applied only if the id property is defined with data type ObjectID.
It would be great if we could re-use existing bits from the infrastructure for coercing arbitrary model properties.
name: String | ||
} | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SPIKE.md
Outdated
|
||
Specifically for `ObjectID`: | ||
- Properties that should be stored as `ObjectID` in MongoDB must be defined as `{type: 'string', mongodb: {dataType: 'ObjectID'}}`. | ||
- There will be no auto-magic coercion from `string` to `Object` for other properties (those defined without `dataType: 'ObjectID'`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There will be no auto-magic coercion from `string` to `Object` for other properties (those defined without `dataType: 'ObjectID'`). | |
- There will be no auto-magic coercion from `string` to `ObjectID` for other properties (those defined without `dataType: 'ObjectID'`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal and PoC looks great!
const Author = ds.createModel( | ||
'Author', | ||
{ | ||
id: {id: true, type: String, mongodb: {dataType: 'objectID'}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: according to "if specified with {id: true}
, need not add mongodb: {dataType: 'ObjectID'}
(it will be automatically inferred)" the mongodb config here can be omitted.
test/new-objectid.test.js
Outdated
}); | ||
|
||
|
||
// SPIKE: in these tests there is no way of knowing if objectID-like strings are actually stored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the spike: we test the raw data from database for decimal properties like this, hope it helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Janny, added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we have a decent understanding of where we want to get, what ObjectID-related behavior we want to have 👍
Now we need to build a better understanding of how to get there. What is the first step (the first task/user story)? What is the next one? Which tasks must be implemented one after other because of dependencies, and which can be worked on in parallel, potentially by more than just one person?
Ideally, the proposed plan should have enough details that the creation of follow-up/implementation GitHub issues will become mostly copy & paste, and that anybody else on the team or even from our community can pick up any task and contribute the changes, without having to fully understand the entire spike document.
@bajtos I will update the spike report with the follow up tasks. |
8f1dad7
to
8db6d8f
Compare
8db6d8f
to
d500f0a
Compare
|
@hacksparrow thank you for opening follow-up tasks, they look mostly good 👍🏻 I remember that we were discussing that in order to support the desired behavior, where PKs are always represented as strings at juggler level, we need to remove Is this still relevant? Which story is covering this part? Should we create a new one, since this change seems relatively isolated? I think this change will require us to update many unit tests, so it may be best to plan it independently, to keep the size of #598 smaller. |
@bajtos sure, let me create a separate story for removing |
7ccaae4
to
799396a
Compare
Created #600. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been closed due to continued inactivity. Thank you for your understanding. If you believe this to be in error, please contact one of the code owners, listed in the |
No description provided.