-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic ranking example. #7
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
examples/basic_ranking.py
Outdated
|
||
def __init__( | ||
self, | ||
user_embeddings_count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_users, num_items
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, although I'm keeping the term candidate
instead of item
everywhere for consistency.
return ( | ||
# Inputs are user ids and movie ids | ||
{ | ||
"user_id": tf.strings.to_number(x["user_id"], out_type=tf.int32), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these IDs guaranteed to be sequential? If they're just random int-like, we would want to reindex them from 0 to num_users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cheated because I know the dataset and they are sequential.
The original example was more general because it wasn't making any assumption on the IDs, not even that they were numbers, because it's using a StringLookup
layer as part of the model. StringLookup
is not multi-backend so we can't do that as part of the model, it would have to be part of preprocessing. But then it gets more complicated to demonstrate inference (in the retrieval example), because the reverse lookup is needed, so we would need:
- model.predict
- convert to Tensorflow
- reverse lookup
- convert to Python
- lookup movie title
We have 3 options:
- make it fully generic and use
StringLookup
outside of the model - assume IDs are ints and use an int to int mapping
- leave it as is (assume sequential IDs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leave it as is (assume sequential IDs)
That's ok if you clearly document this gotcha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some detailed language about this shortcut.
Example was run with all 3 backends (tf, jax and torch).
48878ad
to
f4db280
Compare
Example was run with all 3 backends (tf, jax and torch).