-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameter Efficiency of ConvE #64
Comments
It is a bit surprising that DistMult is so large even though it scales linearly but the issue is that knowledge graphs can be large and it scales with the size of the knowledge graph while the convolution and the projection matrix in ConvE scale independently from the knowledge graph. If you run the models the parameter size is printed, but let me recalculate it by hand for some numbers in the paper to convince you about this claim. In the paper, I claim an embedding size of 128 for DistMult and 96 for ConvE is roughly equivalent in parameters for FB15k-237 (14541 entities and 237 relationships): For ConvE I did not include the bias terms and I used a 2D embedding of size |
Really appreciate your reply! I got it. So the point is that ConvE can use only fewer parameters to achieve a similar performance compared with DistMult, right? |
Yes, that is correct! I also tried to have an alternating, checker pattern between both embeddings which takes the idea to the extreme, but this did not help more than just concatenating. My intuition is that sometimes you just want to model an entity or relationship on its own, meaning you want to model information that is relationship/entity independent. Having a separate region for this could help with modeling this kind of information. |
In your paper, you claim ConvE uses less parameter compared with DistMult. But I think in your code DistMult only uses O(num_entitiesemdding_dim + num_relsembedding_dim) and ConvE uses more parameters. I am a bit confused about your claim. I am afraid I missed something. Can you point out how to verify this claim? Thanks!
The text was updated successfully, but these errors were encountered: