-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build a tool to find closest match in DLMF for a given mathematical expression #1777
Comments
Pinging @physikerwelt who I presume has thought about this before. |
Apart from finding articles, such a normalized representation of mathematical concepts could perhaps also be a useful component for a tool for finding software that does something with these concepts, or even dedicated hardware (should it exist) for computing such things. |
What is the context of this ticket? |
@physikerwelt The background is that I am interested in browsing the literature by mathematical formulas, as per When I came across that paper, I was wondering which mathematical systems similar to that described by their equations might have been explored in other papers before, perhaps even in a completely different context. Yet I would not know an efficient mechanism by which I could find such papers based purely on the formulas / expressions or some abstract representation thereof. DLMF at least assists with the abstract representation bit, yet I am not aware of it having been used for literature search, hence the ticket. In terms of defining similarity, I agree that there are multiple ways to go about that, and your paper illustrates this nicely. For now, I would be happy to use tooling based on any facet of similarity or even a combined measure as per Zhang and Youssef. In short, if we have SwMATH to indicate which software was used in a useful subset of papers, it is probably not a far-fetched idea to think about a system that indicates which formulas were used in such a set of papers, and while exact matches of formulas may not work well in cases like my example above, something that maps onto a taxonomy like DLMF would seem like a good starting point. |
My recommendation would be to take a math-optimized LLM, like Llemma, feed the formulas through the model, take the embeddings and do a k-nearest neighbor search in the embedding space. Given that the recent LLMs got quite good in handling math I am confident that this would produce already somewhat good results. |
Not sure whether that already exists but if I have some expressions like the ones below (from here)
in a machine friendly format, then it would be nice to see how they or their components could be mapped to the Digital Library of Mathematical Functions. Such a mapping could serve as a bridge to support finding other articles that contain similar mathematical constructs, as per
The text was updated successfully, but these errors were encountered: