-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deep learning for (auto-)formalization / theorem proving / mathematical and general purpose reasoning #1
Comments
I think this is a part of a bigger program of autoformalization (with internal Google code-name N2Formal, which I think stands for 'natural to formal'). These threads on Zulip mention additional papers: Machine Learning for Theorem Proving, AI and theorem proving continued Szegedy is on Twitter and the leanprover Zulip server, and you may ask him there, but first take a look at his manifesto A Promising Path Towards Autoformalization and General Artificial Intelligence (CICM 2020). Although I can't find a paragraph where he directly addresses the question "why not just do mathematics in an informal way", I think I can get some clue from the manifesto.
As I understand it, he wants the math literature mainly to serve as goals the agent could try to achieve to get a reward (see Section 7), an anchor so that the agent doesn't go too astray from what humans find interesting, not so much something to imitate.
By the way, Szegedy actually wants to take images as input instead of .tex files:
[1] Kevin Buzzard admitted that the talk was exaggerated to troll mathematicians, but the problem does exist. He has built a remarkable community of people interested in formalizing research-level mathematics around him, with a blog, a Twitter account, a Discord server, and a Twitch account (for live streaming Lean coding / puzzle solving speedruns). Event calendar in case of interest. Buzzard, McAllester, and Szegedy are among the speakers of AITP 2019 and 2020. |
I think there is potential that human mathematicians could teach a lot of math to machines through a powerful language model like the one behind OpenAI API [2], but the system would need to have some permanent external memory behind the scene to store formalized theorems and proofs in a hierarchical, compositional manner, and the language model need to assign names (= character/binary strings, like addresses or pointers, but of variable length) to these formalized objects and call them by their names as necessary (this is exactly what a language model should do! Quantization like in VQ-VAE could do the job but we would like short names for simple objects and long names for complex objects: this is language as compression and measure of complexity (Kolmogorov)). [3] This seems to me the only way to achieve lifelong few-shot learning and avoid catastrophic forgetting. More generally, the language model could synthesize programs (and give names to them) and outsource some of its capabilities to those programs, freeing up the capacity of the model itself (e.g. it can relegate tasks like adding two 3-digit numbers to a calculator by issuing a command (calling a program) instead of doing it itself). These programs could implement tools, models, and concepts/abstractions. This is how I intend to bridge the neural and symbolic worlds. The library of programs are considered a part of the environment (in RL setting) that the language model (the agent) has the power to modify, and here we see the role of language as the communication device between the language model and the program library. In training the model to outsource, some sort of self-awareness might emerge, as the model need to be aware of a part of its computation graph along with weights and copy them over or distill a simpler program out of these. I think commercialization is a great move since it would provide reward signal from real money; this is an advantage over un-/self-supervised learning from text (though the fast feedback during dialogues can also serve as reward signals). to engage more people, companies with many users that have purshased and possess a lot of virtual currency (e.g. in games) could use that as a substitute. People would not waste money to teach the agent useless stuff (or hate speech which some chatbots have learned); they would treat the agent more like their student or child, expect that the trained agent will be able to generate value for them by solving important problems or providing desired services, and their investment will be paid back. (As an example, Leela Zero has been called a "baby" by volunteers who devoted compute to generate training data for the Go bot.) Areas of high-value scientific research may see first instances of such systematic human-to-AI knowledge transfer. The agent could also pay human in exchange for manual labor (to conduct lab work, for example), giving instructions in words or multimedia, making human limbs an extension of the agent itself. The moment it's easier to teach machines than humans will mark the beginning of the singularity era. In latest news, OpenAI put online a proof assistant for MetaMath and expect that "the proof assistant will help you be more productive building your Metamath proofs while helping us improve our model’s accuracy at the same time". [2] cf. OpenAI says they are going to continually improve their API and Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text from DeepMind (which reminds me of Hierarchical Decision Making by Generating and Following Natural Language Instructions and CraftAssist: A Framework for Dialogue-enabled Interactive Agents last year from Facebook). See also another manifesto AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence by Jeff Clune, OpenAI. [3] cf. Neural symbolic reader, Neuro-symbolic concept learner, TP-transformer, TP-N2F, which however weren't exactly aiming at this goal. |
[after I pointed out Szegedy's document A Promising Path Towards Autoformalization and General Artificial Intelligence to /u/starspawn0, he reposted it as a topic on the same subreddit, and the following is my comment.] Thanks to /u/starspawn0 for making this more visible. I just want to add a link to my original post: https://www.reddit.com/r/thisisthewayitwillbe/comments/hcxymf/200604757_language_modeling_for_formal/fvq3372a See also here for a review of the document; I was following Jason Rute in calling it a "manifesto". Learning with AMIGo: Adversarially Motivated Intrinsic Goals just came out, and to me it looks like a step towards a Zero (tabula rasa) approach to math, and relevant to how mathematicians come up with research or pedagogical problems (hard open problems vs. routine test problems), define new objects of study that lead to new interesting and challenging areas of research. |
Additional relevant papers (continually updated): Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity, Berkeley (Pathak et al.), https://arxiv.org/abs/1902.05546, tweet https://twitter.com/skaasj/status/1272574484336324609 CICM 2020: GPT-3: Language Models are Few-Shot Learners, https://arxiv.org/abs/2005.14165 Biological plausible approximates of gradient descent, deep learning and the brain: |
Evolution of McAllester's set-theoretic dependent type theory as language for mathematics: Tarski and Mentalese, Aug 30, 2013, https://machinethoughts.wordpress.com/2014/08/02/tarksi-and-mentalese/ |
[The following two posts are my reply to /u/starspawn0's comment on the paper Language Modeling for Formal Mathematics by Christian Szegedy et al., posted on subreddit 'thisisthewayitwillbe', in which he asked "why they don't just try to train large language models to write proofs the way humans do, and worry about mapping it all to a more useful form (HOL/Isabelle-checkable) later".]
The text was updated successfully, but these errors were encountered: