
TheoremLlama generates its NL-FL-aligned dataset using a bootstrapping technique to incorporate NL proofs into Lean4 code3. This technique helps overcome data shortage and improves LLM's comprehension and execution of formal reasoning by integrating NL reasoning into Lean4 scenarios3. The resulting dataset is called Open Bootstrapped Theorems (OBT).

Lean is a proof assistant and functional programming language used in mathematical reasoning to rigorously verify proofs and guarantee accuracy and consistency in mathematical outcomes. It is based on the calculus of constructions with inductive types and allows for formalizing mathematical definitions, theorems, and proofs in a regimented language1. Lean provides feedback, interprets expressions, and certifies the correctness of proofs.

LLMs use NL proofs in theorem proving by training on aligned Natural Language (NL) and Formal Language (FL) theorem-proving data14. TheoremLlama, an end-to-end framework, specializes general-purpose LLMs in Lean4 theorem proving. It generates an NL-FL-aligned dataset, Open Bootstrapped Theorems (OBT), using a bootstrapping technique to include NL proofs into Lean4 code. This integration improves LLMs' comprehension and execution of formal reasoning, leading to better formal theorem proving outcomes.