teorth · pitmonticone · Oct 2, 2024 · Oct 2, 2024 · Oct 2, 2024 · Oct 2, 2024
diff --git a/README.md b/README.md
@@ -19,6 +19,7 @@ Some automatically generated progress:
 - Oct 1, 2024: Another [~250k transitive implications](equational_theories/Generated/TrivialBruteforce) were proven by simple proof generation. Discussed in the blueprint [here](https://teorth.github.io/equational_theories/blueprint/sect0007.html).
 - Oct 1, 2024: [~500k transitive implications](equational_theories/Generated/EquationSearch) were proven by a custom tool that chooses hypotheses and leveraged previously found implications to search by using the implied equations as substitutions.  Discussed in the blueprint [here](https://teorth.github.io/equational_theories/blueprint/sect0009.html).
 - Oct 2, 2024: All but 7999 implications (99.958%) were [conjecturally resolved using the vampire ATP](equational_theories/Generated/Vampire).
+- Oct 2, 2024: 86 (potentially) new implications in `Subgraph.lean` conclusively (i.e. in Lean) resolved using the [lean-egg tactic](https://github.com/marcusrossel/lean-egg). Some of these were simply missed by the transitive closure computation and technically already implied, but some were genuinely new.
 
 Some statistics and data files from a given point in time:
 - Oct 2, 2024: [A list of unknown implications whose converse is proven](https://github.com/amirlb/equational_theories/blob/extract_implications_equivalence_creators_data/scripts/equivalence_creators.json), i.e. implications that would reduce the number of equivalence classes of equations. At the time of creation we had 2969 equivalence classes. This file contains 4526 unknown implications, if all hold then it will reduce the number of equivalence classes to 1300.

diff --git a/blueprint/src/chapter/egraphs.tex b/blueprint/src/chapter/egraphs.tex
@@ -0,0 +1,29 @@
+\chapter{E-Graphs}
+
+For proving implications, we used another technique called equality saturation~\cite{DBLP:journals/pacmpl/WillseyNWFTP21} with a tactic, to automatically construct proofs.
+
+\section{Methodology}
+
+The basic methodology of equality saturation is based on E-Graphs, a data structure that can store equivalence classes of terms efficiently.
+We used the lean-egg tactic (\url{https://github.com/marcusrossel/lean-egg}), based on equality saturation as a tactic, which (re)constructs a proof from the E-graph~\cite{DBLP:journals/pacmpl/KoehlerGBGTS24} in Lean.
+This means that we do not have to trust either the egg tool nor the tactic: if something goes wrong, Lean will not accept the constructed proof. In fact, we found issues with the proof reconstruction from the examples in this project.
+
+The lean-egg tactic works for equational reasoning, i.e. proving equalities as consequences of other equalities (potentially universally quantified), which is exactly what we need to prove implications of laws in Magmas.
+In many cases, we have laws of the form $x = y$, where neither set of variables in the left- and right-hand-side of the law is a subset of each other.
+In this case the laws cannot be used as rewrite rules: it's not clear what it would be rewritten to, since there are unknowns on both sides of the equation.
+For these cases we used a simple heuristic, where we instantiate the variables with terms found in the (proof) context, as those are likely to be important for proving the equality.
+
+\section{Results}
+
+Out of the possible implications between the $34$ equations considered in Chapter~\ref{subgraph-eq}, this method found an additional 86 implications that were not found before.
+Some of these seem to be missing in the computation of the transitive closure of implications of the equalities (an investigation is in progress), but some of these are genuinely new theorems, and the \texttt{lean-egg} tactic finds good proofs of these (these can be rewritten using \texttt{calc} style with a different tactic, \texttt{calcify}: \url{https://github.com/nomeata/lean-calcify}). An example of this is the following proof, found by lean-egg:
+
+\begin{theorem}[14 implies 23]\label{14_implies_23} \uses{eq23,eq14}\lean{Subgraph.Equation14_implies_Equation23}\leanok  Definition \ref{eq14} is equivalent to Definition \ref{eq23}.
+\end{theorem}
+
+\begin{proof}
+
+$$ x = (x \circ x) \circ (x \circ (x \circ x)) = (x \circ x) \circ x $$
+\end{proof}
+
+It was also able to (re)prove Theorem~\ref{1689_equiv_2}, albeit with a manually-provided hint (guide, in the sense of~\cite{DBLP:journals/pacmpl/KoehlerGBGTS24}).
diff --git a/blueprint/src/chapter/equation_search.tex b/blueprint/src/chapter/equation_search.tex
@@ -1,6 +1,6 @@
 \chapter{Equation Search}
 
-\href{https://github.com/teorth/equational_theories/tree/main/equational_theories/Generated/EquationSearch}{Approximately ~500k transitive implications were proven by a custom tool leveraging the implication graph}. After previous brute force had derived many implications expressible as a small number of rewrites, this equation search leveraged the implication graph to search further. It began at a hypothesis, and then attempted to perform substitutions by equations it transitively implied. Due to it's naive implementation, it can not perform all forms of substutions, it is limited to substutions that match the exact text of another equation. This search method benefits from being able to start from a hypothesis and can search for any number of goals, rather than having to search the combined space of all possible hypotheses and goals, and can 'reach' farther positions in the search graph than simple rewriting alone.
+\href{https://github.com/teorth/equational_theories/tree/main/equational_theories/Generated/EquationSearch}{Approximately ~500k transitive implications were proven by a custom tool leveraging the implication graph}. After previous brute force had derived many implications expressible as a small number of rewrites, this equation search leveraged the implication graph to search further. It began at a hypothesis, and then attempted to perform substitutions by equations it transitively implied. Due to it's naive implementation, it can not perform all forms of substitutions, it is limited to substitutions that match the exact text of another equation. This search method benefits from being able to start from a hypothesis and can search for any number of goals, rather than having to search the combined space of all possible hypotheses and goals, and can 'reach' farther positions in the search graph than simple rewriting alone.
 
 An example proof illustrates the logic it uses:
 

diff --git a/blueprint/src/content.tex b/blueprint/src/content.tex
@@ -17,6 +17,7 @@
 \input{chapter/trivial_auto.tex}
 \input{chapter/all_small_magmas.tex}
 \input{chapter/equation_search.tex}
+\input{chapter/egraphs.tex}
 
 \bibliographystyle{plain} % We choose the "plain" reference style
 \bibliography{references}
diff --git a/blueprint/src/print.pdf b/blueprint/src/print.pdf
diff --git a/blueprint/src/references.bib b/blueprint/src/references.bib
@@ -43,3 +43,43 @@ @article {Kisielewicz2
        DOI = {10.1007/s000120050057},
        URL = {https://doi.org/10.1007/s000120050057},
 }
+
+@article{DBLP:journals/pacmpl/KoehlerGBGTS24,
+  author       = {Thomas Koehler and
+                  Andr{\'{e}}s Goens and
+                  Siddharth Bhat and
+                  Tobias Grosser and
+                  Phil Trinder and
+                  Michel Steuwer},
+  title        = {Guided Equality Saturation},
+  journal      = {Proc. {ACM} Program. Lang.},
+  volume       = {8},
+  number       = {{POPL}},
+  pages        = {1727--1758},
+  year         = {2024},
+  url          = {https://doi.org/10.1145/3632900},
+  doi          = {10.1145/3632900},
+  timestamp    = {Thu, 29 Feb 2024 20:54:53 +0100},
+  biburl       = {https://dblp.org/rec/journals/pacmpl/KoehlerGBGTS24.bib},
+  bibsource    = {dblp computer science bibliography, https://dblp.org}
+}
+
+@article{DBLP:journals/pacmpl/WillseyNWFTP21,
+  author       = {Max Willsey and
+                  Chandrakana Nandi and
+                  Yisu Remy Wang and
+                  Oliver Flatt and
+                  Zachary Tatlock and
+                  Pavel Panchekha},
+  title        = {egg: Fast and extensible equality saturation},
+  journal      = {Proc. {ACM} Program. Lang.},
+  volume       = {5},
+  number       = {{POPL}},
+  pages        = {1--29},
+  year         = {2021},
+  url          = {https://doi.org/10.1145/3434304},
+  doi          = {10.1145/3434304},
+  timestamp    = {Sun, 12 Feb 2023 18:49:14 +0100},
+  biburl       = {https://dblp.org/rec/journals/pacmpl/WillseyNWFTP21.bib},
+  bibsource    = {dblp computer science bibliography, https://dblp.org}
+}