forked from mikeshulman/catlog
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdedsys.tex
559 lines (457 loc) · 49 KB
/
dedsys.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
\chapter{Deductive systems}
\label{chap:dedsys}
The purpose of this appendix is to explain the formal apparatus underlying type theory from a mathematical perspective, giving precise meanings to words like ``judgment'', ``rule'', ``derivation'', and ``binder''.
This is rarely explained in detail, yet in my experience the unfamiliar terminology is a large part of what makes type theory difficult for mathematicians to understand.
Formally speaking, this appendix should come before \cref{chap:unary}.
However, its technicalities seem unlikely to be appreciated without some concrete exposure to the ideas that it is trying to explain, so I have placed it as an appendix instead.
I encourage the reader to skip back and forth between it and the main text as needed.
I should say that probably not all type theorists would agree with the meanings assigned herein to words like ``judgment''.
Constructive type theory also has a philosophical/foundational aspect that I will not attempt to explain or engage with.
The purpose of this appendix, like that of the entire book, is to explain type theory \emph{only} in its role as a language for reasoning about categorical structures, without meaning thereby to disparage its other roles or regard them as unimportant.
\section{Trees and free algebras}
\label{sec:trees}
As remarked in \cref{sec:generalities}, our perspective on type theory is that it presents \emph{free categorical structures} in a particularly convenient way.
Since categorical structures are particular kinds of \emph{algebraic} structures, it seems reasonable to think first about what free algebraic structures look like in general.
In this section we begin by considering ``algebras without axioms''.
A \textbf{signature} $\sig$ is a set $\sig_1$ of \textbf{operations} with a function $\ay:\sig_1 \to \dN$ assigning to each operation a natural number\footnote{Everything in this chapter works just as well if arities are arbitrary cardinal numbers (except that in \cref{sec:axioms} we would require the axiom of choice).
However, for simplicity we restrict to the case of finite arities, since that is where our ultimate interest lies.
On the other hand, there may certainly be infinitely many operations.} called its \textbf{arity}.
A $\sig$-\textbf{algebra} is a set $A$ together with, for each $m\in\sig_1$, a function $\act m :A^{\ay(m)} \to A$.
There is an obvious notion of $\sig$-algebra morphism, forming a category.
Algebras over a signature are a very ``primordial'' sort of algebraic structure, with arbitrary operations but no axioms allowed.
For instance, if $\sig_1=\{e,m\}$ with $\ay(e)=0$ and $\ay(m)=2$, then $\sig$-algebras are \emph{pointed magmas}: sets equipped with a basepoint and a binary operation.
We will see how to add axioms in \cref{sec:axioms}.
Free $\sig$-algebras are conveniently described in terms of {trees}.
A \textbf{tree} is a set whose elements are called \textbf{nodes}, together with a binary relation between them called \textbf{edge existence} (a ``relational graph'') that is connected and loop-free.
A tree is \textbf{rooted} if it is equipped with a chosen node called the \textbf{root}.
In a rooted tree every node $x$ has a unique (non-retracing) path to the root; if $x$ is not the root, this path goes through a unique edge connected to $x$ that we call \textbf{outgoing}, and the node at the other end of that edge is the \textbf{parent} of $x$.
The non-outgoing edges connected to $x$ are called \textbf{incoming}, and the nodes they connect it to are called its \textbf{children}.
A node is a \textbf{descendant} of $x$ if its path to the root passes through $x$, which is to say it is a child of a child of a\dots of $x$.
A node $x$ together with all its descendants forms another rooted tree with $x$ as the root.
% The \textbf{height} of a node is the length of its path to the root; this is always a natural number, and the height of a child of $x$ is one more than the height of $x$.
A \textbf{branch} is a non-retracing path starting at the root; a rooted tree is \textbf{well-founded} if there are no infinite branches.
If $\sig$ is a signature, then a $\sig$-\textbf{labeled tree} is a rooted tree equipped with a \textbf{labeling} function from nodes to $\sig_1$, along with for every node $x$ labeled by $m\in\cO$, a bijection from the incoming edges of $x$ to $\{1,\dots,\ay(m)\}$ (hence, in particular, that there are exactly $\ay(m)$ such edges).
There is an obvious notion of \emph{isomorphism} between labeled trees.
We write $W\sig$ for the set of all isomorphism classes of \emph{well-founded} $\sig$-labeled trees.
(Note that $W\sig$ is empty unless there is at least one nullary operation.)
Then $W\sig$ has a $\sig$-algebra structure defined as follows: given $m\in\sig_1$ and a list of trees $t_1,\dots,t_{\ay(m)}$, define a tree $\act m (t_1,\dots,t_{\ay(m)})$ with nodes $\{\star\} \sqcup \bigsqcup_i t_i$, where $\star$ is the root, with label $m$, and its children are the roots of the trees $t_i$.
The central fact is that $W\sig$ is the initial $\sig$-algebra.
We will give a classical set-theoretic proof of this for the comfort of a certain kind of reader, but readers of a different kind, or who already believe this fact, are welcome to skip the proof.
(From a constructive type-theoretic point of view, $W\sig$ and its initiality are sometimes a fundamental axiom not reducible to sets.)
\begin{thm}\label{thm:tree-ind}
Suppose $P\subseteq W\sig$ has the property that for any $m$ and trees $t_1,\dots,t_{\ay(m)}$ such that each $t_i\in P$, then also $\act m(t_1,\dots,t_{\ay(m)})\in P$.
Then $P= W\sig$.
\end{thm}
\begin{proof}
Suppose not, so there is a well-founded $\sig$-labeled tree not in $P$.
Let $m$ be the label of its root and $t_1,\dots,t_{\ay(m)}$ its children; then our given tree is (isomorphic to) $\act m(t_1,\dots,t_{\ay(m)})$.
By the contrapositive of our assumption, therefore, there must be some $i$ such that $t_i\notin P$.
Iterating, we obtain an infinite branch, contradicting well-foundedness.
\end{proof}
\begin{thm}\label{thm:tree-rec}
For any $\sig$-algebra $A$, there is a unique $\sig$-algebra morphism $W\sig \to A$.
\end{thm}
\begin{proof}
TODO: standard argument.
\end{proof}
Now that we have \emph{initial} $\sig$-algebras, note that \emph{free} $\sig$-algebras can be constructed by a simple modification.
Given $\sig$ and any set $X$, define a new signature $\sig[X]$ by $\sig[X]_1 = \sig_1 \sqcup X$, where $\ay(x)=0$ for all $x\in X$.
Then a $\sig[X]$-algebra is just a $\sig$-algebra together with a map from $X$ into its underlying set, so the initial such algebra is exactly the free $\sig$-algebra on $X$.
Thus, the forgetful functor from $\sig$-algebras to sets has a left adjoint.
A different way to express \cref{thm:tree-rec} is that \emph{given an arbitrary set $A$, to define a function $W\sig \to A$ it suffices to define a $\sig$-algebra structure on $A$}.
This may seem like a trivial reformulation, but it better reflects the way we use it to describe type theories.
In yet other words, we may define a function $f:W\sig \to A$ by specifying $f(\act m(t_1,\dots,t_{\ay(m)}))$ for each $m$, assuming recursively that $f(t_1),\dots,f(t_{\ay(m)})$ have already been defined.
Formally this is the same as specifying a $\sig$-algebra structure on $A$ --- the definition of ``$f(\act m(t_1,\dots,t_{\ay(m)}))$'' given the ``values of $f(t_1),\dots,f(t_{\ay(m)})$'' is precisely the action $\act m$ on $A$ --- but it often matches our thought processes best.
\subsection*{Exercises}
\begin{ex}\label{ex:wf-rigid}
Prove that a well-founded $\sig$-labeled tree has no nonidentity automorphisms.
Thus, the passage to isomorphism classes in the definition of $W\sig$ is ``categorically harmless''.
\end{ex}
\begin{ex}\label{ex:natw}
Exhibit a signature $\sig$ such that $W\sig \cong \dN$ and \cref{thm:tree-ind} reduces to ordinary mathematical induction.
\end{ex}
\section{Indexed trees}
\label{sec:indexed-trees}
The signatures and algebras in \cref{sec:trees} have only one underlying set, or \emph{sort}, but sometimes algebraic structures have more than one sort.
As a simple example, we could consider a set together with a group acting on that set to be a single algebraic structure; then the group and the set are two sorts.
Categories could be regarded as having two sets, namely objects and arrows; but it is generally better to treat them differently.
Specifically, for a fixed set $\cO$, we regard categories with object set \cO as an algebraic structure whose set of sorts is $\cO\times \cO$.
Thus each hom-set is a separate sort, and each triple $A,B,C$ gives a different binary composition operation
\[ \circ_{A,B,C} : (\map(B,C),\map(A,B)) \to \map(A,C) \]
This may seem a little odd, but as we will see it makes sense.
To deal with multi-sorted algebraic structures in general, we augment our signatures with a set $\sig_0$ of \textbf{sorts} together with, for each operation $m\in\sig_1$, an \textbf{output sort} $c_m\in\sig_0$ and also a list of \textbf{input sorts} $d_{m,1},\dots,d_{m,\ay(m)}$.
For brevity we write such an operation as $m:(d_{m,1},\dots,d_{m,\ay(m)}) \to c_m$.
From now on we call these \emph{multi-sorted signatures} simply \textbf{signatures}; the simpler signatures of \cref{sec:trees} we re-christen \textbf{one-sorted signatures}.
(In fact, a multi-sorted signature is essentially the same as a ``multigraph'', \cref{defn:multigraph}.)
For a multi-sorted signature $\sig$, a $\sig$-\textbf{algebra} is a $\sig_0$-indexed family of sets $\{A_i\}_{i\in\sig_0}$ together with for each $m\in\sig_1$ a function $A_{d_{m,1}}\times \cdots \times A_{d_{m,\ay(m)}} \to A_{c_m}$.
For instance, if $\sig_0 = \{g,s\}$ and $\sig_1=\{m,t\}$ with
$m : (g,g) \to g$ and $t : (g,s) \to s$,
then an indexed algebra consists of two sets $A_g$ and $A_s$, a binary operation on $A_g$, and an action of $A_g$ on $A_s$.
Similarly, we define a $\sig$-\textbf{labeled tree} as before, with the additional requirement that if $x$ is the $k^{\mathrm{th}}$ child of $y$, and $x$ is labeled by $m\in\sig_1$ while $y$ is labeled by $p\in\sig_1$, then $c_m = d_{p,k}$.
For each $i\in\sig_0$, let $W\sig_i$ be the set of isomorphism classes of $\sig$-labeled trees for which the output sort of the root is $i$.
Then $\{W\sig_i\}_{i\in\sig_0}$ has a similar tautological $\sig$-algebra structure, and is the initial one.
\subsection*{Exercises}
\begin{ex}\label{ex:multi-sorted-W}
Prove that $\{W\sig_i\}_{i\in\sig_0}$ is the initial $\sig$-algebra.
\end{ex}
\section{Free algebras with axioms}
\label{sec:axioms}
Of course, most algebraic structures of interest contain axioms as well as operations; for instance, multiplication in a group or monoid must be associative and unital.
The free monoid on a set $X$ is naturally regarded as a quotient of the free pointed magma on $X$ that forces associativity and unitality to hold.
It turns out that we can construct free algebras of this sort quite generally by defining an equivalence relation \emph{as another indexed free algebra}.
Making this completely precise in general is a bit technical, so we will begin with a concrete example.
Suppose we want to generate the free semigroup on a set $X$.
Let $\F\bMag X$ denote the free magma on $X$, constructed as in \cref{sec:trees}.
(A magma is a set with a single binary operation; a semigroup is a magma whose operation is associative.)
Now define a signature $\sig^{\equivsym}$ with $\sig^{\equivsym}_0 = \F\bMag X\times \F\bMag X$ and the following operations.
\begin{itemize}
\item For each $x\in \F\bMag X$, a nullary operation $() \to (x,x)$.
\item For each $x,y\in \F\bMag X$, a unary operation $((x,y)) \to (y,x)$.
\item For each $x,y,z\in \F\bMag X$, a binary operation $((x,y),(y,z)) \to (x,z)$.
\item For each $x,y,z,w\in \F\bMag X$, a binary operation \[((x,y),(z,w)) \to (x\cdot z,y\cdot w),\] where $\cdot$ denotes the binary magma operation on $\F\bMag X$.
\item For each $x,y,z\in \F\bMag X$, a nullary operation \[() \to (x\cdot (y\cdot z),(x\cdot y)\cdot z).\]
\end{itemize}
An algebra for this signature is an $(\F\bMag X\times \F\bMag X)$-indexed family of sets $R(x,y)$ equipped with elements and operations
\begin{align*}
e_x &\in R(x,x)\\
R(x,y) &\to R(y,x)\\
R(x,y)\times R(y,z) &\to R(x,z)\\
R(x,y) \times R(z,w) &\to R(x\cdot z,y\cdot w)\\
a_{x,y,z} &\in R(x\cdot (y\cdot z),(x\cdot y)\cdot z)
\end{align*}
Now for any such $R$, ``$R(x,y)$ is nonempty'' is a binary \emph{relation} on $\F\bMag X$, which we abusively denote also by $R(x,y)$.
The above elements and operations imply that this is an equivalence relation that is a congruence for the magma operation and moreover relates $x\cdot (y\cdot z)$ to $(x\cdot y)\cdot z$ for all $x,y,z$.
And conversely, if we have any such binary relation $\sim$, we can construct an indexed algebra $R$ by setting $R(x,y) = \unit$ if $x\sim y$ and $R(x,y)=\emptyset$ otherwise.
Let $\equivsym$ denote the binary relation obtained as above from nonemptiness of the \emph{initial} algebra for this indexed signature.
\begin{thm}\label{thm:free-monoid}
The quotient of $\F\bMag X$ by $\equivsym$ is the free semigroup generated by $X$.
\end{thm}
\begin{proof}
First we show that it is a semigroup.
Given $u,v\in\F\bMag X/\equivsym$, choose representatives $x,y\in\F\bMag X$ for them, and let $u\cdot v$ be the equivalence class of $x\cdot y$.
Since $\equiv$ is a congruence for the magma operation, the result is independent of the choice of representatives; thus $\F\bMag X/\equivsym$ is a magma.
Now given $u,v,w \in \F\bMag X/\equivsym$, choose representatives $x,y,z$; then since $x\cdot (y\cdot z)\equiv (x\cdot y)\cdot z$, we have $u\cdot (v\cdot w) = (u\cdot v)\cdot w$.
Thus $\F\bMag X/\equivsym$ is a semigroup
Now let $M$ be any other semigroup and $\psi:X\to M$ a map.
Since $M$ is in particular a magma, we have a unique induced magma morphism $\phi : \F\bMag X\to M$.
Define a binary relation $R$ on $\F\bMag X$ by saying that $R(x,y)$ means $\phi(x)=\phi(y)$.
Since $\phi$ is a magma morphism and $M$ is a semigroup, $R$ can be regarded as an algebra for the above indexed signature.
Thus it admits a map from the initial such algebra.
Hence, if $x\equiv y$, then $R(x,y)$, i.e.\ $\phi(x)=\phi(y)$; so $\phi$ factors through $\F\bMag X/\equivsym$.
It is straightforward to check that this factorization is a semigroup morphism and is the unique such extending $\psi$.
\end{proof}
In the general case, we proceed as follows.
Suppose $\sig$ is a (multi-sorted) signature and we have additionally a set $\axes$ of \textbf{axioms}, each of which is a pair $(a,b)$ of elements of the free algebra $W\sig[V]_i$ for some $i\in\sig_0$ and some finite set $V$.
Then for any $\sig$-algebra $A$, any axiom $a,b\in W\sig[V]_i$, and any function $g:V\to A$ (picking out some finite set of elements of $A$), we have an induced $\sig$-algebra map $\gbar:W\sig[V]\to A$.
We define a $(\sig,\axes)$-\textbf{algebra} to be a $\sig$-algebra $A$ such that $\gbar(a)=\gbar(b)$ for any $(a,b)\in\axes$ and $g:V\to A$.
For instance, associativity in a magma is represented by the axiom
\[ \left(
\vcenter{\xymatrix@-1.5pc{ & m \ar@{-}[dl] \ar@{-}[dr] \\ x && m \ar@{-}[dl] \ar@{-}[dr] \\ & y && z }},
\vcenter{\xymatrix@-1.5pc{ && m \ar@{-}[dl] \ar@{-}[dr] \\ & m \ar@{-}[dl] \ar@{-}[dr] && z \\ x && y }}
\right)
\in W\sig[\{x,y,z\}]
\]
The $(\sig,\axes)$-algebras in this case are exactly semigroups.
Now, given a set $X$, we define a signature $\sig^{\equivsym}$ with
\[\sig^{\equivsym}_0 = \setof{(i,x,y) | i\in \sig_0; x,y\in W\sig[X]_i}\]
and the following operations:
\begin{itemize}
\item For each $x\in W\sig[X]_i$, a nullary operation $() \to (i,x,x)$.
\item For each $x,y\in W\sig[X]_i$, a unary operation $((i,x,y)) \to (i,y,x)$.
\item For each $x,y,z\in W\sig[X]_i$, a binary operation $((i,x,y),(i,y,z)) \to (i,x,z)$.
\item For each operation $m:(d_{m,1},\dots,d_{m,\ay(m)}) \to c_m$ in $\sig$, and each collection of pairs of elements $x_k,y_k \in W\sig[X]_{d_{m,k}}$ for $1\le k\le \ay(m)$, an operation
\begin{multline*}
((d_{m,1},x_1,y_1),\dots,(d_{m,\ay(m)},x_{\ay(m)},y_{\ay(m)})) \\\too
(c_m, \act m(x_1,\dots,x_{\ay(m)}), \act m(y_1,\dots,y_{\ay(m)})).
\end{multline*}
\item For each axiom $a,b\in W\sig[V]_i$ in $\axes$ and each function $g:V\to W\sig[X]$ with unique extension $\gbar:W\sig[V]\to W\sig[X]$, a nullary operation
\[ () \to (i,\gbar(a),\gbar(b)). \]
\end{itemize}
Let $\equivsym_i$ be the binary relation on $W\sig[X]_i$ defined by $a\equiv_i b$ if the sort $(i,a,b)$ is nonempty in the initial $\sig^{\equivsym}$-algebra.
\begin{thm}\label{thm:tree-quotient}
Each $\equivsym_i$ is an equivalence relation and a congruence for the $\sig$-algebra structure, and the quotients $W\sig[X]_i/\equivsym_i$ form the free $(\sig,\axes)$-algebra.\qed
\end{thm}
As in \cref{sec:trees}, we will usually think of this theorem slightly differently: to define a family of maps $f_i:W\sig[X]_i/\equivsym_i\to A_i$, it suffices to define each $f_{c_m}(\act m(t_1,\dots,t_{\ay(m)}))$ assuming recursively that $f_{d_{m,1}}(t_1),\dots, f_{d_{m,\ay(m)}}(t_{\ay(m)})$ have been defined, and also to check that for any axiom $(a,b)\in W\sig[V]_i$ and $g:V\to W\sig[X]_i$ we have $f_i(\gbar(a))=f_i(\gbar(b))$.
\subsection*{Exercises}
\begin{ex}\label{ex:tree-quotient}
Prove \cref{thm:tree-quotient}.
\end{ex}
\begin{ex}\label{ex:infop-ac}
Why is the axiom of choice required to generalize \cref{thm:tree-quotient} to the case of infinitary operations?
\end{ex}
\section{Rules and deductive systems}
\label{sec:rules}
The basic machinery of type theory is an iteration and reformulation of the preceding sections in different language, simultaneously introducing convenient notations.
We consider a sequence of signatures $\sig^{(1)},\sig^{(2)},\dots,\sig^{(n)}$ for which the sorts of $\sig^{(k)}$ are defined in terms of the initial algebras $W\sig^{(j)}$ for the previous signatures $j<k$.
For instance, we might have $\sig^{(2)}_0 = W\sig^{(1)} \times W\sig^{(1)}$.
A particularly important special case is when $\sig^{(k)}$ is $(\sig^{(j)})^{\equivsym}$ for some $j<k$ and some set of axioms, as in \cref{sec:axioms}.
Each sort in one of the signatures $\sig^{(k)}$ is called a \textbf{judgment}.
We write $\cJ$ for a generic judgment, but we use more specific and congenial notation in particular cases, such as:
\begin{itemize}
\item When categories with object set \cO are regarded as an $(\cO\times\cO)$-sorted theory as mentioned in \cref{sec:indexed-trees}, the sort $(A,B)$ is usually written $A\types B$.
This signature (with an $\equivsym$ on top of it) corresponds to the cut-ful type theory for categories from \cref{sec:category-cutful}.
The cut-free type theory for categories has different operations but the same sorts, and uses the same notation.
\item If $\sig^{(1)}$ is a one-sorted signature regarded as describing the \emph{objects} of some categorical structure, then we denote its sort by ``$\mathsf{type}$''.
We generally then have $\sig^{(2)}_0 = W\sig^{(1)} \times W\sig^{(1)}$ (for a unary type theory), with sorts again written as $A\types B$, where now $A$ and $B$ are elements of the initial $\sig^{(1)}$-algebra rather than elements of a fixed set \cO.
\item The multicategorical and polycategorical theories of \cref{chap:simple,chap:polycats} use a similar notation $\Gamma\types\Delta$ for sorts $(\Gamma,\Delta)$ where one or both of $\Gamma$ and $\Delta$ is a list rather than a single item.
\item If $\sig^{(k)}=(\sig^{(j)})^{\equivsym}$, then its sort $(\cJ,x,y)$ is usually written $x\equiv y : \cJ$.
\end{itemize}
In general, each operation $m:(\cJ_1,\dots,\cJ_n) \to \cJ'$ in one of the signatures $\sig^{(k)}$ is called a \textbf{rule}, and written
\[ \inferrule*[right=$m$]{\cJ_1 \\ \cdots \\ \cJ_n}{\cJ'}. \]
The input judgments $\cJ_1,\dots,\cJ_n$ of a rule are called its \textbf{premises}, and the output judgment $\cJ'$ is called its \textbf{conclusion}.
Finally, each element of $W\sig^{(k)}$ is called a \textbf{derivation} (sometimes a derivation \emph{of} its root judgment) and written by placing rules on top of each other to form its tree structure.
For instance, if \cJ denotes the single sort of the signature for semigroups, then the associativity axiom of a monoid is
\[
\inferrule*[Right=$m$]{\inferrule*[Right=$x$]{\qquad}{\cJ} \\
\inferrule*[Right=$m$]{\inferrule*[Right=$y$]{\qquad}{\cJ}\\\inferrule*[Right=$z$]{\qquad}{\cJ}}{\cJ}}{\cJ}
\qquad \equiv \qquad
\inferrule*[Right=$m$]{\inferrule*[Right=$m$]{\inferrule*[Right=$x$]{\qquad}{\cJ}\\\inferrule*[Right=$y$]{\qquad}{\cJ}}{\cJ} \\
\inferrule*[Right=$z$]{\qquad}{\cJ}}{\cJ}
\]
Note the rules with empty premises, corresponding to nullary operations.
Similarly, for the cut-ful type theory for categories, associativity is the collection of axioms (one for each $A,B,C\in\cO$)
\begin{multline*}
\inferrule*[right=$\circ_{A,B,D}$]{\inferrule*[Right=$x$]{\qquad}{A\types B} \\
\inferrule*[Right=$\circ_{B,C,D}$]{\inferrule*[Right=$y$]{\qquad}{B\types C}\\\inferrule*[Right=$z$]{\qquad}{C\types D}}{B\types D}}{A\types D}
\\ \equiv \qquad
\inferrule*[right=$\circ_{A,C,D}$]{\inferrule*[right=$\circ_{A,B,C}$]{\inferrule*[Right=$x$]{\qquad}{A\types B}\\\inferrule*[Right=$y$]{\qquad}{B\types C}}{A\types C} \\
\inferrule*[Right=$z$]{\qquad}{C\types D}}{A\types D}
\end{multline*}
The whole sequence of signatures $\sig^{(1)},\sig^{(2)},\dots,\sig^{(n)}$ is called a \textbf{deductive system}.
Thus, for instance, the signature $\sig[X]$ for semigroups under a fixed set $X$, together with the axiom-signature for monoids under $X$ on top of it, form a single deductive system.
Some deductive systems (probably not all) deserve to be called \emph{type theories}; but we will not attempt to give any definition of this class except by the examples we consider (throughout the entire book).
\begin{rmk}
To be sure, not all type theories fit exactly into the picture presented here.
In particular, \emph{dependent} type theories break the clean ``stratification'' of a deductive system $\sig^{(1)},\sig^{(2)},\dots,\sig^{(n)}$, since in the judgment $\types A\type$ the type $A$ can now contain terms from the ``higher level'' judgment $\Gamma\types M:B$.
Thus the whole system must be defined by one big mutual induction (in type-theoretic lingo it is an ``inductive-inductive definition'').
The general idea is similar, however.
\end{rmk}
\section{Terms}
\label{sec:terms}
Since the judgments in each signature $\sig^{(k)}$ in a deductive system are defined in terms of the \emph{elements} of $W\sig^{(j)}$ for $j<k$, and the latter are rooted trees, the notation would rapidly get unwieldy if each $\cJ$ in a rule contained within it some number of derivation trees.
Thus, we generally represent derivations by \textbf{terms}, which are a more concise syntax containing enough information to reconstruct the derivation.
For instance, the expressions $x\cdot (y\cdot z)$ and $(x\cdot y)\cdot z$ for the two sides of associativity are terms, in which we have represented the rule $m$ by an infix operation ``$\cdot$''.
% TODO: This is actually describing a new inductive family with terms as an additional parameter, thereby defining the graph of the annotation function.
If $M$ is a term representing a derivation of the judgment \cJ, we generally write $M:\cJ$.
(A notable exception is that if \cJ is the sort of a one-sorted $\sig^{(1)}$ presenting the objects of a category, as mentioned above, we usually write ``$A\type$'' or ``$\types A\type$'' rather than ``$A:\mathsf{type}$''.)
We describe a syntax for terms by annotating the rules of a deductive system with terms, so that for instance the multiplication of a semigroup would be
\[ \inferrule*[Right=$m$]{M:\cJ \\ N:\cJ}{M\cdot N :\cJ} \]
Here $M$ and $N$ are ``metavariables'' standing for terms, indicating that whatever terms we have representing two derivations of \cJ, we represent their combination by $m$ by juxtaposing them with an infix dot.
(We always assume that parentheses are added as necessary to ensure correct grouping.)
For purposes of this discussion, ``terms with variables from the context'' such as $x:A\types M:B$ can be regarded as merely a variant notation of something like $x.M : (A\types B)$.
Thus we still have a single thing (namely $x.M$) that represents the entire derivation, even though we generally apply the word ``term'' only to part of this thing (namely $M$).
Similarly, an equality judgment like $x:A \types M\equiv N:B$ is shorthand for $(x.M)\equiv (x.N) : (A\types B)$.
There is one actual difference here in that we generally consider terms of this form modulo ``$\alpha$-equivalence'', i.e.\ the consistent renaming of variables.
For now, let us assume that we know what that means; in \cref{sec:alpha} we will explain it precisely.
% TODO: meaning terms vs. derivation terms vs. proof terms
There is no unique way to assign terms to a deductive system; all that is necessary is to describe some kind of syntax from which a derivation tree can be algorithmically extracted.
When a human mathematician reads an expression such as $x\cdot (y\cdot z)$, they generally mentally organize it as a tree without really thinking about it: here the first $\cdot$, being the ``outer'' operation, is the root, with children $x$ and $y\cdot z$, and the latter decomposes further into another $\cdot$ node with children $y$ and $z$.
This ``internal syntax tree'' has exactly the same shape as the intended derivation tree.
An alternative reading where the second $\cdot$ is the root with children ``$x\cdot (y$'' and ``$z)$'' is ruled out by our intuitive understanding of the meaning of parentheses.
When a computer reads such an expression it likewise constructs an internal tree representation, but the programmer has to explicitly instruct it how to do so; this is called \textbf{parsing}.
If we are given a putative term claiming to represent a derivation of some judgment, then after parsing there is a further step of verifying that the ``parse tree'' indeed corresponds to a valid derivation tree.
This is called \textbf{type-checking}.
Technically it could be done at the same time as parsing, but both human and electronic mathematicians generally separate them.
Thus the parse tree is a sort of ``raw abstract syntax'' that knows how operations are grouped but not whether the operations actually mean anything yet.
We will not say anything more about parsing; we trust the human reader to do it unconciously and the programmer to have good algorithms for it.
Thus, in our formal description of terms, the mathematical objects we call ``terms'' will be representations of parse trees.
And as trees, they will be elements of some other free algebra --- but a simpler one than the one whose derivations we are using them to represent.
For instance, for the cut-ful type theory of categories under \cG, whose judgments are of the form $A\types B$ for $A,B\in\cG_0$ (and in particular there are $\cG_0\times \cG_0$ of them), the terms will be elements of a \emph{one-sorted} free algebra with a nullary operation $\id_A$ and a binary operation $\circ_A$ for each $A\in\cG_0$.
Thus this free algebra contains many ``ill-typed'' terms such as $g\circ_B \id_A$ where $g\in\cG(C,D)$; the goal of type-checking is to discard these undesirables.
(For technical reasons, rather than a single set of terms as here, in the general case we will allow each judgment of our intended theory to be assigned a different set of ``potential terms''; see below.)
% TODO: Actually a judgment can have different "modes" --- we could be checking the type as for a canonical judgment, but we could also be synthesizing it as for an atomic one, and the requisite annotations are different.
Now in practice, the input to type-checking is usually a parsed term \emph{together with} a putative type for that term, and so the term notations only need to contain enough information to reconstruct the derivation tree when supplemented with the latter.
For instance, we have noted that the cut-ful type theory for categories technically involves a different composition operation $\circ_{A,B,C}$ for each triple of objects, so that terms would technically have to be written as $h\circ_{A,C,D} (g\circ_{A,B,C} f)$.
However, if we are given a term whose outer operation is a composition and that claims to represent a derivation of a judgment $A\types D$, then the composition must be $\circ_{A,?,D}$.
Thus in general it suffices to indicate the object being composed over, as in $h\circ_C (g\circ_B f)$.
\begin{rmk}
In many cases we can omit further information because it can be inferred from context; for instance, if we know that $h:C\to D$ then a term of the form ``$h\circ (-)$'' can only mean ``$h\circ_C (-)$''.
Human mathematicians omit information informally and unsystematically, and we have done the same throughout the book.
The implementors of electronic mathematicians have elaborate and precise algorithms for ``inferring from context'' enabling the omission of information, but most of these are far beyond our scope.
% \footnote{One that is worth mentioning, however, is that a canonical/atomic calculus like that of \cref{sec:atomcan-catprod} can be type-checked in a ``bidirectional'' way. canonical terms $M\can A$ are type-checked as usual with both $M$ and $A$ being given, while atomic terms $M\atom A$ instead ``type-synthesize'': only $M$ is given and its type $A$ is deduced.
% Paradoxically, this sometimes enables the omission of more information.
% For instance, if when type-checking a function application $f(a)\can B$ the term $f$ can synthesize its type $A\to B$ (note that it is atomic since application is an elimination rule for a negative type), then we can extract the type $A$ at which we have to type-check the argument $a\can A$; thus the notation for function application doesn't need to notate the type $A$.}
\end{rmk}
With type-checking (and also ``proof search'') in mind, type theorists tend to read the rules of a deductive system ``bottom-up''.
That is, instead of thinking of a rule
\begin{mathpar}
\inferrule*{\cJ_1 \\ \cJ_2}{\cJ}
\end{mathpar}
as meaning ``if we have $\cJ_1$ and $\cJ_2$ we can deduce $\cJ$'', they instead think ``if we want to deduce $\cJ$, it suffices to have $\cJ_1$ and $\cJ_2$''.
This is the direction that a type-checking algorithm applies the rule: given a parsed term $M$ and a putative judgment $\cJ$, the rule tells us how to break down the job of checking that $M:\cJ$ into simpler type-checking tasks that can be done recursively.\footnote{However, some more advanced theories are type-checked in a ``bidirectional'' way, with some judgments being read upwards in this way and others being read downwards as ``type synthesis'', where only the term is given and the type is inferred by the algorithm.}
It is also the direction that the rule is often applied when \emph{searching} for a derivation of \cJ, by the same sort of recursive procedure.
With all of this in mind, we make the following formal definition.
% TODO: This definition only works for all-input moded judgments.
\begin{defn}\label{defn:terms}
Let $\sig$ be a signature; a \textbf{term system} for $\sig$ is a $\sig$-algebra \dT, whose elements are called (potentially ill-typed) \textbf{terms}, such that
\begin{enumerate}
\item For any judgment $c\in\sig_0$ and term $t\in \dT_c$, there is at most one rule $m:(d_1,\dots,d_n)\to c$ and terms $s_j\in \dT_{d_j}$ such that $t = \act m(s_1,\dots,s_n)$.\label{item:terms-uniq}
\item If we define a relation $s\preceeds t$ on $\bigsqcup_i \dT_i$ to hold just when $t = \act m(s_1,\dots,s_n)$ and $s=s_j$ for some $j$, then $\preceeds$ is well-founded: there are no infinite chains $t_1 \succ t_2 \succ t_3 \succ \cdots$.\label{item:term-wf}
\end{enumerate}
\end{defn}
Since a term system \dT is a $\sig$-algebra, there is a unique $\sig$-algebra morphism $W\sig \to \dT$.
This is the function that assigns to each derivation a unique representing term.
Axiom~\ref{item:terms-uniq} above ensures that a derivation is determined by its term:
\begin{lem}\label{thm:term-system}
If \dT is a term system, then the unique $\sig$-algebra morphism $W\sig \to \dT$ is injective.
\end{lem}
\begin{proof}
Let $x,y\in W\sig_i$ have the same image in $\dT_i$.
By axiom~\ref{item:terms-uniq}, and the fact that $W\sig \to \dT$ is a $\sig$-algebra morphism, we must have $x=\act m(x_1,\dots,x_n)$ and $y=\act m(y_1,\dots,y_n)$ for the same operation $m$ and each pair $x_j,y_j$ having the same image in $\dT$.
By structural induction, therefore, each $x_j=y_j$, and thus $x=y$.
\end{proof}
However, the converse of \cref{thm:term-system} does not hold.
Indeed, its conclusion does not even imply axiom~\ref{item:terms-uniq} (which is all that was used in its proof): the former is only about ``globally'' well-typed terms, while the latter is a ``local'' condition that says something even about ill-typed terms.
The reason for the extra strength of~\ref{item:terms-uniq}, and for condition~\ref{item:term-wf}, is to make type-checking a ``deterministic terminating recursive algorithm'', as follows.
Given a term $t$, we check whether it is of the form $\act m(s_1,\dots,s_n)$.
If so, then by~\ref{item:terms-uniq} $m$ and the $s_j$'s are uniquely determined, and we can recursively consider each $s_j$.
If the answer is ever no, then the term $t$ is ill-typed.
Otherwise, axiom~\ref{item:term-wf} ensures that the algorithm must terminate (by bottoming out at nullary rules), and we have now constructed a derivation (the tree of rules $m$) represented by the term $t$.
In the main text, we generally stated our ``terms are derivations'' lemmas in the simple form of ``if a term judgment is derivable, then it has a unique derivation.''
As stated this is only the conclusion of \cref{thm:term-system}, but in all cases our proofs had the simple inductive form that actually establishes all of \cref{defn:terms}.
Of course, for this to actually be an algorithm in the computer science sense, the test for whether $t=\act m(s_1,\dots,s_n)$ would have to be ``computable''.
Making that precise is far beyond our current scope, but it may be worth mentioning that it generally holds because \dT is constructed using an initial algebra for some other signature, and initial algebras are very computable (they are ``abstract datatypes'').
Such a construction of \dT generally also ensures axiom~\ref{item:term-wf} immediately.
Notationally, we regard the common ``annotation of rules'' as specifying a signature along with a term syntax for it.
For instance, when we annotate the composition rule in the cut-ful type theory of categories
\[ \inferrule{A\types B \\ B\types C}{A\types C} \]
by terms to get
\[ \inferrule{\phi:(A\types B) \\ \psi:(B\types C)}{\psi\circ_B \phi : (A\types C)} \]
we mean that if $m$ is this rule, then the corresponding operation $\act m$ on $\dT$ is given by the operation $(-\circ_B -)$.
Technically, this requires us to specify in advance the set (or sets) \dT of terms, so that the annotated rules are describing which \emph{previously existing operations} on \dT we are using to represent each rule.
However, since in most cases \dT is a free algebra for a different signature with an operation corresponding directly to each rule in $\sig$ (though not in a one-to-one manner), we can generally omit this preliminary step and assume that \dT is freely generated as necessary by the operations named in the annotations.
% The most notable example is dependent type theories, which among other things have terms that do not exactly represent derivations because of the ``coercion'' rule:
% \begin{mathpar}
% \inferrule*{M:A \\ A\equiv B}{M:B}
% \end{mathpar}
\section{Variable binding and $\alpha$-equivalence}
\label{sec:alpha}
Finally, we come to the vexing question of $\alpha$-equivalence.
We could wave our hands at it by claiming to use de Bruijn variables everywhere, but this would be a bit dishonest.
As is evident, we actually do use named variables all over the place, so it behooves us to say something about what they mean.
In this section we will describe a general way to construct ``terms with binders'' such as $\match$ and $\lambda$ and define a notion of $\alpha$-equivalence.
There are many ways to do this; our approach follows~\cite{gp:asib,gp:aswvb,pg:freshml} (see also~\cite{crole:alpha}).
Let \dA be a fixed infinite set (usually countable), whose elements we call \textbf{variables}.
Let $\sig$ be a signature, one-sorted for simplicity, together with injective functions $v,b:\dA \to \sig_1$ such that $\ay(v(x))=0$ and $\ay(b(x))=1$ for all $x\in\dA$.
What we have in mind is that the initial $\sig$-algebra will supply the set of terms in a term syntax for some other signature, with the operations of $\sig$ corresponding to the term notations for the rules in that other signature.
The inclusion $v$ simply says that variables can occur in terms, while the operation $b(x)$ is intended to ``bind'' the variable $x$ in its argument; usually $b(x)(M)$ is written $x.M$.
When term notations bind variables, their corresponding operations will put a specially named $\sig$-operation together with one or more uses of $b$.
For instance, when describing the terms in the unary type theory for categories with coproducts, there will be operations $\acase AB$ of arity 3, which we combine with two uses of $b$ to represent the terms annotating $\plusE$:
\[\acase AB(M,u.P,v.Q) = \acase AB(M,b(u)(P),b(v)(Q)).\]
As usual, let $W\sig$ be the initial $\sig$-algebra; and let $\nAut(\dA)$ be the group of automorphisms (permutations) of the set \dA.
We write the action of $\sigma\in\nAut(\dA)$ on $x\in \dA$ by $x^\sigma$.
Now we define, by recursion, an action of $\nAut(\dA)$ on $W\sig$ as follows:
\begin{align*}
\sigma \cdot \act{v(x)} &= \act{v(x^\sigma)}\\
\sigma \cdot \act{b(x)}(M) &= \act{b(x^\sigma)}(\sigma\cdot M)
\end{align*}
with $\sigma\cdot M$ defined recursively in the latter.
In all other cases, $\sigma \cdot (-)$ simply recurses into all subtrees.
It is easy to show that this is a group action.
Because all operations in $\sig$ have finite arity\footnote{If $\sig$ were allowed to contain infinitary operations, then to make this work, the cardinality of \dA would have to be of cofinality greater than any of their arities.} and all trees in $W\sig$ are well-founded, only finitely many variables can occur in any such tree (either through $v$ or $b$).
So, since \dA is infinite, for any $M\in W\sig$ there is some variable $z\in \dA$ that does not occur in $M$.
We call such a $z$ \textbf{fresh} (relative to $M$) and write $z\notin M$.
We now define $\alpha$-equivalence $\equiv$ on $W\sig$, by defining a new signature $\sig^\equiv$ similar to how we did it in \cref{sec:axioms}.
We include operations making $\equiv$ a congruence for all operations of $\sig$ except $b$.
In the case of $v$, this means we have ``reflexivity at variables'' $v(x)\equiv v(x)$.
We also include one further operation that in rule form looks like this:
\begin{equation}
\inferrule{z\notin M \\ z\notin N \\ z\neq x \\ z \neq y \\ (zx)\cdot M \equiv (zy)\cdot N}
{b(x)(M) \equiv b(y)(N)}\label{eq:alpha-gen}
\end{equation}
Here $(zx)$ and $(zy)$ denote the transposition permutations that swap $z$ with $x$ and $z$ with $y$, respectively.
Since $z$ does not occur in $M$ and $N$, the permutation actions $(zx)\cdot M$ and $(zy)\cdot N$ amount to replacing all occurences of $x$ in $M$ and $y$ in $N$ (even bound ones) by $z$.
The rule then says that if these two results are $\alpha$-equivalent, then so are the terms $x.M$ and $y.N$ with new bound variables.
For instance, $x.x$ and $y.y$ are $\alpha$-equivalent because $(zx)\cdot x = z$ and $(zy)\cdot y = z$.
We also have $x.(x.x) \equiv x.(y.y)$ because $(zx)\cdot (x.x) = (z.z)$ and $(zy)\cdot (y.y) = (z.z)$ as well, so the inner bound $x$ really does ``shadow'' the outer one, making the latter invisible even though it has the same name.
But neither of these is equivalent to $x.(y.x)$, since $(zx)\cdot (y.x) = (y.z)$.
Note also that if $M\in W\sig$, then $x.M$ is $\alpha$-equivalent to $y.((yx)\cdot M)$ for any variable $y$ not occurring in $M$, since if neither $y$ nor $z$ occur in $M$ then
\[(zy)\cdot (yx)\cdot M = (yx)\cdot (zx)\cdot M = (zx)\cdot M.\]
Thus, we can always replace a bound variable by any another fresh variable.
Unlike in \cref{sec:axioms}, we do not explicitly include operations making $\equiv$ an equivalence relation.
However, we can nevertheless prove that it is; this is itself a sort of cut-admissibility.
\begin{lem}\label{thm:alpha-adm}
$\alpha$-equivalence $\equiv$, as defined above, has the following properties:
\begin{enumerate}
\item Equivariance: if $M\equiv N$, then $\sigma\cdot M \equiv \sigma \cdot N$ for any $\sigma\in \nAut(\dA)$.\label{item:alpha-eqvadm}
\item Congruence for binding: if $M\equiv N$, then $x.M\equiv x.N$.\label{item:alpha-bindadm}
\item Rule~\eqref{eq:alpha-gen} is invertible: if $x.M\equiv y.N$, then $(zx)\cdot M \equiv (zy)\cdot N$ for some fresh $z$.\label{item:alpha-gen-inv}
\item Reflexivity: $M\equiv M$ for any $M\in W\sig$.\label{item:alpha-refl}
\item Symmetry: if $M\equiv N$ then $N\equiv M$.\label{item:alpha-symm}
\item Transitivity: if $M\equiv N$ and $N\equiv P$, then $M\equiv P$.\label{item:alpha-trans}
\item Bound variables can be altered freely: if $z\notin M$ then $x.M \equiv z.((zx)\cdot M)$.\label{item:alpha-rename}
\end{enumerate}
\end{lem}
\begin{proof}
Perhaps surprisingly, the tricky and important one is~\ref{item:alpha-eqvadm}.
Of course, the proof is by induction on the derivation of $M\equiv N$, and all the congruence rules are immediate, so it remains to deal with~\eqref{eq:alpha-gen}.
That is, suppose $x.M\equiv y.N$ is obtained from $(zx)\cdot M \equiv (zy)\cdot N$, and let $\sigma\in\nAut(\dA)$.
Now $\sigma\cdot(x.M) = x^\sigma.(\sigma\cdot M)$ and similarly for $N$, so to conclude $\sigma\cdot(x.M) \equiv \sigma\cdot(y.N)$ it will suffice to show $(wx^\sigma)\cdot \sigma\cdot M \equiv (wy^\sigma)\cdot\sigma\cdot N$ for some fresh $w$.
The obvious choice for $w$ is $z^\sigma$.
Then if we let $\tau = (z^\sigma y^\sigma)\sigma(zx)\in\nAut(\dS)$, we have
\begin{align*}
(z^\sigma x^\sigma)\cdot \sigma\cdot M
&= \tau \cdot (zx)\cdot M\\
&\equiv \tau \cdot (zy)\cdot N\\
&= (z^\sigma y^\sigma)\cdot\sigma\cdot N
\end{align*}
using the inductive hypothesis of equivariance for $(zx)\cdot M \equiv (zy)\cdot N$.
Now~\ref{item:alpha-bindadm} is immediate, since $M\equiv N$ implies $(zx)\cdot M \equiv (zx)\cdot N$, whence~\eqref{eq:alpha-gen} gives $x.M\equiv x.N$.
And~\ref{item:alpha-gen-inv} is clear since~\eqref{eq:alpha-gen} is the only rule that can produce an $\alpha$-equivalence between terms of the form $x.M$ and $y.N$ (since we did not include~\ref{item:alpha-bindadm} or~\ref{item:alpha-refl}--\ref{item:alpha-trans} as primitive).
Combining the primitive congruence rules with~\ref{item:alpha-bindadm} yields straightforward inductive proofs of~\ref{item:alpha-refl} and~\ref{item:alpha-symm}.
For~\ref{item:alpha-trans} we induct on both $M\equiv N$ and $N\equiv P$.
By inspection of the form of $N$, they must both be derived from the same rule.
If it is a primitive congruence rule, we just apply the inductive hypothesis to all the inputs and then congruence again.
The interesting case is~\eqref{eq:alpha-gen}, where we have $(ux)\cdot M \equiv (uy)\cdot N$ and also $(vy)\cdot N\equiv (vz)\cdot P$ for potentially different variables $u$ and $v$, with $u$ fresh for $M,N,x,y$ and $v$ fresh for $N,P,y,z$.
Since \dA is infinite there exists a variable $w$ that is fresh for all of $M,N,P,x,y,z$, and so we have
\[ (wx)\cdot M = (wu)\cdot (ux)\cdot M \equiv (wu)\cdot (uy)\cdot N = (wy)\cdot N\]
using~\ref{item:alpha-eqvadm}.
Similarly, $(wy)\cdot N \equiv (wz)\cdot P$, so we ought to be able to conclude by the inductive hypothesis that $(wx)\cdot M\equiv (wz)\cdot P$ and so $x.M\equiv z.P$ by~\eqref{eq:alpha-gen}.
However, this is not the usual structural inductive hypothesis, since the derivations of and $(wx)\cdot M \equiv (wy)\cdot N $ and $(wy)\cdot N \equiv (wz)\cdot P$ are produced by~\ref{item:alpha-eqvadm} and are not subtrees of our given derivations of $x.M\equiv y.N$ and $y.N\cdot z.P$.
Instead we have to do something like assign a natural number ``height'' to all derivations, observe that~\ref{item:alpha-eqvadm} preserves the height of derivations, and then induct on height.
Finally, for~\ref{item:alpha-rename} we choose a fresh $w$ and observe that $(wx)\cdot M = (wz)\cdot (zx)\cdot M$.
Thus, by reflexivity~\ref{item:alpha-refl} we have $(wx)\cdot M \equiv (wz)\cdot (zx)\cdot M$ and hence by~\eqref{eq:alpha-gen} $x.M \equiv z.((zx)\cdot M)$.
\end{proof}
The quotient $W\sig/\equiv$ of this equivalence relation is, of course, our set of ``terms modulo $\alpha$-equivalence of bound variables''.
Since $\equiv$ is a congruence for all the operations of $\sig$, these all descend to the quotient, including (by \cref{thm:alpha-adm}\ref{item:alpha-bindadm}) variable binding; we also denote this operation by $x.M$ where now $M\in W\sig/\equiv$.
Our goal is to use this quotient as the term syntax for another signature.
In practice we will write terms as elements of $W\sig$ itself, but we regard them formally as representing their equivalence class.
We also usually want to restrict to some subsets of terms that have the right number of variables bound to represent the context.
For instance, in unary type theories (\cref{chap:unary}) we have said that a term judgment such as $x:A\types M:B$ can be read as $x:M: (A\types B)$.
We really do want this $x$ to be a \emph{bound} variable in the formal sense of this section, since derivations to determine unique terms we have to quotient by renaming the variables in the context as well.
That is, we represent ``free'' variables as variables that are bound ``on the outside''.
Thus, we should take our set \dT of terms to be the subset of $W\sig/\equiv$ consisting of terms having a variable binding outermost.
Similarly, in a simple type theory (\cref{chap:simple}) the terms for $\Gamma\types B$ should have $n$ variable bindings outermost, where $n$ is the length of $\Gamma$ (this is why in \cref{sec:terms} we allowed different judgments to have different sets of potential terms).
We then need to define operations on these sets \dT that represent the rules of our desired signature.
These will generally be constructed from basic operations in $\sig$ combined with one or more variable bindings.
Let us consider $\case$ from \cref{sec:catcoprod} as a paradigmatic example.
Since the rule $\plusE$ has three premises, what we have to give is an operation $\dT\times \dT\times \dT \to \dT$, where $\dT$ is the set of $\alpha$-equivalence classes of terms of the form $x.M$.
We have presumably included ``$\case$'' as a 3-ary operation in our term signature $\sig$, but this does not take account yet of the binding structure.
The inputs to our desired operation will be (given the above restriction defining $\dT$) of the form $x.M$, $u.P$, and $v.Q$.
The basic 3-ary operation in $\sig$ could give $\case(x.M,u.P,v.Q)$, but of course we want ``$x.\case(M,u.P,v.Q)$'' instead.
To define this, we first choose representatives for the equivalence classes of $x.M$, $u.P$, and $v.Q$.
By \cref{thm:alpha-adm}\ref{item:alpha-rename} we can do this so that $x$ does not appear in $u.P$ or $v.Q$ (which have only finitely many variables each).
Now we can write $x.\case(M,u.P,v.Q)$; but for this to define an operation $\dT\times \dT\times \dT \to \dT$ we have to check that it is independent of the chosen representatives.
For $u.P$ and $v.Q$ this is easy since $\equiv$ is a congruence for all operations of $\sig$, including $\case$.
And if $x.M\equiv y.N$, then by \cref{thm:alpha-adm}\ref{item:alpha-gen-inv} we have $(zx)\cdot M \equiv (zy)\cdot N$ for some $z$, which we may also take to not appear in $u.P$ or $v.Q$.
Thus, using the congruence rules and transitivity, we have
\begin{align*}
x. \case(M,u.P,v.Q)
&\equiv z.\case((zx)\cdot M,u.P,v.Q)\\
&\equiv z.\case((zy)\cdot N,u.P,v.Q)\\
&\equiv y.\case(N,u.P,v.Q).
\end{align*}
The same principle applies to all other term systems using variable binding.
Sometimes we also need to poke down into all the terms to ensure that certain variables in their context are disjoint or equal.
For instance, the term operation representing $\timesI$ takes as given $x.M$ and $y.N$, but its output has only one shared variable.
Thus we have to first note $x.M \equiv z.((zx)\cdot M)$ and $y.N \equiv z.((zy)\cdot N)$ for some $z$ that is fresh for both, and then write $z.\pair{(zx)\cdot M}{(zy)\cdot N}$ for the pairing term.
Based on these examples, we trust that the reader could formulate precise definitions of all the terms used in this book as operations on $\alpha$-equivalence classes.
Of course, in any particular case it is still (technically) necessary to prove that what we get \emph{is} a term system in the sense of \cref{defn:terms}.
Since \dT is a subset of an initial algebra, and our operations are built using at least one operation of that algebra, \ref{defn:terms}\ref{item:term-wf} is straightforward.
Finally, the proof of \ref{defn:terms}\ref{item:terms-uniq} essentially means checking that we chose the operations of the term signature to contain enough information to reconstruct a derivation step-by-step.
This is a formal version of what in the main text we called \emph{type-checking is possible} or \emph{terms are derivations}.
Note that since all our ``terms'' are actually $\alpha$-equivalence classes, we never have to prove anything about $\alpha$-equivalence.
% [TODO: Ought we to say something about substitution at the level of untyped terms?]
% Local Variables:
% TeX-master: "catlog"
% End: