Notes on Model Theory

This summer, I decided to read something about Model Theory. Aware of my limited time, I decided to aim for a slim volume as a guide, so went for An Invitation to Model Theory by Jonathan Kirby.

Here I collect some notes I made while reading Kirby, completely biased to my own interests. Definitions and theorems largely follow the book, but have often been tweaked by me to make my notes more self-contained. I’ve also tried to ‘sloganise’ each of the originally-unnamed theorems in a way that helps me grasp their essence. Additional commentary around the definitions and theorems is mine, and I’ve added a couple of extra definitions to make things easier for me to read back in the future. Of course there is some risk that I’ve introduced errors or lack of clarity, in which case I’m sure the fault lies with me not with the book – please do reach out if you spot anything.

What this post will discuss

We will look at defining basic syntactical objects, terms and formulas, and their interpretations (all in first-order logic). We will look at ideas relating to whether mathematical structures can be distinguished via the truth of formal sentences, including embedding of one structure inside another. We will look at results guaranteeing the existence of structures (especially infinite structures) satisfying sets of such sentences. Turning this around, we’ll also look at some results on which structures can be axiomatised by such sets of sentences. We will consider the extent to which one can define a subset using first-order sentences and the relationship to quantifier elimination (see also my post including algorithms for quantifier elimination). Finally, we will look at the idea of ‘how uniquely’ mathematical structures can be described in this way.

What this post will not discuss

I have purposely not given proofs or examples. Kirby’s book of course has (most) proofs and is rich in examples, through which much of the material is developed. This means I have not discussed some really interesting results on, say, algebraically closed fields, on which Kirby spends considerable time. Those interested in how to apply some of these ideas would do well to consult the book. In addition, I have avoided the topic of types in this simple set of notes, which Kirby highlights as a “central concept in model theory”, despite the interesting results they enable later in the book, e.g. saturated models. The exception is that I included a simplified version of the Ryll-Nardzewski theorem at the end of this post (the proof given uses types) because this particular part of the theorem may be of use to me in the future and can be stated without recourse to types.

I hope to do a later blog post with some of the material using types in due course.


Languages and Structures

Definition (Language). A language L consists of a set of relation symbols, a set of function symbols, a set of constant symbols and, for each relation and function symbol, a positive number known as its arity. The set of all these symbols is denoted \text{Symb}(L).

Definition (Lstructure). An L-structure \mathcal{A} consists of a set A called its domain, together with interpretations of the symbols in L: for each relation symbol R of arity n, a subset R^\mathcal{A} of A^n; for each function symbol f of arity n, a function f^\mathcal{A} : A^n \to A; for each constant symbol c, a value c^\mathcal{A} \in A.

Definition (Embedding). Let \mathcal{A} and \mathcal{B} be L-structures. An embedding of \mathcal{A} into \mathcal{B} is an injection \pi: A \to B s.t.: for all relation symbols R of L and all a_1,\ldots,a_n \in A, (a_1,\ldots,a_n) \in R^\mathcal{A} iff \pi(a_1),\ldots,\pi(a_n) \in R^\mathcal{B}; for all function symbols f of L and all a_1,\ldots,a_n \in A, \pi(f^\mathcal{A}(a_1,\ldots,a_n)) = f^\mathcal{B}(\pi(a_1),\ldots,\pi(a_n)); for all constant symbols c of L, \pi(c^\mathcal{A}) = c^\mathcal{B}.

Definition (Isomorphism). An embedding \pi: \mathcal{A} \to \mathcal{B} is an isomorphism iff there is an embedding \sigma: \mathcal{B} \to \mathcal{A} s.t. \pi \circ \sigma is the identity on \mathcal{B} and \sigma \circ \pi is the identity on \mathcal{A}.

Definition (Automorphism). An automorphism is an isomorphism from a structure to itself.

Terms, Formulas and their Models

Definition (Lterm). The set of terms of the language L is defined recursively as follows: every variable is a term; every constant symbol of L is a term; if f is a function symbol of L of arity n and t_1, \ldots, t_n are terms of L, then f(t_1, \ldots,  t_n) is a term; only something built from the preceding clauses in finitely many steps is a term.

Definition (Lformula). The set of formulas of L is defined recursively as follows: if t_1 and t_2 are terms, then (t_1 = t_2) is a formula; if t_1, \ldots t_n are terms and R is a relation symbol of L of arity n, then R(t_1, \ldots t_n) is a formula; if \varphi and \psi are formulas, then \neg \varphi and \varphi \wedge \psi are formulas; if \varphi is a formula and x is a variable, then \exists x[\varphi] is a formula; only something built from the preceding four clauses in finitely many steps is a formula. We will write \varphi(x) for a formula with free variables x = (x_1, \ldots x_n) in some defined order.

Definition (Lsentence). An L-sentence is a L-formula with no free variables.

Definition (Atomic formula). Formulas of the form t_1 = t_2 for some L-terms t_1, t_2 or of the form R(t_1,\ldots,t_n) for some relation symbol R and terms t_1,\ldots,t_n are atomic formulas.

Definition (Interpretation of terms). Let \mathcal{A} be an L-structure, t be a term of L, and x = (x_1, \ldots, x_r) be a list of variables including all those appearing in t. The interpretation of t, written t^\mathcal{A} is a function from A^r to A defined by: if t is a constant symbol c then t^\mathcal{A} : x \mapsto c^\mathcal{A}; if t is x_i then t^\mathcal{A} : x \mapsto x_i; if t is f(t_1, \ldots, t_n) then t^\mathcal{A} : x \mapsto f^\mathcal{A}(t_1^\mathcal{A}(x), \ldots, t_n^\mathcal{A}(x)).

Definition (Interpretation of formulas). We define \mathcal{A} \vDash \phi(a) (read \vDash as ‘models’) recursively as: \mathcal{A} \vDash t_1(a) = t_2(a) iff t_1^\mathcal{A}(a) = t_2^\mathcal{A}(a); \mathcal{A} \vDash R(t_1(a), \ldots, t_n(a)) iff (t_1^\mathcal{A}(a), \ldots, t_n^\mathcal{A}(a)) \in R^\mathcal{A}; \mathcal{A} \vDash \neg \varphi(a) iff \mathcal{A} \nvDash \varphi(a); \mathcal{A} \vDash \varphi(a) \wedge \psi(a) iff \mathcal{A} \vDash \varphi(a) and \mathcal{A} \vDash \psi(a); \mathcal{A} \vDash \exists x[\varphi(x,a)] iff there is some b \in A s.t. \mathcal{A} \vDash \varphi(b,a).

Definition (Lsentence). An L-sentence is a L-formula with no free variables.

Definition (Models of sentences). For a sentence \varphi, if \mathcal{A} \vDash \phi we say \mathcal{A} is a model of \varphi. For a set of sentences \Sigma, we say that \mathcal{A} is a model of \Sigma (\mathcal{A} \vDash \Sigma) iff it is a model of all sentences in \Sigma.

Definition (Entailment). Let \Sigma be a set of L-sentences and \varphi be an L-sentence. \Sigma entails \varphi, written \Sigma \vdash \varphi if every model of \Sigma is also a model of \varphi.

Note that I am using the notation for entailment used by Kirby’s book, for consistency, but computer scientist readers of my blog will want to note that this symbol is commonly also used for (the more restricted, assuming sound) ‘proves’ relation for a formal deductive system.

Definition (Deductively closed). A set \Sigma of L-sentences is deductively closed iff, for every L-sentence \sigma, if \Sigma \vdash \sigma then \sigma \in \Sigma.

(Again note no formal deduction in play here.)

Definition (Satisfiable). A set of sentences \Sigma is satisfiable iff there exists some L-structure \mathcal{A} such that \mathcal{A} \vDash \Sigma.

Definition (Complete). A set \Sigma of L-sentences is complete if, for every L-sentence \varphi, either \Sigma \vdash \varphi or \Sigma \vdash \neg \varphi.

Definition (Theory). A theory is a satisfiable and deductively closed set of L-sentences.

Definition (Theory of a structure). The theory of an L-structure \mathcal{A}, denoted \text{Th}(\mathcal{A}) is the set of all L-sentences which are true in \mathcal{A}.

Elementary Equivalence

It’s clearly interesting to know whether structures can be distinguished by sentences:

Definition (Elementary equivalence). Two L-structures \mathcal{A} and \mathcal{B} are elementarily equivalent, denoted A \equiv B iff for every L-sentence \varphi, \mathcal{A} \vDash \varphi iff \mathcal{B} \vDash \varphi.

Compactness

Theorem (Compactness). Let \Sigma be a set of L-sentences. If every finite subset of \Sigma has a model then \Sigma has a model.

A huge amount of excitement flows from the compactness theorem, including many of the results that follow.

Proposition (Elementarily-equivalent non-isomorphic structures). Let \mathcal{A} be any infinite L-structure. Then there is another L-structure which is elementarily equivalent to \mathcal{A} but not isomorphic to it.

And so, via this route, we get wonderful non-standard models. As we will see in a moment, via the Löwenheim-Skolem theorems, there are actually many such structures.

Axiomatisable Classes

Definition (Axiomatisable class). If \Sigma is a set of L-sentences then denote by \text{Mod}(\Sigma) the class of all L-structures which are models of \Sigma. The class \text{Mod}(\Sigma) is axiomatised by \Sigma and is and axiomatisable class. The class is finitely axiomatised if it is axiomatised by a finite set of sentences.

By compactness, we can’t capture arbitrarily sized, but finite, models (e.g. the class of all finite groups):

Proposition (Large models for axiomatisable classes ensures infinite models). Let C be an axiomatisable class with arbitrarily large finite models (i.e. for every n \in \mathbb{N} there is a model \mathcal{A}_n \in C whose domain is finite and of size at least n). Then C also contains an infinite model.

If we can finitely axiomatise a class, then if we know we can axiomatise it by some (possibly infinite set of) sentences, then it suffices to choose an appropriate finite subset of those sentences:

Proposition (Finite subsets of sentences axiomatise). If C = \text{Mod}(\Sigma) and C is finitely axiomatisable, then it is axiomatised by a finite subset of \Sigma.

Cardinality of Languages

Definition (Language cardinality). We will write |L| for the cardinality of language L, which we define to be the cardinality of the set of all L-formulas containing only countably many variables.

Unsurprisingly, given this definition, the language cardinality is countable if the set of symbols is countable, otherwise it’s the same cardinality as the set of symbols:

Proposition (Language cardinality). |L| = \max( \aleph_0, |\text{Symb}(L)| ).

Substructures/Extensions and Elementary Substructures/Extensions

Definition (Substructure). Let \mathcal{A} and \mathcal{B} be L-structures, and suppose A \subseteq B. \mathcal{A} is a substructure of \mathcal{B} iff the inclusion of \mathcal{A} into \mathcal{B} is an embedding.

An example of a substructure would be the naturals as a substructure of the integers, with the language consisting of the < relation and 0; clearly the inclusion map is an embedding.

Lemma (Substructures preserve quantifier-free formulas). If \mathcal{A} \subseteq B, \varphi(x) is a quantifier-free formula and a \in A, then \mathcal{A} \vDash \varphi(a) iff \mathcal{B} \vDash \varphi(a).

Taking the same example, we can see why a stronger notion than substructure is required in order to preserve formulas with quantifiers. For example \exists x[ x < 0 ] clearly has a model in the integers but not in the naturals.

Definition (Elementary substructure/extension). \mathcal{A} is an elementary substructure of \mathcal{B} (and \mathcal{B} is an elementary extension of \mathcal{A}, written \mathcal{A} \preccurlyeq \mathcal{B}, iff A \subseteq B and, for each formula \varphi(x) and each a \in \mathcal{A}^n, \mathcal{A} \vDash \varphi(a) iff \mathcal{B} \vDash \varphi(a).

The Tarski-Vaught test is a test for whether a substructure is elementary. The essence of the test is to check whether we can always find an element in the potential substructure that could replace the equivalent element in the superstructure in a formula containing just one variable ranging over the superstructure domain:

Lemma (Tarski-Vaught test). Let \mathcal{A} \subseteq \mathcal{B} be an L-substructure and for every L-formula \varphi(x,y) and every a \in A^n and b \in B such that \mathcal{B} \vDash \varphi(a,b), there is d \in A such that \mathcal{B} \vDash \varphi(a,d). Then \mathcal{A} \preccurlyeq \mathcal{B}.

The Downward Löwenheim-Skolem theorem guarantees the existence of ‘small’ elementary substructures built from the base of an arbitrary subset of the domain of of the elementary extension:

Theorem (Downward Löwenheim-Skolem theorem). Let \mathcal{B} be an L-structure and S \subseteq B. Then there exists an elementary substructure A \preccurlyeq B such that S \subseteq A and |A| \leq \max(|S|,|L|).

While the Upward Löwenheim-Skolem theorem gives us a tower of models elementarily extending any infinite model:

Theorem (Upward Löwenheim-Skolem theorem). For any infinite L-structure \mathcal{A} and any cardinal \kappa \geq \max(|L|,|A|), there exists an L-structure \mathcal{B} of cardinality equal to \kappa such that \mathcal{A} \preccurlyeq \mathcal{B}.

Definable Sets

Definition (Definable set). In an L-structure \mathcal{A}, a subset S of A^n is said to be a definable set if there is an L-formula \varphi(x_1,\ldots,x_n) s.t. S = \{ a \in \mathcal{A}^n \; | \; \mathcal{A} \vDash \varphi(a) \}.

One way to show that a set is undefinable is to find an automorphism not preserving the set:

Theorem (Preservation of definability). If S \subseteq A^n is a definable set and \pi is an automorphism of \mathcal{A} then \pi(S) = S.

Notation (\text{Def}). We will write \text{Def}_n(\mathcal{A}) for the set of all definable subsets of A^n.

Quantifier Elimination

Quantifier elimination is a useful way to get a grip on which sets are definable.

Definition (Quantifier elimination for structures). An L-structure \mathcal{A} has quantifier elimination iff for every n \in \mathbb{N}^n, every definable subset of A^n is defined by a quantifier-free L-formula.

Definition (Quantifier elimination for theories). An L-theory T has quantifier elimination iff for every n \in \mathbb{N}^n, for every L-formula \varphi(x_1, \ldots, x_n), there is a quantifier-free L-formula \theta(x_1, \ldots, x_n) such that T \vdash \forall x [\varphi(x) \leftrightarrow \theta(x)].

These semantic and syntactic views line up:

Lemma (Structures and their theories are equivalent w.r.t. quantifier elimination). Let \mathcal{A} be an L-structure. Then \text{Th}(\mathcal{A}) has quantifier elimination iff \mathcal{A} has quantifier elimination.

Substructure Completeness

One way to show that a theory has quantifier elimination is via substructure completeness, a weaker notion than completeness.

Notation (\mathcal{A}^+). Given an L-structure \mathcal{A}, define a new language L_A by adding a new distinct constant symbol c_a for every a \in A. \mathcal{A}^+ is the L_A-structure obtained by interpreting each constant symbol c_a as a, with all other interpretations remaining as in \mathcal{A}.

Definition (Diagram). The diagram \text{Diag}(\mathcal{A}) of a L-structure \mathcal{A} is the set of all atomic L_A-sentences and negations of atomic L_A-sentences which are true in \mathcal{A}^+.

Definition (Substructure complete). A L-theory is substructure complete if, whenever \mathcal{A} is a substructure of a model of T, the theory T \cup \text{Diag}(\mathcal{A}) is a complete L_A-theory.

Proposition (Substructure completeness is equivalent to quantifier elimination). Let T be a L-theory. Then T is substructure complete iff T has quantifier elimination.

Principal Formulas

Notation. Given an L-structure \mathcal{A} and an L-formula \varphi with n free variables, we will write \varphi(\mathcal{A}) for \{ a \in A^n | \mathcal{A} \vDash \varphi(a) \}.

Definition (Principal formula). Let T be a complete L-theory. A principal formula w.r.t. T is a formula \psi(x) such that: (i) T \vdash \exists x [\psi(x)], (ii) for every L-formula \varphi(x) with the same tuple of free variables, either T \vdash \forall x[ \psi(x) \to \varphi(x) ] or T \vdash \forall x[ \psi(x) \to \neg \varphi(x)].

Another way to get a good handle on the definable sets of a L-structure is via their principal formulas, especially if there are only finitely many.

Proposition (Definable sets from principal formulas). Let \psi_1(x), \ldots, \psi_m(x) be principal formulas in n variables for the (complete) theory \text{Th}(\mathcal{A}), defining distinct subsets of A^n. Also let these formulas cover A^n, i.e. \cup_{i=1}^m{\psi_i(\mathcal{A})} = A^n. Then every definable set in \text{Def}_n(\mathcal{A}) is a union of some of the sets \psi_i(\mathcal{A}), and |\text{Def}_n(\mathcal{A})| = 2^m.

Categoricity

Bearing in mind the upward and downward Löwenheim-Skolem theorems, what’s the appropriate definition of a model that’s ‘effectively defined’ by a theory?

Definition (Categorical). A theory T is \kappa-categorical iff there is a model of T of cardinality \kappa and any two models of cardinality \kappa are isomorphic.

It turns out that categoricity can give us completeness:

Lemma (Łos-Vaught test). If an L-theory T is \kappa-categorical for some \kappa \geq |L| and T has no finite models, then T is complete.

How do we know whether a theory is categorical? We have a a special case for countably categorical (i.e. \aleph_0-categorical) theories. It turns out that countably categorical complete theories are exactly those whose models have only finitely many definable subsets for each fixed dimension:

Theorem (Ryll-Nardzewski theorem). Let T be a complete L-theory with an infinite model \mathcal{A} and with L countable. Then T is countably categorical iff for all n \in \mathbb{N}^+, \text{Def}_n(\mathcal{A}) is finite.

(Note to reader: the version of this theorem presented in Kirby instead links countable categoricity to the finiteness of Lindenbaum algebras, which I have avoided discussing in this blog post. But Lindenbaum algebras are isomorphic to algebras of definable sets for complete theories (see Kirby Lemma 20.3), so I have stated as above here. I have also eliminated elements of the full theorem presented, to make the presentation here self-contained.)


So how about the book? Would I recommend it to others?

From my perspective as an “informed outsider” to model theory, Kirby’s book was accessible. It’s brief – as the title says, it’s an invitation to model theory – and I think it does that job well. If you’re comfortable (or indeed enjoy) seeing material presented in a variety of ways, sometimes through example that generalises later in the book, sometimes through application, sometimes in its general form first, then you will probably enjoy the presentation – as I did. I felt the book falls a little short of a self-study guide, though, primarily because it doesn’t include worked answers to any of the exercises; this would be a welcome addition. This point is particularly relevant to Chapter 16, which (unlike other chapters in the book) takes the form of a ‘guided exercise’ to applying the ideas of the preceding chapters to the study of a particular structure.

Overall, I would definitely recommend to other outsiders wanting to catch a glimpse of what Model Theory is all about.