Book Review: Category Theory for the Sciences

I have been reading Spivak’s book ‘Category Theory for the Sciences’ on-and-off for the last 18 months during vacations as some of my ‘holiday reading.’ I’ve wanted to learn something about category theory for a while, mainly because I feel it’s something that looks fun and different enough to my day-to-day activity not to feel like work. I found Spivak’s book on one of my rather-too-regular bookshop trips a couple of years ago, and since it seemed like a more applied introduction, I thought I’d give it a try. This blog post combines my views on the strengths of this book from the perspective of a mathematically-minded engineer, together with a set of notes on the topic which might be useful for me – and maybe other readers – to refer back to in the future.

Screenshot 2019-08-01 at 07.40.30.png
Category Theory by the Pool, Summer 2019

The book has a well-defined structure. The first few chapters cover topics that should be familiar to applied mathematicians and engineers: sets, monoids, graphs, orders, etc., but often presented in an unusual way. This sets us up for the middle section of the book which presents basic category theory, showing how the structures previously discussed can be viewed as special cases within a more general framework. The final chapters of the book discuss some topics like monads that can then be explored using the machinery of this framework.

Set

Chapter 3 covers the category \mathbf{Set} of sets.

One of the definitions here that’s less common in engineering is that of coproduct. The coproduct of sets X and Y, denoted X \sqcup Y is defined as the disjoint union of X and Y. I first came across disjoint union as a concept when doing some earlier holiday reading on topology. We also here get the first experience of the author’s use of informal ‘slogans’ to capture the ideas present in more formal definitions and theorems: “any time behavior is determined by cases, there is a coproduct involved.” I really like this approach – it’s the kind of marginalia I usually write in books myself, but pre-digested by the author!

The notation \text{Hom}_{\mathbf{Set}}(X,Y) is used for the set of functions from X to Y, which we are encouraged to refer to as morphisms, all without really explaining why at this stage – it of course all becomes clear later in the book.

To give a flavour of the kind of exercises provided, we are asked to prove an isomorphism \text{Hom}_{\text{\bf Set}}(X \sqcup Y, A) \xrightarrow{\cong} \text{Hom}_{\text{\bf Set}}(X, A) \times \text{Hom}_{\text{\bf Set}}(Y, A). This reminds me of digital circuit construction: the former corresponds to a time-multiplexed circuit capable of producing outputs from A with inputs from either X or Y, together with an identifier for which (to feed a multiplexer select line); the latter corresponds to two circuits producing both the outputs rather than just the selected one.

The fibre product, X \times_Z Y, given functions f : X \to Z and g : Y \to Z is defined as the set X \times_Z Y = \{ (x, z, y) | f(x) = z = g(y)\}. Any set isomorphic to this fibre product is called a pullback of the diagram X \xrightarrow{f} Z \xleftarrow{g} Y.

The fibre sum, X \sqcup_W Y, given functions f : W \to X and g : W \to Y is the quotient of X \sqcup W \sqcup Y by the equivalence relation \sim generated by w \sim f(w) and w \sim g(w) for all w \in W. A pushout of X and Y over W is any set Z which is isomorphic to X \sqcup_W Y.

A good intuitive description in the text is to think of the fibre sum of X and Y as gluing the two sets together, with W as the glue. A nice example from one of the exercises is W = \mathbb{N}, X = \mathbb{Z}, Y a singleton set, f : W \to X defined by w \mapsto -(w+1). Then we’re left with X \sqcup_W Y \cong \{ -, 0, 1, \ldots \}, where - is an element corresponding to the collapsing of all negative elements in X to a single point.

The familiar notions of surjective and injective functions are recast as monomorphisms and epimorphisms, respectively. A function f : X \to Y is a monomorphism if for all sets A and pairs of functions g, g' : A \to X, if fg = fg' then g = g'. A function f : X \to Y is an epimorphism if for all sets B and pairs of functions h, h' : Y \to B, if hf = h'f then h = h'. By this time in the book, we get the hint that we’re learning this new language as it will generalise later beyond functions from sets to sets – the chapter ends with various slightly more structured concepts, like set-indexed functions, where the idea is always to define the object and the type of morphism that might make sense for that object.

Monoids, Groups, Graphs, Orders & Databases

In Chapter 4, the reader is introduced to several other structures. With the exception of the databases part, this was familiar material to me, but with slightly unfamiliar presentation.

We learn the classical definition of a monoid and group, and also discuss the free monoid generated by a set X as being the monoid consisting of set \mbox{List}(X) of lists of elements of X, with the empty list as the identity and list concatenation as the operation. We discuss the idea of a monoid action on a set S, i.e. a function \circlearrowleft: M \times S \to S associated with the monoid M having the properties e \circlearrowleft s = s for any s \in M, where e is the identity, and m \circlearrowleft (n \circlearrowleft s) = (m \star n) \circlearrowleft s, where m and n are any elements of M, s is any element of S, and \star is the monoid operation. The discussion of monoids is nice and detailed, and I enjoyed the exercises, especially those leading up to one of the author’s slogans: “A finite state machine is an action of a free monoid on a finite set.”

Graphs are dealt with in a slightly unusual way (for me) as quadruples G := (V, A, \mbox{src}, \mbox{tgt}) of a set of vertices V, a set of arrows A, a functions \mbox{src}, \mbox{tgt} mapping each arrow into its source and destination vertex. This corresponds then to what I would call a multigraph, as parallel edges can exist, though it is called a graph in the book.

For each of the structures discussed, we finish by looking at their morphisms.

As an example, let {\mathcal M} := (M, e, \star) and {\mathcal M'} := (M', e', \star') be monoids (the tuple is of the set, identity element, and operation). Then a monoid homomorphism f: {\mathcal M} \to {\mathcal M'} is a function f: M \to M' preserving identity and operation, i.e. (i) f(e) = e' and f(m_1 \star m_2) = f(m_1) \star' f(m_2) for all m_1, m_2 \in M.

Similarly, for graphs, graph homomorphisms f: G \to G' for G := (V, A, \mbox{src}, \mbox{tgt}) and G' := (V', A', \mbox{src}', \mbox{tgt}') are defined as pairs of functions f_0: V \to V' and f_1: A \to A' preserving the endpoints of arrows, i.e. such that the diagrams below commute.

Order morphisms are similarly defined, this time as those preserving order.

Thus we are introduced to several different (somewhat) familiar mathematical structures, and in each case we’re learning about how to describe morphisms on these structures that somehow preserve the structure.

Categories, Functors, and Natural Transformations

Things really heat up in Chapter 5, where we actually begin to cover the meat of category theory. This is exciting and kept me hooked by the pool. The unusual presentation of certain aspects in previous chapters (e.g. the graphs) all made sense during this chapter.

A category \mathcal{C} is defined as a collection of various things: a set of objects \text{Ob}(\mathcal{C}), a set of morphisms \text{Hom}_{\mathcal{C}}(x,y) for every pair of objects x and y, a specified identity morphism \text{id}_x for every object x, and compositions \circ: \text{Hom}_{\mathcal{C}}(y,z) \times \text{Hom}_{\mathcal{C}}(x,y) \to \text{Hom}_{\mathcal{C}}(x,z). To qualify as a category, such things must satisfy identity laws and also associativity of composition.

Examples of the category of Monoids, of Groups, of Sets, of Graphs, of Pre-Orders, etc., are explored. Isomorphisms between objects in a category are carefully defined.

Functors are introduced as transformations from one category to another, via a function transforming objects and a function transforming morphisms, such that identities and composition and preserved. Functors are then used to rigorously explore the connections between preorders, graphs, and categories, and I finally understood why the author had focused so much on preorders in the previous chapters!

Things started to take a surprising and exciting turn for me when the author shows, via the use of functors, that a monoid is a category with one object (the set of the monoid becomes the morphisms in the category), and that groups can therefore be understood as categories with one object for which each morphism is an isomorphism. This leads to a playful discussion, out of which pops groupoids, amongst other things. Similar gymnastics lead to preorders reconstituted as categories in which hom-sets have one or fewer elements, and graphs reconstituted as functors from a certain two-object category to \mathbf{Set}. Propositional logic is re-examined from a categorical perspective, though I suspect this is just scratching the surface of the interplay between category theory and logic, and I’d love to read more about this. For me, this was a highlight – that by playing with these definitions and trying things out, all these wonderful familiar structures arise.

The final part of this chapter deals with natural transformations. Just as functors transform categories to categories, natural transformations transform functors to functors. Specifically, a natural transformation between functor F : {\mathcal C} \to {\mathcal D} and G : {\mathcal C} \to {\mathcal D} is defined as a morphism (in {\mathcal D}) \alpha_X for each object X in {\mathcal C}, such that the diagram below commutes for every morphism f : X \to Y in {\mathcal C}.

Screenshot 2019-08-01 at 04.52.46

I think this section on natural transformations is where the huge number of exercises in the book really came into its own; I doubt very much I would’ve grasped some of the subtleties of natural transformations without them.

We learn about two kinds of composition of natural transformations. The first combines two natural transformations on functors F,G,H : {\mathcal C} \to {\mathcal D}, say one \alpha: F \to G and one \beta: G \to H to obtain a natural transformation \beta \circ \alpha: F \to H. The other combines a natural transformation \gamma_1 between functors F_1, F_2 : {\mathcal C} \to {\mathcal D} with one \gamma_2 between functors G_1, G_2 : {\mathcal D} \to {\mathcal E}, to produce one, \gamma_2 \Diamond \gamma_1: G_1 \circ F_1 \to G_2 \circ F_2.

An equivalence relation {\mathcal C}' \simeq {\mathcal C} – weaker than isomorphism – is introduced on categories as a functor F : {\mathcal C} \to {\mathcal C}' such that there exists a functor F' : {\mathcal C}' \to {\mathcal C} and natural isomorphisms id_{\mathcal C} \xrightarrow{\cong} F' \circ F and id_{{\mathcal C}'} \xrightarrow{\cong} F \circ F'. This seemed odd to me — the lack of insistence on isomorphism for equivalence — but the examples all make sense. More importantly it is shown that this requirement is sufficient for the on-homomorphism part of the functor F to be bijective for every fixed pair of objects in {\mathcal C}, which feels right. I think my understanding could benefit from further interesting examples of usefully equivalent but non-isomorphic categories though – suggestions welcome!

Limits and Colimits

Chapter 6 is primarily given over to the definitions and discussion of limits and colimits, generalising the ideas in \mathbf{Set} to other categories. The chapter begins by looking again at product and coproduct. As an example, for a category {\mathcal C}, with objects X, Y \in \text{Ob}({\mathcal C}), a span is defined as a triple (Z, p, q) where Z is an object in {\mathcal C} and p : Z \to X and q : Z \to Y are morphisms in {\mathcal C}. A product X \times Y is then defined as as span on X and Y such that for any other span (Z, p, q) on X and Y, there exists a unique morphism t_{p,q} such that the following diagram commutes. Again, I rather like the author’s slogan for this: “none shall map to X and Y except through me!”

Screenshot 2019-10-06 at 15.36.29

The chapter then further generalises these ideas to the the ideas of limit and colimit. This section was a step up in abstraction and also one of the least “applied” parts of the book. It has certainly piqued my interest – what are limits and colimits in various structures I might use / encounter in my work? – but I found the examples here fairly limited and abstract compared to the copious ones in previous chapters. It’s in this chapter that universal properties begin to rear their head in a major way, and again I’m still not totally intuitively comfortable with this idea – I can crank the handle and answer the exercises, but I don’t feel like I’m yet at home with the idea of universal properties.

Categories at Work

In the final section of the book, we first study adjunctions. Given categories {\mathcal B} and {\mathcal A}, an adjunction is defined as a pair of functors L: {\mathcal B} \to {\mathcal A} and R: {\mathcal A} \to {\mathcal B} together with a natural isomorphism \alpha whose component for A \in \text{Ob}({\mathcal A}), B \in \text{Ob}({\mathcal B}) is \alpha_{B,A}: \text{Hom}_{\mathcal A}(L(B), A) \xrightarrow{\cong} \text{Hom}_{\mathcal B}(B, R(A)). Several preceding results and examples can then be re-expressed in this framework, and understood as examples of adjunctions. Adjuctions are also revisited later in the chapter to show a close link with monads.

Categories of functors are discussed, leading up to a discussion of the Yoneda lemma. This is really interesting stuff, as it seems to cross huge boundaries of abstraction. Consider a category {\mathcal C}. Spivak first defines a functor Y_c := \text{Hom}_{\mathcal{C}}(c, -) : {\mathcal C} \to \mathbf{Set} sending objects in {\mathcal C} to the set \text{Hom}_{\mathcal C}(c,d) and similarly for morphisms. This is then used in the following theorem:

Let {\mathcal C} be a category, c \in \text{Ob}(\mathcal{C}) be an object, and I: {\mathcal C} \to \mathbf{Set} be a set-valued functor. There is a natural bijection \text{Hom}_{\mathcal{C}-\mathbf{Set}}(Y_c, I) \xrightarrow{\cong} I(c).

This looks super weird: the right-hand side is just a plain old set and yet the left hand-side seems to be some tower of abstraction. I worked through a couple of examples I dreamt up, namely \mathcal{C} = \mathbf{1} (the category with one object) with I mapping that object to an arbitrary set, and \mathcal{C} = \mathbf{Set} with I = \text{id}_{\mathbf{Set}} and convinced myself this works just fine for those two examples. Unfortunately, the general proof is not covered in the book – readers are referred to Mac Lane – and while some discussion is provided regarding links to databases, I felt like I need a proof to properly grasp what’s going on here under the hood.

Monads are then covered. I’ve briefly seen monads in a very practical context of functional programming, specifically for dealing with I/O in Haskell, but I’ve never looked at them from a theoretical perspective before.

A monad on \mathbf{Set} (monads are only defined in the book on \mathbf{Set}, for reasons not totally apparent to me) is defined as a triple (T, \eta, \mu) consisting of a functor T: \mathbf{Set} \to \mathbf{Set}, a natural transformation \eta: \text{id}_{\mathbf{Set}} \to T, and natural transformation \mu: T \circ T \to T such that the following diagrams commute:

 

Screenshot 2019-12-08 at 14.52.30Screenshot 2019-12-08 at 14.52.41Screenshot 2019-12-08 at 14.52.50

I found this discussion enlightening and relatively straight-forward. Firstly, exception monads are discussed, like Maybe in Haskell, and a nice example of wrapping sets and their functions into power sets and their functions, for which I again enjoyed the exercises. Kleisli categories of monads are defined, and it then becomes apparent why this is such a powerful formalism to think about concepts like program state and I/O.

The Kleisli category \mathbf{Kls}(\mathcal{T}) of a monad \mathcal{T} = (T, \eta, \mu) on \mathbf{Set} is defined as a category with objects \text{Ob}(\mathbf{Kls}(\mathcal{T}) = \text{Ob}(\mathbf{Set}) and morphisms \text{Hom}_{\mathbf{Kls}(\mathcal{T})}(X,Y) = \text{Hom}_{\mathbf{Set}}(X,T(Y)). Identity is \text{id}_X: X \to X in \mathbf{Kls}(\mathcal{T}) is exactly \eta_X in \mathbf{Set}  and composition of morphisms f: X \to Y and g: Y \to Z in \mathbf{Kls}(\mathcal{T}) is given by \mu_z \circ T(g) \circ f in \mathbf{Set}. You can intuitively see the ‘baggage’ of T being carried through in \mathbf{Set} yet hidden from view in \mathbf{Kls}.

Finally, the book ends with a (very) brief discussion of operads. I’ve not come across operads before, and from this brief discussion I’m not really sure what they ‘buy’ me beyond monoidal categories.

General

The exercises are copious and great, especially in the earlier chapters. They kept me engaged throughout and helped me to really understand what I was reading, rather than just think I had. In the later exercises there are many special cases that turn out to be more familiar things. This is both enlightening “oh, an X is just a Y!” and frustrating “yeah, but I want to see Xs that aren’t Ys, otherwise what’s this all heavyweight general machinery for?” I guess this comes from trying to cover a lot of ground in one textbook.

I am struck by just how much mental gymnastics I feel I had to go through to see mathematical structures from the “outside in” as well as from the “inside out”, and how category theory helped me do that. This is a pleasing feeling.

The slogans are great and often amusing, e.g. when describing a functor from the category of Monoids to the category of Sets, “you can use a smartphone as a paperweight.”

The book ends rather abruptly after the brief discussion of operads, with no conclusion. I felt it would be nice to have a summary discussion, and perhaps point out the current directions the field of applied category theory is taking.

Overall, this book has been an enlightening read. It has helped me to see the commonality behind several disparate mathematical structures, and has encouraged me to look for such commonality and relationships in my own future work. Recommended!

Review: Verified Functional Programming in Agda

Agda is a dependently typed programming language. I’ve read about dependent types on and off for the past few years in magazine articles as well as discussed them a little with my collaborator Dan Ghica and others, but never really had the time to properly experiment until this Easter holiday. This book – by Aaron Stump – is interesting because it’s basic, practical, and jumps right into coding. I thought it would be good for someone like me who wants to play with the practice of dependent types and only delve into theoretical concerns as necessary or motivated to do so.

The book comes with its own Agda library, the Iowa Agda Library (IAL), a basic library of Booleans, lists, etc., simply written for the beginner to understand. It has a lot of exercises and code snippets based on that library which I found useful and relatively easy to follow.

The book assumes no functional programming expertise, and since I’ve done a (very) little bit of functional programming in Haskell in the distant past, this made the introductory material easy to follow and a useful revision aid.

Introductory Agda

The early parts of the book – Chapters 1 to 4 – are based on various data structures: Booleans, natural numbers, lists. Here the functional programming adds little to what I’ve seen before. The neat thing is the ability to use Agda as a theorem prover. This is what is described in the book as “external verification” – you write code and then you prove – separately – theorems about the code.

Firstly, let’s take a look at some example Agda code, taken from Chapter 4:


take : ∀{ℓ}{A : Set ℓ} → (n : ℕ) → 𝕃 A → 𝕃 A
take 0       _         = []
take (suc n) []        = []
take (suc n) (x :: xs) = x :: take n xs

There’s a few interesting things here already. Firstly, unicode! It feels surprisingly liberating to write unicode characters in code. Secondly the type declaration declares (via ∀), the function to be polymorphic – it operates on lists (𝕃 from IAL) of any type A (of any level ℓ within a hierarchy of type levels).

The real excitement comes when we start proving theorems. For example, given the function nthTail from IAL which returns all elements after the first n elements in a list (and the empty list if the list has fewer than n elements), we can write


ex46 : ∀{ℓ}{A : Set ℓ} (n : ℕ)(l : 𝕃 A) → (take n l) ++ (nthTail n l) ≡ l
ex46 0        _                           = refl
ex46 (suc n) []                           = refl
ex46 (suc n)  (x :: xs) rewrite ex46 n xs = refl

Here ex46 is just the name of my proof (it was Exercise 4.6 in the book). The type declaration for the function ex46 can be read as “I’ll produce a proof that concatenating (take n l) with (nthTail n l) just produces l for all natural numbers n and for all lists of elements of any type A. Here , taken from IAL, is propositional equality, and the only property of it we’re using is reflexivity; in fact the only inhabitant of this type is refl. The first two lines of the proof work because the definitions of take and nthTail are such that when the first argument is zero or the second argument is empty, the left and right hand side normalize to the same result, namely .l ≡ .l in the first case (.l being the name assigned to the second unnamed argument _) and [] ≡ [] in the second case. The third case explicitly uses induction: rewrite is a directive that applies the induction hypothesis ex46 n xs to the proof of ex46 (suc n) (x :: xs). Very pleasing!

Dependent Types Revealed

Chapter 5 is where we really start to see the power of dependent types. Up until this point, we’ve written functional programs and then proved properties about those programs. We have definitely used dependent types in the process, but they were somehow “invisible” to us as a user, and – at least for the exercises in the book – we could ignore the dependent-type based implementation without really affecting the usability of the approach. Chapter 5 shows how the proofs can be part of the programs themselves, using the type system to express the properties we wish to enforce. The simplest and clearest example given is a declaration of a vector datatype:


data 𝕍 {ℓ} (A : Set ℓ) : ℕ → Set ℓ where
[] : 𝕍 A 0
_::_ : {n : ℕ} (x : A) (xs : 𝕍 A n) → 𝕍 A (suc n)

Note here that a vector, unlike a list, has a specific length built into its type. As a result, you get various properties for free, for example Agda’s type system verifies that the length of the concatenation of two vectors is the sum of the individual lengths from a straight-forward definition of a concatenation operation; the code below type checks, and that’s all that necessary.


_++𝕍_ ∀{ℓ}{A : Set ℓ}{n m : ℕ} → 𝕍 A n → 𝕍 A m → 𝕍 A (n + m)
[] ++𝕍 ys = ys
(x :: xs) ++𝕍 ys = x :: (xs ++𝕍 ys)

A more complex example of Braun trees is described in detail in the book. These are binary trees where the data at a node has a value less than or equal to that of its children while the size of the node left and right subtrees are either equal or the left subtree is exactly one node larger than the right subtree. The author chooses to encode this latter property within the declaration of the Braun tree type, and then shows how to write code for tree insertion, deletion, etc.

I enjoyed this chapter, though it had me puzzling for quite some time over sigma types, a subtle and powerful idea that I think deserves a bit more discussion than the space given in the book; I ended up reading around this topic a bit before I felt comfortable. An example given considers a function that turns lists into vectors. Immediately we have a problem – what is the return type of this function? It should be (𝕍 A n) but n is not fixed! The solution presented is to return a type written in Agda as Σ ℕ (λ n → 𝕍 A n); an inhabitant of this type should be interpreted as “a pair consisting of a natural number and a function which, when applied to that natural number, returns the vector type we’re actually interested in.” This takes a little getting used to!

Type-Level Computation

Chapter 6, introduces the idea of proof by reflection, amongst other things. This looks super-powerful, and I’m motivated to dig further into this topic when I have the time. The author illustrates the concept by creating a list simplifier, which simplifies expressions on lists. By reflecting the operations on lists as corresponding constructors for list expressions, and then proving the soundness of the simplifier, he is able to use this soundness proof to, inter alia, prove properties about the operations operating on lists. The example is really interesting, though unlike in previous chapters of the book, it’s not really discussed in detail in Chapter 6, and you have to look carefully at the code accompanying the book to see how it all fits together.

The other primary example of Chapter 6 is a statically typed version of printf, the archetypal type-evasive function of my youth

int printf( const char *restrict format, ... );

Here, the author presents an example credited to Augustsson, showing that dependent types can come to the rescue essentially by providing a type for C’s “…” which depends on the format string.

The integer example from this chapter is the closest to my own interests, showing how integers can be defined using naturals in an interesting way: an integer is a natural together with a sign which is unit (trivial) if the natural is zero but has Boolean type otherwise, i.e. the type of the sign is dependent on the value of the natural, leading to the following type declaration and definition of ℤ [taken from here].


ℤ-pos-t : ℕ → Set
ℤ-pos-t 0 = ⊤
ℤ-pos-t (suc _) = 𝔹

data ℤ : Set where
mkℤ : (n : ℕ) → ℤ-pos-t n → ℤ

I had followed through the discussion in the book easily, but came totally unstuck when trying to implement a seemingly obvious exercise I set myself: prove that integer multiplication commutes using a pre-existing proof of the commutativity of multiplication over the naturals.

I had defined integer multiplication as:


_*ℤ_ : ℤ → ℤ → ℤ
_*ℤ_ (mkℤ 0 _)        _                = mkℤ 0 triv
_*ℤ_ (mkℤ (suc n) _)  (mkℤ 0 _)        = mkℤ 0 triv
_*ℤ_ (mkℤ (suc n) sn) (mkℤ (suc m) sm) = mkℤ ((suc n) * (suc m)) (sn xor sm)

where triv is the single inhabitant of the type.  I had therefore naively assumed that one could simply write the proof as:

*ℤcommutes (mkℤ 0 _)        (mkℤ 0 _) = refl
*ℤcommutes (mkℤ 0 _) (mkℤ (suc bN) _) = refl
*ℤcommutes (mkℤ (suc aN) _) (mkℤ 0 _) = refl
*ℤcommutes (mkℤ (suc a) sa) (mkℤ (suc b) sb) rewrite xor-comm sa sb | (*comm (suc a) (suc b)) = refl

However, this fails – for the final pattern – for a very interesting reason that was, however, totally incomprehensible to me at first. Agda reports:

𝔹 !=< ℤ-pos-t w of type Set
when checking that the type
(b a w : ℕ) →
w ≡ suc (a + b * suc a) →
(sa sb : 𝔹) →
mkℤ w (sb xor sa) ≡ mkℤ (suc (a + b * suc a)) (sb xor sa)
of the generated with function is well-formed

“Huh? ‘Generated with function?'”

Thus began a small odyssey of Easter holiday evenings spent learning how rewrites are implemented as withs, and how withs in turn are implemented using generated auxiliary functions. I found Norrell and Chapman’s tutorial, together with the good quality documentation provided me with the necessary additional background required to teach myself what was going on here and to produce a final, working proof:


*ℤcommutes : ∀ (a b : ℤ) → a *ℤ b ≡ b *ℤ a
*ℤcommutes (mkℤ 0 _)        (mkℤ 0 _)        = refl
*ℤcommutes (mkℤ 0 _)        (mkℤ (suc m) sm) = refl
*ℤcommutes (mkℤ (suc n) _)  (mkℤ 0 _)        = refl
*ℤcommutes (mkℤ (suc n) sn) (mkℤ (suc m) sm)
with m + n * suc m | *comm (suc n) (suc m)
... | .(n + m * suc n) | refl rewrite xor-comm sn sm = refl

I have mixed feelings about this experience. On the one hand, I’ve learnt a lot about Agda specifically (and dependent types in general) by being forced to go through this process to understand what’s going on under the hood. On the other hand, this seems like a lot of hoops to jump through for this simple proof, and the book under review didn’t have all the answers I was looking for to solve the exercise. I don’t see the latter point as a big deal, but if you’re going to do your explorations on holiday, make sure you have a wifi connection available so you can look beyond the book for help!

Selected Topics

The remainder of the book is eclectic but interesting.

Chapters 7 and 8 combine to produce a larger example piece of code – a Huffman coder / decoder. It’s good to see the ideas put into practice in a larger case study, of course, though these chapters were less interesting to me personally.

Chapter 9 really recaptured my attention, though! The author uses Agda’s types to reason about termination. He explicitly sets up a general datatype representing what it means for a relation DAG to not have any infinite chain of successor nodes starting from a given named node, something he calls a terminating relation. He then proves in Agda that > over ℕ is terminating in this sense. The chapter then goes on to apply this to producing a termination proof for a recursively defined division function. I’m aware of this kind of reasoning, and have even played a bit with tools making use of these ideas like Byron Cook et al.‘s Terminator. So it’s interesting to see this in the context of the type system of Agda where in fact it becomes rather important since all functions must terminate in Agda – otherwise none of the proofs discussed above make sense!

The final chapter of the book, Chapter 10, formalises some ideas in a particular logic within Agda, using Agda to prove soundness and completeness of the logic.

Conclusion

The book ends rather abruptly with Chapter 10 on intuitionistic logic and Kripke semantics. There is no rounding off, summarising lessons learnt or directions for future research or indeed practical programming advice.

Nevertheless, I learnt a lot from reading this book – and a lot more from doing its exercises! It’s given me an initial understanding of one of the most recently talked-about developments in programming. I’m left keen to try to apply this to some real-world problems in my research! Well recommended holiday reading.

Book Review: Complexity and Real Computation

Over several holiday and train journeys, I’ve been slowly reading and digesting this book by Blum, Cucker, Shub, and Smale. It’s one of a number of books I’ve read that are definitively not in my research field – hence holiday reading – and yet touches it tangentially enough to get me excited. I’ve been interested in computation defined in terms of real numbers ever since my own PhD thesis back in 1998-2001, which looked at ways to take a specific class of computation defined on the reals and synthesise hardware circuits operating in very efficient fixed-point arithmetic. While my own work has often involved specifying computation on the reals, including recently exploiting the differences between real and finite precision computation for circuits, this book takes a fundamental look at computation over the reals themselves. In particular, it looks at what happens to notions of computability and complexity if you consider real addition, multiplication, division, and subtraction as basic operators in a computation model: what is computable, what resulting complexity bounds can be proved? Ever since my first exposure to algebraic geometry, especially real algebraic geometry, I see it everywhere – and have even made some small forays [1,2] into harnessing its power for hardware design. It’s a delight to see computability and algebraic geometry coming together so neatly in this text.

Part I: Basic Development

The authors define a computational model that is very natural for analysis: a computation consists of a control-flow graph consisting of input nodes, computation nodes (polynomial or rational maps), branch nodes (checking for equality or inequality, depending on whether the ring / field being computed over is ordered), and output nodes. For the remainder of this review, I will consider only computation over the reals, though some of the results are more general. Theorems naturally flow from this setting. My favourites: the halting set of a machine is a countable union of semialgebraic sets; there are “small” systems of polynomial inequalities, of low degree, describing behaviour of a machine up to T time steps. As a by product, the first practical uses of this theory are illustrated, e.g. a proof that the Mandelbrot set is not decidable over {\mathbb R}. The interplay between the uncountable (program state) and the countable (program counter state) leads to some intricate and enjoyable analysis.

Chapter 3 generalises classical results for universal Turing machines, typically expressed in terms of unbounded tapes with binary storage at each location, to the case where an element of an arbitrary ring is stored at each location (imagine a real number stored at each location). Additional shift operations are introduced to deal with moving along the tape. It is shown that operation over unbounded sequences allows for uniformity of algorithm across dimensionality but does not add additional computational power for fixed dimensionality. In finite time we can only explore a finite portion of the tape, so we still retain the semi-algebraic (or countable union of semi-algebraic) sets for the halting sets derived in Chapter 2.

Chapter 4 extends the usual complexity-theoretic classes P and EXP to computation over a ring. Computation over {\mathbb Z}_2 corresponds to the classical case. There are no great surprises here for a reader who has read a basic text in computational complexity together with Chapters 1-3 of this book. The authors round off the chapter by using this new machinery to cast results from Goedel and Tarski from an interesting perspective. For example, Goedel becomes there exist definable undecidable sets over {\mathbb Z}, while Tarski becomes every definable set over {\mathbb R} is decidable over {\mathbb R}. Questions of definability are then put off until the very end of this book – and this review.

Chapter 5 shows how classical NP, NP-complete and NP-hard definitions carry forward into computation over a general ring. Then, specialising the results to particular rings and particular costs of elementary operations (unit? dependent on size of numbers?), the NP-completeness of various key problems are shown. In particular, it is shown that the classical 3-SAT NP-completeness result can be obtained by working over the ring {\mathbb Z}_2. The chapter ends with an interesting discussion of the P ?= NP question over various rings and with various costs. It’s interesting, for example, that the problem of determining whether there is a zero of a degree-4 polynomial is NP-complete over {\mathbb R} while that of a degree 3 is in P; analogous to the situation with 3-SAT versus 2-SAT in the classical setting.

Chapter 6 moves on to look specifically at machines over {\mathbb Z} with bit cost, i.e. the bigger the number, the more it costs to operate on it. It is shown that these machines are polynomially equivalent to those operating over {\mathbb Z}_2 with unit cost, i.e. anything you can write in terms of algebraic operations over {\mathbb Z} you can also write under the classical Turing model of computation, and vice versa, with only polynomial blow up in execution steps. This is not too surprising, and ultimately derives from the standard possibility of writing an integer as its binary expansion. The chapter then takes a detour via Farkas’s Lemma to show that linear programming feasibility (LPF) is in NP, a rather intricate result I first saw in Schrijver’s encyclopaedic book when working with my former PhD student Qiang Liu on discrete linear algebra for hardware memory optimization. It is noted that LPF’s status depends strongly on the choice of computation model and ring: in the (equivalent) classic setting of integers with bit cost it is NP-complete, over rationals with bit cost it is in P, while its status over {\mathbb R} is apparently open.

Part II: Some Geometry of Numerical Algorithms

This part of the book turns to look at specific algorithms such as Newton’s method and a homotopy method for nonlinear optimization. In each case, complexity results are given in terms of the number of basic real arithmetic operations required. Some fun analysis and quite a clear exposition of homotopy methods, which I’ve worked with before in the context of convex optimization, especially when using barrier methods for linear and convex quadratic programming, something I looked at quite carefully when trying to design FPGA-based computer architectures for this purpose.

The chapter on Bézout’s Theorem assumed a rather more thorough grounding in differential topology than I have, and I wasn’t entirely sure how it sat with the rest of the work, as the constructive / algorithmic content of the theorems was less apparent to me.

The final chapters of this part of the book took me back to more familiar territory – first introduced to me by the outstandingly clear encyclopaedic textbook by Nick Higham – the discussion of condition numbers for linear equations and the relationship to the loss of precision induced by solving such systems (roughly, the relative error in the right hand side of Ax = b gets amplified by the condition number of the matrix A). What was interesting here is the authors’ probabilistic analysis – looking at the expected condition number if the matrix elements are iid Gaussian variables, for example. I have skimmed another book also co-authored by Cucker – on this very topic, and I’m waiting to have the time to read it. Condition for polynomial problems is covered, leading to complexity results for logarithmic barrier interior point methods for linear equations over \mathbb{Q} (spoiler: it’s in \mathbb{P}).

Part III: Complexity Classes over the Reals

After Part II, which was only interesting to me in parts, I started to get excited again by Part III. Lower bounds on the computational complexity of various problem are derived here based on some very interesting ideas. The basic scheme is to bound the number of different connected components in a (semi-)algebraic set in terms of the degree of the polynomials and the number of variables involved. The number of different connected components can then be used to bound the number of steps an (algebraic) machine takes to decide membership of that set.

In Chapter 17, the framework is extended to probabilistic machines, and the standard definition of BPP is extended to computation over an arbitrary ring, giving BPP_{R}. The chapter then goes on to show that probabilistic machines over \mathbb{R} can be simulated by deterministic machines over \mathbb{R} without blowing up execution time, i.e. that P_\mathbb{R} = BPP_\mathbb{R}, whereas this is unknown for the classical setting. The essence of the argument makes use of a special real number (shown, non constructively, to exist), encoding all the different “coin flips” that a probabilistic program makes during its execution, which is then encoded as a machine constant in the simulating machine.

Parallelism – close to my heart – is covered in Chapter 18, through the introduction of complexity classes PL^k_\mathbb{R}, parallel polylogarithmic time, for sets that can be decided in time O(\log^k n) using a polynomial number of processors, and PAR_\mathbb{R}, parallel polynomial time, for sets that can be decided in polynomial time using an exponential number of processors (definitions following a SPMD computational model). Algorithms are derived for matrix inversion and product and a result is cited for parallel quantifier elimination over the reals.

Chapter 22 looks at the relative power of machines over \mathbb{Z}_2, i.e. the classical model, compared to real machines which take inputs restricted to be drawn from \{0,1\}. It’s fairly obvious that the latter setting can provide a lot of additional power, for example deciding membership of a set in \mathbb{Z}^\infty_2 can be done by encoding the characteristic function of the set as the digits of some real constant in the program. It is then shown that additive real machines (a restricted form of the machines discussed above, containing no multiplication or division) that additionally contain branching only on equality can solve exactly the same problems in polynomial time as in the classical model. This is unlike the case where branching can be over inequality, where some additional computational power is provided by this model.

Descriptive Complexity, Chapter 23 is interesting: the descriptive power of (a particular) first order logic extended with fixed point and maximisation operations and of existential second order logic are shown to correspond to a particular subclass of P_\mathbb{R} problems and EXP_\mathbb{R}, respectively. In the latter case, I think this essentially says that known results in the classical case carry forward to computation over the reals despite the move away from a finite ring. I am not familiar enough with computational complexity literature to comment on how the former result compares with the classical setting, and I didn’t feel that this point is was very well described in the book.

Conclusion

Overall, I found this book a little eclectic, especially Part II, reflective of the fact that it has clearly been put together drawing from a variety of different primary sources spanning several decades. This should not be seen as a criticism. The choice of material was wonderful for me – as holiday reading at least! – tangentially touching so many different topics I’ve seen during my career and drawing me in. In places, the development was over an arbitrary ring, in places over \mathbb{C} and in places over \mathbb{R}, and although for some sections it was absolutely clear why, for others it was less clear. Overall, though, I would very much recommend anyone who likes to keep a foot in both the continuous and discrete worlds to take a look at this book.

Book Reviews: Popular Science

A friend recently reintroduced me to the genre of popular science – specifically, popular physics – something I had left behind in my mid teens. I remember voraciously reading the books of John Gribbin as a teenager, sitting in bed in my parents’ house thinking about black holes and quantum physics. Since then I’ve been more focused on – and interested in – my particular field.

However, following prompting, I’ve recently read two popular science books: Seven Brief Lessons on Physics by Carlo Rovelli, and A Beautiful Question by Frank Wilczek. Both are good books, though very different. The former is a brief, almost poetically written waltz through relatively, quantum physics, and more recent unification efforts – it can easily be read in a single afternoon. The second is a more weighty tome, a very personal view from a Nobel Prize winner of beauty and its embodiment in the symmetries present in nature. The chapters of the latter vary in terms of how much I enjoyed them, primarily because many I felt some had too much information not to take a mathematical approach, yet this was not forthcoming. Meanwhile the former was a joy to read because it its brevity skimmed the surface and left me wanting to dig further, specifically into ideas of quantum loop gravity and into what  modern neuroscience may or may not be able to tell us about the emergence of notions of “self”.

Book Review: Essential Topology

Topology is an area of mathematics with which I have no prior experience. I guess it was felt historically that engineers don’t really need topology, and this has filtered down the generations of study. Yet I’ve always found the ideas of topology intriguing, even if I have not deeply understood them. They seem to pop up in the most wide variety of places. While running a math circle for primary school kids, we ended up discussing Euler’s Polyhedron Formula, and I realised that if I wanted to explore these ideas more deeply, I would need a proper grounding in topology. From the wonderful lectures of Tadashi Tokieda which I watch with my son, to the more theoretical end of computer science I bump up against in my day-to-day research, topology seems to be everywhere. I found this book, Essential Topology by Martin D. Crossley, while browsing in a bookshop and decided to take it on holiday as holiday reading.

As presented, the book naturally falls into three parts: an introductory section, a section on basic topology, and a section on more advanced algebraic topology. Although short, the book is written as a sequence of a good number of chapters, 11 in total, which make the material more digestible. Moreover, I found the two “interludes” between sections of the book to be a great innovation – invaluable as a mechanism for orienting my reading, providing the appropriate informal guidance about the journey on which the text was taking me.

The introductory section begins with looking at the question of continuity from a rigorous perspective, bringing in both the epsilon-delta approach to continuity with which I am familiar, and an approach based on open sets with which I was not – but which is easily generalised to functions other than from \mathbb{R} to \mathbb{R}. It then moves on to axiomatically defines topological spaces, some standard topologies, and bases for topologies.

The second section begins to explore some of the properties that can be proved using the equipment set up in the previous chapters: connectedness, compactness, the Hausdorff property, homeomorphisms, disjoint unions, product and quotient spaces. I enjoyed this section greatly.

The third section then goes on to discuss various more advanced topics, looking at various topological invariants that have been well studied. Some highlight for me included:

Chapter 6 discusses the idea of homotopy as an equivalence between maps: two maps f,g: S \rightarrow T are homotopic iff there is a continuous function F: S \times [0,1] \rightarrow T allowing for a kind of continuous deformation of one function into the other, i.e. with F(s,0) = f(s) and F(s,1) = g(s). This is shown to be rather a broad equivalence, for example all continuous functions on \mathbb{R} are homotopic. However, working with other topological spaces, things get quite interesting. A big part of the chapter is given over to working with circles \mathbb{S}^1, where it is shown that there is a countable set of homotopy classes of functions from \mathbb{S}^1 to \mathbb{S}^1. This is very clearly described, and I seem to remember through the mists of time some of these issues cropping up informally in the context of my control theory courses as an undergraduate (or, for that matter, fathoming {\tt unwrap} in Matlab as an undergraduate!) The chapter ends with a proof of the fantastically named “Hairy ball theorem,” from which – amongst other things – it follows that at any given point in time, there’s a part of the world with zero wind speed!

Chapter 7 discusses the Euler characteristic as a topological invariant and introduces some interesting theorems. After introducing the idea of a ‘triangulable space’, it is stated that if two such spaces are homotopy equivalent then they have the same Euler number. More remarkable (to me!) is that when restricted to surfaces, it is sufficient for two such surfaces to have the same Euler number and the same orientability in order for them to be homotopy equivalent. Unfortunately, the proofs of both these results are omitted – apparently they are rather involved – references are given. I certainly appreciated the results collected in this chapter, but I found the exercises quite hard compared to other chapters, possibly partly because of the proof issue, but also because I found it hard to visualise some aspects, e.g. triangulation of a surface of genus 2, and since such surfaces had only been defined informally in the preceding chapters I could not (easily) fall back on a purely algebraic approach to the geometry. I did find it particularly interesting to see the Euler characteristic rigorously defined for a general class of spaces – the very definition in the primary math circle that had brought me to this book in the first place!

Chapter 8 discusses homotopy groups. The basic idea is that one can work with homotopy classes of maps from {\mathbb S}^n, the n-dimensional sphere, to a space X (actually pointed homotopy classes of pointed maps, but lets keep things simple) and it turns out that these classes form a group structure: they have an identity element (constant maps), an addition operation, etc. I guess the purpose here is to bring out the topological structure of X, though the special role played by {\mathbb S}^n was not totally apparent to me – why is this particular space so fruitful to study? I wonder if any readers can comment on this point for me.

Chapter 9 provides an introduction to Simplicial Homology, the idea – roughly – that those combinations of simplices within a simplicial complex which ‘could’ form boundaries of other simplices but ‘do’ not, tell us something about the topology of the simplicial complex. Homology is then introduced in a different form (Singular Homology) in Chapter 10, and it is stated that the two approaches coincide for triangulable spaces.

In these latter chapters of the book, some theorems tend to be stated without proof. The author is always careful to refer to other textbooks for proof, and I guess this is a necessary aspect of trying to introduce a very large subject in a brief textbook, but for me this made them somewhat less appealing than those in the first two sections. Nevertheless, I now feel suitably armed to begin looking at the world through a more topological lens. I wonder what I will see. For now I’m off to eat a coffee cup.

Book Review: Out of the Labyrinth: Setting Mathematics Free

This book, by Kaplan and Kaplan, a husband and wife team, discusses the authors’ experience running “The Math Circle”. Given my own experience setting up and running a math circle with my wife, I was very interested in digging into this.

The introductory chapters make the authors’ perspective clear: mathematics is something for kids to enjoy and create independently, together, with guides but not with instructors. The following quote gets across their view on the difference between this approach and their perception of “school maths”:

Now math serves that purpose in many schools: your task is to try to follow rules that make sense, perhaps, to some higher beings; and in the end to accept your failure with humbled pride. As you limp off with your aching mind and bruised soul, you know that nothing in later life will ever be as difficult.

What a perverse fate for one of our kind’s greatest triumphs! Think how absurd it would be were music treated this way (for math and music are both excursions into sensuous structure): suffer through playing your scales, and when you’re an adult you’ll never have to listen to music again.

I find the authors’ perspective on mathematics education, and their anti-competitive emphasis, appealing. Later in the book, when discussing competition, Math Olympiads, etc., they note two standard arguments in favour of competition: that mathematics provides an outlet for adolescent competitive instinct and – more perniciously – that mathematics is not enjoyable, but since competition is enjoyable, competition is a way to gain a hearing for mathematics. Both perspectives are roundly rejected by the authors, and in any case are very far removed from the reality of mathematics research. I find the latter of the two perspectives arises sometimes in primary school education in England, and I find it equally distressing. There is a third argument, though, which is that some children who don’t naturally excel at competitive sports do excel in mathematics, and competitions provide a route for them to be winners. There appears to be a tension here which is not really explored in the book; my inclination would be that mathematics as competition diminishes mathematics, and that should competition for be needed for self-esteem, one can always find some competitive strategy game where mathematical thought processes can be used to good effect. However, exogenous reward structures, I am told by my teacher friends, can sometimes be a prelude to endogenous rewards in less mature pupils. This is an area of psychology that interests me, and I’d be very keen to read any papers readers could suggest on the topic.

The first part of the book offers the reader a detailed (and sometimes repetitive) view of the authors’ understanding of what it means to do mathematics and to learn mathematics, peppered with very useful and interesting anecdotes from their math circle. The authors take the reader through the process of doing mathematics: analysing a problem, breaking it down, generalising, insight, and describe the process of mathematics hidden behind theorems on a page. They are insistent that the only requirement to be a mathematician is to be human, and that by developing analytical thinking skills, anyone can tackle mathematical problems, a mathematics for The Growth Mindset if you will. In the math circles run by the authors, children create and use their own definitions and theorems – you can see some examples of this from my math circle here, and from their math circles here.

I can’t say I share the authors’ view of the lack of beauty of common mathematical notation, explored in Chapter 5. As a child, I fell in love with the square root symbol, and later with the integral, as extremely elegant forms of notation – I can even remember practising them so they looked particularly beautiful. This is clearly not a view held by the authors! However, the main point they were making: that notation invented by the children, will be owned and understood by the children, is a point well made. One anecdote made me laugh out loud: a child who invented the symbol “w” to stand for the unknown in an equation because the letter ‘w’ looks like a person shrugging, as if to say “I don’t know!”

In Chapter 6, the authors begin to explore the very different ways that mathematics has been taught in schools: ‘learning stuff’ versus ‘doing stuff’, emphasis on theorem or emphasis on proof, math circles in the Soviet Union, competitive versus collaborative, etc. In England, in my view the Government has been slowly shifting the emphasis of primary school mathematics towards ‘learning stuff,’ which cuts against the approach taken by the authors. The recent announcement by the Government on times tables is a case in point. To quote the authors, “in math, the need to memorize testifies to not understanding.”

Chapter 7 is devoted to trying to understand how mathematicians think, with the idea that everyone should be exposed to this interesting thought process. An understanding of how mathematicians think (generally accepted to be quite different to the way they write) is a very interesting topic. Unfortunately, I found the language overblown here, for example:

Instead of evoking an “unconscious,” with its inaccessible turnings, this explanation calls up a layered consciousness, the old arena of thought made into a stable locale that the newer one surrounds with a relational, dynamic context – which in its turn will contract and be netted into higher-order relations.

I think this is simply arguing for mathematical epistemology as akin to – in programming terms – summarizing functions by their pre and post conditions. I think. Though I can’t be sure what a “stable locale” or a “static” context would be, what “contraction” means, or how “higher order relations” might differ from “first order” ones in this context. Despite the writing not being to my taste, interesting questions are still raised regarding the nature of mathematical thought and how the human mind makes deductive discoveries. This is often contrasted in the text to ‘mechanical’ approaches, without ever exploring the question of either artificial intelligence or automated theorem proving, which would seem to naturally arise in this context. But maybe I am just demonstrating the computing lens through which I tend to see the world.

The authors write best when discussing the functioning of their math circle, and their passion clearly comes across.

The authors provide, in Chapter 8, a fascinating discussion of the ages at which they have seen various forms of abstract mathematical reasoning emerge: generalisation of when one can move through a 5×5 grid, one step at a time, visiting each square only once, at age 5 but not 4; proof by induction at age 9 but not age 8. (The topics, again, are a far cry from England’s primary national curriculum). I have recently become interested in the question of child development in mathematics, especially with regard to number and the emergence of place value understanding, and I’d be very interested to follow up on whether there is a difference between this between the US, where the authors work, and the UK, what kind of spread is observed in both places, and how age-of-readiness for various abstractions correlates with other aspects of a child’s life experience.

Other very valuable information includes their experience on the ideal size of a math circle: 5 minimum, 10 maximum, as they expect children to end up taking on various roles “doubter, conjecturer, exemplifier, prover, and critic.” If I run a math circle again, I would like to try exploring a single topic in greater depth (the authors use ten one hour sessions) rather than a topic per session as I did last time, in order to let the children explore the mathematics at their own rate.

The final chapter of the book summarises some ideas for math circle style courses, broken down by age range. Those the authors believe can appeal to any age include Cantorian set theory and knots, while those they put off until 9-13 years old include complex numbers, solution of polynomials by radicals, and convexity – heady but exciting stuff for a nine year old!

I found this book to be a frustrating read. And yet it still provided me with inspiration and a desire to restart the math circle I was running last academic year. Whatever my beef with the way the authors present their ideas, their core love – allowing children to explore and create mathematics by themselves, in their own space and time – resonates with me deeply. It turns out that the authors run a Summer School for teachers to learn their approach, practising on kids of all ages. I think this must be some of the best maths CPD going.

Book Review: Surely You’re Joking, Mr Feynman

This Summer I had a long flight to Singapore. A few hours into long flights I tend to lose the will to read anything challenging, so I decided to return to a book I first read as a young teen when it was lent to me by family friend fred harris, the book Surely You’re Joking, Mr Feynman. I remember at the time that this book was an eye opener. Here was a Nobel prize winning physicist with a real lust for life coupled with an inherent enthusiasm for applying the scientific method to all aspects of life and a healthy lack of respect for rules and authority. As a teenager equally keen to apply the scientific method to anything and everything, and keen to work out my own ideas on authority and rules, yet somewhat lacking the personal confidence of Feynman, I found the book hugely enjoyable.

I still find the book hugely enjoyable. I was chuckling throughout – the practical jokes, the encounters with non-scientific “experts” and authority figures. However, I was also somewhat taken aback by Feynman’s attitude to women, which I found misogynistic in places, something that had totally passed me by as a teenager. Feynman’s attitudes were probably unremarkable for his time (Feynman was born in 1918) so I’m not attributing blame here, but I find it odd that I didn’t notice this in the 1980s. Googling for it now, I find commentary about this point is all over the Internet. It just goes to show how many subtleties pass children and teens by, something parents and educators would do well to remember.