Equivalent. But better.

Ever since primary (elementary) school, we’ve known that multiplying an integer by 10 is easy. No need for the long written calculations we churn through when doing multiplication by hand. Just stick an extra zero on the end, and we’re done. Multiplication is (relatively) hard, concatenation of digits is easy. And yet, in this case, they’re equivalent in terms of the operation they perform.

Similar equivalences abound in the design of digital hardware for arithmetic computation, and my PhD student Sam Coward (jointly supervised by Theo Drane from Intel) has been devising ways to automatically take advantage of such equivalences to make Intel hardware smaller and more efficient. He will be presenting our work on this topic at the main computer arithmetic conference, ARITH, next week. The conference will be online, and registration is free: https://arith2022.arithsymposium.org.

Let’s explore this example from our early school years a bit more. I’ll use Verilog notation \{\cdot,\cdot\} to denote a function taking two bit vectors and concatenating them together. Of course the ‘multiplication by 10 is easy’ becomes ‘multiplication by 2 is easy’ in binary. Putting this together, we can write 2*x \simeq \{x,0\}, meaning that multiplication by two is the same as concatenation with a zero. But what does ‘the same as’ actually mean here? Clearly they are not the same expression syntactically and one is cheap to compute whereas one is expensive. What we mean is that no matter which value of x I choose, the value computed on the left hand side is the same as the value computed on the right hand side. This is why I’ve chosen to write \simeq rather than =. \simeq clearly defines a relation on the set of expressions. This is a special kind of relation called a congruence: it’s an equivalence relation, i.e. it is symmetric, transitive, and reflexive, but it also ‘plays well’ with function application: if x \simeq y then it necessarily follows that f(x) \simeq f(y) for every function symbol f. Like any equivalence relation on a set, \simeq partitions the set into a set of equivalence classes: in our setting a class corresponds to expressions that can be freely interchanged without changing the functionality of our hardware, even if it changes the performance, area or energy consumption of the resulting design.

Our colleagues Willsey, Nandi, Wang, Flat, Tatlock and Panchekha recently published egg, a wonderful open source library for building and exploring data structures known as ‘e-graphs’, specifically designed to capture these relations on expressions. Sam, Theo and I have developed a set of ‘rewrites’ capturing some of the important intuition that Intel designers apply manually, and encoded these for use within egg. To give you a flavour of these rewrites, here’s the table from Sam’s paper; you can see the example we started with is hiding in there by the name ‘Mult by Two’. The subscripts are used to indicate how many digits we’re dealing with; not all these rules are true for arbitrarily-sized integers, and Sam has gone to some lengths to discover simple rules – listed here as ‘sufficient condition’ – for when they can be applied. This is really important in hardware, where we can use as few or as many bits as the job requires.

You can imagine that, when you have this many equivalences, they all interact and you can very quickly build up a very large set of possible equivalent expressions. e-graphs help us to compactly represent this large set.

Once our tool has spent enough time building such a representation of equivalences, we need to extract an efficient implementation as a hardware implementation. This is actually a hard problem itself, because common subexpressions change the hardware cost. For example if I’m calculating (x+1)*(x+1) then I wouldn’t bother to calculate x+1 twice. We describe in our paper how we address this problem via an optimisation formulation. Our tool solves this optimisation and produces synthesisable Verilog code for the resulting circuit.

So, does it generate good circuits? It certainly does! The graph below shows the possible circuit area and performance achievable before (blue) and after (orange) the application of our tool flow before standard logic synthesis tools. For this example, silicon area can be reduced by up to around 70% – a very significant saving.

Area/Delay tradeoff for a smoothing kernel before and after our work

I’ve really enjoyed working on this topic with Sam and Theo. Lots more exciting content to follow. In the meantime, please tune in to hear Sam talk about it next week.

Notes on Model Theory

This summer, I decided to read something about Model Theory. Aware of my limited time, I decided to aim for a slim volume as a guide, so went for An Invitation to Model Theory by Jonathan Kirby.

Here I collect some notes I made while reading Kirby, completely biased to my own interests. Definitions and theorems largely follow the book, but have often been tweaked by me to make my notes more self-contained. I’ve also tried to ‘sloganise’ each of the originally-unnamed theorems in a way that helps me grasp their essence. Additional commentary around the definitions and theorems is mine, and I’ve added a couple of extra definitions to make things easier for me to read back in the future. Of course there is some risk that I’ve introduced errors or lack of clarity, in which case I’m sure the fault lies with me not with the book – please do reach out if you spot anything.

What this post will discuss

We will look at defining basic syntactical objects, terms and formulas, and their interpretations (all in first-order logic). We will look at ideas relating to whether mathematical structures can be distinguished via the truth of formal sentences, including embedding of one structure inside another. We will look at results guaranteeing the existence of structures (especially infinite structures) satisfying sets of such sentences. Turning this around, we’ll also look at some results on which structures can be axiomatised by such sets of sentences. We will consider the extent to which one can define a subset using first-order sentences and the relationship to quantifier elimination (see also my post including algorithms for quantifier elimination). Finally, we will look at the idea of ‘how uniquely’ mathematical structures can be described in this way.

What this post will not discuss

I have purposely not given proofs or examples. Kirby’s book of course has (most) proofs and is rich in examples, through which much of the material is developed. This means I have not discussed some really interesting results on, say, algebraically closed fields, on which Kirby spends considerable time. Those interested in how to apply some of these ideas would do well to consult the book. In addition, I have avoided the topic of types in this simple set of notes, which Kirby highlights as a “central concept in model theory”, despite the interesting results they enable later in the book, e.g. saturated models. The exception is that I included a simplified version of the Ryll-Nardzewski theorem at the end of this post (the proof given uses types) because this particular part of the theorem may be of use to me in the future and can be stated without recourse to types.

I hope to do a later blog post with some of the material using types in due course.

Languages and Structures

Definition (Language). A language L consists of a set of relation symbols, a set of function symbols, a set of constant symbols and, for each relation and function symbol, a positive number known as its arity. The set of all these symbols is denoted \text{Symb}(L).

Definition (Lstructure). An L-structure \mathcal{A} consists of a set A called its domain, together with interpretations of the symbols in L: for each relation symbol R of arity n, a subset R^\mathcal{A} of A^n; for each function symbol f of arity n, a function f^\mathcal{A} : A^n \to A; for each constant symbol c, a value c^\mathcal{A} \in A.

Definition (Embedding). Let \mathcal{A} and \mathcal{B} be L-structures. An embedding of \mathcal{A} into \mathcal{B} is an injection \pi: A \to B s.t.: for all relation symbols R of L and all a_1,\ldots,a_n \in A, (a_1,\ldots,a_n) \in R^\mathcal{A} iff \pi(a_1),\ldots,\pi(a_n) \in R^\mathcal{B}; for all function symbols f of L and all a_1,\ldots,a_n \in A, \pi(f^\mathcal{A}(a_1,\ldots,a_n)) = f^\mathcal{B}(\pi(a_1),\ldots,\pi(a_n)); for all constant symbols c of L, \pi(c^\mathcal{A}) = c^\mathcal{B}.

Definition (Isomorphism). An embedding \pi: \mathcal{A} \to \mathcal{B} is an isomorphism iff there is an embedding \sigma: \mathcal{B} \to \mathcal{A} s.t. \pi \circ \sigma is the identity on \mathcal{B} and \sigma \circ \pi is the identity on \mathcal{A}.

Definition (Automorphism). An automorphism is an isomorphism from a structure to itself.

Terms, Formulas and their Models

Definition (Lterm). The set of terms of the language L is defined recursively as follows: every variable is a term; every constant symbol of L is a term; if f is a function symbol of L of arity n and t_1, \ldots, t_n are terms of L, then f(t_1, \ldots,  t_n) is a term; only something built from the preceding clauses in finitely many steps is a term.

Definition (Lformula). The set of formulas of L is defined recursively as follows: if t_1 and t_2 are terms, then (t_1 = t_2) is a formula; if t_1, \ldots t_n are terms and R is a relation symbol of L of arity n, then R(t_1, \ldots t_n) is a formula; if \varphi and \psi are formulas, then \neg \varphi and \varphi \wedge \psi are formulas; if \varphi is a formula and x is a variable, then \exists x[\varphi] is a formula; only something built from the preceding four clauses in finitely many steps is a formula. We will write \varphi(x) for a formula with free variables x = (x_1, \ldots x_n) in some defined order.

Definition (Lsentence). An L-sentence is a L-formula with no free variables.

Definition (Atomic formula). Formulas of the form t_1 = t_2 for some L-terms t_1, t_2 or of the form R(t_1,\ldots,t_n) for some relation symbol R and terms t_1,\ldots,t_n are atomic formulas.

Definition (Interpretation of terms). Let \mathcal{A} be an L-structure, t be a term of L, and x = (x_1, \ldots, x_r) be a list of variables including all those appearing in t. The interpretation of t, written t^\mathcal{A} is a function from A^r to A defined by: if t is a constant symbol c then t^\mathcal{A} : x \mapsto c^\mathcal{A}; if t is x_i then t^\mathcal{A} : x \mapsto x_i; if t is f(t_1, \ldots, t_n) then t^\mathcal{A} : x \mapsto f^\mathcal{A}(t_1^\mathcal{A}(x), \ldots, t_n^\mathcal{A}(x)).

Definition (Interpretation of formulas). We define \mathcal{A} \vDash \phi(a) (read \vDash as ‘models’) recursively as: \mathcal{A} \vDash t_1(a) = t_2(a) iff t_1^\mathcal{A}(a) = t_2^\mathcal{A}(a); \mathcal{A} \vDash R(t_1(a), \ldots, t_n(a)) iff (t_1^\mathcal{A}(a), \ldots, t_n^\mathcal{A}(a)) \in R^\mathcal{A}; \mathcal{A} \vDash \neg \varphi(a) iff \mathcal{A} \nvDash \varphi(a); \mathcal{A} \vDash \varphi(a) \wedge \psi(a) iff \mathcal{A} \vDash \varphi(a) and \mathcal{A} \vDash \psi(a); \mathcal{A} \vDash \exists x[\varphi(x,a)] iff there is some b \in A s.t. \mathcal{A} \vDash \varphi(b,a).

Definition (Lsentence). An L-sentence is a L-formula with no free variables.

Definition (Models of sentences). For a sentence \varphi, if \mathcal{A} \vDash \phi we say \mathcal{A} is a model of \varphi. For a set of sentences \Sigma, we say that \mathcal{A} is a model of \Sigma (\mathcal{A} \vDash \Sigma) iff it is a model of all sentences in \Sigma.

Definition (Entailment). Let \Sigma be a set of L-sentences and \varphi be an L-sentence. \Sigma entails \varphi, written \Sigma \vdash \varphi if every model of \Sigma is also a model of \varphi.

Note that I am using the notation for entailment used by Kirby’s book, for consistency, but computer scientist readers of my blog will want to note that this symbol is commonly also used for (the more restricted, assuming sound) ‘proves’ relation for a formal deductive system.

Definition (Deductively closed). A set \Sigma of L-sentences is deductively closed iff, for every L-sentence \sigma, if \Sigma \vdash \sigma then \sigma \in \Sigma.

(Again note no formal deduction in play here.)

Definition (Satisfiable). A set of sentences \Sigma is satisfiable iff there exists some L-structure \mathcal{A} such that \mathcal{A} \vDash \Sigma.

Definition (Complete). A set \Sigma of L-sentences is complete if, for every L-sentence \varphi, either \Sigma \vdash \varphi or \Sigma \vdash \neg \varphi.

Definition (Theory). A theory is a satisfiable and deductively closed set of L-sentences.

Definition (Theory of a structure). The theory of an L-structure \mathcal{A}, denoted \text{Th}(\mathcal{A}) is the set of all L-sentences which are true in \mathcal{A}.

Elementary Equivalence

It’s clearly interesting to know whether structures can be distinguished by sentences:

Definition (Elementary equivalence). Two L-structures \mathcal{A} and \mathcal{B} are elementarily equivalent, denoted A \equiv B iff for every L-sentence \varphi, \mathcal{A} \vDash \varphi iff \mathcal{B} \vDash \varphi.


Theorem (Compactness). Let \Sigma be a set of L-sentences. If every finite subset of \Sigma has a model then \Sigma has a model.

A huge amount of excitement flows from the compactness theorem, including many of the results that follow.

Proposition (Elementarily-equivalent non-isomorphic structures). Let \mathcal{A} be any infinite L-structure. Then there is another L-structure which is elementarily equivalent to \mathcal{A} but not isomorphic to it.

And so, via this route, we get wonderful non-standard models. As we will see in a moment, via the Löwenheim-Skolem theorems, there are actually many such structures.

Axiomatisable Classes

Definition (Axiomatisable class). If \Sigma is a set of L-sentences then denote by \text{Mod}(\Sigma) the class of all L-structures which are models of \Sigma. The class \text{Mod}(\Sigma) is axiomatised by \Sigma and is and axiomatisable class. The class is finitely axiomatised if it is axiomatised by a finite set of sentences.

By compactness, we can’t capture arbitrarily sized, but finite, models (e.g. the class of all finite groups):

Proposition (Large models for axiomatisable classes ensures infinite models). Let C be an axiomatisable class with arbitrarily large finite models (i.e. for every n \in \mathbb{N} there is a model \mathcal{A}_n \in C whose domain is finite and of size at least n). Then C also contains an infinite model.

If we can finitely axiomatise a class, then if we know we can axiomatise it by some (possibly infinite set of) sentences, then it suffices to choose an appropriate finite subset of those sentences:

Proposition (Finite subsets of sentences axiomatise). If C = \text{Mod}(\Sigma) and C is finitely axiomatisable, then it is axiomatised by a finite subset of \Sigma.

Cardinality of Languages

Definition (Language cardinality). We will write |L| for the cardinality of language L, which we define to be the cardinality of the set of all L-formulas containing only countably many variables.

Unsurprisingly, given this definition, the language cardinality is countable if the set of symbols is countable, otherwise it’s the same cardinality as the set of symbols:

Proposition (Language cardinality). |L| = \max( \aleph_0, |\text{Symb}(L)| ).

Substructures/Extensions and Elementary Substructures/Extensions

Definition (Substructure). Let \mathcal{A} and \mathcal{B} be L-structures, and suppose A \subseteq B. \mathcal{A} is a substructure of \mathcal{B} iff the inclusion of \mathcal{A} into \mathcal{B} is an embedding.

An example of a substructure would be the naturals as a substructure of the integers, with the language consisting of the < relation and 0; clearly the inclusion map is an embedding.

Lemma (Substructures preserve quantifier-free formulas). If \mathcal{A} \subseteq B, \varphi(x) is a quantifier-free formula and a \in A, then \mathcal{A} \vDash \varphi(a) iff \mathcal{B} \vDash \varphi(a).

Taking the same example, we can see why a stronger notion than substructure is required in order to preserve formulas with quantifiers. For example \exists x[ x < 0 ] clearly has a model in the integers but not in the naturals.

Definition (Elementary substructure/extension). \mathcal{A} is an elementary substructure of \mathcal{B} (and \mathcal{B} is an elementary extension of \mathcal{A}, written \mathcal{A} \preccurlyeq \mathcal{B}, iff A \subseteq B and, for each formula \varphi(x) and each a \in \mathcal{A}^n, \mathcal{A} \vDash \varphi(a) iff \mathcal{B} \vDash \varphi(a).

The Tarski-Vaught test is a test for whether a substructure is elementary. The essence of the test is to check whether we can always find an element in the potential substructure that could replace the equivalent element in the superstructure in a formula containing just one variable ranging over the superstructure domain:

Lemma (Tarski-Vaught test). Let \mathcal{A} \subseteq \mathcal{B} be an L-substructure and for every L-formula \varphi(x,y) and every a \in A^n and b \in B such that \mathcal{B} \vDash \varphi(a,b), there is d \in A such that \mathcal{B} \vDash \varphi(a,d). Then \mathcal{A} \preccurlyeq \mathcal{B}.

The Downward Löwenheim-Skolem theorem guarantees the existence of ‘small’ elementary substructures built from the base of an arbitrary subset of the domain of of the elementary extension:

Theorem (Downward Löwenheim-Skolem theorem). Let \mathcal{B} be an L-structure and S \subseteq B. Then there exists an elementary substructure A \preccurlyeq B such that S \subseteq A and |A| \leq \max(|S|,|L|).

While the Upward Löwenheim-Skolem theorem gives us a tower of models elementarily extending any infinite model:

Theorem (Upward Löwenheim-Skolem theorem). For any infinite L-structure \mathcal{A} and any cardinal \kappa \geq \max(|L|,|A|), there exists an L-structure \mathcal{B} of cardinality equal to \kappa such that \mathcal{A} \preccurlyeq \mathcal{B}.

Definable Sets

Definition (Definable set). In an L-structure \mathcal{A}, a subset S of A^n is said to be a definable set if there is an L-formula \varphi(x_1,\ldots,x_n) s.t. S = \{ a \in \mathcal{A}^n \; | \; \mathcal{A} \vDash \varphi(a) \}.

One way to show that a set is undefinable is to find an automorphism not preserving the set:

Theorem (Preservation of definability). If S \subseteq A^n is a definable set and \pi is an automorphism of \mathcal{A} then \pi(S) = S.

Notation (\text{Def}). We will write \text{Def}_n(\mathcal{A}) for the set of all definable subsets of A^n.

Quantifier Elimination

Quantifier elimination is a useful way to get a grip on which sets are definable.

Definition (Quantifier elimination for structures). An L-structure \mathcal{A} has quantifier elimination iff for every n \in \mathbb{N}^n, every definable subset of A^n is defined by a quantifier-free L-formula.

Definition (Quantifier elimination for theories). An L-theory T has quantifier elimination iff for every n \in \mathbb{N}^n, for every L-formula \varphi(x_1, \ldots, x_n), there is a quantifier-free L-formula \theta(x_1, \ldots, x_n) such that T \vdash \forall x [\varphi(x) \leftrightarrow \theta(x)].

These semantic and syntactic views line up:

Lemma (Structures and their theories are equivalent w.r.t. quantifier elimination). Let \mathcal{A} be an L-structure. Then \text{Th}(\mathcal{A}) has quantifier elimination iff \mathcal{A} has quantifier elimination.

Substructure Completeness

One way to show that a theory has quantifier elimination is via substructure completeness, a weaker notion than completeness.

Notation (\mathcal{A}^+). Given an L-structure \mathcal{A}, define a new language L_A by adding a new distinct constant symbol c_a for every a \in A. \mathcal{A}^+ is the L_A-structure obtained by interpreting each constant symbol c_a as a, with all other interpretations remaining as in \mathcal{A}.

Definition (Diagram). The diagram \text{Diag}(\mathcal{A}) of a L-structure \mathcal{A} is the set of all atomic L_A-sentences and negations of atomic L_A-sentences which are true in \mathcal{A}^+.

Definition (Substructure complete). A L-theory is substructure complete if, whenever \mathcal{A} is a substructure of a model of T, the theory T \cup \text{Diag}(\mathcal{A}) is a complete L_A-theory.

Proposition (Substructure completeness is equivalent to quantifier elimination). Let T be a L-theory. Then T is substructure complete iff T has quantifier elimination.

Principal Formulas

Notation. Given an L-structure \mathcal{A} and an L-formula \varphi with n free variables, we will write \varphi(\mathcal{A}) for \{ a \in A^n | \mathcal{A} \vDash \varphi(a) \}.

Definition (Principal formula). Let T be a complete L-theory. A principal formula w.r.t. T is a formula \psi(x) such that: (i) T \vdash \exists x [\psi(x)], (ii) for every L-formula \varphi(x) with the same tuple of free variables, either T \vdash \forall x[ \psi(x) \to \varphi(x) ] or T \vdash \forall x[ \psi(x) \to \neg \varphi(x)].

Another way to get a good handle on the definable sets of a L-structure is via their principal formulas, especially if there are only finitely many.

Proposition (Definable sets from principal formulas). Let \psi_1(x), \ldots, \psi_m(x) be principal formulas in n variables for the (complete) theory \text{Th}(\mathcal{A}), defining distinct subsets of A^n. Also let these formulas cover A^n, i.e. \cup_{i=1}^m{\psi_i(\mathcal{A})} = A^n. Then every definable set in \text{Def}_n(\mathcal{A}) is a union of some of the sets \psi_i(\mathcal{A}), and |\text{Def}_n(\mathcal{A})| = 2^m.


Bearing in mind the upward and downward Löwenheim-Skolem theorems, what’s the appropriate definition of a model that’s ‘effectively defined’ by a theory?

Definition (Categorical). A theory T is \kappa-categorical iff there is a model of T of cardinality \kappa and any two models of cardinality \kappa are isomorphic.

It turns out that categoricity can give us completeness:

Lemma (Łos-Vaught test). If an L-theory T is \kappa-categorical for some \kappa \geq |L| and T has no finite models, then T is complete.

How do we know whether a theory is categorical? We have a a special case for countably categorical (i.e. \aleph_0-categorical) theories. It turns out that countably categorical complete theories are exactly those whose models have only finitely many definable subsets for each fixed dimension:

Theorem (Ryll-Nardzewski theorem). Let T be a complete L-theory with an infinite model \mathcal{A} and with L countable. Then T is countably categorical iff for all n \in \mathbb{N}^+, \text{Def}_n(\mathcal{A}) is finite.

(Note to reader: the version of this theorem presented in Kirby instead links countable categoricity to the finiteness of Lindenbaum algebras, which I have avoided discussing in this blog post. But Lindenbaum algebras are isomorphic to algebras of definable sets for complete theories (see Kirby Lemma 20.3), so I have stated as above here. I have also eliminated elements of the full theorem presented, to make the presentation here self-contained.)

So how about the book? Would I recommend it to others?

From my perspective as an “informed outsider” to model theory, Kirby’s book was accessible. It’s brief – as the title says, it’s an invitation to model theory – and I think it does that job well. If you’re comfortable (or indeed enjoy) seeing material presented in a variety of ways, sometimes through example that generalises later in the book, sometimes through application, sometimes in its general form first, then you will probably enjoy the presentation – as I did. I felt the book falls a little short of a self-study guide, though, primarily because it doesn’t include worked answers to any of the exercises; this would be a welcome addition. This point is particularly relevant to Chapter 16, which (unlike other chapters in the book) takes the form of a ‘guided exercise’ to applying the ideas of the preceding chapters to the study of a particular structure.

Overall, I would definitely recommend to other outsiders wanting to catch a glimpse of what Model Theory is all about.

Schools Bill 2022

On the 11th May 2022 the Government published its Schools Bill, providing the legislation required to implement several of the ideas explained in their earlier White Paper which I analysed in a previous blog post. On the 12th May the Bill was followed up with a raft of explanatory policy statements. I have been going through all these documents, and summarise here points of key importance or that stood out for me.

It is worth noting at the outset that the policy statements go well beyond what’s in the Bill, as they set out plans for intended standards to be set in the future, with the Bill simply enforcing these standards as they arise. As such, the Bill provides for quite sweeping powers for the Secretary of State to amend these standards and duties over time without the requirement for additional legislation.

1. Academies

1.1. Intervention in Existing Academies

Part 1 of the Bill deals with changes to the landscape of Academies in England. I would characterise the scope here as two-fold: firstly, the aim is to move many existing powers from contractual funding agreements into primary legislation, and secondly to be able to apply these powers at the Trust level (‘proprietor’ in the Bill) rather than only at individual Academy level.

Currently, the DfE’s powers of intervention depend on the version of the Funding Agreement and Articles of Association used by a particular academy, which depends on when they converted to Academy status – a patchwork which is both inequitable and very hard to manage. The Bill would place all Academies under the same legislative framework, which is to be welcomed.

Notices to Improve, an existing mechanism (formerly known as ‘Financial Notices to Improve’) seem to be envisaged as the main approach for ensuring change at a Trust or Academy without using the ‘nuclear’ Termination option. It’s proposed that these notices can be issued under wider circumstances than currently, e.g. “[if the SoS is satisfied] that there are significant weaknesses in the proprietor’s governance procedures” Part 1 6(1)(b). I think on balance this is a good idea: it is much better for the DfE to be able to use appropriate levers to effect change, especially in cases of inadequate governance, rather than have to fall back on Termination as the only route.

Along the same lines, Part 1 Section 7 includes powers to appoint or require appointment of directors to an Academy either due to failure to act on a Notice to Improve, a serious breakdown in governance procedures, or for safeguarding reasons. This happens by directing the Trust to appoint a named person or, if the DfE prefers, a person with named skills and experience. The accompanying briefing paper discusses using this power to create Interim Trust Boards, not unlike existing Interim Executive Boards for maintained schools causing concern. It’s interesting that similar provisions existed at academy level in early versions of the model Articles of Association of academies, see e.g. Article 62 allowing the appointment of ‘Additional Governors’ for very similar reasons, and the consequent Article 64 requiring existing governors to stand down in these circumstances. These no longer exist in the current model Articles, and I have always thought this odd. I am pleased to see these intervention powers returning, and perhaps more importantly all schools placed on an equal footing in this regard.

1.2. Academisation

The main new power over LA maintained schools is the power for LAs to apply for an Academy Order via a new Section 3A of the Academies Act 2010. This appears as Part 1 29(2) of the Bill. This seems to be the way the Government is implementing its stated aim to allow LAs to establish MATs (see my analysis of the White Paper for more detail). In areas where the LA is aligned with Government plans, we can expect to see beginning of the end of maintained schools over the 2023/24 school year, I expect. However, there is no obligation for LAs to apply for academy orders, so I believe we will still see a mixed economy of MATs, maintained schools, and isolated SATs well into the future, but perhaps with more regional variation than currently.

1.3. Grammar Schools

Grammar schools and schools with religious character get significant emphasis in the Bill. At least for grammar schools, I read this as primarily a desire to reassure them that the only way they will lose their ability to select is through a parental ballot, not through one of the sweeping powers introduced in the rest of the Bill: essentially the Secretary of State appears to want to explicitly limit his own powers in this regard without bringing in further primary legislation. This makes sense in the context of the White Paper, as a way to bring more grammar schools into MATs.

1.4. Not in the Bill

There are a few things I noted previously as mentioned in the White Paper that the Bill gives no real insight into, such as:

  • the requirement to (re)broker schools into strong MATs after two consecutive Ofsted reports below Good
  • the setting up of an “arms length” curriculum body
  • mechanisms for local governance (though this is mentioned in the explanatory briefing)
  • MAT inspection
  • collaboration standards for MATs

There is also talk in the accompanying briefings of improvements to complaint handling by academies, another issue I have commented on before and where I agree that significant improvements need to be made.

I suspect the broad powers introduced by the Bill are likely to be used as an “Enabling Act” to introduce these various points via the school standards provisions outlined in Part 1 1(2) of the Bill. We are told that the Academy Trust Standards will come to parliament separately (see explanatory briefing), and I look forward to reading these.

2. The Rest of the Bill

The rest of the Bill deals with school funding, school attendance, independent educational institutions, and various miscellaneous items.

I was involved in determining the school funding formula for Essex LA for many years, and I don’t read this portion of the Bill as particularly radical. Long-term readers of my blog will know that I’m not very happy with the approach taken to the National Funding Formula (see, for example, my 2017 consultation response and the links to earlier posts going back some years therein). But the new content in this Bill is essentially simply about making the default funding position the “direct” (previously known as “hard”) National Funding Formula, with the Secretary of State authorising exceptions (presumably because they recognise that actually one size really doesn’t fit all!). I’m pleased to see a continued role for Schools Forums in the proposed legislation, including around de-delegation for maintained schools, in Part 2 (3)(a).

The school attendance provisions introduce a duty on LAs to maintain a register of children not in school and to exchange information between LAs when children on the register move, as well as various provisions to help clamp down on illegal schools.


From my perspective, the meat of the Bill is Part 1 on Academies, standards, and interventions. I am cautiously optimistic about the measures proposed to intervene in existing academies. In particular, I welcome the move away from high stakes “hands off” versus “terminate” interventions. Whether a particular intervention is best made at Trust or Academy level is worthy of considerable debate, and I think practice is likely to evolve here over time, but it appears to me that if the Government really wants MATs to take active responsibility for their Academies, as discussed in the White Paper, the balance will move toward Trust-level intervention and towards softer forms of intervention.

I am less comfortable with the proposed amendments to the Academies Act 2010 to allow LAs to issue Academy Orders. There appears to be no mechanism to provide for the LAs to decide which MATs academised schools will end up in following one of these orders. While there is no requirement for LAs to issue Academy Orders to all their maintained schools, I do wonder what will happen in those areas which keep significant maintained schools: how will the Government achieve their stated aim of full academisation here? Future legislation, or increasingly constraining LAs ability to adequately maintain their schools via funding or other levers?

Finally, in order to make the intervention measures work well in practice, good resourcing of Regional Director (currently Regional Schools Commissioner) offices will be absolutely essential. It therefore somewhat unfortunate that these papers arrived in the same week as a drive to cut civil service jobs.

Keeping the Pipelines Full

On the 16th May, my PhD student Jianyi Cheng (jointly advised with John Wickerson) will present his most recent paper “Dynamic C-Slow Pipelining for HLS” at FCCM 2022 in New York, the FPGA community’s first in-person conference since the pandemic hit.

Readers of this blog may remember that Jianyi has been working on high-level synthesis techniques that combine the best of dynamic scheduling with the best of static scheduling [FPGA 2020,FCCM 2021]. The general principle underlying his work is to make the most of what information we have at compile time to develop highly efficient custom architectures, while leaving what we don’t know at compile time to influence execution at run-time.

A very common design pattern in hardware acceleration is the idea of C-slow pipelining. Pipelining tends to be taught early in undergraduate programmes, but C-slow pipelining rarely gets mentioned. The idea arises in circuits with feedback loops. The basic approach to pipelining doesn’t really work in this setting: although we can throw multiple registers into the circuit, potentially improving clock frequency at the cost of latency, just like with feed-forward circuits, we can’t then overlap computation to achieve improved throughput, unlike the feed-forward case, because of the data dependency corresponding to the feedback loop.

C-slow pipelining essentially says “OK, but you can use the spare capacity induced by the pipeline registers to overlap computation of independent streams of data, if you happen to have them available.”

Our new paper introduces a dynamic HLS flow for C-slow pipelining. This is particularly valuable in the context of a globally dynamic environment but where certain components exhibit static control flow and can be efficiently pipelined, for example some deep but predictable computation that must be repeated many times but with the arrival times and sources for this computation may change dynamically at runtime, a perfect fit for our prior work.

Jianyi presents a way to leverage the Boogie language and tool flow from Microsoft Research to automatically prove sufficient conditions for C-slowing to be correct. He is then able to introduce a new hardware component within the Dynamatic HLS tool that allows the schedule to “run ahead” to implement certain bounded out-of-order executions corresponding to C-slowing at the circuit level.

At the cost of a small area overhead in the region of 10%, this combined software analysis and hardware transformation is able to reduce wall-clock execution time by more than half compared to the vanilla dynamic scheduling approach.

If you’ll be in NYC in mid-May, go along and hear Jianyi’s talk!

Nonlinearity is Your Friend

My former PhD student Erwei Wang and I recently teamed up with some collaborators at UCL: Dovydas Joksas, Nikolaos Barmpatsalos, Wing Ng, Tony Kenyon and Adnan Mehonic and our paper has just been published by Advanced Science (open access).

Our goal was to start to answer the question of how specific circuit and device features can be accounted for in the training of neural networks built from analogue memristive components. This is a step outside my comfort zone of digital computation, but naturally fits with the broader picture I’ve been pursuing under the auspices of the Center for Spatial Computational Learning on bringing circuit-level features into the neural network design process.

One of the really interesting aspects of deep neural networks is that the basic functional building blocks can be quite diverse and still result in excellent classification accuracy, both in theory and in practice. Typically these building blocks include linear operations and a type of nonlinear function known as an activation function; the latter being essential to the expressive power of ‘depth’ in deep neural networks. This linear / nonlinear split is something Erwei and I, together with our coauthors James Davis and Peter Cheung, challenged for FPGA-based design, where we showed that the nonlinear expressive power of Boolean lookup tables provides considerable advantages. Could we apply the a similar kind of reasoning to analogue computation with memristors?

Memristive computation for the linear part of the computation performed in neural networks has been proposed for some time. Computation essentially comes naturally, using Ohm’s law to perform scalar multiplication and Kirchhoff’s Current Law to perform addition, resulting in potentially energy-efficient analogue dot product computation in a physical structure known as a ‘crossbar array’. To get really high energy efficiency, though, devices should have high resistance. But high resistance brings nonlinearity in practice. So do we back away from high resistance devices so we can be more like our mathematical abstractions used in our training algorithms? We argue not. Instead, we argue that we should make our mathematical abstractions more like our devices! After all, we need nonlinearity in deep neural networks. Why not embrace the nonlinearity we have, rather than compromise energy efficiency to minimise it in linear components, only to reintroduce it later in activation functions?

MNIST classification error trading off against power consumption

I think our first experiments in this area are a great success. We have been able to not only capture a variety of behaviours traditionally considered ‘non-ideal’ and harness them for computation, but also show very significant energy efficiency savings as a result. You can see an example of this in the figure above (refer to the paper for more detail). In high power consumption regimes, you can see little impact of our alternative training flow (green & blue) compared to the standard approach (orange) but when you try to reduce power consumption, a very significant gap opens up between the two precisely because our approach is aware of the impact this has on devices, and the training process learns to adapt the network accordingly.

We’ve only scratched the surface of what’s possible – I’m looking forward to lots more to come! I’m also very pleased that Dovydas has open-sourced our training code and provided a script to reproduce the results in the paper: please do experiment with it.

Deciding Theories

Every once in a while I start modelling a problem using a logical formalism and need to remind myself of the decidability of various first-order theories and the complexity results for those that are decidable. Inevitably, I end up spending a long time looking back over literature I once read, with results sprinkled in different places.

This post summarises some of this information in one place for me or others to find in the future, with a skew towards those theories of interest for arithmetic. The information here is primarily taken from Bradley and Manna supplemented with various other sources linked. The usual definitions of \Omega and O are used.

TheoryDescriptionFull theoryQuantifier-free conjunctive fragmentQuantifier elimination algorithm
(see note 1)
EqualityUndecidableO(n \log n)
(Downey et al.)
Congruence closure algorithm
T_{\mathrm{PA}}Peano arithmeticUndecidableUndecidableN/A
Presburger arithmetic\Omega\left(2^{2^n}\right)
(Fischer and Rabin)
Cooper’s Algorithm
(see note 2)
(see note 3)
Real arithmetic (with multiplication)O\left(2^{2^{kn}}\right)O\left(2^{2^{kn}}\right)Collins’ Cylindrical Algebraic Decomposition
(see note 4)
Rational arithmetic (without multiplication), a.k.a. linear arithmetic.\Omega\left(2^n\right)
(Fischer and Rabin)
(Ferrante and Rackoff)
(a.k.a. linear programming)
Ferrante and Rackoff’s Algorithm
(see note 5)
Extensional ArraysUndecidableNP-complete
(Stump et al.)

Notes on the table:

  1. T_{\mathrm{E}} is as defined in Bradley and Manna, that is it has a signature consisting of = (equality) and all constant, function and predicate symbols, a.k.a. ‘equality with uninterpreted functions’. Reflexivity, symmetry, transitivity, function congruence and predicate congruence are axioms for =, but all other functions and predicates are uninterpreted (except w.r.t. these congruence and predicate axioms). Note that this is not the theory of pure equality, for which the full theory is decidable and admits a weak form of quantifier elimination (pure equality doesn’t have the functions or predicates, see Bradley and Manna Section 10.4 for the definition of weak quantifier elimination).
  2. Presburger arithmetic as described on Wikipedia does not admit quantifier elimination (counter-example: \exists x. 2x = y). However, adding an additional countable set of predicates capturing divisibility (one per divisor) together with an appropriate axiom leads to a version admitting quantifier elimination as per this table.
  3. T_{\mathbb{R}} is here taken to have a signature of \{0, 1, +, -, \times, =, \geq\} (with - unary) and axioms corresponding to those of a real closed field (theory of reals in SMT-LIB).
  4. T_{\mathbb{Q}} is here taken to have the signature \{0, 1, +, -, = \geq\} (again with - unary). Its axioms are those of an ordered torsion-free abelian group, together with an additional axiom schema asserting divisibility: \forall x. \exists y. x = ny for every positive integer n.
  5. Using the notation of Bradley and Manna, the theory of extensional arrays is designed to capture array data structures. It has the signature \{\cdot[\cdot], \cdot\langle\cdot\vartriangleleft\cdot\rangle, =\}, with the first two symbols denoting a binary and ternary function (respectively) for accessing and modifying array elements; arrays are immutable and so the ternary operator returns the modified array. ‘Extensional’ here denotes that there is an an axiom capturing that arrays are equal iff all their elements are equal in all places. (Theory of ArraysEx theory in SMT-LIB).

Further Comments

A decision procedure for the union of quantifier-free fragments can be obtained by combining the decision procedures for the individual fragments, via the Nelsen-Oppen method, under the following conditions:

  1. Their signatures only share equality.
  2. Their theories must (individually) be stably infinite, i.e. every T-satisifiable quantifier-free formula is satisfied by a T-interpretation with a domain of infinite cardinality.

If deciding each individual theory is in NP, then deciding the combination theory is also in NP.

While investigating the quantifier elimination column of the table above, I came across the Ultimate Eliminator tool which looks like great fun.

Please let me know if you spot any errors or significant omissions that may be of interest to readers of this blog.

Schools White Paper 2022

Today saw the launch of the Department for Education’s long-awaited schools white paper “Opportunity for all: strong schools with great teachers for your child”, alongside some accompanying documents which are well worth reading, especially “The Case for a Fully Trust-Led System”. This post collects some initial thoughts on the content published today. I will focus here on the specific future plans laid out, what they’re trying to achieve, and some elements of what’s missing. I will not engage deeply with areas I know little about such as teacher training, and my reading will of course be biased to my interests in school improvement.

A Fully Trust-Led System by 2030

The paper sets out the Government’s plan for all schools to be in multi-academy trusts within a single regulatory system by 2030 and makes clear that LAs will be allowed to establish MATs as part of this process (a sensible idea, if we’re going down the all-MAT route). I am slightly worried by the wording that LAs will be able to establish MATs “where too few strong trusts exist” – I don’t think special conditions should be placed on this ability. A small capacity fund of £86m has been put aside to help expand MATs – it is not clear from the paper whether LAs can bid for access to this fund too, or how this budget was arrived at.

The white paper calls for a clear role for every part of the school system (LAs, MATs, ESFA, DfE, etc.) but is rather unclear on what it sees these roles as being. I blogged about my own views on this last month (here and here), and Sam Freedman gives his view in an Institute for Government report also last month.

The paper begins to flesh out the government’s idea of what a good MAT looks like. Firstly, on size: “We know that trusts typically start to develop central capacity when they have more than 10 schools … We expect that most trusts will be on a trajectory to either serve a minimum of 7,500 pupils or run at least 10 schools.” They also provide a list of features of a strong trust, as previously discussed in the CST’s useful discussion paper setting out their view.

I welcome the moves to address the isolation of some schools, especially those stuck in a single academy trust model: “Many of our best schools operate alone, and not enough attention has been paid to harnessing the expertise already in the system”. I blogged about some problems with SATs in one of my February posts and last year the government published views of SATs who had joined MATs, citing improved governance, leadership, sense of direction, and ability to direct financial resources where they are needed.

So how will all schools end up in MATs? Clearly the government would like them to choose to join. For those that don’t and are graded inadequate by Ofsted, there is already the expectation in the RSC guidance that they will be (re-)brokered into strong MATs. It seems that these powers are likely to be strengthened: “We will shortly be consulting on moving schools that have received two consecutive below ‘Good’ judgements from Ofsted into strong trusts to tackle underperformance.”

Whatever our views on the relative merits of MATs, the government’s reports make “interesting” use of statistics: “If all children did as well as pupils in a trust performing at the 90th percentile, national performance at key stage 2 would be 14 percentage points higher and 19 percentage points higher for disadvantaged pupils”. Here, “higher” seems to refer to a baseline of the national average rather than of the 90th percentile of LA maintained schools (say) – quite a bizarre use of statistics.

Curriculum and Assessment

The government is proposing to create a new arms-length curriculum body, building on Oak National Academy. This could be of value, but could also see a de-professionalisation of teachers: the devil will be in the implementation. I for one would love to see university engagement with this body, and will try to engage enthusiastically if possible.

I’m disappointed that “we will maintain our current system of primary assessment and world-class GCSEs and A levels”. If there’s some things that the pandemic has brought into sharp relief, they certainly include some of the more unsuitable aspects of our exam system, its lack of robustness to disruption, its reliance on norm referencing and on a huge number of GCSE grades that are of limited reliability even in the best of times.

Much is made of a “parent pledge” that schools will provide evidence-based support “if your child falls behind in English or maths”. But behind in what sense? In attainment compared to year expectations (in which case how does this address the claim to support ‘the most able’) or in progress — but if so, compared to what benchmark? And how will this be identified in any uniform way across schools? This will, apparently, be via “robust assessment” — further guidance will be issued in partnership with Ofsted in due course.


The DfE hopes to increase to 90% the proportion of children achieving expected standard in reading, writing and mathematics at primary, and to increase the national average GCSE grade in English and maths to grade 5, both by 2030. These are consistent with “Mission 5” outlined in the previously-published Levelling-up the UK technical annex. As they note in that annex, it’s important to try to understand the level of ambition embedded in this objective. Unfortunately, despite the annex including a section “Is this mission ambitious, specific and achievable?” the section doesn’t actually provide an argument that it is at an appropriate level of ambition. The best I could find in the white paper is a citation to Slater et al. to argue that significant GCSE grade improvements can come from good teachers, but the same paper also says:

We now finally explore whether any of the few observable teacher characteristics that we have are correlated with estimated teaching effectiveness: gender, age, experience and education…. In fact, none of these variables play any statistically significant role in explaining teacher effectiveness, other than very low levels of experience showing a negative effect.

Helen Slater, Neil M. Davies, Simon Burgess, Do Teachers Matter? Measuring the Variation in Teacher Effectiveness in England

In other words, students gain significantly from good teachers, but spotting good teachers before the event — rather than as a backward-looking explanation — is hard to do. So this seems to support the idea that retention of good teachers and a focus on quality teacher CPD and mentoring is key.

What’s Missing

Quite a lot of information appears to be ‘to follow’. Some examples:

  • On governance. “So that trusts continue to be responsive to parents and local communities, all trusts should have local governance arrangements for their schools. We will discuss how to implement this with the sector.”
  • On MAT inspection. “We will launch a regulatory review in May 2022 looking at accountability and regulation – including how we will hold trusts to account through inspection in the future.”
  • On achieving full academisation. “We … will engage with the sector on how best to achieve a fully trust led system.”
  • On new collaboration expectations. “we will introduce a new collaborative standard – one of the new statutory academy trust standards – requiring that trusts work constructively with each other, their local authorities and the wider public and third sectors. We will engage with the sector, through the wider regulatory review, as we develop the detail.”

In addition, there are several other aspects that I had hoped would be better addressed within the white paper:

  • Teacher retention. There is a focus on mechanisms to attract new teachers, e.g. with the previously-announced £30,000 starting salary, but very little to retain existing good teachers. I believe this is short-sighted.
  • Teacher time. Funding for greater time ‘off timetable’ to engage in the all-important CPD, peer observation, lesson study, etc. has the potential to retain excellent staff as well as improve practice.
  • Capital. We know that many of our schools are in bad shape physically. Now we also know the huge impact that fresh air can have on the health of our staff and students alike, I would like to see a one-off significant cash injection to bring all our schools up to the high standard of ventilation we should expect for a healthy future for our school communities.

I will be following future developments with interest.

Pruning Circuits

On Tuesday, my former PhD student Erwei Wang (now at AMD) will present our recent paper “Logic Shrinkage: Learned FPGA Netlist Sparsity for Efficient Neural Network Inference” at the ACM International Symposium on FPGAs. This is joint work with our collaborator Mohamed Abdelfattah from Cornell Tech as well as James Davis, George-Ilias Stavrou and Peter Y.K. Cheung at Imperial College.

In 2019, I published a paper in Phil. Trans. Royal Soc. A, suggesting that it would be fruitful to explore the possibilities opened up when considering the graph of a Boolean circuit – known as a netlist – as a neural network topology. The same year, in a paper at FCCM 2019 (and then with a follow-up article in IEEE Transactions on Computers), Erwei, James, Peter and I showed how to put some of these ideas into practice by learning the content of the Boolean lookup tables that form the reconfigurable fabric of an FPGA, for a fixed circuit topology.

Our new paper takes these ideas significantly further and actually learns the topology itself, by pruning circuit elements. Those working on deep neural networks will be very familiar with the idea of pruning – removing certain components of a network to achieve a more efficient implementation. Our new paper shows how to apply these ideas to prune circuits made of lookup tables, leaving a simplified circuit capable of highly-efficient inference. Such pruning can consist of reducing the number of inputs of each Boolean LUT and, in the limit, removing the LUT completely from the netlist. We show that very significant savings are possible compared to binarising a neural network, pruning that network, and only then mapping it into reconfigurable logic – the standard approach to date.

We have open-sourced our flow, and our paper has been given a number of ACM reproducibility badges. Please try it out at https://github.com/awai54st/Logic-Shrinkage and let us know what you think. And, if you’re attending FPGA next week, reach out and say hello.

Better accuracy / silicon area tradeoffs through pruning circuits

Islands of Certainty

Readers of this blog may remember that my PhD student Jianyi Cheng (jointly supervised by John Wickerson) has been working on high-level synthesis, combining dynamic scheduling with static scheduling. His latest contribution, to be presented on 28th February at the ACM FPGA conference, is about finding islands of static control flow in a sea of dynamic behaviour.

Here’s the story so far:

So now, two years later, we are back at the same conference to present a method to do just that. We now have an automated flow to select parts of a program to statically schedule, resulting in a 4x reduction in area combined with a 13% boost in performance compared to a fully dynamic circuit, a result that is close to the best achievable — as shown by exhaustively enumerating different parts of the program to schedule statically.

The basic idea of the paper is to develop the concept of a static island — a part of a dataflow graph where making decisions on scheduling of operations once, at compile time, is likely to have minimal impact on performance (or may even improve it) while opening the door to static resource sharing. We can throw a boundary around these islands, synthesise them efficiently with commercial HLS tools (we use Xilinx Vitis HLS), and integrate the result into the overall dynamic circuit using our previous open-source compiler flow.

So what makes a good static island? Unsurprisingly, these islands should exhibit static control flow or control flow with balanced path timing, e.g. in a conditional statement the if and else branch should take the same time, and loops should have constant dependence distances (or none at all). Jianyi also shows that there is an advantage to having these islands consume their inputs at offset-times, e.g. for a two-input island we may wish the static scheduler to be aware that second input is available — on average — two cycles after the first. He shows precisely how to generate ‘wrapper’ circuits for these components, allowing them to communicate with a dynamically scheduled environment.

The overall design flow, shown below, is now fully automated – freeing the user from writing the pragmas we required two years ago.

MATs, SATs and Maintained Schools

This post forms the second in a short series of blog posts about the current governance structure of schools in England, ahead of the Government’s expected while paper this spring. The previous post was about the review of the Education and Skills Funding Agency. In this post I focus on the landscape of school structures: multi-academy trusts (MATs), single-academy trusts (SATs) and local authority (LA) maintained schools from the perspective of a governor / trustee.

I will draw heavily on Sam Freedman’s report this month released by the Institute for Government as well as the excellent discussion in the Confederation of School Trusts’ report “What is a Strong Trust?“. One does not have to completely agree with Freedman’s or the CST’s perspectives to find both reports clear, well-argued, and a very useful step towards a greater debate in the education sector.

Freedman notes what is obvious to anyone who has worked in or with the sector – that the current dual system of maintained schools and academies leads to significant duplication and inefficiency. He notes that the regulatory system for academies is ‘incoherent’, largely as a result of the ESFA/DfE split I dealt with in my last blog post. He mentions – quite rightly – that LAs’ statutory responsibility for SEND and certain safeguarding requirements further complicates the picture when they have no effective oversight or intervention powers over academy trusts.

Early Academisation Errors

My experience is that the rapid expansion of the academies programme after the Academies Act 2010 was mismanaged. We have been left with a patchwork of schools and a web of contractual agreements. Schools which converted early have often been left with legacy articles of association based on the early model articles which demonstrated little insight into how schools could evolve under poor governance (modern model articles are much improved, though not perfect). Regulators have been left with very limited powers to improve matters, and it should not have come as a surprise to government that – as Freedman states – “[By 2012/3] with things moving so fast it quickly became apparent that the Department for Education (DfE) was becoming overwhelmed and could not properly oversee that many schools.” The unfortunate reality is that many schools are still stuck with the legacy of poor decisions made during this period.

Three School Structures

There are currently three main models for schools in England: those which are part of a multi-academy trust (MAT), those which are a single-academy trust (SAT) and those maintained by local authorities. Often in these discussions the first two types of school are lumped together, and it becomes a discussion about academies versus non-academies, but I think these three situations deserve distinct analysis. In particular, oversight of an individual school’s performance in the maintained sector and in the MAT sector is, in my experience, stronger than in the SAT sector, which is the outlier in terms of sufficient oversight. I wonder how many SATs are maintaining an Ofsted grade above inadequate, and therefore not subject to intervention, but are nevertheless not performing at the level they might be if they had closer interaction with other schools.

It is clear to anyone who has worked with or for the education sector that schools benefit immensely from working together and supporting each other, and I agree with Leora Cruddas’s argument made at a governance meeting I attended last year that, in times of difficulty for a school, a compulsion to work with other schools is important. At the moment, this primarily comes through membership of a strong multi-academy trust, though I do not see why a strong, well-resourced and empowered Local Authority could not form an equally viable mechanism to drive school improvement.

Pragmatics: Control and Intervention

Unsurprisingly, Freedman’s paper seeks to advise Zahawi on an appropriate way forward to a more coherent fully academised education system, and without entering the discussion over whether academy trusts are the best structure to deliver school education in the first place, it is worth engaging with the recommendations he makes.

Freedman sees the future of LAs – as per the 2016 white paper – as ensuring that every child has a school place, ensuring the needs of vulnerable children are met, and acting as champions for all parents and families, and – quite reasonably – proposes greater powers for LAs to ensure they can actually fulfil these objectives.

There are some proposals made in the paper that I would fully support:

  • Setting a transparent framework of expectations for MATs and giving the regulator powers to intervene, close or merge MATs for either financial/compliance reasons or educational reasons, not only tied to Ofsted grades.
  • Ensuring that MATs take ownership of educational improvement and are not simply back-office consolidation bodies as is sometimes the case currently.
  • Giving local authorities the right of access to MAT data.
  • A single regulator for academies, ideally organised as an arm’s length body.

There are also some proposals that are less clear cut for me:

  • Giving LAs the power to direct a change in individual academy PANs and admissions policies. Let’s assume that we move to an “all-MAT” system with LA’s still retaining the statutory duty to ensure a place for every pupil. To ensure clear lines of accountability, it seems appropriate for these to be at MAT level not individual academy level: surely mechanisms can still be put in place for intervention at MAT level to ensure they play their part in the strategic place-planning of LAs, rather than micromanaging a MAT’s academies over the head of its trustee board?
  • Moving SEND and exclusions policy monitoring / appeals from ESFA to LAs. I agree that ESFA is an odd place for this to sit at the moment in terms of ensuring joined up working between the RSC offices, ESFA and the Local Authority. But moving this to LAs rather than to the DfE again seems to introduce dual lines of accountability for MATs; might it not be better for RSCs to be required to ensure MATs meet the LA’s planning needs?
  • Giving individual academies a mechanism to argue to ‘break free’ from a MAT, involving giving schools legal status independent from the MAT. I agree that there may be very good reasons for academies to want to move to another MAT if the MAT is not functioning well, however under an all-MAT system it seems that a more appropriate approach is to provide the regulator with powers to intervene at MAT level than to provide independent legal status to individual academies.

There is an important question of democratic control, which I believe is required to balance some of these suggestions. In particular: who gets to appoint trustees (and members) of an academy trust, and what geographical area is it reasonable for a MAT to cover? On the first point, in the early days of academisation, academies needed to have some staff trustees, some (elected) parent trustees and a trustee appointed by the Local Authority. The Secretary of State was empowered to change the composition under specific conditions laid out in the academy’s funding agreement / articles of association. Government views on this composition has changed over time, with parent trustees going out of and then back into fashion, while staff trustees are definitely out of fashion at the moment. Local Authorities no longer get to appoint trustees at all in recent model articles. The situation locally will vary from trust to trust, depending on when their articles were approved — differences that cannot be justified, in my view. I would suggest that trusts articles are updated and that the ability (though not the requirement) for local authorities and the DfE (via RSC offices) to appoint trustees is included in the new model. This would provide LAs and the DfE direct information on trusts, rather than having to rely on existing trust boards to provide accurate information, in addition to providing a powerful mechanism for spreading best practice across trusts.

There is a huge opportunity for development of the schools sector in England. I look forward to publication of the white paper!

Appendix: Minor Quibbles

It’s probably worth pointing out a couple of very minor inaccuracies in the Institute for Government report:

  • Financial Notices to Improve, mentioned in the report, no longer exist since September 2021, precisely in recognition of the broader ESFA role currently; they are now subsumed within the broader “Notice to Improve” title.
  • A few times in the report, the ‘Special Measures’ category is cited as giving the Regional Schools Commissioners power to re-broker academies. While there may be additional powers – depend on the trust’s funding agreement – under a Special Measures Ofsted category, it’s clear in the Schools Causing Concern guidance that RSCs have the power to re-broker any inadequate school, i.e. also those judged to have ‘Serious Weaknesses’.