February 2025 – Thinking

Readers of this blog will know that I have been interested in how to bridge the worlds of Boolean logic and machine learning, ever since I published a position paper in 2019 arguing that this was the key to hardware-efficient ML.

Since then, I have been working on these ideas with several of my PhD students and collaborators, most recently my PhD student Marta Andronic‘s work forms the leading edge of the rapidly growing area of LUT-based neural networks (see previous blog posts). Central to both Marta’s PolyLUT and NeuraLUT work (and also LogicNets from AMD/Xilinx) is the idea that one should train Boolean truth tables (which we call L-LUTs for logical LUTs) which then, for an FPGA implementation, get mapped into the underlying soft logic (which we call P-LUTs, for physical LUTs).

Last Summer, Marta and I had the pleasure of supervising a bright undergraduate student at Imperial, Olly Cassidy, who worked on adapting some ideas for compressing large lookup tables coming out of the lab of my friend and colleague Kia Bazargan, together with his student Alireza Khataei at the University of Minnesota, to our setting of efficient LUT-based machine learning. Olly’s paper describing his summer project has been accepted by FPGA 2025 – the first time I’ve had the pleasure to send a second-year undergraduate student to a major international conference to present their work! In this blog post, I provide a simple introduction to Olly’s work, and explain my view of one of the most interesting aspects, ahead of the conference.

A key question in the various LUT-based machine learning frameworks we have introduced, is how to parameterise the space of the functions implemented in the LUTs. Our first work in this area, LUTNet, with my former PhD student Erwei Wang (now with AMD), took a fully general approach: if you want to learn a $K$ -input Boolean function, then learn all $2^K$ lines in that function’s truth table. Since then, Marta and I have been exploring ways of parameterising that space to decouple the complexity of the function-classes implemented from the number of inputs. This gave rise to PolyLUT (parameterised as polynomials) and NeuraLUT (parameterised as small neural networks). Once we have learnt a function $f$ , all these methods enumerate the inputs of the function for the discrete space of quantised activations to produce the L-LUT. Olly’s work introduces `don’t cares’ into the picture: if a particular combination of inputs to the function is never, or rarely, seen in the training data, then the optimisation is allowed to treat the function as a don’t care at that point.

Olly picked up CompressedLUT from Khataei and Bazargan, and investigated the injection of don’t care conditions into their decomposition process. The results are quite impressive: up to a 39% drop in the P-LUTs (area) required to implement the L-LUTs, with near zero loss in classification accuracy of the resulting neural network.

To my mind, one of the most interesting aspects of Olly’s summer work is the observation that aggressively targeting FPGA area reduction through don’t care conditions without explicitly modelling the impact on accuracy, nevertheless has a negligible or even a positive impact on test accuracy. This can be interpreted as a demonstration that (i) the generalisation capability of the LUT-based network is built into the topology of the NeuraLUT network and (ii) that, in line with Occam’s razor, simple representations – in this case, simple circuits – generalise better.

Our group is very proud of Olly!

Profiles

The proposal for School Profiles, incorporating but going beyond the new Ofsted report card (my very brief comments on Ofsted proposals here) is perfectly reasonable, but nothing particularly new (check out GIAS!). More fundamental, in my view, is the need to revisit what counts as “school performance data” (hint: Attainment 8 ain’t it!) But sadly there is nothing in the consultation about this. One would at least hope that certain data hidden behind an ASP login might become public in the short term.

Intervention

It is disappointing that by default, a maintained school placed in special measures will become an academy but there is no scope for an academy placed in special measures to become a maintained school.

On the other hand RISE Teams are a good idea, especially sign-posting of best practice, regional events for school staff, etc. They are not a new idea – remember when local authorities could actually afford support teams, anyone? – but they are a good idea nevertheless. I support the mandatory nature of some interventions, which for academies will presumably come via Section 43 of the Children’s Wellbeing and Schools Bill. I am a little worried, though, that currently the RISE teams seem to be described as brokers of support “with a high-quality organisation” rather than as actually having in-house expertise – there is a danger that RISE Teams become ways to mandate schools to buy in services from favoured MATs. The devil will be in the implementation.

It is disappointing that there has been no focus so far by this government on the structural problems present in the sector. The former government’s failed Schools Bill 2022, while having many significant problems, did at least aim to replace the patchwork of academy funding contracts signed at different times with different models with a uniform footing. And the peculiar nature of the Single Academy Trust remains an untackled issue to this day.

Month: February 2025

Machine Learning with Large Lookup Tables

School Accountability Reform

Profiles

Intervention

Overall