Last month, the Royal Society Phil Trans A published my paper Rethinking Arithmetic for Deep Neural Networks, part of a special issue put together by Jack Dongarra, Laura Grigori and Nick Higham. In this blog post, I aim to briefly introduce the ideas in the paper. The paper is open access, so please read it for further detail. In addition, my slides from a talk given on an early version of this work are available from Nick Higham’s blog, and an mp3 recording of me talking to these slides has been made available by the Royal Society here.
The central theme of the paper is that hardware accelerator circuits for neural networks can actually be treated as neural networks. Consider the two graphs below. One of them represents a simple deep neural network where each node performs an inner product and a ReLU operation. The other represents the result of (i) deciding to used 4-bit fixed-point arithmetic, and then (ii) synthesising the resulting network into a circuit, technology-mapped to 2-input Boolean gates.
Although there are obvious differences (in structure, in number of nodes) – there is a core commonality: a computation described as a graph operating on parameterisable functional nodes.
So what can we gain from this perspective?
1. Binarized Neural Networks are universal. The paper proves that any network you want to compute can be computed using binarized neural networks with zero loss in accuracy. It’s simply not the case that some problems need high precision. But, as a corollary, it is necessary to not be tied too closely to the original network topology if you want to be guaranteed not to lose accuracy.
2. Boolean topologies are tricky things. So if we want to rethink topologies, what first principles should we use to do so? In the paper, I suggest looking to inspiration from the theory of metric spaces as one step towards producing networks that generalise well. Topology, node functionality, and input / output encoding interact in subtle, interesting, and under-explored ways.
3. This viewpoint pays practical dividends. My PhD student Erwei Wang and collaborators James Davis and Peter Cheung and I have developed a Neural Network flow called LUTNet, which uses Boolean lookup tables as the main computational node in a neural network, leading to very efficient FPGA implementations.
I’m hugely excited by where this work could go, as well as the breadth of the fields it draws on for inspiration. Please do get in touch if you would like to collaborate on any of the open questions in this paper, or any other topic inspired by this work.