I was delighted to be asked by Axel Jantsch to speak recently at the opening of the new Christian Doppler Laboratory (CDL) for Embedded Machine Learning. This is a new collaborative laboratory between TU Wien (led by Jantsch) and TU Graz (Bischof) as well as several strong industrial partners from Germany and Austria. Its scope and content is very closely aligned to the EPSRC Center for Spatial Computational Learning, which I launched last November, the latter bringing together Imperial and Southampton in the UK with Toronto in Canada, UCLA in the USA, and – just announced – Sydney in Australia. I was therefore delighted to bring fraternal greetings from our centre, and to begin discussions over how we could work together in the future to build a truly global collaborative research effort in this space.
In addition to hearing about the work plan and objectives for the new CDL, the meeting heard from three external academics (Christoph Lampert, Bernt Schiele and myself) on their recent relevant research. I spoke about the work I recently published in my article in the Philosophical Transactions of the Royal Society. I found both Christoph and Bernt’s talks inspiring – I briefly summarise the key points of interest to me, below.
Christoph Lampert from IST spoke on Embedded and Adaptive Models for Visual Scene Understanding. He explained that while others are working on more efficient hardware and more efficient ML models, his group is focusing on matching models to problems and making models adaptive to the environment in which they’re applied. With this in mind, he focused on two recent papers, one from WACV 2020 and one from ICCV 2019. The WACV paper is on object detection, proposing compute-efficient method that avoids the twin pitfalls of proposal-based systems like Faster R-CNN and grid-based methods like YOLO. The ICCV paper is on multi-exit architectures, where latency constraints can cause an early termination of deep neural network inference computation and we want to have a useful result still in these circumstances. The paper discusses how to train such networks using ideas from Hinton’s et al.’s “Knowledge Distillation”. This also reminded me of some work from our research group [1,2] featuring early exits or a focus on getting good results quickly in latency-constrained systems.
Bernt Schiele from MPI and Saarland University spoke on Bright and Dark Sides of Computer Vision and Machine Learning. He spoke about several very interesting topics, including scene context (e.g. ‘Not Using the Car to See the Sidewalk‘, CVPR 2019), Disentangling Adversarial Robustness and Generalization (CVPR 2019) and reverse engineering and stealing deep models (e.g. ‘Knock-off Nets’, CVPR 2019). All are very interesting and timely topics, but the work on disentangling adversarial robustness and generalisation was particularly interesting to me, since I’ve given the topic of generalisation some thought in the context of efficient DNN accelerator hardware. Schiele argued for a stronger distinction to be made in the research community between different classes of adversarial examples — he focused on the idea of “on-manifold adversarial examples” where an adversarial example needs to actually be a correct instance of the class, rather than an arbitrary perturbation of an image — the latter commonly used in the literature, which Schiele referred to as a ‘regular’ adversarial example. His talk showed how on-manifold examples could be studied. The main take-home messages were that ‘regular’ robustness and generalisation are not contradictory, but that ‘on-manifold’ adversarial examples can be found, and such robustness in this instance is generalisation.