In the foundational conversation kicking off the “Intelligence at the Edge” series, I sat down with Chief Executive Officer of Infxl, Altaf Khan, whose company develops hardware-agnostic, ultra-low-power, ultra-low-latency machine learning (ML) solutions for the edge. What this means is he’s achieved a way of mechanically reaching a conclusion from a shed load of digital information a hell of a lot faster than it feels to process the underlined sentence. Let’s take a deeper dive into the reasons edge processing is almost a necessity, even for cloud-based solutions.
The case for edge processing
The key benefits of receiving computing capabilities through the cloud stem from the subject of “accessibility” – I needn’t be lagged down with large, fixed, on-site computing infrastructure to fulfil my data-processing needs when I can rent compute power from a cloud provider to process my data in their infrastructure. Of course, whether I should decide to use cloud, hybrid-cloud, or on-premises computing is a separate strategic conversation, and the cost-benefits vary from use-case to use-case, but the benefits of cloud are clear and known. It is important to note, however, that the cloud, contrary to its name, is not floating in the air around us. It is in fact a physical set of highly capable servers/data centres housed somewhere in the world, that has advanced and wide-reaching network capabilities that allow users to tap into the processing power remotely, usually through subscription packages offered by cloud providers. So, despite the impressive scalability of providing data processing capabilities in the cloud, the vast majority (if not all) of cloud providers will likely be unable to process the vast volumes of complex data coming into them from a growing number of connected devices that use and will use, data analytics technologies like ML. Thus, an inevitable case for ML inference at the edge emerges, that can be best explained through a classic “chain-of-command” analogy used by myself and Altaf during our conversation.
An army general sitting behind enemy lines cannot (and needn’t) receive and mentally process every single action taking place on the battlefield. Whilst the foot soldiers on the field (at the edge) must mentally process every bullet and explosion on the field to carry out their function, all this data and information must be processed, contextualised, simplified (inference) and key insights extracted before relaying the relevant insights to the general (who, for the purpose of this analogy, is the cloud server). ML inference at the edge is the natural next step to process the ever-expanding volume of data produced through edge devices – the cloud server that was once a foot soldier receiving huge volumes of data from the field, must now be promoted up a rank to act more strategically on key pieces of information passed up from the rapidly increasing number of edge devices. At the risk of trivialising the technology used to build ML capabilities, ML is quite simply the automation of converting raw data into knowledge and wisdom. As more data is produced from devices at the edge, it falls to the role of ML implementation in all these edge devices to make them smart and intelligent, that is, to ensure that it is not raw data that they are passing up the chain of command, but actionable insights, so that the general may focus on strategy without getting overwhelmed by the volume of received data.
To put this analogy into a real-world example …
Let’s consider a domestic cat sensor that opens a door flap only when it sees a cat (incidentally, the world of ML is obsessed with images of cats). In this device, there must be a camera that can recognise a cat (better yet, your own cat) as opposed to the neighbour’s dog. The inference would constitute the act of recognising whether the image in front of the sensor is your cat or not; image recognition is one of the big, applied domains of ML at the moment. In the cloud-inference model (that is inference in the cloud) all the raw image pixel data would be transmitted to the cloud. The servers in the cloud would then match up the image to its trained and pre-loaded model of a cat and decide whether the figure in the image meets its criteria of what constitutes a cat. If it’s a yes, the cloud servers could send a signal back to the cat door to open. If not, no signal would be necessary. This is an awful lot of pictures being sent to the cloud for what essentially needs to be an output of “open” or “stay shut”. So, this inference is outsourced to the edge device, and only the “open” signal is sent back to the cloud. This saves telecommunication and compute costs for the cloud company, and allows them to focus on higher-level decisions such as “if we receive data telling us what time of day / how many times cats enter and leave their owner’s houses, what could we do with that information? Could we use it to expand our suite of services for the cat market?”.
The vendor market enabling this functionality at the edge is catching fire, and there are several factors to cater for to realise this new chain of command.
The edge devices need to be first and foremost energy-efficient – the world is dying, everyone’s talking about planetary sustainability, and it only makes sense to mass distribute edge capabilities if it’s not going to kill the planet as quickly as people seem to be uncomfortable with. Altaf and the team at Infxl have achieved some impressive results, as he said “… in our work with Microchip’s PolarFire FPGA, we were trying to build an IoT system with an ML solution consisting of 130 neurons. That solution consumed 57 nano joules per inference. And if you think about what a nano joule is, you know these AAA alkaline batteries, the really tiny ones, the thin ones, they store 5000 joules. So, we were doing inference on 57 nano joules. That means that you know, if that’s the only thing that you’re doing, that battery would last until the end of time”. Infxl also designs their edge ML to be hardware-agnostic, meaning their ML inference engine is flexible and runs on almost anything available. This comes from the fact that their inference engine is simple, compact, and works on a range of devices whether that’s an MCU, a DSP, or an FPGA. This gives their customers flexibility to use e.g., an FPGA if they want to run Infxl’s ML model super-fast, or, on a 20-year-old MCU if they want to run it super cheap.
The end goal for edge companies like Infxl is to supply edge processing capabilities for real-world use cases. You can find Infxl’s case studies below for