When talking about the challenges facing developers of increasingly complex and capable generative models, people most often mention the incredible energy demands of the specialized servers on which these models are usually executed. Of course, recently there has been an active development of models that are ready to run locally on PCs and even on smartphones, however, they also have fewer capabilities than cloud versions that significantly surpass them in the number of operating parameters, and even in such a reduced form they still consume energy more classic (for the pre-generative era) computer programs and mobile applications. The desire to equip any robot vacuum cleaner, lawn mower, or even smart Internet of Things sensor with some kind of AI puts developers of such devices in a dilemma: either deliberately reduce the functionality of locally executed models to fit the capabilities of the von Neumann hardware available on such devices, or principle of abandoning the desire for autonomy – and relying on advanced networks like 5G to communicate with “truly smart” cloud AI, rather than such a distant future and 6G.
And although the second path is the most predictable, since it is based on already well-developed technologies, from the point of view of energy costs it seems to many experts to be a dead end. According to the International Energy Agency, if a typical request to a web search engine costs the data center that powers it approximately 0.3 Wh, then the same request addressed to ChatGPT costs 2.9 Wh ( and it is not a fact that in response the user will receive data corresponding to the truth, and not another generative hallucination). Every day around the world, the Google search engine alone receives about 9 billion requests, and if all of them were processed by smart bots, the increase in global consumption for the year would be about 10 TWh – this is the amount of energy consumed on average for the same year 1. 5 million Europeans. One can only imagine how much the need for electricity will increase if cloud-based generative models begin to be regularly accessed by smart devices, of which there are already many around us, and will only become more numerous every year. In addition, providing reliable and universally available wireless communications for such machine-to-machine connections will also likely cost operators a pretty penny.
⇡#It will be easier to complicate
In previous articles in the series devoted to neuromorphic systems, we have repeatedly pointed out the main difference between computers of the von Neumann architecture and biological nervous tissue: the fundamental, physical separation of data storage and processors. The information bus between the processor and RAM can be made as wide as desired (although the more channels, the more expensive such a design will be and the lower its reliability will be), but there will still be time – and especially energy – to move information from RAM to the CPU and back is consumed quite a bit: 2-3 decimal orders of magnitude more than a typical computational operation on the processor itself. Especially considering the volumes of data modern generative models with hundreds of billions of operating parameters have to sift through. The operations of matrix multiplication (to which, in essence, the work of an artificial neural network is reduced from a mathematical point of view) is not a Newton binomial, but the von Neumann calculator for it is clearly not optimal, especially when the dimension of both factors is too large.
But biological nervous tissue, on the contrary, stores and processes signals on the same hardware nodes, which prompted Carver Mead back in 1990 to introduce the term “neuromorphic,” i.e., “functionally similar to nervous tissue.” fabrics”, in application to promising in this regard, but at that time not yet existing computing systems. In those same years, the idea of “neuromorphic engineering” was formed – a field of interdisciplinary research at the intersection of data science, microelectronics, neurobiology and many other areas, the main task of which is the construction of computing systems, at the hardware level, if not one to one, emulating the work of a living nervous system. fabrics, then at least using in their design optimal solutions suggested by nature. The impulse (spike) neural networks, SNNs, that we examined in this series of articles are just one of these: aimed at processing sequences of signals, they fairly closely reproduce the functionality of biological neurons.
The willingness of von Neumann systems to emulate fundamentally different computing architectures greatly contributes to the development of neuromorphics – digital models are a good way to work out the general principles of the organization and operation of new devices. However, in order to truly realize the advantages of hardware SNNs, it is necessary to create the appropriate hardware – starting, of course, with semiconductor systems, since the technologies for their production have now been developed in the best possible way. Neuromorphic processors Loihi (now two generations old), ODIN, SpiNNaker, Xylo, Akida and many others, the circuits of which (including local memory for storing the previous states of individual neurons, as well as signal delay circuits and weight adjustments in artificial synapses) are formed by silicon micro – and nanostructures, are already demonstrating outstanding results in the tasks for which they are best suited. True, for the most part, such computers are now available either as laboratory prototypes or, at best, as small-scale products with a limited scope of applicability. We cited the reasons for this at the end of the previous article, and (if we ignore the purely practical difficulties of the hardware implementation of neuromorphic systems) the main ones among them are perhaps two: a rather limited set of software – and the very complex nature of this field of knowledge itself, which sharply narrows the range of potential specialists ready to deal with it.
Still, the potential benefits of neuromorphic computing are keeping researchers busy. In addition to the mind-blowing (especially against the backdrop of power-hungry server GPUs) energy efficiency and – in theory – extremely high performance due to the combination of a computing node and memory in a single physical unit, without separation into CPU and RAM with a relatively narrow bus between them, experts indicate at least by two. The first of these additional benefits is the bright promise of parallel processing on asynchronous SNNs: in essence, neuromorphic systems can simultaneously handle as many information transformation streams as the number of artificial neurons they contain. And the second is the extremely high adaptability (in neurophysiology the term “plasticity” is used) of neuromorphic computers to the changeable nature of the problems they solve. The self-learning capabilities of systems, the very principle of operation of which is borrowed from biological nervous tissue, are potentially so great that many researchers seriously doubt that truly smart robots – those that respond adequately and in a timely manner to the changing realities of the surrounding world – will be able to appear before it is possible to equip them on-board neuromorphic analytical systems.
⇡#Hysteresis, phase shift
Although classical semiconductor technologies are generally suitable for constructing hardware neuromorphic systems, they have a number of serious limitations. In particular, it will not be possible to scale silicon neuromorphic computers in the direction of units of conventional nanometers with the same ease with which x86 or RISC processors allow this, for example, for the reason that the SNN artificial neuron must include at least a limited number of memory cells. Placing memory on a single silicon substrate next to the processing unit is a trivial task; An example of this is the SRAM cache as an integral component of central processors for several decades now. However, as readers of our series of articles on the challenges of semiconductor production should remember, memory cells are much more difficult to miniaturize than logical circuits, which means that it is unlikely to significantly reduce the scale of production standards for semiconductor neuromorphic chips from the current ones of approximately “28 nm”. succeed at minimal cost. In other words, at the current stage of development, semiconductor neuromorphics obviously outperforms platforms that are more exotic in terms of hardware implementation – simply due to the excellent smoothness of the production processes used for its manufacture. However, as these exotic platforms themselves begin to gain sufficient momentum in their development, they have every chance of catching up and overtaking silicon neuromorphic systems – simply because without a sharp reduction in the scale of production standards, it will become difficult to compete withother architectures both in terms of energy efficiency and performance.
The closest to semiconductor implementations of neuromorphic computers are, perhaps, memristor ones – built not on classical transistors (whose gates allow or do not allow charge to pass through – “open” or “close” – under the influence of control voltage), but on memristors themselves (from memory + transistor ), the electrical conductivity of which is changed by the current passing through them. The behavior of a memristor determines its inherent phenomenon of hysteresis: the instantaneous response of the system to a stimulus (in this case, to the current passing through it) depends not only on the strength of this stimulus itself, but also, nonlinearly, on the state of the system itself at a certain previous time interval. We can say that this reaction manifests itself with some delay (the Greek word ὑστέρησις precisely means “delay”), but only up to a certain limit, after which saturation occurs – and further growth in the magnitude of the reaction in response to the stimulus no longer occurs. This phenomenon of hysteresis is fundamentally different from inertia, which also manifests itself as a delay in the response, but is usually linear and not limited in the strength of the reaction (except perhaps by the physical strength of the system being stimulated).
Memristors can thus be used as a basis for the creation of neuromorphic artificial neurons and synapses, although these passive electrical elements were initially studied for their applicability in computer memory systems. The fact is that the change in the conductivity of the memristor under the influence of passing current occurs quite quickly, and therefore in the future such devices will be able to compete in the speed of operation not only with NAND memory, but also with DRAM memory. Moreover, when the voltage is removed from the contacts, the memristor remains in an unchanged state – which allows you to almost instantly resume the system immediately from the moment before it was turned off during the next startup, without wasting time on saving the contents of RAM cells to the swap file, and then retrieving it from there .
As interest in neuromorphic computing grew, it became clear that physical processes with hysteresis – i.e. those that ensure the preservation of a certain value after an external influence for a long time, and then are also able, also induced, to return to the original state – are not so few. For example, such chemical compounds as chalcogenides are known, among other interesting properties of which there is a change in phase state (from polycrystalline to amorphous and back), which can be provoked, for example, by heating – and that, in turn, is generated by the application of voltage to the supplied voltage. chalcogenide cell heat dissipating electrode. Phase-change memory (PCM) is based on this principle. By the way, it is not difficult to heat a chalcogenide with a laser, and, in addition to conductivity, the transition from one phase state to another can be accompanied in certain substances by a change in optical, and not just electrical properties. Therefore, the basis of, say, the well-known – although from a modern point of view almost standing on the same shelf with antediluvian phonographs and daguerreotypes – (re)recordable CDs and DVDs also lies in the mechanism of phase transition in chalcogenide alloys.
PCM cells are characterized by extremely high reliability – experimental samples of various compositions showed the first signs of degradation only after 108, or even 1012 rewrite cycles. In addition, their physical dimensions may well be calculated in nanometers, which opens up enormous scope for miniaturization of computer memory and neuromorphic systems created on their basis – especially if fine photolithographic processes are used to manufacture buses connecting such cells. Delays when writing/reading data in PCM amount to tens of nanoseconds, thus the latency of neuromorphic computers built on their basis turns out to be quite acceptable. Unfortunately, this extremely attractive technology also has a significant drawback, namely, a shift in electrical resistance that accumulates over time due to structural changes during regularly repeating cycles of changing the phase states of chalcogenides. Such a shift can be taken into account and controlled, however, if the resistance level of an individual PCM cell encodes a certain value in a matrix or vector (and the work of modern neural networks, in essence, as we have said more than once, comes down to multiplying them), the shift will introduce an error into the calculations – which will have to be compensated, for example, at the software level using a classical von Neumann computer. Which, of course, makes the very fact of switching to a neuromorphic hardware platform meaningless.
⇡#Wait for an answer
If in memristors the electrical conductivity changes under the influence of the current passing through the sample, and in chalcogenide cells the control effect (transition from one phase state to another) is exerted by temperature, then other systems demonstrate similar hysteresis phenomena – and in relation to their very different properties. Thus, in ferroelectrics (also known as ferroelectrics), a phase transition occurs between two crystalline states with different polarizations under the influence of an external electric field. Accordingly, such crystals can be used to record and store data: one polarization value will encode a logical “1”, the other – “0”. Computer memory based on this effect, FeRAM, from ferroelectric, was proposed back in the 1950s, along with FeCAP capacitors and even FeFET field-effect transistors built on the same principles. In this case, since the state of the ferroelectric after “programming” by an external field does not change if the temperature remains suitable, the FeFET retains the state assigned to it for an indefinitely long time until a new command signal is sent via the control bus. This just gives the potential opportunity to train neural networks on a ferroelectric basis, and then use them already trained to perform applied tasks (inference).
It is important that the initial (spontaneous) polarization of the ferroelectric is not induced, but arises naturally due to the displacement of positive and negative charges relative to neutral positions inside the crystal in a certain temperature range. That is, materials of this kind can act as both memory cells and basic elements of neuromorphic systems. A single ferroelectric crystal is a binary system (with two states of polarization), however, if you use a polycrystalline sample, you can, by applying a specially selected external field, change its physical properties as a whole almost continuously and within fairly wide limits – you get a kind of analog circuit , capable of acting as a synapse (and storing the weight value at one of the inputs of an artificial neuron, for example). Particularly encouraging from the point of view of the circuit design of future ferroelectric neuromorphic computers is the fact that on the same hardware basis they allow the formation of both memory cells and logical circuits.
Another option for using ferroelectrics involves confining a sample with such properties between a pair of electrodes to implement a probabilistic ferroelectric tunneling junction, FTJ. By influencing the thin (a few nanometers) polycrystalline ferroelectric film separating the electrodes, it is possible to create conditions for quantum tunneling of charge carriers through it. In this way, an FTJ memory cell is formed, writing to which is carried out by applying a voltage to the electrodes connected to the film (thereby creating conditions for tunneling), and reading is done by measuring the current generated by the tunneling effect.
Ferroelectrics as a promising hardware basis – first for a new type of memory, and later for neuromorphic computing – have become the object of intense research attention since 2011, when outstanding ferroelectric properties were serendipitously discovered in thin films of silica (SiO2) doped with small amounts of dioxide hafnium (HfO2). From the point of view of the evolution of computer technology, the less than fifteen years that have passed since that moment is an insignificantly short period of time: like other non-semiconductor neuromorphic technologies, ferroelectric technology today suffers from many “childhood diseases”. And it will, apparently, suffer for at least another decade, although a certain proximity to well-developed technologies (in particular, serial photolithographs are used for deposition of ferroelectric films) gives hope for fairly rapid progress.
One way or another, hardware neuromorphics beyond the purely semiconductor (where the same synapses, for example, still have to be emulated with rather complex transistor circuits) continues to remain a matter of tomorrow, if not the day after tomorrow. In addition to the technologies we have listed, even more exotic ones are considered, such as the use of the chemical mechanism of the redox reaction to change the valence state and, accordingly, the electrical properties of media with metal ions to create artificial synapses (valence change memory, VCM, a subtype of resistive memory with arbitrary access; resistive random access memory, RRAM), programmable electrochemical metallization (ECM), neuromorphic networks nanotubes (neuromorphic nanowire networks), etc. There are many directions, and there are plenty of difficulties in moving along each of them – we listed them at the end of the previous article in this series.
However, probably in the next 3-5 years, among this host of conditionally promising, but not yet very well-trodden paths, a maximum of two or three will stand out, and then more tangible results can be expected from hardware neuromorphics. In the meantime, the emulation of generative (and other) AI models in the memory of von Neumann machines continues to remain, albeit expensive, but undoubtedly an achievable reality, while almost all attempts to provide a different “hardware” basis for artificial neural networks do not give an economically justified result. What remains to be congratulated are the energy workers of the whole world – obviously, thanks to the ever-present demand for a variety of artificial intelligence applications, they will not be left without work and without super-profits for a long time.
Augmented reality glasses manufacturer Xreal has introduced a new line of One Series devices. At…
When creating the Seasonic Prime TX-1600 Noctua Edition power supply, Noctua developed an unusual grille…
Following Mass Effect, another BioWare role-playing franchise waited for its unofficial holiday - on December…
About two years ago, OpenAI said that Artificial General Intelligence (AGI), also called strong AI…
I admit, I had a desire to indicate in the title the main “feature” of…
Distinguished Intel researcher Tom Petersen, who actually became the face of the company's graphics direction,…