Jump to content
  • IBM team builds low-power analog AI processor


    Karlston

    • 630 views
    • 5 minutes
     Share


    • 630 views
    • 5 minutes

    Huge arrays of phase-change material perform in-memory processing.

    Large language models, the AI tech behind things like Chat GPT, are just what their name implies: big. They often have billions of individual computational nodes and huge numbers of connections among them. All of that means lots of trips back and forth to memory and a whole lot of power use to make that happen. And the problem is likely to get worse.

     

    One way to potentially avoid this is to mix memory and processing. Both IBM and Intel have made chips that equip individual neurons with all the memory they need to perform their functions. An alternative is to perform operations in memory, an approach that has been demonstrated with phase-change memory.

     

    Now, IBM has followed up on its earlier demonstration by building a phase-change chip that's much closer to a functional AI processor. In a paper released on Wednesday by Nature, the company shows that its hardware can perform speech recognition with reasonable accuracy and a much lower energy footprint.

    In phase

    Phase-change memory has been under development for a while. It offers the persistence of flash memory but with performance that's much closer to existing volatile RAM. It operates by heating a small patch of material and then controlling how quickly it cools. Cool it slowly, and the material forms an orderly crystal that conducts electricity reasonably well. Cool it quickly, and it forms a disordered mess that has much higher resistance. The difference between these two states can store a bit that will remain stored until enough voltage is applied to melt the material again.

     

    This behavior also turns out to be a great match for neural networks. In neural networks, each node receives an input and, based on its state, determines how much of that signal to forward to further nodes. Typically, this is viewed as representing the strength of the connections between individual neurons in the network. Thanks to the behavior of phase-change memory, that strength can also be represented by an individual bit of memory operating in an analog mode.

     

    When storing digital bits, the difference between the on and off states of phase-change memory is maximized to limit errors. But it's entirely possible to set the resistance of a bit to values anywhere in between its on and off states, allowing analog behavior. This smooth gradient of potential values can be used to represent the strength of connections between nodes—you can get the equivalent of a neural network node's behavior simply by passing current through a bit of phase-change memory.

     

    As mentioned above, IBM has already shown this can work. The chip described today, however, is much closer to a functional processor, containing all the hardware needed to connect individual nodes. And it has done so at a scale much closer to that needed to handle large language models.

    The chip

    The core component of the new chip is what's called a tile, which is a crossbar array (think square grid) of individual phase-change bits 512 units wide by 2,048 units deep. Each chip contains 34 of these tiles, meaning about 35 million phase-change bits. The chip also has everything the bits need to communicate at high speed, even across different tiles, and can do so without the need for any analog-to-digital conversion. Traditional processing units on board, coupled with some static RAM, help control the flow of this communication and handled translation between the analog and digital portions of the chip.

     

    The system is also flexible, in that it allows the strength of any connection to be held by a variable number of bits. And communication between chips is possible, allowing larger problems to be split up and distributed across multiple chips. The largest work demonstrated here involved 140 million phase-change bits spread across five chips.

     

    To actually get this to work, the researchers started with an existing AI system and set the states of the phase-change bits to match. Once set, the analysis could be run repeatedly without the phase-change portion of the chip requiring any additional energy.

     

    The researchers used this hardware to demonstrate speech recognition on two speech recognition tasks. The simpler one involved identifying a small selection of keywords in speech, such as you might need for handling interactions you might get on an automated call system. A second was general speech recognition, albeit with a somewhat condensed vocabulary. In both of these cases, the hardware was capable of matching the performance of an equivalent AI system run on traditional processors.

     

    As a result, the chip was able to perform 12.4 trillion operations for each watt of power used at its peak performance. This is many times less than the power used by a traditional processor to perform equivalent operations.

     

    It's critical to note that this is not a general-purpose AI processor. It only works with a specific type of neural network, and not every problem is a good match for that sort of neural network. The energy savings it promises is also predicated on the network staying static. Any problems that require reconfiguring the connections among the nodes means resetting the state of the phase change bits, and that requires significantly more energy.

     

    This also means that the chip isn't much use for training an AI. In fact, the training process used to develop the neural network executed on them had to be tailored to ensure that the results could be translated to the phase-change chip.

     

    That said, when matched with the right sort of problem, the chip can potentially provide a significant cut in energy use. And there's the potential to get much better in that regard. The chip was made on a 14-nanometer process, which is well off the cutting-edge, and the researchers suggest they haven't done anything to optimize the energy use for the portions of the processor dedicated to communications and digital/analog conversions.

     

    Nature, 2023. DOI: 10.1038/s41586-023-06337-5  (About DOIs).

     

    Source


    User Feedback

    Recommended Comments

    There are no comments to display.



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...