#aihardware

waynerad@diasp.org

Cerebras, the company that makes gigantic silicon wafers where they make the whole thing a single huge chip, claims they have made a single "Wafer Scale Engine" that can outperform a supercomputer custom-built for molecular dynamics at molecular dynamics. "Anton 3 uses 512 specialized processors and 400 kilowatts of power. In contrast, the Cerebras system uses a single accelerator, 7% of the power, and outperforms Anton 3 by 20%."

They claim they beat the world's leading general-purpose supercomputer, "Frontier", by 748x.

But what can they do with a large language model, you ask? Using Meta AI's Llama 3.1-405B, they claim to be able to output 969 output tokens per second, 75x faster than the same model running on GPUs.

Cerebras sets new world record in molecular dynamics at 1.1 million simulations per Second -- 748X faster than the world's #1 supercomputer 'Frontier'

#solidstatelife #ai #aihardware

waynerad@diasp.org

"For LLMs, IBM's NorthPole chip overcomes the tradeoff between speed and efficiency."

IBM's NorthPole chips have a different architecture from GPUs, more directly inspired by the brain.

This page has funky graph that has energy efficiency (not energy) on the vertical axis and latency on the horizontal axis -- but in reverse order, so the slowest latency is on the right. Both axes are logarithmic. The only reason I can think of why it's done this way is to make "better" up on the vertical axis and to the right on the horizontal axis. Better energy efficiency is good so you want to go up on the vertical axis. Low latency is good so you want to go to the right on the horizontal axis. With this set up they put the NorthStar chip in the upper right corner.

I wonder if there's a possibility of this competing commercially with Nvidia.

For LLMs, IBM's NorthPole chip overcomes the tradeoff between speed and efficiency

#solidstatelife #ai #aihardware #ibm

waynerad@diasp.org

Lumen Orbit is a new startup wants to "put hundreds of satellites in orbit, with the goal of processing data in space before it's downlinked to customers on Earth."

"Lumen's business plan calls for deploying about 300 satellites in very low Earth orbit, at an altitude of about 315 kilometers (195 miles). The first satellite would be a 60-kilogram (132-pound) demonstrator that's due for launch in May 2025."

"We started Lumen with the mission of launching a constellation of orbital data centers for in-space edge processing, Essentially, other satellites will send our constellation the raw data they collect. Using our on-board GPUs, we will run AI models of their choosing to extract insights, which we will then downlink for them. This will save bandwidth downlinking large amounts of raw data and associated cost and latency."

If you're wondering who wants this, there's a bunch of investors listed in the article, and it says they've raised $2.4 million to start with.

Lumen Orbit emerges from stealth and raises $2.4M to put data centers in space - GeekWire

#solidstatelife #ai #aihardware #startups #space

waynerad@diasp.org

Aethero is a new startup promising "edge computing" AI systems in satellites. "The next generation of space rated edge computing".

"Aethero's onboard software includes a thin, headless system containing a minimal set of packages and tools needed to boot the system. The onboard system will allow you to run containerized applications and will provide services such as Over the Air software updating. The Over the Air testing or updating and fleet management or system level testing is accessed through Aethero's unified Aether Software. We provide an automated framework for platform testing; it includes support for the Hardware, Board Support Packages and Software -- this allows users to develop, debug and test multi-node device systems reliably, scalably and effectively."

"Users can use Aethero's AMATDT (Automated Model Annotation, Training & Deployment Tool) that is integrated within the Aether Software to customize models deployed on Aethero's Edge Computing Modules. Current imagery support includes RGB, Multispectral, and Hyperspectral data."

This is tailored to Aethero's hardware.

"Aethero is leveraging standard architectures such as the CubeSat or PC104 framework, state-of-the-art radiation-hardened commercial components, and modern software operating systems to provide a high performance, highly capable single board computers, or multiple redundant or distributed computing configurations. The versatile architecture means that the system can be used for a variety of applications such as Autonomous Spacecraft Operation, Machine Vision Operations such as [VPS] Visual Positioning Systems or Optical Navigation, Imagery Processing ([UV/EO/IR] Ultraviolet/Electro-Optical/Infrared, [MSI] Multi-Spectral, [HSI] Hyperspectral, [SAR] Synthetic Aperture Radar, [LIDAR] Light Detection and Ranging, [TIR] Thermal Infrared), [RF] Radio Frequency Signal Processing such as Link-Budget Optimization, Video/Image processing such as Manipulation or Segmentation, Object Detection with Classification, [AI] Artifical Intelligence/[ML] Machine Learning Applications, [SDR] Software Defined Radio, Data Compression/Management, etc."

Aethero -- space data, re-imagined

#solidstatelife #ai #edgecomputing #aihardware #space #satellites

waynerad@diasp.org

"Tiny Corp decides to make both AMD and Nvidia versions of its TinyBox AI server" but "GeForce version is 67% more expensive".

"The TinyBox went public with much fanfare this February. Tiny Corp described it as an innovation that could help democratize PetaFLOPS-class performance for artificial intelligence. It was convinced of the AMD advantage at the time, explaining that the Radeon consumer-grade GPUs worked with superior 'full fabric' PCIe 4.0 x16 links and peer-to-peer interconnections necessary for large language models (unlike GeForce). Top performers like the AMD Radeon RX 7900 XTX were also much cheaper to buy and easier to find."

I'm always interested in the question of whether Nvidia can get some serious competition. The article goes on to recount how Tiny Corp ran into problems with their AMD GPUs.

"Fast forward to today, and Tiny Corp says that it has found a useful userspace debugging and diagnostic tool for AMD GPUs that offers some hope."

Tiny Corp decides to make both AMD and Nvidia versions of its TinyBox AI server - GeForce version is 67% more expensive | Tom's Hardware

#solidstatelife #ai #nvidia #gpus #amd #aihardware

waynerad@diasp.org

Groq, the company that you only just heard of, acquired another company called Definitive Intelligence, which will enable them to launch GroqCloud. Not only that, but Groq is the author of the press release! You know how corporate press releases are usually a bunch of excessively verbose blabbity-blah? Now you get an AI-generated press release that is even more excessively verbose than usual! (Actually I don't know if the press release is attributed to "Groq", a language model, or "Grok" the company. Maybe it was actually written by humans very skilled at verbose corporate-speak.)

Basically, the gist of it is that this company, Groq, invented a new chip design for AI models, and instead of selling you the chips, they will let you use the chips by uploading your code to their cloud servers.

Groq acquires Definitive intelligence to launch GroqCloud - Groq

#solidstatelife #ai #aihardware

waynerad@diasp.org

"Project Izanagi: SoftBank founder Masayoshi Son hopes to raise $100 billion so that he can build an AI chip venture that can compete with Nvidia."

Nvidia is attracting competition. Will Nvidia be able to fend them off? And nobody has the ability to lose more money than SoftBank.

Project Izanagi: SoftBank's Masayoshi Son plans $100bn AI chip venture

#solidstatelife #ai #aihardware #venturecapital

waynerad@diasp.org

"The scientists rewrote Bayesian equations so a memristor array could perform statistical analyses that harnesses randomness -- aka 'stochastic computing.' Using this approach, the array generated streams of semi-random bits at each tick of the clock. These bits were often zeroes but were sometimes ones. The proportion of zeroes to ones encoded the probabilities needed for the statistical calculations the array performed."

"The researchers fabricated a prototype circuit incorporating 2,048 hafnium oxide memristors on top of 30,080 CMOS transistors on the same chip. In experiments, they had the new circuit recognize a person's handwritten signature from signals beamed from a device worn on the wrist."

"Bayesian reasoning is often thought of as computationally expensive with conventional electronics. The new circuit performed handwriting recognition using 1/800th to 1/5,000th the energy of a conventional computer processor, suggesting 'that memristors are a highly promising lead to provide low-energy consumption artificial intelligence.'"

The CMOS part of the circuit was fabricated at 130 nm. The architecture is a set of what they call "likelihood memory arrays" which were arrays containing memristors along with linear feedback shift registers to generate the random numbers for the prior probabilities and a single-bit AND gate which functions as a stochastic multiplication device, resulting in, at the output, bit streams that encode the posterior distribution.

"A benefit of the use of stochastic computing by the Bayesian machine is that our system is naturally resilient to soft errors: bit errors can make one cycle wrong, but will be averaged throughout the computation. As memristor storage is also more resilient to radiation than static random-access memory, this feature can make the Bayesian machine appropriate for extreme environments. All these features make the Bayesian machine robust and flexible, and it can, therefore, be particularly useful for monitoring difficult environments with a variable or unstable power supply. This capability maps well with the fact that a Bayesian machine excels at dealing with the highly uncertain situations encountered in such environments."

"As the energy consumption is dominated by digital circuitry, it could also be reduced by scaling the design to more aggressive technology nodes. Our energy analysis revealed that 88% of the energy consumption during the inference phase was due to random number generation and distribution. The generation cost is due to the use of LFSRs, and the distribution cost is due to the non-local nature of random number generation. Our system used a single LFSR per column, shared by all the likelihood blocks of the column. This energy could be reduced by again using nanodevices: some stochastic nanodevices can locally generate high-quality random bits, at a very low area and energy cost, using read operations."

Memristors run AI tasks at 1/800th power

#solidstatelife #ai #aihardware

waynerad@pluspora.com

Insect brains have been reverse-engineered to derive new algorithms for collision avoidance and navigation that can be used in robotics. "Opteran has been working with honeybee brains as they are both sufficiently simple and capable of orchestrating complex behavior. Honeybees are able to navigate over distances of 7 miles, and communicate their mental maps accurately to other bees. It does all this with fewer than a million neurons, in an energy-efficient brain the size of a pinhead."

"Opteran has successfully reverse-engineered the algorithm honeybees use for optical flow estimation (the apparent motion of objects in a scene caused by relative motion of the observer)."

Would be interesting to know what is the difference between honeybee neurons and the artificial neural networks used for deep learning.

Reverse-engineering insect brains to make robots

#solidstatelife #ai #aihardware

waynerad@diasp.org

Insect brains have been reverse-engineered to derive new algorithms for collision avoidance and navigation that can be used in robotics. "Opteran has been working with honeybee brains as they are both sufficiently simple and capable of orchestrating complex behavior. Honeybees are able to navigate over distances of 7 miles, and communicate their mental maps accurately to other bees. It does all this with fewer than a million neurons, in an energy-efficient brain the size of a pinhead."

"Opteran has successfully reverse-engineered the algorithm honeybees use for optical flow estimation (the apparent motion of objects in a scene caused by relative motion of the observer)."

Would be interesting to know what is the difference between honeybee neurons and the artificial neural networks used for deep learning.

Reverse-engineering insect brains to make robots

#solidstatelife #ai #aihardware

waynerad@pluspora.com

Reconfigurable perovskite nickelate electronics for AI. Another twist for doing AI hardware. "The hardware is a small, rectangular device made of a material called perovskite nickelate, which is very sensitive to hydrogen. Applying electrical pulses at different voltages allows the device to shuffle a concentration of hydrogen ions in a matter of nanoseconds, creating states that the researchers found could be mapped out to corresponding functions in the brain."

"When the device has more hydrogen near its center, for example, it can act as a neuron, a single nerve cell. With less hydrogen at that location, the device serves as a synapse, a connection between neurons, which is what the brain uses to store memory in complex neural circuits."

"Through simulations of the experimental data, the team's collaborators showed that the internal physics of this device creates a dynamic structure for an artificial neural network that is able to more efficiently recognize electrocardiogram patterns and digits compared with static networks. This neural network uses 'reservoir computing,' which explains how different parts of a brain communicate and transfer information."

The brain's secret to lifelong learning can now come as hardware for artificial intelligence

#solidstatelife #ai #aihardware