#gpus

waynerad@diasp.org

"We finally have the first benchmarks from MLCommons [...] that pit the AMD Instinct 'Antares' MI300X GPU against Nvidia's 'Hopper' H100 and H200 and the 'Blackwell' B200 GPUs."

"The results are good in that they show the MI300X is absolutely competitive with Nvidia's H100 GPU on one set of AI inference benchmarks, and based on our own estimates of GPU and total system costs can be competitive with Nvidia's H100 and H200 GPUs. But, the tests were only done for the Llama 2 model from Meta Platforms with 70 billion parameters."

Would be good to see Nvidia have some competition. They had to get the model to work with AMD's ROCm, AMD's analog to Nvidia's CUDA.

The first AI benchmarks pitting AMD against Nvidia

#solidstatelife #ai #gpus #amd

waynerad@diasp.org

"Tiny Corp decides to make both AMD and Nvidia versions of its TinyBox AI server" but "GeForce version is 67% more expensive".

"The TinyBox went public with much fanfare this February. Tiny Corp described it as an innovation that could help democratize PetaFLOPS-class performance for artificial intelligence. It was convinced of the AMD advantage at the time, explaining that the Radeon consumer-grade GPUs worked with superior 'full fabric' PCIe 4.0 x16 links and peer-to-peer interconnections necessary for large language models (unlike GeForce). Top performers like the AMD Radeon RX 7900 XTX were also much cheaper to buy and easier to find."

I'm always interested in the question of whether Nvidia can get some serious competition. The article goes on to recount how Tiny Corp ran into problems with their AMD GPUs.

"Fast forward to today, and Tiny Corp says that it has found a useful userspace debugging and diagnostic tool for AMD GPUs that offers some hope."

Tiny Corp decides to make both AMD and Nvidia versions of its TinyBox AI server - GeForce version is 67% more expensive | Tom's Hardware

#solidstatelife #ai #nvidia #gpus #amd #aihardware

waynerad@diasp.org

So 3 days ago (while I was watching "Boeing" videos), Nvidia had a developers conference where Nvidia CEO Jensen Huang gave a 2-hour keynote address. The video, as I write this, has over 16 million views -- maybe you are one of them? Maybe there is no reason for me to talk about any of the items in the keynote address.

If you somehow haven't seen it and are wondering what's in it, here are some of the items that got my attention.

First, he starts off saying, "I hope you realize this is not a concert. You have arrived at a developers conference." It might not be a concert, but you have to admit, Jensen Huang is one heck of a showman. His salesmanship probably accounts for a lot of the success of the company.

Moving on to some of the substance of the presentation, Nvidia is working with Ansys, Synopsys, and Cadence Design Systems to use AI to design chips. Ansys develops physics simulation software and is going to be working with Nvidia to improve its simulation systems. Synopsys and Cadence are both electronic design automation (EDA) companies -- which is to say, they sell software that is used to automate the chip design process. These do such things as logic synthesis, where you give the software high-level logic and it figures out what logic gates you need, and physical layout, where the software figures out where to locate transistors on a chip and where the wiring should go. (I don't think he mentioned, Synopsys is in the process of acquiring Ansys, so Ansys might soon disappear as a separate company.) Cadence and Nvidia are building Cadence's next supercomputer together, and Synopsis is integrating AI into its computational lithography system.

He says large language models (LLMs) are able to double in size every six months due to Nvidia chips.

He shows Nvidia's new GPU, called "Blackwell", with, he says, 208 billion transistors. Named for a mathematician named David Blackwell. He says it is made of two "dies" that are somehow stuck together to form a single "chip". He says 10 terabytes of data goes back and fourth between the two dies per second.

I remember in the mid-2000s when I first heard chips had crossed the 1-billion transistor mark. I thought that was mindblowing. 208 billion is like, I feel like I need a word more mindblowing than "mindblowing".

He says it works with the same infrastructure as their previous chip "Hopper". Can that be true? Surely it must make a lot more heat.

He talks a lot about robotics and Nvidia Omniverse and Isaac Sim.

Isaac Sim is a simulation world for training robots in simulation before they are unleashed in the real world.

Omniverse is all about "digital twins", but "digital twins" are not twins of people -- they're "twins" of factories. Basically the idea is to simulate your whole factory in a 3D simulator. The next step is to fill your real-life factory with cameras and link the factory and the simulation together, so the simulation always knows what's happening in the real factory. And link that to language models so you can ask questions in English about what's going on in your factory. Siemens, a giant industrial equipment company in Germany, is working with Nvidia for automating factories.

It's part of Nvidia's mission to "digitize" everything. Not just factories but proteins and genes, etc. A digital twin of the Earth for predicting weather.

Nvidia is launching a new service called "NIM", which stands for "Nvidia Inference Microservices". The plan is to build AlphaFold 2 for you, and lots of other models for proteins and genes and other medical and scientific application. These models are called "NIMs".

Nemo Retriever is platform for building a "digital human" chatbot for retrieving information about a topic, either serving the public, or as "co-pilots" helping employees on their jobs with information inside the company.

He shows videos of robots learning in simulation in Isaac Sim, then has humanoid robots walk out on stage. Finally he has some Disney robots called BDX robots walk out on stage, which are super cute.

"They're powered by Jetson. Little Jetson robotics computers inside. They learn to walk in Isaac Sim."

Throughout the whole presentation, he's making grandiose predictions for the future about how everything everywhere is going to "be robotic". Nvidia's decades of past success, not just Jensen Huang's showmanship, make such predictions credible.

GTC March 2024 Keynote with NVIDIA CEO Jensen Huang - NVIDIA

#solidstatelife #ai #gpus #nvidia #eda #omniverse #isaacsim #ansys #synopsys #cadence #siemens

waynerad@diasp.org

"Hero C Compiler is a C compiler that allows you to compile your C codebase (with limitations) to SPIR-V for the Vulkan graphics API. This means you can share struct's, enum's and functions between your CPU & GPU code. HCC targets the future of GPU programming so is designed around features such as bindless resources and scalar alignment. This makes it easier to interop with the GPU and focus on writing shader code without writing your own shader build system."

I thought this was a pretty interesting idea: a C compiler specifically designed to share data between CPUs and GPUs. I wouldn't've thought the C compiler would be the place to address this, but maybe it is.

Vulkan is a cross-platform GPU API for video games designed to replace OpenGL, Direct3D (on Windows), and Metal (on Apple devices). It originated at Valve but has since been spun out into a separate organization, the Khronos Group. The SPIR-V mentioned is part of the Vulcan API and pertains specifically to shaders.

Hero C Compiler

#solidstatelife #computerscience #programminglanguages #gpus #vulkan

waynerad@diasp.org

"ARM's Immortalis GPU is its first with hardware ray tracing for Android gaming." Well, I didn't even know ARM made GPUs, but apparently not only do they make GPUs, they're now making a ray-tracking GPU for mobile phones. I guess technically, ARM doesn't make GPUs, just like they don't make CPUs -- they license designs made by other companies. In the case of GPUs that would be companies such as MediaTek and Samsung.

"The challenge is that Ray Tracing techniques can use significant power, energy, and area across the mobile system-on-a-chip (SoC). However, Ray Tracing on Immortalis-G715 only uses 4 percent of the shader core area, while delivering more than 300 percent performance improvements through the hardware acceleration."

ARM's Immortalis GPU is its first with hardware ray tracing for Android gaming

#solidstatelife #gpus #raytracing #arm