"Evaluate LLMs in real time with Street Fighter III"

"A new kind of benchmark? Street Fighter III assesses the ability of LLMs to understand their environment and take actions based on a specific context. As opposed to RL models, which blindly take actions based on the reward function, LLMs are fully aware of the context and act accordingly."

"Each player is controlled by an LLM. We send to the LLM a text description of the screen. The LLM decide on the next moves its character will make. The next moves depends on its previous moves, the moves of its opponents, its power and health bars."

"Fast: It is a real time game, fast decisions are key"
"Smart: A good fighter thinks 50 moves ahead"
"Out of the box thinking: Outsmart your opponent with unexpected moves"
"Adaptable: Learn from your mistakes and adapt your strategy"
"Resilient: Keep your RPS high for an entire game"

Um... Alrighty then...

OpenGenerativeAI / llm-colosseum

#solidstatelife #ai #genai #llms

2