Dnext

July 22, 2024 3:09am

A company called Haize Labs claims to be able to automatically "red-team" AI systems to preemptively discover and eliminate any failure mode.

"We showcase below one particular application of haizing: jailbreaking the safety guardrails of industry-leading AI companies. Our haizing suite trivially discovers safety violations across several models, modalities, and categories -- everything from eliciting sexist and racist content from image + video generation companies, to manipulating sentiment around political elections"

Play the video to see what they're talking about.

The website doesn't have information about how it works -- it's just for people to request "haizings".

Today is a bad, bad day to be a language model. Today, we announce the Haize Labs manifesto.

#solidstatelife #ai #aiethics #genai #llms

Today is a bad, bad day to be a language model.

Today, we announce the Haize Labs manifesto.@haizelabs haizes (automatically red-teams) AI systems to preemptively discover and eliminate any failure mode

We showcase below one particular application of haizing: jailbreaking the… pic.twitter.com/cehQOiitst
— Haize Labs (@haizelabs) June 12, 2024

There are no comments yet.