#deepfakes

waynerad@diasp.org

"'I've got your daughter': Mom warns of terrifying AI voice cloning scam that faked kidnapping."

"And you have no doubt in your mind that that was her voice?"

"Oh, 100% her voice. 100% her voice. It was never a question of, you know, who is this? It was completely her voice, it was her inflection, it was the way she would've cried. I never doubted for one second it was her."

A week after hearing about this idea on The AI Dilemma, here it is in real life.

#solidstatelife #ai #generativeai #deepfakes

https://www.nbc15.com/2023/04/10/ive-got-your-daughter-mom-warns-terrifying-ai-voice-cloning-scam-that-faked-kidnapping/

waynerad@diasp.org

22-year-old TikToker fooled millions by pretending to be AI-generated. Curt Skelton "went mega-viral for an August 25 video in which he claimed to be a sophisticated artificial intelligence deepfake -- orchestrated by fellow visual effects creator Zahra Hussain."

"The Tiktok, which has since been viewed 17.7 million times, features Hussain telling viewers that she used AI programs (some that exist, some that don't) to create the 'fake character' -- ending the video with an open-ended question for viewers: should she continue to post to the @curt.skelton account, or end the 'experiment'?"

"I honestly was expecting maybe 20,000 views at most. So 50% of 20,000 not getting a joke isn't that bad or impactful. But 50% of 16 million on TikTok alone..."

A 22-year-old TikToker fooled millions by pretending to be an AI-generated image -- and he says it "shows how little people understand AI and deepfakes in general"

#solidstatelife #deepfakes

danie10@squeet.me

Deepfake attacks can easily trick 9 out of 10 live facial recognition systems online, fooling even ‘liveness tests’

Bild/Foto
Sensity AI, a startup focused on tackling identity fraud, carried out a series of pretend attacks. Engineers scanned the image of someone from an ID card, and mapped their likeness onto another person’s face. Sensity then tested whether they could breach live facial recognition systems by tricking them into believing the pretend attacker is a real user.

Sensity mentioned needing a specialised phone to hijack mobile cameras and injecting pre-made deepfake models in its report.

Security is always a moving target…

See https://www.theregister.com/2022/05/22/ai_in_brief/

#technology #security #deepfakes #facialrecognition
#Blog, ##deepfakes, ##facialrecognition, ##security, ##technology

waynerad@diasp.org

"Get ready for your evil twin." "In a recent Microsoft blog post by Executive VP Charlie Bell, he states that in the metaverse fraud and phishing attacks could 'come from a familiar face -- literally -- like an avatar that impersonates your coworker.'"

"Accurately replicating the look and sound of a person in the metaverse is often referred to as creating a 'digital twin.' Earlier this year, Jensen Haung, the CEO of NVIDIA gave a keynote address using a cartoonish digital twin. He stated that the fidelity will rapidly advance in the coming years as well as the ability for AI engines to autonomously control your avatar so you can be in multiple places at once. Yes, digital twins are coming."

Get ready for your evil twin

#solidstatelife #deepfakes #gans #generativeai

waynerad@pluspora.com

Deepfakes detection with attention. Deepfakes have two kinds of defects, one within any given video frame and the other between them. Defects in frames are generated because the generation model is imperfect. Defects between frames when they are sequenced into videos happen because the generator doesn't know what global constraints to apply.

The generator models are so far always generative adversarial networks (GANs) or variational autoencoders (VAEs), but either one will have defects from its upscaling phase. Basically these models have a set of parameters that are used to generate the face and a system for going from a low-resolution image to a high-resolution image, filling in details as it goes. This is called "deconvolution", and it has imperfections, in particular uneven overlap, which means the large pixels are decomposed in such a way that they overlap, which prevents edge artifacts from being introduced, but detail is lost in the overlap. In addition, there are defects which you could call "semantic" defects, for example, it can get the specular reflection in the eyes wrong. Specular reflection in eyes refers to the reflection of the scene the person is looking at reflected in the eye. Details of teeth can make them look slightly rougher than real teeth. Some face semantics are correct in their details, but only look amiss when you zoom out. For example generated faces often have less symmetry than real faces. Human eyes are always separated by a certain distance and have the same color, but the eyes of the fake face sometimes have different distance and color.

Between frames, the defects tend to be things like the frequency of eye blinking, and inconsistency in the combination of head movements and facial expressions that are different from what real humans tend to make. If you're thinking all the system needs to do is faithfully copy these from the source video, yeah, that's right, but the generative models often fail to do that.

The deepfake detection introduced here is based on the vision transformer architecture. Transformers were originally invented for language translation, translating one sequential sequence of words in one language into another sequential sequence of words in another language. It turned out to be useful to have the translation system, when it is looking at generating the next output word, to have an "attention" mechanism that lets the model look at any word in the input stream, and look at the words out of order, paying "attention" to any part of the original sentence at any time. Vision transformers port this idea over to images and video.

The system works by dividing the input image into small patches that correspond to parts of the face. Each "patch" is turned into an "embedding" that's analogous to word embeddings in language. These patch "embeddings" have information on their position in the image encoded into them. The "embeddings" then go trough "feature extractors", which are plugged into a "transformation matrix", the parameters of which are also learned though a training process. A "global forgery template" is also learned and when combined with the "transformation matrix" results in an "attention map". The "attention" mechanism is used both to direct the attention of the system to different parts of the same image, to detect single-frame defects, and to make a "long range" attention map, that is used to detect defects across frames.

What finally comes out the other end of the system is a set of activations that represent the confidence each of the "patches" is "suspicious". Enough "suspicious" patches and the system classifies the video as a deepfake.

Detection of deepfake videos using long distance attention

#solidstatelife #ai #deepfakes