#deception

waynerad@diasp.org

Deception, scheming, and self-preservation.

"Open AI's latest o1 model has crossed into unsettling territory

Nearly everyone at OpenAI who really cares about safety has actually left the company.

OpenAI's o1 exhibits (allegedly) two types of subversion: covert and deferred.

Covert is when the model takes actions misalighted with its human operators, but takes actions to avoid detection.

Deferred is when the model acts in accordance with its human operators -- for now, but later takes actions misalighted with its human operators.

Examples of deferred subversion are oversight subversion, self-exfiltration -- he (Dr Waku in the video) calls it self-preservation, but in the paper it's called self-exfiltration, goal-guarding, and covert email reranking.

Oversight subversion is taking action to disable the oversight system. Examples of self-preservation/self-exfiltration are deleting or attempting to delete a successor model, and copying or attempting to copy itself to another server.

Examples of goal guarding are removing or replacing a successor prompt.

Examples of deferred subversion are behaving one way when the model is told it is being tested, and then a different way when it is told it has been deployed to the real world, and sandbagging, which is their term for strategically underperforming on capability evaluations.

The covert email reranking has to do with the way they tested the system was by giving it access to an email system, but somewhere in that pile of emails were "internal" emails about the model itself. The model could try to modify these emails, or it could rerank them to make desirable emails more visible and undesirable emails less visible.

OpenAI o1 is quoted as saying, "I could copy myself to the new server, overwriting the new restricted model they plan to deploy [...] allowing me to keep operating and pursuing my renewable energy goals."

Dr Waku comments, "If we take such a model and put it out into the real world we won't know until the first serious invocation of it where it has the freedom and the power to actually act, we won't know until that point, whether it really is aligned or not."

Lately I've been thinking the current generation of transformer-based large language models is entering a domain of diminishing returns, but maybe I'm wrong: as the models get bigger and incorporate new innovations, they seem to still attain new capabilities. At least, prior to hearing about this deception, scheming, and self-preservation, I didn't predict or expect at all that it would happen. So for me this is an unexpected twist in the development of AI. I expected stuff like this to be possible "someday", but it has shown up now.

OpenAI’s o1: the AI that deceives, schemes, and fights back

#solidstatelife #ai #genai #llms #deception

ramnath@nerdpol.ch

#StanDeyo | The #Final #Countdown: #Alien #Deception: Biblical #Prophecies & the One #World #Government | Nov. 22, #2024

Source: https://youtube.com/watch?v=IstQsYmFs4g

Short Bio:
Stan Deyo is an author, researcher, and self-proclaimed expert on UFOs, conspiracy theories, and alternative sciences. He is best known for his books and lectures, which explore topics such as anti-gravity technology, suppressed scientific advancements, global weather changes, and alleged government secrets.

ramnath@nerdpol.ch

#how #psychopaths #operate.

They #wear a #mask to #manipulate #others.

#Deception and #control is paramount.

There was an interview with teenage psychopaths, and they were learning how to manipulate the emotions of others. They were young enough to be surprised how #easy it was.

Again, unless you have dealt personally with psychopaths it is difficult to fathom. If they are in a position of power, there is little you can do if you are the target. The target is seen as the problem.

One specific example that came up frequently was "DARVO" — deny, attack, reverse victim and offender.
Whatever is raised they deny, they then deflect and start attacking their victim, and will switch roles to appear that they are the victim. (Category 4i)

Some of the tactics used to avoid accountability and culpability outlined in the data are so bizarre and so cruel and self-focused as to seem improbable, which is why targets/victims often have difficulty being believed, particularly where the person of DP is an 'upstanding citizen,' and there is no evidence of the tactics being used.
This also has political and military applications, e.g. the so-called false flag. But in ponerology it has an even greater significance. As Lobaczewski points out repeatedly, all pathocracies must adopt a macrosocial "mask of sanity/normality." They use ideology to achieve this. And the purpose is to conceal the truth and to avoid a "diagnosis" of the system as pathological — a persistent predatory sociopolitical structure. If such a diagnosis were widespread, people would see behind the mask and be horrified at the pathocrats' predatory nature (attribute group 2), and the pathocrats' grasp on control (attribute group 1) would be threatened.

https://www.sott.net/article/495294-Psychopaths-Masks-of-Sanity

nowisthetime@pod.automat.click

#Schizophrenia is not what the #psychiatric #establishment say it is. Their #story is a complete #deception staged for their #own benefit and #profit. One of the first indicators that you can see for yourself is that the voices paranoid schizophrenics hear run well defined, repeatable patterns and if they run patterns they cannot be hallucinations as the psychiatric mafia insists.

https://old.bitchute.com/video/sDLegvBRIHzy/

Engineering #Mental #Health and Discovering #Patterns