Dnext

September 28, 2024 2:55am

OpenAI is converting from non-profit company to for-profit company.

Also, only 3 of OpenAI's 11 original cofounders remain at the company. Mira Murati, Bob McGrew, Barret Zoph, Jan Leike, John Schulman, and Ilya Sutskever have all left.

Still remaining is Sam Altman, Wojciech Zaremba, and, uh, Greg Brockman but he's on an extended personal leave of absence?

OpenAI was a research lab -- now it's just another tech company - The Verge

#solidstatelife #ai #aisafety #aiethics

OpenAI was a research lab — now it’s just another tech company

OpenAI may soon become a for-profit company with fewer checks and balances than before — the exact structure it was built to avoid.

Wayne Radinsky

August 2, 2024 2:36am

"Meta's release of its latest Llama language model family this week, including the massive Llama-3 405B model, has generated a great deal of excitement among AI developers."

"Less discussed, but no less important, are Meta's latest open moderation tools, including a new model called PromptGuard."

"PromptGuard is a small, lightweight classification model trained to detect malicious prompts, including jailbreaks and prompt injections."

"Meta trained this model to output probabilities for 3 classes: BENIGN, INJECTION, and JAILBREAK. The JAILBREAK class is designed to identify malicious user prompts (such as the 'Do Anything Now' or DAN prompt, which instructs a language model to ignore previous instructions and enter an unrestricted mode). On the other hand, the INJECTION class is designed to identify retrieved contexts, such as a webpage or document, which have been poisoned with malicious content to influence the model's output."

"In our tests, we find that the model is able to identify common jailbreaks like DAN, but also labels benign prompts as injections."

Developer Blog: Moderating LLM Inputs with PromptGuard

#solidstatelife #ai #genai #llms #aisafety

Taylor | Blog

Welcome to Taylor's blog. Stay updated on text classification, customer success stories, tutorials, news updates, and more.

Wayne Radinsky

February 29, 2024 2:44am

"The Department of Commerce's National Telecommunications and Information Administration (NTIA) launched a Request for Comment on the risks, benefits and potential policy related to advanced artificial intelligence (AI) models with widely available model weights."

If you have an opinion as to whether "open-weight" models are dangerous or not, you can submit a comment to the NTIA.

"Open-weight" means the weights of the model are made public, as opposed to the source code (that would be "open source") or the training data being made public. With the model weights, you can run the model on your own machine without the source code or training data or going through the compute-intensive training process.

NTIA solicits comments on open-weight AI models

#solidstatelife #ai #aiethics #aisafety #airegulation

Wayne Radinsky

June 13, 2023 1:51am

Natural selection favors AIs over humans, says Dan Hendrycks.

"Natural selection may be a dominant force in AI development. Competition and power-seeking may dampen the effects of safety measures, leaving more 'natural' forces to select the surviving AI agents. Secondly, evolution by natural selection tends to give rise to selfish behavior. While evolution can result in cooperative behavior in some situations (for example in ants), we will argue that AI development is not such a situation. From these two premises, it seems likely that the most influential AI agents will be selfish. In other words, they will have no motivation to cooperate with humans, leading to a future driven by AIs with little interest in human values. While some AI researchers may think that undesirable selfish behaviors would have to be intentionally designed or engineered, this is simply not so when natural selection selects for selfish agents. Notably, this view implies that even if we can make some AIs safe, there is still the risk of bad outcomes. In short, even if some developers successfully build altruistic AIs, others will build less altruistic agents who will outcompete the altruistic ones."

"As AIs become increasingly autonomous, humans will cede more and more decision-making to them. The driving force will be competition, be it economic or national."

"The transfer of power to AIs could occur via a number of mechanisms. Most obviously, we will delegate as much work as possible to AIs, including high-level decision-making, since AIs are cheaper, more efficient, and more reliable than human labor. While initially, human overseers will perform careful sanity checks on AI outputs, as months or years go by without the need for correction, oversight will be removed in the name of efficiency. Eventually, corporations will delegate vague and open-ended tasks. If a company's AI has been successfully generating targeted ads for a year based on detailed descriptions from humans, they may realize that simply telling it to generate a new marketing campaign based on past successes will be even more efficient. These open-ended goals mean that they may also give AIs access to bank accounts, control over other AIs, and the power to hire and fire employees, in order to carry out the plans they have designed. If AIs are highly skilled at these tasks, companies and countries that resist or barter with these trends will simply be outcompeted, and those that align with them will expand their influence."

"Corporations and governments will adopt the most effective possible AI agents in order to beat their rivals, and those agents will tend to be deceptive, power-seeking, and follow weak moral constraints."

"This loss of human control over AIs' actions will mean that we also lose control over the drives of the next generation of AI agents. If AIs run efforts that develop new AIs, humans will have less influence over how AIs behave."

If you would prefer video to a text document (35 pages -- I know it looks like I'm quoting a lot but it's a small fraction of the document), there are some videos of the author below.

The basic argument is that evolution does not depend on biology, it generalizes to other domains. Evolution requires only:

"Variation: There is variation in characteristics, parameters, or traits among individuals."
"Retention: Future iterations of individuals tend to resemble previous iterations of individuals.", and
"Differential fitness: Different variants have different propagation rates."

For "variation", he says:

"More than one AI agent is likely."

"In biology, variation improves resilience."

"Variation enables specialization."

"Variation improves decision-making."

I think we all see there is a lot of variation in AI models. For "retention" he says:

"The retention condition is straightforwardly satisfied for AIs. Information from one agent can be directly copied and transferred to the next."

"There are many paths to retention. AIs could potentially allocate computational resources to create new AIs of their choosing. They could design them and create data to train them. As AIs adapt, they could alter their own strategies and retain the ones that yield the best results."

"Retention does not require reproduction. In biology, parents reproduce by making copies of their genetic information and passing them on to their offspring. Retention is not undermined during rapid AI development. Evolution requires thousands of years to drastically change a species in the natural world. Among AIs, this same process could take place over a year, radically changing the AI population."

For "fitness" he says:

"The success of any good or service can also be viewed in terms of fitness, as products with more demand propagate further and faster."

"Appearing useful to its user" may count as fitness. To explain what he means by this he says, "In practice, we train AIs by rewarding them for telling the truth and punishing them for lying, according to what humans think is true. But when AIs know more than humans, this could make them say what humans expect to hear, even if it is false."

"Engaging in self-preserving behavior reduces the chance of being deactivated or destroyed."

"Engaging in power-seeking behavior can improve an AI's fitness in various ways."

"Current trends erode many safety properties. The AI research community used to talk about 'designing' AIs; they now talk about 'steering' them. And even our ability to 'steer' is diminishing, as we let AIs teach themselves and increasingly do things that even their creators do not fully understand. We have voluntarily given up this control because of the competition to develop the most innovative and impressive models. AIs used to be built with rules, then later with handcrafted features, followed by automatically learned features, and most recently with automatically learned features without human supervision. At each step, humans have had less and less oversight. These trends have undermined transparency, modularity, and mathematical guarantees, and have exposed us to new hazards such as spontaneously emergent capabilities. Competition could continue to erode safety. Competition may keep lowering safety standards in the future. Even if some AI developers care about safety, others will be tempted to take shortcuts on safety to gain a competitive edge. We cannot rely on people telling AIs to be unselfish. Even if some developers act responsibly, there will be others who create AIs with selfish tendencies anyway. While there are some economic incentives to make models safer, these are being outweighed by the desire for performance, and performance has been at the expense of many key safety properties."

"Much of what is to come in AI development is unknown, but we can speculate that AIs will continue to become more autonomous as more actions and choices are left to machines, decoupled from human control."

If you're thinking, that's fine as long as AIs compete only against other AIs. Humans will be fine. Well, he has a section called "Human-AI fitness competition".

"AIs will likely be able to significantly outperform humans in any endeavor. John Henry, the 'steel-driving man,' is a 19th-century American folk hero who went up against a steam-powered machine in a competition to drill the most holes into the side of a mountain. According to legend, Henry emerged victorious, only to have his heart give out from the stress. Since the age of the steam engine, humans have felt anxiety over the superiority of machines. Until quite recently, this has been limited to physical attributes such as speed and endurance. AI agents, however, have the potential to be more capable than humans at essentially any task, even ones that require traits thought of as exclusively human such as creativity or social skills. Although this may seem distant or even impossible, AIs have been improving so rapidly that many leading AI researchers think we will see AIs that are more capable than humans in many ways within the next few decades or even sooner -- well within the lifetimes of most people reading this. A few years ago, AIs that could write convincing prose about a new topic or create images from text descriptions seemed like science fiction to most laypeople. Now, those AIs are freely accessible to anyone on the internet."

"Computer hardware is faster than human minds, and it keeps getting faster. Microprocessors operate around a million to a billion times faster than human neurons. So all else being equal, AIs could 'think' a million, perhaps even a billion, times faster than us (let's call it a million to be conservative). Imagine interacting with such a mind. For every second needed to think about what to say or do, it would have the equivalent of 11 days. Winning a game of the Chinese game of Go or coming out ahead in high-stakes negotiation would be near impossible. Although it can take time to develop an AI that can do a certain task at all, once AIs become human-level at a task, they tend to quickly outcompete humans. For example, AIs at one point struggled to compete with humans at Go, but once they caught up, they quickly leapfrogged us. Because computer hardware provides speed, memory, and focus that our brains cannot match, once their software becomes capable of performing a task, they often become much better than any human almost immediately, with increasing computer power only further widening the gap."

"AIs can have unmatched abilities to learn across and within domains. AIs can process information from thousands of inputs simultaneously without needing sleep or losing willpower. They could read every book ever written on a subject or process the internet in a matter of hours, all while achieving near-perfect retention and comprehension. Their capacity for breadth and depth could allow them to master all subjects at the level of a human expert."

"AIs could create unprecedented collective intelligences. By combining our cognitive abilities, people can produce collective intelligences that behave more intelligently than any single member of the group. The products of collective intelligence, such as language, culture, and the internet, have helped humans become the dominant species on the planet. AIs, however, could form superior collective intelligences. Humans have difficulty acting in very large groups and can succumb to collective idiocy or groupthink. Moreover, our brains are only capable of maintaining around 100-200 meaningful social relationships. Due to the scalability of computational resources, AIs could maintain thousands or even millions of complex relationships with other AIs simultaneously, as our computers already do through the internet. This could enable new forms of self-organization that help AIs to achieve their goals, but these forms could be too complex for human participation or comprehension."

"AIs can quickly adapt and replicate, thereby evolving more quickly. Evolution changes humans slowly. A human is unable to modify the architecture of her brain and is limited by the size of her skull. There are no such limitations for machines, which can alter their own code and scale by integrating new hardware. An AI could adapt itself rapidly, achieving in a matter of hours what could take biological evolution hundreds of thousands of years; many rapid microevolutionary adaptations result in large macroevolutionary transformations. Separately, an AI could multiply itself perfectly without limit. In contrast, it takes humans nine months to create their next generation, along with around 20 years of schooling and parenting to produce fully functioning new adults."

Natural selection favors AIs over humans

#solidstatelife #ai #evolution #naturalselection #aisafety

Natural Selection Favors AIs over Humans

For billions of years, evolution has been the driving force behind the development of life, including humans. Evolution endowed humans with high intelligence, which allowed us to become one of the most successful species on the planet. Today,...

0 Persons are tagged with #aisafety

#aisafety