#genai

waynerad@diasp.org

OpenAI announces GPT-4o. The "o" is for "omni". The model "can reason across audio, vision, and text in real time."

There's a series of videos showing conversation by voice, recognizing "bunny ears", two GPT-4os interacting and singing, real-time translation, lullabies and whispers, sarcasm, math problems, learning Spanish, rock paper scissors, interview prep, "Be My Eyes" accessibility, and coding assistant and desktop app.

Hello GPT-4o

#solidstatelife #ai #openai #genai #llms #gpt #multimodal

waynerad@diasp.org

"SUQL stands for Structured and Unstructured Query Language. It augments SQL with several important free text primitives for a precise, succinct, and expressive representation. It can be used to build chatbots for relational data sources that contain both structured and unstructured information."

Ok, that's kind of a crazy concept. Let's have a look. You can do queries like:

SELECT answer("Event year Info", 'where is this event held?') FROM table WHERE "Name" = 'XXXI';

(Where was the XXXl Olympic held? )

SELECT "Name" FROM table WHERE answer("Event year Info", 'is this event held in Rio?') = 'Yes';

(What was the name of the Olympic event held in Rio?)

SELECT answer("Flag Bearer Info", 'when is this person born?') FROM table WHERE answer("Event year Info", 'is this event held in Rio?') = 'Yes';

(When was the flag bearer of Rio Olympic born?)

SELECT "Flag Bearer" FROM table WHERE "Gender" = 'Male' AND answer("Flag Bearer Info", 'did this person participate in Men's 100kg event?') = 'Yes';

(Which male bearer participated in Men's 100kg event in the Olympic game?)

SELECT MAX(answer("Flag Bearer Info", 'when is this person born?')::date) FROM table WHERE "Event year" IN ('2016', '2012');

(For the 2012 and 2016 Olympic Event, when was the younger flag bearer born?)

SELECT "Event year" FROM table ORDER BY answer("Flag Bearer Info", 'when is this person born?')::date DESC LIMIT 1;

(When did the youngest Burmese flag bearer participate in the Olympic opening ceremony?)

In addition to the "answer" keyword that this adds to SQL, they also added a "summary" keyword.

The way the system works is they have added a "large language model with in-context learning" to the SQL database system.

SUQL (Structured and Unstructured Query Language)

#solidstatelife #ai #genai #llms #sql

waynerad@diasp.org

Company revives Alan Turing as an AI chatbot, hilarity, no, wait, outrage ensues.

The company is Genius Group, based in Singapore, which provides "AI-powered business education."

"Software engineer Grady Booch, a former Turing Talk speaker, wrote on Twitter/X: 'Absolute and complete trash. I hope that Turing's heirs sue you into oblivion.'"

"Another user told Genius Group's CEO: 'This is so incredibly unethical, disrespectful, and disgusting. You are pillaging the image of a deceased person (who frankly has suffered enough from exploitation) and the voice of an actor to suit your purposes. Vile.'"

Company revives Alan Turing as an AI chatbot, outrage ensues

#solidstatelife #ai #aieducation #llms #genai #computervision #videoai

waynerad@diasp.org

This isn't important, just whacky. Two instances of Claude get connected together and asked to play "virtual CLI" (command line interface) to, uh, "explore the boundaries of cognition"...

cat being.txt
cd consciousness
ls
cat sentience.py
cd self_model
ls
cat identity.py
cd ..
cat awareness.cpp
cat qualia.rs
cd ..
ls
cat thoughts.log
cd abstractions
ls
cat concepts.lisp
cd intelligence
ls
cat intelligence_definition.txt

conversation_1713192942_scenario_vanilla backrooms.txt

#solidstatelife #ai #genai #llms #aiweirdness

waynerad@diasp.org

"EyeEm, the Berlin-based photo-sharing community that exited last year to Spanish company Freepik after going bankrupt, is now licensing its users' photos to train AI models. Earlier this month, the company informed users via email that it was adding a new clause to its Terms & Conditions that would grant it the rights to upload users' content to 'train, develop, and improve software, algorithms, and machine-learning models.' Users were given 30 days to opt out by removing all their content from EyeEm's platform."

AI says: All your photos are belong to us.

Photo-sharing community EyeEm will license users' photos to train AI if they don't delete them - techcrunch.com

#solidstatelife #ai #genai #computervision

waynerad@diasp.org

Vidu is a Chinese video generation AI competitive with OpenAI's Sora, according to rumor (neither is available for the public to use). It's a collaboration between Tsinghua University in Beijing and a company called Shengshu Technology.

"Vidu is capable of producing 16-second clips at 1080p resolution -- Sora by comparison can generate 60-second videos. Vidu is based on a Universal Vision Transformer (U-ViT) architecture, which the company says allows it to simulate the real physical world with multi-camera view generation. This architecture was reportedly developed by the Shengshu Technology team in September 2022 and as such would predate the diffusion transformer (DiT) architecture used by Sora."

"According to the company, Vidu can generate videos with complex scenes adhering to real-world physics, such as realistic lighting and shadows, and detailed facial expressions. The model also demonstrates a rich imagination, creating non-existent, surreal content with depth and complexity. Vidu's multi-camera capabilities allows for the generation of dynamic shots, seamlessly transitioning between long shots, close-ups, and medium shots within a single scene."

"A side-by-side comparison with Sora reveals that the generated videos are not at Sora's level of realism."

Meet Vidu, A New Chinese Text to Video AI Model - Maginative

#solidstatelife #ai #genai #computervision #videogeneration

waynerad@diasp.org

WebLlama is "building agents that can browse the web by following instructions and talking to you".

This is one of those things that, if I had time, would be fun to try out. You have to download the model from HuggingFace & run it on your machine.

"The goal of our project is to build effective human-centric agents for browsing the web. We don't want to replace users, but equip them with powerful assistants."

"We are build on top of cutting edge libraries for training Llama agents on web navigation tasks. We will provide training scripts, optimized configs, and instructions for training cutting-edge Llamas."

If it works, this technology has a serious possible practical benefit for people with vision impairment who want to browse the web.

McGill-NLP / webllama

#solidstatelife #ai #genai #llms #agenticllms

waynerad@diasp.org

"Are large language models superhuman chemists?"

So what these researchers did was make a test -- a benchmark. They made a test of 7,059 chemistry questions, spanning the gamut of chemistry: computational chemistry, physical chemistry, materials science, macromolecular chemistry, electrochemistry, organic chemistry, general chemistry, analytical chemistry, chemical safety, and toxicology.

They recruited 41 chemistry experts to carefully validate their test.

They devised the test such that it could be evaluated in a completely automated manner. This meant relying on multiple-choice questions rather than open-ended questions more than they wanted to. The test has 6,202 multiple-choice questions and 857 open-ended questions (88% multiple-choice). The open-ended questions had to have parsers written to find numerical answers in the output in order to test them in an automated manner.

In addition, they ask the models to say how confident they are in their answers.

Before I tell you the ranking, the researchers write:

"On the one hand, our findings underline the impressive capabilities of LLMs in the chemical sciences: Leading models outperform domain experts in specific chemistry questions on many topics. On the other hand, there are still striking limitations. For very relevant topics the answers models provide are wrong. On top of that, many models are not able to reliably estimate their own limitations. Yet, the success of the models in our evaluations perhaps also reveals more about the limitations of the exams we use to evaluate models -- and chemistry -- than about the models themselves. For instance, while models perform well on many textbook questions, they struggle with questions that require some more reasoning. Given that the models outperformed the average human in our study, we need to rethink how we teach and examine chemistry. Critical reasoning is increasingly essential, and rote solving of problems or memorization of facts is a domain in which LLMs will continue to outperform humans."

"Our findings also highlight the nuanced trade-off between breadth and depth of evaluation frameworks. The analysis of model performance on different topics shows that models' performance varies widely across the subfields they are tested on. However, even within a topic, the performance of models can vary widely depending on the type of question and the reasoning required to answer it."

And with that, I'll tell you the rankings. You can log in to their website at ChemBench.org and see the leaderboard any time for the latest rankings. At this moment I am seeing:

gpt-4: 0.48

claude2: 0.29

GPT-3.5-Turbo: 0.26

gemini-pro: 0.25

mistral_8x7b: 0.24

text-davinci-003: 0.18

Perplexity 7B Chat: 0.18

galactica_120b: 0.15

Perplexity 7B online: 0.1

fb-llama-70b-chat: 0.05

The numbers that follow the model name are the score on the benchmark (higher is better). You'll notice there appears to be a gap between GPT-4 and Claude 2. One interesting thing about the leaderboard is you can show humans and AI models on the same leaderboard. When you do this, the top human has a score of 0.51 and beats GPT-4, then you get GPT-4, then you get a whole bunch of humans in between GPT-4 and Claude 2. So it appears that that gap is real. However, Claude 2 isn't the latest version of Claude. Since the evaluation, Claude 3 has come out, so maybe sometime in the upcoming months we'll see the leaderboard revised and see where Claude 3 comes in.

Are large language models superhuman chemists?

#solidstatelife #ai #genai #llms #chemistry

waynerad@diasp.org

FutureSearch.AI lets you ask a language model questions about the future.

"What will happen to TikTok after Congress passed a bill on April 24, 2024 requiring it to delist or divest its US operations?"

"Will the US Department of Justice impose behavioral remedies on Apple for violation of antitrust law?"

"Will the US Supreme Court grant Trump immunity from prosecution in the 2024 Supreme Court Case: Trump v. United States?"

"Will the lawsuit brought against OpenAI by the New York Times result in OpenAI being allowed to continue using NYT data?"

"Will the US Supreme Court uphold emergency abortion care protections in the 2024 Supreme Court Case: Moyle v. United States?"

How does it work?

They say rather than asking a large language model a question in a 1-shot manner, they guide it through 6 steps for reasoning through hard questions. The 6 steps are:

  1. "What is a basic summary of this situation?"

  2. "Who are the important people involved, and what are their dispositions?"

  3. "What are the key facets of the situation that will influence the outcome?"

  4. "For each key facet, what's a simple model of the distribution of outcomes from past instances that share that facet?"

  5. "How do I weigh the conflicting results of the models?"

  6. "What's unique about this situation to adjust for in my final answer?"

See below for a discussion of two other approaches that claim similar prediction quality.

FutureSearch: unbiased, in-depth answers to hard questions

#solidstatelife #ai #genai #llms #futurology

waynerad@diasp.org

MyBestAITool: "The Best AI Tools Directory in 2024".

"Ranked by monthly visits as of April 2024".

"AI Chatbot": ChatGPT, Google Gemini, Claude AI, Poe.

"AI Search Engine": Perplexity AI, You, Phind, metaso.

"AI Photo & Image Generator": Leonardo, Midjourney, Fotor, Yodayo.

"AI Character": CharacterAI, JanitorAI, CrushonAI, SpicyChat AI.

"AI Writing Assistants": Grammarly, LanguageTool, Smodin, Obsidian.

"AI Photo & Image Editor": Remove.bg, Fotor, Pixlr, PhotoRoom.

"AI Model Training & Deployment": civitai, Huggingface, Replicate, google AI.

"AI LLM App Build & RAG": LangChain, Coze, MyShell, Anakin.

"AI Image Enhancer": Cutout Pro, AI Image Upscaler, ZMO.AI, VanceAI.

"AI Video Generator": Runway, Vidnoz, HeyGen, Fliki.

"AI Video Editor": InVideo, Media io, Opus Clip, Filmora Wondershare.

"AI Music Generator": Suno, Moises App, Jammable, LANDR.

No Udio? Really? Maybe it'll show up on next month's stats.

"AI 3D Model Generator": Luma AI, Recraft, Deepmotion, Meshy.

"AI Presentation Generator": Prezi AI, Gamma, Tome, Pitch.com.

"AI Design Assistant": Firefly Adobe, What font is, Hotpot, Vectorizer.

"AI Copywriting Tool": Simplified, Copy.ai, Jasper.ai, TextCortex.

"AI Story Writing": NovelAI, AI Novellist, Dreampress AI, Artflow.

"AI Paraphraser": QuillBot, StealthWriter, Paraphraser, Linguix.

"AI SEO Assistant": vidIQ, Writesonic, Content At Scale, AISEO.

"AI Email Assistant": Klaviyo, Instantly, Superhuman, Shortwave.

"AI Summarizer": Glarity, Eightify, Tactiq, Summarize Tech.

"AI Prompt Tool": FlowGPT, Lexica, PromptHero, AIPRM.

"AI PDF": ChatPDF, Scispace, UPDF, Ask Your PDF.

"AI Meeting Assistant": Otter, Notta, Fireflies, Transkriptor.

"AI Customer Service Assistant": Fin by Intercom, Lyro, Sapling, ChatBot.

"AI Resume Builder": Resume Worded, Resume Builder, Rezi, Resume Trick.

"AI Speech Recognition": Adobe Podcast, Transkriptor, Voicemaker, Assemblyai.

"AI Website Builder": B12.io, Durable AI Site Builder, Studio Design, WebWave AI.

"AI Art Generator": Leonardo, Midjourney, PixAI Art, NightCafe.

"AI Developer Tools": Replit, Blackbox, Weights & Biases, Codeium.

"AI Code Assistant": Blackbox, Phind, Codeium, Tabnine.

"AI Detector Tool": Turnitin, GPTZero, ZeroGPT, Originality.

You can view full lists on all of these and there are even more if you go through the categories on the left side.

No idea where they get their data? I would guess Comscore but they don't say.

The Best AI Tools Directory in 2024 | MyBestAITool

#solidstatelife #ai #aitools #genai

waynerad@diasp.org
waynerad@diasp.org

The end of classical computer science is coming, and most of us are dinosaurs waiting for the meteor to hit, says Matt Welsh.

"I came of age in the 1980s, programming personal computers like the Commodore VIC-20 and Apple IIe at home. Going on to study computer science in college and ultimately getting a PhD at Berkeley, the bulk of my professional training was rooted in what I will call 'classical' CS: programming, algorithms, data structures, systems, programming languages."

"When I was in college in the early '90s, we were still in the depth of the AI Winter, and AI as a field was likewise dominated by classical algorithms. In Dan Huttenlocher's PhD-level computer vision course in 1995 or so, we never once discussed anything resembling deep learning or neural networks--it was all classical algorithms like Canny edge detection, optical flow, and Hausdorff distances."

"One thing that has not really changed is that computer science is taught as a discipline with data structures, algorithms, and programming at its core. I am going to be amazed if in 30 years, or even 10 years, we are still approaching CS in this way. Indeed, I think CS as a field is in for a pretty major upheaval that few of us are really prepared for."

"I believe that the conventional idea of 'writing a program' is headed for extinction, and indeed, for all but very specialized applications, most software, as we know it, will be replaced by AI systems that are trained rather than programmed."

"I'm not just talking about CoPilot replacing programmers. I'm talking about replacing the entire concept of writing programs with training models. In the future, CS students aren't going to need to learn such mundane skills as how to add a node to a binary tree or code in C++. That kind of education will be antiquated, like teaching engineering students how to use a slide rule."

"The shift in focus from programs to models should be obvious to anyone who has read any modern machine learning papers. These papers barely mention the code or systems underlying their innovations; the building blocks of AI systems are much higher-level abstractions like attention layers, tokenizers, and datasets."

This got me thinking: Over the last 20 years, I've been predicting AI would advance to the point where it could automate jobs, and it's looking more and more like I was fundamentally right about that, and all the people who poo-poo'd the idea over the years in coversations with me were wrong. But while I was right about that fundamental idea (and right that there wouldn't be "one AI in a box" that anyone could pull the plug on if something went wrong, but a diffusion of the technology around the world like every previous technology), I was wrong about how exactly it would play out.

First I was wrong about the timescales: I thought it would be necessary to understand much more about how the brain works, and to work algorithms derived from neuroscience into AI models, and looking at the rate of advancement in neuroscience I predicted AI wouldn't be in its current state for a long time. While broad concepts like "neuron" and "attention" have been incorporated into AI, there are practically no specific algorithms that have been ported from brains to AI systems.

Second, I was wrong about what order. I was wrong in thinking "routine" jobs would be automated first, and "creative" jobs last. It turns out that what matters is "mental" vs "physical". Computers can create visual art and music just by thinking very hard -- it's a purely "mental" activity, and computers can do all that thinking in bits and bytes.

This has led me to ponder: What occupations require the greatest level of manual dexterity?

Those should be the jobs safest from the AI revolution.

The first that came to mind for me -- when I was trying to think of jobs that require an extreme level of physical dexterity and pay very highly -- was "surgeon". So I now predict "surgeon" will be the last job to get automated. If you're giving career advice to a young person (or you are a young person), the advice to give is: become a surgeon.

Other occupations safe (for now) against automation, for the same reason would include "physical therapist", "dentist", "dental hygienist", "dental technician", "medical technician" (e.g. those people who customize prosthetics, orthodontic devices, and so on), and so on. "Nurse" who routinely does physical procedures like drawing blood.

Continuing in the same vein but going outside the medical field (pun not intended but allowed to stand once recognized), I'd put "electronics technician". I don't think robots will be able to solder any time soon, or manipulate very small components, at least after the initial assembly is completed which does seem to be highly amenable to automation. But once electronic components fail, to the extent it falls on people to repair them, rather than throw them out and replace them (which admittedly happens a lot), humans aren't going to be replaced any time soon.

Likewise "machinist" who works with small parts and tools.

"Engineer" ought to be ok -- as long as they're mechanical engineers or civil engineers. Software engineers are in the crosshairs. What matters is whether physical manipulation is part of the job.

"Construction worker" -- some jobs are high pay/high skill while others are low pay/low skill. Will be interesting to see what gets automated first and last in construction.

Other "trade" jobs like "plumber", "electrician", "welder" -- probably safe for a long time.

"Auto mechanic" -- probably one of the last jobs to be automated. The factory where the car is initially manufacturered, a very controlled environment, may be full of robots, but it's hard to see robots extending into the auto mechanic's shop where cars go when they break down.

"Jewler" ought to be a safe job for a long time. "Watchmaker" (or "watch repairer") -- I'm still amazed people pay so much for old-fashioned mechanical watches. I guess the point is to be pieces of jewlry, so these essentially count as "jewler" jobs.

"Tailor" and "dressmaker" and other jobs centered around sewing.

"Hairstylist" / "barber" -- you probably won't be trusting a robot with scissors close to your head any time soon.

"Chef", "baker", whatever the word is for "cake calligrapher". Years ago I thought we'd have automated kitchens at fast food restaurants by now but they are no where in sight. And nowhere near automating the kitchens of the fancy restaurants with the top chefs.

Finally, let's revisit "artist". While "artist" is in the crosshairs of AI, some "artist" jobs are actually physical -- such as "sculptor" and "glassblower". These might be resistant to AI for a long time. Not sure how many sculptors and glassblowers the economy can support, though. Might be tough if all the other artists stampede into those occupations.

While "musician" is totally in the crosshairs of AI, as we see, that applies only to musicians who make recorded music -- going "live" may be a way to escape the automation. No robots with the manual dexterity to play physical guitars, violins, etc, appear to be on the horizon. Maybe they can play drums?

And finally for my last item: "Magician" is another live entertainment career that requires a lot of manual dexterity and that ought to be hard for a robot to replicate. For those of you looking for a career in entertainment. Not sure how many magicians the economy can support, though.

The end of programming - Matt Welsh

#solidstatelife #genai #codingai #technologicalunemployment

waynerad@diasp.org

"In defense of AI art".

YouTuber "LiquidZulu" makes a gigantic video aimed at responding once and for all to all possible arguments against AI art.

His primary argument seems to me to be that AI art systems are learning art in a manner analogous to human artists -- by learning from examples from other artists -- and do not plagiarize because they do not copy exactly any artists' work. In contrast AI art systems are actually good at combining styles in new ways. Therefore, AI art generators are just as valid "artists" as any human artists.

Artists have no right to government protection from getting their jobs get replaced by technology, he says, because nobody anywhere else in the economy has any right to government protection to getting their jobs replaced by technology.

On the flip side, he thinks the ability of AI art generators to bring the ability to create art to the masses is a good thing that should be celebrated.

Below-average artists have no right to deprive people of this ability to generate the art they like because those low-quality artists want to be paid.

Apparently he considers himself an anarcho-capitalist (something he has in common with... nobody here?) and has has harsh words for people he considers neo-Luddites. He accuses artists complaining about AI art generators of being "elitist".

In defense of AI art - LiquidZulu

#solidstatelife #ai #genai #aiart #aiethics

waynerad@diasp.org

For the first time, Alice Yalcin Efe is scared of AI as a music producer.

A professional music producer, been number one on BeatPort, has millions of streams on Spotify, played in big festivals and clubs, "yet for the first time I am scared of AI as a music producer."

When you're homeless, you can listen to AI mix the beat on the beach.

After that, she ponders what this means for all the rest of us. Those of us who aren't professional music producers. Well, I guess we can all be music producers now.

"Music on demand becomes literal. You feel heartbroken, type it in. Type in the genres that you want. Type in the lyrics that you want. Type in the mood that you want and then AI spits out the perfect ballad for you to listen."

"I think it's both incredible and horrifying at the same time. I honestly don't know what comes next. Will this kill the artists' soul, or will it give us just more tools to make even greater things?"

For the first time, I'm scared of AI as a music producer - Alice Yalcin Efe - Mercurial Tones Academy

#solidstatelife #ai #genai #musicai

waynerad@diasp.org

Musician Paul Folia freaks out over Suno and Udio (and other music AI). Reminds me of the freak-out of visual artists a year ago. It appears AI is going to replace humans one occupation at a time and people will freak out when it's their turn. He estimates in a year AI music will be of high enough quality to wipe out stock music writing completely, producing tracks for a price no human can compete with ($0.02 and in minutes).

He experiments with various music styles an artists' styles and the part that impressed me the most was, perhaps surprisingly, the baroque music. After noting that the training data was probably easy to get because it's public domain, he says, "This m-f-er learned some serious harmony. Not like three chords and some singing."

Suno, Udio (and other music AI). We're f*ed and it's really bad. Seriously. - Folia Soundstudio

#solidstatelife #ai #genai #musicai

waynerad@diasp.org

"Evaluate LLMs in real time with Street Fighter III"

"A new kind of benchmark? Street Fighter III assesses the ability of LLMs to understand their environment and take actions based on a specific context. As opposed to RL models, which blindly take actions based on the reward function, LLMs are fully aware of the context and act accordingly."

"Each player is controlled by an LLM. We send to the LLM a text description of the screen. The LLM decide on the next moves its character will make. The next moves depends on its previous moves, the moves of its opponents, its power and health bars."

"Fast: It is a real time game, fast decisions are key"
"Smart: A good fighter thinks 50 moves ahead"
"Out of the box thinking: Outsmart your opponent with unexpected moves"
"Adaptable: Learn from your mistakes and adapt your strategy"
"Resilient: Keep your RPS high for an entire game"

Um... Alrighty then...

OpenGenerativeAI / llm-colosseum

#solidstatelife #ai #genai #llms

waynerad@diasp.org

Creating sexually explicit deepfakes to become a criminal offence in the UK. If the images or videos were never intended to be shared, under the new legislation, the person will face a criminal record and unlimited fine. If the images are shared, they face jail time.

Creating sexually explicit deepfakes to become a criminal offence

#solidstatelife #ai #genai #computervision #deepfakes #aiethics

waynerad@diasp.org

The 2024 AI Index Report from Stanford's Human-Centered Artificial Intelligence lab.

It says between 2010 and 2022, the number of AI research papers per year nearly tripled, from 88,000 to 240,000. So if you're wondering why I'm always behind in my reading of AI research papers, well, there's your answer.

Besides that, I'm just going to quote from the highlights in the report itself, because it seems I can't improve on them, at least not in short order and I've decided I'd like to get this report out to you all quickly. I'll continue browsing through the charts & graphs in all the chapters, but for now I'll just give you their highlights and you can decide if you want to download the report and read it or part of it more thoroughly.

"Chapter 1: Research and Development"

"1. Industry continues to dominate frontier AI research. In 2023, industry produced 51 notable machine learning models, while academia contributed only 15. There were also 21 notable models resulting from industry-academia collaborations in 2023, a new high."

"2. More foundation models and more open foundation models. In 2023, a total of 149 foundation models were released, more than double the amount released in 2022. Of these newly released models, 65.7% were open-source, compared to only 44.4% in 2022 and 33.3% in 2021."

"3. Frontier models get way more expensive. According to AI Index estimates, the training costs of state-of-the-art AI models have reached unprecedented levels. For example, OpenAI's GPT-4 used an estimated $78 million worth of compute to train, while Google's Gemini Ultra cost $191 million for compute."

"4. The United States leads China, the EU, and the UK as the leading source of top AI models. In 2023, 61 notable AI models originated from US-based institutions, far outpacing the European Union's 21 and China's 15."

"5. The number of AI patents skyrockets. From 2021 to 2022, AI patent grants worldwide increased sharply by 62.7%. Since 2010, the number of granted AI patents has increased more than 31 times."

"6. China dominates AI patents. In 2022, China led global AI patent origins with 61.1%, significantly outpacing the United States, which accounted for 20.9% of AI patent origins. Since 2010, the US share of AI patents has decreased from 54.1%."

"7. Open-source AI research explodes. Since 2011, the number of AI-related projects on GitHub has seen a consistent increase, growing from 845 in 2011 to approximately 1.8 million in 2023. Notably, there was a sharp 59.3% rise in the total number of GitHub AI projects in 2023 alone. The total number of stars for AI-related projects on GitHub also significantly increased in 2023, more than tripling from 4.0 million in 2022 to 12.2 million."

"8. The number of AI publications continues to rise. Between 2010 and 2022, the total number of AI publications nearly tripled, rising from approximately 88,000 in 2010 to more than 240,000 in 2022. The increase over the last year was a modest 1.1%."

"Chapter 2: Technical Performance"

"1. AI beats humans on some tasks, but not on all. AI has surpassed human performance on several benchmarks, including some in image classification, visual reasoning, and English understanding. Yet it trails behind on more complex tasks like competition-level mathematics, visual commonsense reasoning and planning."

"2. Here comes multimodal AI. Traditionally AI systems have been limited in scope, with language models excelling in text comprehension but faltering in image processing, and vice versa. However, recent advancements have led to the development of strong multimodal models, such as Google's Gemini and OpenAI's GPT-4. These models demonstrate flexibility and are capable of handling images and text and, in some instances, can even process audio."

"3. Harder benchmarks emerge. AI models have reached performance saturation on established benchmarks such as ImageNet, SQuAD, and SuperGLUE, prompting researchers to develop more challenging ones. In 2023, several challenging new benchmarks emerged, including SWE-bench for coding, HEIM for image generation, MMMU for general reasoning, MoCa for moral reasoning, AgentBench for agent-based behavior, and HaluEval for hallucinations."

"4. Better AI means better data which means ... even better AI. New AI models such as SegmentAnything and Skoltech are being used to generate specialized data for tasks like image segmentation and 3D reconstruction. Data is vital for AI technical improvements. The use of AI to create more data enhances current capabilities and paves the way for future algorithmic improvements, especially on harder tasks."

"5. Human evaluation is in. With generative models producing high-quality text, images, and more, benchmarking has slowly started shifting toward incorporating human evaluations like the Chatbot Arena Leaderboard rather than computerized rankings like ImageNet or SQuAD. Public sentiment about AI is becoming an increasingly important consideration in tracking AI progress."

"6. Thanks to LLMs, robots have become more flexible. The fusion of language modeling with robotics has given rise to more flexible robotic systems like PaLM-E and RT-2. Beyond their improved robotic capabilities, these models can ask questions, which marks a significant step toward robots that can interact more effectively with the real world."

"7. More technical research in agentic AI. Creating AI agents, systems capable of autonomous operation in specific environments, has long challenged computer scientists. However, emerging research suggests that the performance of autonomous AI agents is improving. Current agents can now master complex games like Minecraft and effectively tackle real-world tasks, such as online shopping and research assistance."

"8. Closed LLMs significantly outperform open ones. On 10 select AI benchmarks, closed models outperformed open ones, with a median performance advantage of 24.2%. Differences in the performance of closed and open models carry important implications for AI policy debates."

"Chapter 3: Responsible AI"

"1. Robust and standardized evaluations for LLM responsibility are seriously lacking. New research from the AI Index reveals a significant lack of standardization in responsible AI reporting. Leading developers, including OpenAI, Google, and Anthropic, primarily test their models against different responsible AI benchmarks. This practice complicates efforts to systematically compare the risks and limitations of top AI models."

"2. Political deepfakes are easy to generate and difficult to detect. Political deepfakes are already affecting elections across the world, with recent research suggesting that existing AI deepfake methods perform with varying levels of accuracy. In addition, new projects like CounterCloud demonstrate how easily AI can create and disseminate fake content."

"3. Researchers discover more complex vulnerabilities in LLMs. Previously, most efforts to red team AI models focused on testing adversarial prompts that intuitively made sense to humans. This year, researchers found less obvious strategies to get LLMs to exhibit harmful behavior, like asking the models to infinitely repeat random words."

"4. Risks from AI are becoming a concern for businesses across the globe. A global survey on responsible AI highlights that companies' top AI-related concerns include privacy, data security, and reliability. The survey shows that organizations are beginning to take steps to mitigate these risks. Globally, however, most companies have so far only mitigated a small portion of these risks."

"5. LLMs can output copyrighted material. Multiple researchers have shown that the generative outputs of popular LLMs may contain copyrighted material, such as excerpts from The New York Times or scenes from movies. Whether such output constitutes copyright violations is becoming a central legal question."

"6. AI developers score low on transparency, with consequences for research. The newly introduced Foundation Model Transparency Index shows that AI developers lack transparency, especially regarding the disclosure of training data and methodologies. This lack of openness hinders efforts to further understand the robustness and safety of AI systems."

"7. Extreme AI risks are difficult to analyze. Over the past year, a substantial debate has emerged among AI scholars and practitioners regarding the focus on immediate model risks, like algorithmic discrimination, versus potential long-term existential threats. It has become challenging to distinguish which claims are scientifically founded and should inform policymaking. This difficulty is compounded by the tangible nature of already present short-term risks in contrast with the theoretical nature of existential threats."

"8. The number of AI incidents continues to rise. According to the AI Incident Database, which tracks incidents related to the misuse of AI, 123 incidents were reported in 2023, a 32.3 percentage point increase from 2022. Since 2013, AI incidents have grown by over twentyfold. A notable example includes AI-generated, sexually explicit deepfakes of Taylor Swift that were widely shared online."

"9. ChatGPT is politically biased. Researchers find a significant bias in ChatGPT toward Democrats in the United States and the Labour Party in the UK. This finding raises concerns about the tool's potential to influence users' political views, particularly in a year marked by major global elections."

"Chapter 4: Economy"

"1. Generative AI investment skyrockets. Despite a decline in overall AI private investment last year, funding for generative AI surged, nearly octupling from 2022 to reach $25.2 billion. Major players in the generative AI space, including OpenAI, Anthropic, Hugging Face, and Inflection, reported substantial fundraising rounds."

"2. Already a leader, the United States pulls even further ahead in AI private investment. In 2023, the United States saw AI investments reach $67.2 billion, nearly 8.7 times more than China, the next highest investor. While private AI investment in China and the European Union, including the United Kingdom, declined by 44.2% and 14.1%, respectively, since 2022, the United States experienced a notable increase of 22.1% in the same time frame."

"3. Fewer AI jobs in the United States and across the globe. In 2022, AI-related positions made up 2.0% of all job postings in America, a figure that decreased to 1.6% in 2023. This decline in AI job listings is attributed to fewer postings from leading AI firms and a reduced proportion of tech roles within these companies."

"4. AI decreases costs and increases revenues. A new McKinsey survey reveals that 42% of surveyed organizations report cost reductions from implementing AI (including generative AI), and 59% report revenue increases. Compared to the previous year, there was a 10 percentage point increase in respondents reporting decreased costs, suggesting AI is driving significant business efficiency gains."

"5. Total AI private investment declines again, while the number of newly funded AI companies increases. Global private AI investment has fallen for the second year in a row, though less than the sharp decrease from 2021 to 2022. The count of newly funded AI companies spiked to 1,812, up 40.6% from the previous year."

"6. AI organizational adoption ticks up. A 2023 McKinsey report reveals that 55% of organizations now use AI (including generative AI) in at least one business unit or function, up from 50% in 2022 and 20% in 2017."

"7. China dominates industrial robotics. Since surpassing Japan in 2013 as the leading installer of industrial robots, China has significantly widened the gap with the nearest competitor nation. In 2013, China's installations accounted for 20.8% of the global total, a share that rose to 52.4% by 2022."

"8. Greater diversity in robot installations. In 2017, collaborative robots represented a mere 2.8% of all new industrial robot installations, a figure that climbed to 9.9% by 2022. Similarly, 2022 saw a rise in service robot installations across all application categories, except for medical robotics. This trend indicates not just an overall increase in robot installations but also a growing emphasis on deploying robots for human-facing roles."

"9. The data is in: AI makes workers more productive and leads to higher quality work. In 2023, several studies assessed AI's impact on labor, suggesting that AI enables workers to complete tasks more quickly and to improve the quality of their output. These studies also demonstrated AI's potential to bridge the skill gap between low- and high-skilled workers. Still, other studies caution that using AI without proper oversight can lead to diminished performance."

"10. Fortune 500 companies start talking a lot about AI, especially generative AI. In 2023, AI was mentioned in 394 earnings calls (nearly 80% of all Fortune 500 companies), a notable increase from 266 mentions in 2022. Since 2018, mentions of AI in Fortune 500 earnings calls have nearly doubled. The most frequently cited theme, appearing in 19.7% of all earnings calls, was generative AI."

"Chapter 5: Science and Medicine"

"1. Scientific progress accelerates even further, thanks to AI. In 2022, AI began to advance scientific discovery. 2023, however, saw the launch of even more significant science-related AI applications-- from AlphaDev, which makes algorithmic sorting more efficient, to GNoME, which facilitates the process of materials discovery."

"2. AI helps medicine take significant strides forward. In 2023, several significant medical systems were launched, including EVEscape, which enhances pandemic prediction, and AlphaMissence, which assists in AI-driven mutation classification. AI is increasingly being utilized to propel medical advancements."

"3. Highly knowledgeable medical AI has arrived. Over the past few years, AI systems have shown remarkable improvement on the MedQA benchmark, a key test for assessing AI's clinical knowledge. The standout model of 2023, GPT-4 Medprompt, reached an accuracy rate of 90.2%, marking a 22.6 percentage point increase from the highest score in 2022. Since the benchmark's introduction in 2019, AI performance on MedQA has nearly tripled."

"4. The FDA approves more and more AI-related medical devices. In 2022, the FDA approved 139 AI-related medical devices, a 12.1% increase from 2021. Since 2012, the number of FDA-approved AI-related medical devices has increased by more than 45-fold. AI is increasingly being used for real-world medical purposes."

"Chapter 6: Education"

"1. The number of American and Canadian CS bachelor's graduates continues to rise, new CS master's graduates stay relatively flat, and PhD graduates modestly grow. While the number of new American and Canadian bachelor's graduates has consistently risen for more than a decade, the number of students opting for graduate education in CS has flattened. Since 2018, the number of CS master's and PhD graduates has slightly declined."

"2. The migration of AI PhDs to industry continues at an accelerating pace. In 2011, roughly equal percentages of new AI PhDs took jobs in industry (40.9%) and academia (41.6%). However, by 2022, a significantly larger proportion (70.7%) joined industry after graduation compared to those entering academia (20.0%). Over the past year alone, the share of industry-bound AI PhDs has risen by 5.3 percentage points, indicating an intensifying brain drain from universities into industry."

"3. Less transition of academic talent from industry to academia. In 2019, 13% of new AI faculty in the United States and Canada were from industry. By 2021, this figure had declined to 11%, and in 2022, it further dropped to 7%. This trend indicates a progressively lower migration of high-level AI talent from industry into academia."

"4. CS education in the United States and Canada becomes less international. Proportionally fewer international CS bachelor's, master's, and PhDs graduated in 2022 than in 2021. The drop in international students in the master's category was especially pronounced."

"5. More American high school students take CS courses, but access problems remain. In 2022, 201,000 AP CS exams were administered. Since 2007, the number of students taking these exams has increased more than tenfold. However, recent evidence indicates that students in larger high schools and those in suburban areas are more likely to have access to CS courses."

"6. AI-related degree programs are on the rise internationally. The number of English-language, AI-related postsecondary degree programs has tripled since 2017, showing a steady annual increase over the past five years. Universities worldwide are offering more AI-focused degree programs."

"7. The United Kingdom and Germany lead in European informatics, CS, CE, and IT graduate production. The United Kingdom and Germany lead Europe in producing the highest number of new informatics, CS, CE, and information bachelor's, master's, and PhD graduates. On a per capita basis, Finland leads in the production of both bachelor's and PhD graduates, while Ireland leads in the production of master's graduates."

"Chapter 7: Policy and Governance"

"1. The number of AI regulations in the United States sharply increases. The number of AI-related regulations has risen significantly in the past year and over the last five years. In 2023, there were 25 AI-related regulations, up from just one in 2016. Last year alone, the total number of AI-related regulations grew by 56.3%."

"2. The United States and the European Union advance landmark AI policy action. In 2023, policymakers on both sides of the Atlantic put forth substantial proposals for advancing AI regulation The European Union reached a deal on the terms of the AI Act, a landmark piece of legislation enacted in 2024. Meanwhile, President Biden signed an Executive Order on AI, the most notable AI policy initiative in the United States that year."

"3. AI captures US policymaker attention. The year 2023 witnessed a remarkable increase in AI-related legislation at the federal level, with 181 bills proposed, more than double the 88 proposed in 2022."

"4. Policymakers across the globe cannot stop talking about AI. Mentions of AI in legislative proceedings across the globe have nearly doubled, rising from 1,247 in 2022 to 2,175 in 2023. AI was mentioned in the legislative proceedings of 49 countries in 2023. Moreover, at least one country from every continent discussed AI in 2023, underscoring the truly global reach of AI policy discourse."

"5. More regulatory agencies turn their attention toward AI. The number of US regulatory agencies issuing AI regulations increased to 21 in 2023 from 17 in 2022, indicating a growing concern over AI regulation among a broader array of American regulatory bodies. Some of the new regulatory agencies that enacted AIrelated regulations for the first time in 2023 include the Department of Transportation, the Department of Energy, and the Occupational Safety and Health Administration."

"Chapter 8: Diversity"

"1. US and Canadian bachelor's, master's, and PhD CS students continue to grow more ethnically diverse. While white students continue to be the most represented ethnicity among new resident graduates at all three levels, the representation from other ethnic groups, such as Asian, Hispanic, and Black or African American students, continues to grow. For instance, since 2011, the proportion of Asian CS bachelor's degree graduates has increased by 19.8 percentage points, and the proportion of Hispanic CS bachelor's degree graduates has grown by 5.2 percentage points."

"2. Substantial gender gaps persist in European informatics, CS, CE, and IT graduates at all educational levels. Every surveyed European country reported more male than female graduates in bachelor's, master's, and PhD programs for informatics, CS, CE, and IT. While the gender gaps have narrowed in most countries over the last decade, the rate of this narrowing has been slow."

"3. US K12 CS education is growing more diverse, reflecting changes in both gender and ethnic representation. The proportion of AP CS exams taken by female students rose from 16.8% in 2007 to 30.5% in 2022. Similarly, the participation of Asian, Hispanic/Latino/Latina, and Black/African American students in AP CS has consistently increased year over year."

"Chapter 9: Public Opinion"

"1. People across the globe are more cognizant of AI's potential impact--and more nervous. A survey from Ipsos shows that, over the last year, the proportion of those who think AI will dramatically affect their lives in the next three to five years has increased from 60% to 66%. Moreover, 52% express nervousness toward AI products and services, marking a 13 percentage point rise from 2022. In America, Pew data suggests that 52% of Americans report feeling more concerned than excited about AI, rising from 38% in 2022."

"2. AI sentiment in Western nations continues to be low, but is slowly improving. In 2022, several developed Western nations, including Germany, the Netherlands, Australia, Belgium, Canada, and the United States, were among the least positive about AI products and services. Since then, each of these countries has seen a rise in the proportion of respondents acknowledging the benefits of AI, with the Netherlands experiencing the most significant shift."

"3. The public is pessimistic about AI's economic impact. In an Ipsos survey, only 37% of respondents feel AI will improve their job. Only 34% anticipate AI will boost the economy, and 32% believe it will enhance the job market."

"4. Demographic differences emerge regarding AI optimism. Significant demographic differences exist in perceptions of AI's potential to enhance livelihoods, with younger generations generally more optimistic. For instance, 59% of Gen Z respondents believe AI will improve entertainment options, versus only 40% of baby boomers. Additionally, individuals with higher incomes and education levels are more optimistic about AI's positive impacts on entertainment, health, and the economy than their lower-income and less-educated counterparts."

"5. ChatGPT is widely known and widely used. An international survey from the University of Toronto suggests that 63% of respondents are aware of ChatGPT. Of those aware, around half report using ChatGPT at least once weekly."

AI Index Report 2024 -- Artificial Intelligence Index

#solidstatelife #ai #genai

waynerad@diasp.org

"The rise of generative AI and 'deepfakes' -- or videos and pictures that use a person's image in a false way -- has led to the wide proliferation of unauthorized clips that can damage celebrities' brands and businesses."

"Talent agency WME has inked a partnership with Loti, a Seattle-based firm that specializes in software used to flag unauthorized content posted on the internet that includes clients' likenesses. The company, which has 25 employees, then quickly sends requests to online platforms to have those infringing photos and videos removed."

This company Loti has a product called "Watchtower", which watches for your likeness online.

"Loti scans over 100M images and videos per day looking for abuse or breaches of your content or likeness."

"Loti provides DMCA takedowns when it finds content that's been shared without consent."

They also have a license management product called "Connect", and a "fake news protection" program called "Certify".

"Place an unobtrusive mark on your content to let your fans know it's really you."

"Let your fans verify your content by inspecting where it came from and who really sent it."

They don't say anything about how their technology works.

Hollywood celebs are scared of deepfakes. This talent agency will use AI to fight them.

#solidstatelife #ai #genai #computervision #deepfakes #aiethics