#solidstatelife

waynerad@diasp.org

Paper monitor, but the hardware designs for this one are entirely open source.

"The Modos Paper Monitor is an open-hardware 13.3-inch, 1600 x 1200 monochrome or color e-ink monitor with a fast 60 Hz refresh rate, low latency, multiple image modes and dithering options, and flexible screen update control. It can be connected using HDMI and USB-C and works on Linux, macOS, and Windows."

There's a link to a document below that explains everything you ever wanted to know about paper monitors and e-ink.

Modos paper monitor pre-launch on crowd supply

#solidstatelife #papermonitors #eink

waynerad@diasp.org

Older Venezuelans have figured out they can escape age discrimination (alleged) and make money on from "clickwork," earning pennies by labeling and annotating data to train AI systems. Wait, anybody anywhere is paid to do labeling for AI systems? I thought "self-supervised learning" and "synthetic data" had obsoleted that practice. No?

Amid economic collapse, older Venezuelans turn to gig work

#solidstatelife #ai #technologicalunemployment

waynerad@diasp.org

"Whatever the hell is going on with the folks at Emory University is simply bizarre. A group of students are suing the school after being suspended for a year over an AI program they built called "Eightball," which is designed to automagically review course study material within the school's software where professors place those study materials and develop flashcards, study materials for review, and the like. The only problem is that the school not only knew all about Eightball, it paid these same students $10,000 to make it."

"The school actually did much more than just fund Eightball's creation. It promoted the tool on its website. It announced how awesome the tool is on LinkedIn. Emails from faculty at Emory showered the creators of Eightball with all kinds of praise, including from the Associate Dean of the school. Everything was great, all of this was above-board, and it seemed that these Emory students were well on their way to doing something special, with the backing of the university."

"Then the school's IT and Honor Council got involved."

Emory University suspends students over AI study tool the school gave them $10k to build and promoted

#solidstatelife #ai #aiethics

waynerad@diasp.org

"BrainBridge exits stealth with ambitious goal to develop 'world's first head transplant system'."

When I first saw this, well, first I read "BrainBridge" as "Bainbridge", the private equity firm. But no, it's "*Brain*Bridge".

Then, when I first started watching the video, I thought it was satire, or maybe a scene from a movie.

It's looks like they are serious about this (see link below). But, having a pretty CGI video doesn't mean they're actually capable of doing it.

BrainBridge exits stealth with ambitious goal to develop "world's first head transplant system" - Longevity Technology

#solidstatelife #robotics #medicalrobotics #transplants #surgery

waynerad@diasp.org

"I stumbled upon LLM Kryptonite."

"For a bit over a year I've been studying and working with a range of large language models (LLMs). Most users see LLMs wired into web interfaces, creating chatbots like ChatGPT, Copilot, and Gemini. But many of these models can also be accessed through APIs under a pay-as-you-go usage model. With a bit of Python coding, it's easy enough to create custom apps with these APIs."

"I have a client who asked for my assistance building a tool to automate some of the most boring bits of his work as an intellectual property attorney."

"Some parts involve value judgements such as 'does this seem close to that?' where 'close' doesn't have a strict definition -- more of a vibe than a rule. That's the bit an AI-based classifier should be able to perform 'well enough'"

"I set to work on writing a prompt for that classifier, beginning with something very simple."

"Copilot Pro sits on top of OpenAI's best-in-class model, GPT-4. Typed the prompt in, and hit return."

"The chatbot started out fine -- for the first few words in its response. Then it descended into a babble-like madness."

"No problem with that, I have pretty much all the chatbots -- Gemini, Claude, ChatGPT+, LLamA 3, Meta AI, Mistral, Mixtral."

"I ran through every chatbot I could access and -- with the single exception of Anthropic's Claude 3 Sonnet -- I managed to break every single one of them."

He doesn't present the prompt, though.

I stumbled upon LLM Kryptonite and no one wants to fix it

#solidstatelife #ai #genai #llms #adversarialexamples

waynerad@diasp.org

"Full-body haptics via non-invasive brain stimulation."

"We propose & explore a novel concept in which a single on-body actuator renders haptics to multiple body parts -- even as distant as one's foot or one's hand -- by stimulating the user's brain. We implemented this by mechanically moving a coil across the user's scalp. As the coil sits on specific regions of the user's sensorimotor cortex it uses electromagnetic pulses to non-invasively & safely create haptic sensations, e.g., touch and/or forces. For instance, recoil of throwing a projectile, impact on the leg, force of stomping on a box, impact of a projectile on one's hand, or an explosion close to the jaw."

Hmm somehow I don't think everyone's going to be doing this any time soon. Or maybe I'm wrong and this is the missing piece that will let VR take off? Let's continue.

"The key component in our hardware implementation is a robotic gantry that mechanically moves the TMS coil across key areas of the user's scalp. Our design is inspired by a traditional X-Y-Z gantry system commonly found in CNC machines, but with key modifications that allow it to: (1) most importantly, conform to the curvature of the scalp around the pitch axis (i.e., front/back), which is estimated to be 18 degrees for the sensorimotor cortex area, based on our measurement from a standard head shape dataset; (2) accommodate different heads, including different curvatures and sizes; (3) actuate with sufficient force to move a medical-grade TMS coil (~1 kg); (4) actuate with steps smaller than 8.5 mm, as determined by our study; and, finally, (5) provide a structure that can be either directly mounted to a VR headset or suspended from the ceiling."

"We feature three actuators, respectively, to move the coil in the X- (ear-to-ear translation), Y- (nose-to-back translation), and Z- (height away from the scalp) axes. Since the X-axis exhibits most curvature as the coil moves towards the ear, we actuate it via a servo motor with a built-in encoder, this offers a reliable, fairly compact, strong way to actuate the coil."

"To account for curvature, the Z-axis needs to be lifted as the coil traverses the head."

"We used a medically compliant magnetic stimulator (Magstim Super Rapid) with a butterfly coil (Magstim D70)."

"We stimulated the right hemisphere of the sensorimotor cortex (corresponding to the left side of the body) with three consecutive 320 microsecond TMS pulses separated by 50 ms, resulting in a stimulation of ~150 ms. While we opted to only stimulate the right side of the brain to avoid fatigue, the results will be generalizable to the left side."

"We identified two locations on the participant's scalp that yielded minimum stimulation intensities to elicit observable limb movement (i.e., motor threshold) for the hand and foot."

"For each location, the intensity was set to 10% below the hand's motor threshold. The amplitude of TMS stimulation was reported in percentage (100% is the stimulator's maximum). During a trial, the experimenter stimulated the target location. Afterward, the participant reported the strongest point and area of a perceived touch as well as a keyword (or if nothing was felt). Then, the experimenter increased the intensity by 5% while ensuring the participant's comfort & consent, and moved to the next trial. This process continued until the participant reported the same location and same quality of sensations for two consecutive trials, or the intensity reached the maximum (i.e., 100%)."

"After each study session, we organized the participants' responses regarding touch sensations based on where the strongest point of the sensation was. We also annotated each trial to indicate which of the following body parts moved: 'none', 'jaw', 'upper arm', 'forearm', 'hand' (i.e., the palm), 'fingers', 'upper leg' (i.e., the thigh), 'lower leg', and 'foot'."

"Results suggest that we were able to induce, by means of TMS, touch sensations (i.e., only tactile in isolation of any noticeable movements) in two unique locations: hand and foot, which were both experienced by 75% of the participants."

"Our results suggest that we were able to induce, by means of TMS, force-feedback sensations (i.e., noticeable involuntary movements) in six unique locations: jaw (75%), forearm (100%), hand (92%), fingers (83%), lower-leg (92%), and foot (92%), which were all experienced by >75% of the participants. In fact, most of the actuated limbs were observed in almost all participants (>90%) except for the jaw (75%) and fingers (83%). The next most promising candidate would be the upper leg (58%)."

Surely they stopped there and didn't try to do a full-blown VR experience, right? Oh yes they did.

"Participants wore our complete device and a VR headset (Meta Quest 2), as described in Implementation. Their hands and feet were tracked via four HTC VIVE 3.0 Trackers attached with Velcro-straps. Participants wore headphones (Apple Airpods Pro) to hear the VR experience."

"Participants embodied the avatar of a cyborg trying to escape a robotics factory that has malfunctioned. However, when they find the escape route blocked by malfunctioning robots that fire at them, the VR experience commands our haptic device to render tactile sensation on the affected area (e.g., the left hand in this case, but both hands and feet are possible). To advance, participants can counteract by charging up their plasma-hand and firing plasma-projectiles to deactivate the robots. When they open the palm of their hands in a firing gesture, the VR experience detects this gesture and prompts our haptic device to render force-feedback and tactile sensations on the firing hand (e.g., the right hand in the case, but both hands can fire plasma shots). After this, the user continues to counteract any robots that appear, which can fire shots against any of the user's VR limbs (i.e., the hands or feet). The user is shot in their right foot -- just before this happens, the VR prompts our haptic device to render tactile sensation on the right foot. After a while, the user's plasma-hand stops working, and they need to recharge the energy. They locate a crate on the floor and stomp it with their feet to release its charging energy. Just before the stomping releases the energy, the VR commands our haptic device to render force-feedback and tactile sensation on the left leg. The user keeps fighting until they eventually find a button that opens the exit door as they are about to press it, the VR requests that our haptic device render tactile sensation on the right hand. The user has escaped the factory."

Full-body haptics via non-invasive brain stimulation

#solidstatelife #vr #haptics

waynerad@diasp.org

Sam Altman: Genius master class strategist? Debate on Twitter.

#solidstatelife #ai #aiethics #openai

https://twitter.com/signulll/status/1790756395794518342

waynerad@diasp.org

"How AI personalization fuels groupthink and uniformity"

"Here are some ways how Slack will use their customer's data to 'make your life easier':"

"Autocomplete: Slack might make suggestions to complete search queries or other text"

"Emoji Suggestion: Slack might suggest emoji reactions to messages using the content and sentiment of the message, the historic usage of the emoji ..."

"Search Results: 'We identify the right results for a particular query based on historical search results and previous engagements (...)'"

"At first glance, these features seem harmless, even helpful. [...] However, beneath the surface lies a more troubling consequence: the potential for these features to stifle creativity and reinforce groupthink."

"Consider the autocomplete function. By suggesting common completions based on past data, Slack's AI could inadvertently discourage users from thinking outside the box."

How AI personalization fuels groupthink and uniformity

#solidstatelife #ai #genai #llms #slack #aiethics

waynerad@diasp.org

"So Salesforce just announced that they'll be training their Slack AI models on people's private messages, files, and other content. And they're going to do so by default, lest you send them a specially formatted email to feedback@slack.com."

"Presumably this is because some Salesforce executives got the great idea in a brainstorming sesh that the way to catch up to the big players in AI is by just ignoring privacy concerns all together. If you can't beat the likes of OpenAI in scanning the sum of public human knowledge, maybe you can beat them by scanning all the confidential conversations about new product strategies, lay-off plans that haven't been announced yet, or private financial projections for 2025?"

Paranoia and desperation in the AI gold rush

#solidstatelife #ai #aiethics

waynerad@diasp.org

CodeAid is a coding LLM designed to be learner-centric rather than productivity-centric.

This reminds me of how people sometimes ask ChatGPT to play "tutor" instead of just giving direct answers to things.

"Instead of generating code, CodeAid generated an interactive pseudo-code. The pseudo-code allowed students to hover over each line to see a detailed explanation about that line."

"Not everything needs to be AI-generated. CodeAid uses Retrieval Augmented Generation (RAG) to display official and instructor-verified documentations of functions relevant to students' queries."

"CodeAid also generates several suggested follow-up questions for students to ask after each response."

"When using the Help Fix Code, CodeAid does not display the fixed code. Instead, it highlights incorrect parts of the students' code with suggested fixes."

"Instead of just displaying a high-level explanation of the entire code in a paragraph, CodeAid renders an interactive component in which students can hover over each line to understand the purpose and implementation of each line of the provided code."

Thematic analysis of surveys and interviews revealed four types of queries from CodeAid:

"Asking Programming Questions" (36%): "Code and conceptual clarification queries" about the programming language, "function-specific queries" about specific functions, and "code execution probe queries."

"Debugging Code" (32%): "Buggy code resolution queries," "problem source identification queries," and "error message interpretation queries."

Writing Code (24%): "High-level coding guidance queries" ("how to" questions), and "direct code solution queries" (students copy the task description from their assignment), and

"Explaining Code" (6%): "like explaining the starter code provided in their assignments."

"Students appreciated CodeAid's 24/7 availability and being 'a private space to ask questions without being judged'."

"Students also liked CodeAid's contextual assistance which provided a faster way to access relevant knowledge, allowed students to phrase questions however they wanted, and produced responses that were relevant to their class."

"In terms of the directness of responses: some students indicated that they wanted CodeAid to produce less direct responses, like hints."

"In terms of trust some students trusted CodeAid while others found that 'it can lie to you and still sound confident.'"

"When asked students about reasons for not using CodeAid, they mentioned a lack of need, preference to use existing tools, wanting to solve problems by themselves, or a lack of trust over AI."

And if you're wondering how CodeAid compares with "ChatGPT-as-tutor":

"Comparing CodeAid with ChatGPT: even though using ChatGPT was prohibited, students reported using it slightly more than CodeAid. They preferred its easier interface, and larger context window to ask about longer code snippets. However, some students did not like ChatGPT since it did a lot of the work for them."

CodeAid: A classroom deployment of an LLM-based coding assistant

#solidstatelife #ai #genai #llms #codingllms

waynerad@diasp.org

"Google has indexed inaccurate infrastructure-as-code samples produced by Pulumi AI [...] and the rotten recipes are already appearing at the top of search results."

Pulumi AI is a chatbot that generates solutions to infrastructure-as-code problems.

"Software developers have found some of the resulting AI-authored documentation and code inaccurate or even non-functional."

But the worse part is, Google has indexed Pulumi AI's answers and is giving them to people in Google Search results.

It looks like "the curse of recursion" that I told you all about last June -- almost a year ago -- is becoming a reality in real life.

There's an old adage in the computer industry (dating all the way back to the 1950s, believe it or not): "Garbage in, garbage out."

Now we can call it "the curse of recursion" or "model collapse", though.

Google Search results polluted by buggy AI-written code frustrate coders

#solidstatelife #ai #genai #llms #gigo

waynerad@diasp.org

Last year, DeepMind made "Barkour" robot dogs (Barkour == parkour + dogs bark, get it?) designed to put agility dogs out of a job. Recently, they open-sourced the hardware, including computer-aided design (CAD) and printed circuit board assembly (PCBA) designs, assembly instructions, and firmware code and other low level code.

As for software, they've released a 3D model of the robot you can use in the MuJoCo simulation system. However, they haven't release the models they created themselves with reinforcement learning. You're on your own to use the simulator and the real robots to make your own.

Video below is the video of the Barkour robots from a year ago. Link to the hardware designs is below.

Barkour: Benchmarking Animal-level Agility with Quadruped Robots - Atil Iscen

#solidstatelife #ai #robotics

waynerad@diasp.org

Jan Leike, OpenAI's head of AI Alignment, brrrrrp! former head of AI Alignment -- has left OpenAI and joined Anthropic.

Commentary from David Shapiro on what's going on at OpenAI. He says Sam Altman's "web 2.0" worldview has reached its limit and is now resulting in brain drain from OpenAI. Ilya Sutskever, technical lead of the research team that created GPT-4, has also left OpenAI.

Sam Altman wrecks OpenAI - Jan Leike joins Anthropic - brain drain from OpenAI - David Shapiro

#solidstatelife #ai #aiethics #openai

waynerad@diasp.org

A little bit of information has emerged about the chaos at OpenAI last November when Sam Altman was fired as CEO, then reinstated 4 days later. Helen Toner, one of the board members when all that happened surfaced on a podcast called the TED AI Show. At the time all anybody said was the board found Sam Altman was "not consistently candid in his communications" with the board of directors. "The board no longer has confidence in his ability to continue leading OpenAI."

Helen Toner said the board learned about ChatGPT only after it was launched -- on Twitter. The board was not informed ahead of time. :O Wow. She said Sam Altman owned OpenAI's Startup Fund, but he constantly claimed he had no financial interest in the company and was financially independent. She said he provided inaccurate information about the very little formal safety processes the company had in place. She said Sam Altman told many more lies but she can only mention these, because they are already known to the public. (Not to me but I guess people who are really paying attention.)

After that the conversation continues to discuss various AI safety issues. The potential for AI to be misused for mass surveillance, deepfake scams, automated systems that make decisions badly and people can't do anything about it (e.g. people lose access to financial systems, medical systems, because of some automated system making a decision that affects them), what she calls "the Wall-E future" where AI gives us what we want but not what is actually best for us.

There seems to be no way to link to the episode on the website, so the link just goes to the TED AI Show website. If you're clicking on this right away, it should be the newest episode, but if you're reading this some time later, you may need to search down for "What really went down at OpenAI and the future of regulation w/ Helen Toner".

What really went down at OpenAI and the future of regulation w/ Helen Toner

#solidstatelife #ai #aiethics #openai

waynerad@diasp.org

Google's Project Starline is being commercialized through a partnership between Google and Hewlett-Packard. Project Starline is a system that replaces video calling with a 3D system that is supposed to make it feel like you're together in person. The system works using a combination of light field cameras and a light field display. Light field cameras are cameras that capture the direction rays of light come from, not just how much light is detected and what color like regular cameras.

Bringing Project Starline out of the lab

#solidstatelife #lightfieldcameras

waynerad@diasp.org

I never heard of the British Post Office scandal (also known as the Horizon IT scandal) until @balduin@diasp.org mentioned it on here. Even though it's been going on for a long time -- the first trials were in 1999. Some programmers made some buggy software, which resulted in 980 people getting criminally prosecuted, 236 people going to prison, an unspecified number of bankruptcies, and 4 suicides.

In trying to find out what happened, most of the news relates to the trials, but I kept wondering, what were the software bugs? Eventually, I came across these videos, from Computerphile, this time featuring computer science professor Steven Murdoch, and Continuous Delivery, aka Dave Farley's channel, which give a cursory outline of what the software bugs might have been.

Basically, Fujitsu made an accounting system called Horizon, and it was a large and complex distributed system, with software designed to be able to perform operations at post offices anywhere without being online, and would be able to synchronize and reconcile everything with only intermittent connections to the main system.

However, they failed to properly maintain what's known as "ACID"-compliance. "ACID" is an acronym from database theory that stands for "atomicity, consistency, isolation, and durability." "Atomicity" means a transaction either goes through entirely or not at all -- nothing in between. All the changes to all the relevant accounts have to happen "atomically" or not at all. "Consistency" means if you have data spread across multiple computers, any computer you ask must give the same answers to the same questions -- you can have one machine saying the bank balance is one thing while another says it's something else -- who is right? Consistency is hard to maintain in distributed systems. "Isolation" means transactions must not inadvertently interfere with other transactions -- no operations can interfere with any other operations. They must all be "isolated" from each other. "Durability" means when you do a transaction, the next day it hasn't disappeared -- once committed, every update to the database is "durable".

So in what way were these principles violated? One example that seems to be mentioned often is if a user in a post office pushed a button, and the system seemed to not respond, and they pushed the button again, a transaction would be recorded multiple times in the central system but only once on their system. If you're familiar with the word "idempotent", the operation was idempotent in one place but not the other. (Idempotent means doing an operation multiple times produces the same result as doing it once -- an important principle in implementing reliable distributed systems.) So they could push a button for $8,000 -- er, this is the UK so it would be £8,000 -- four times, and the central office would think £8,000 was deposited 4 times, but the local post office system would show the £8,000 deposited only once. Corporate would call them up and demand the missing £24,000. But of course, they don't have it, and can't pay it. So they get criminally charged.

Regarding the criminal prosecutions and such -- I wish I could express surprise about that, but I can't, because, for me, it's become a familiar story. I learned about this from reading books about mistakes (that I learned about following the Boeing crashes and related to the Chernobyl accident). The way "mistakes" play out in human social hierarchies is: blame is assigned to whoever is most "proximal" to the accident, and blame goes down social status hierarchies to whoever is on the bottom.

For example in the Chernobyl accident, people knew about the flaws in the reactor design, but were unable to get that information to the plant operators, because that information could not propagate through the bureaucracy (social status hierarchy) of the Soviet Union. When the accident happened, the designers of the reactor were not blamed, nor were any of the people high up in the social status hierarchy who failed to propagate the relevant information blamed. Who was blamed? The plant operators -- they were both most proximal to the accident itself, and on the bottom of the social status hierarchy and unable to pass blame on to anybody else. Most of the plant operators died of cancer before they got their prison sentences. But it's worth noting that only the plant operators ever got prison sentences or any other punishment. The plant operators were actually not at fault -- at all -- because they weren't responsible for the flaws in the reactor design, and furthermore, given the flaws in the reactor design, they were not provided the proper information about what the flaws were and how to properly mitigate them.

In the case of the British Post Office scandal, all of the people who went to prison, went bankrupt, or committed suicide were subpostmasters. "Subpostmasters" is a term I have never encountered before -- apparently is a British term for what we here in the US would call a "branch manager". As far as I've been able to tell, nobody in Post Office Limited management or Fujitsu got convicted, got prison sentences, went bankrupt, or committed suicide. Evidently they fought very hard, both in the legal system, and in terms of PR, to protect themselves. The programmers who wrote the code also have not gotten criminal convictions, prison sentences, gone bankrupt, or committed suicide.

#solidstatelife #relationalmodel #databasetheory

https://www.youtube.com/watch?v=hBJm9ZYqL10

waynerad@diasp.org

Neural networks with only 1/100th the number of parameters? If Kolmogorov-Arnold Networks live up to their promise.

Rethinking the central assumptions of how layers work in neural networks. Neural networks are inspired by brain neurons, where outputs from one neuron shoot down axons, cross synapses, and into dendrites of the next neuron. This idea resulted in the simplification of thinking of neurons as "nodes" on a mathematical graph and the connections between them as edges. In neural networks, the "edges" perform linear transformations, and the "nodes" perform nonlinear transformations. (When there is an edge connecting every node, you have what's called a "fully connected" neural network -- also known as a "multi-layer perceptron" (MLP), a term which makes no sense -- well it does if you understand the history of neural networks going back to the 1960s, but otherwise it doesn't. But it's used everywhere in the paper, so you ought to know "MLP" and "fully connected neural network" means the same thing in case you decide to dive into the paper.)

Actually when I learned neural networks, I learned that they alternate between "linear transformation" layers and "activation" layers. The reason for this is that if you did only "linear transformation" layers, you could combine them all into a single "linear transformation" layer. Mathematically, this is represented as a matrix operation. If you do a matrix operation, followed by another, you can combine the two into a third matrix operation that does the equivalent of the first two. Put another way, a neural network with only linear transformations can be smashed flat into a single-layer linear transformation. In order to be able to make deeper neural networks, and neural networks capable of learning nonlinear relationships between inputs and outputs, we have to stick non-linear layers in between. People initially did very complex activation functions, like tanh, but now most people use ReLU, which means "rectified linear", which is a fancy way of saying we chop off the negative numbers (function returns 0 for x < 0), which is, believe it or not, enough non-linearity for the whole system to work.

Here, though, the researchers start with the easier-to-visualize idea that "edges" represent linear transformations while "nodes" represent the non-linear "activation" functions. Even the term "activation" is inspired by the brain, where neurons only "spike" if they get sufficient input to "activate" them. Neural networks in computers, though, don't "spike" -- everything is done with continuous, floating-point numbers.

This conception of edges being linear and nodes being non-linear is important to keep in mind, because with Kolmogorov-Arnold Networks, aka KANs, we're going to flip those around.

The inspiration for this comes from the fact that there is a mathematical proof underlying neural networks, known as the universal approximation theorem. The universal approximation theorem proves that for any given mathematical function, it is possible for a neural network, mathematically modeled as alternating linear transformations and non-linear functions as described above, to approximate that mathematical function to within same error margin. It doesn't say how big a neural network you'll have to make, just that it's guaranteed to be possible if you make your neural network big enough.

Well, two mathematicians, Vladimir Arnold and Andrey Kolmogorov, proved a theorem called the Kolmogorov-Arnold Representation theorem, and the idea here is that this alternative theorem can leader to similar results as the universal approximation theorem, but in a different way.

The Kolmogorov-Arnold Representation theorem proves that for any mathematical function that takes multiple parameters, it is possible to find two sets of functions that, when combined in the right way, will give you the same output, but each of those functions only takes a single input parameter. In other words, a function that takes, say, a 3-dimensional input can be "decomposed" into two sets of functions that take only a single (1-dimensional) input. The way these are combined is, and I'll try to describe this simply and without going into gory detail (you can look up uhu if you want that), for each of the second set of functions, you sum up all combinations of the first set of functions that pertain to that function from the second set, then run those outputs through the second set, and then sum up all of those. Or to put even more simply, you doing a combination of "summation" and "composition" where "composition" means taking the output of one function and making it the input of another function (think "f(g(x))").

The The Kolmogorov-Arnold Representation theorem doesn't tell you what these functions are, only that they are proven to exist. These researchers, though, since they are looking for a way to approximate to arbitrary accuracy, chose splines as the functions to use. If you're not familiar with splines, I have some links below. Splines are used in video games and computer-generated imagery in movies and computer-aided design (CAD) programs for manufactured objects and lots of other places to generate smooth curves.

Now think of your mental image of a neural network as nodes and edges. Now the splines will be the functions along the edges. The nodes will simply sum up the output of the edges. So the "edges" do the splines and represent the "composition" part of the Kolmogorov-Arnold Representation, and the "nodes" represent the summations. And now you can see, we've reversed our linearity & non-linearity: the "edges" are now the "non-linear" part of the system, and the "nodes" are now the "linear" part. (While the original Kolmogorov-Arnold Representation prescribed exactly two types of functions and in specific quantities, this generalization allows for any number of sets of functions, which correspond to "layers" in the neural network, and any number of functions in each set, which corresponds to the "size" (number of neurons) in each layer.)

So that's the essence of the idea, glossing over a lot of mathematical details. In the paper they prove a theorem that this architecture can approximate any mathematical function to within a given arbitrary error margin, provide the mathematical function is continuously differentiable. So now we have our analogy with the universal approximation theorem for Kolmogorov-Arnold Networks. The "continuously differentiable" assumption is helpful, too, because it means we can learn all the parameters to the splines with backpropagation and gradient descent!

Kolmogorov-Arnold Networks have a couple more tricks up their sleeves, too. One is that the network can be made larger without throwing all the learned parameters away and starting over -- sort of. With regular neural networks, if you wanted to, say, double the size of the network, you have to make a new network twice as big and train it rom scratch with your training dataset. With Kolmogorov-Arnold Networks, you can increase the precision of your splines by adding more parameters to them, without throwing away any of the parameters you've already learned. (You go from "coarse-grained spline" to "fine-grained spline".) But I say "sort of" because you can't make any arbitrary change, like change the number of layers. But it's really cool that they have some ability to expand their learning ability by adding parameters after the fact, something normal neural networks can't do at all. (They call expanding the capacity of the neural network in this manner "grid extension". It's called "grid extension" because historically the set of points defining a spline is called the "grid".)

Another trick of the sleeve of Kolmogorov-Arnold Networks is the lack of "catastrophic forgetting". It turns out in regular neural networks, using the same "activation" function for all neurons in a layer causes them all to be causally linked, which means if the neurons on one part learn a task, they will forget it when neurons on another part learn a second task. Neural networks have to be trained on all tasks they have to learn all at once. (Companies like OpenAI have learned to "ensemble" separately trained neural networks to deal with this problem.) In the Kolmogorov-Arnold Networks, if neurons on one part learned one task, neurons on another part could learn a different task, and the two wouldn't interfere with each other. So Kolmogorov-Arnold Networks don't have the "catastrophic forgetting" problem.

They came up with an "auto-discovery" system and discovered the auto-discovered KANs were smaller than the human-constructed ones. Humans tend to construct networks that look like the math formulas, but the "auto-discovery" system could find shortcuts.

They make the claim that KANs are more "interpretable" than regular neural networks. Well, maybe for the smallest KANS. For example, one of the "toy" functions that they tested on was x/y (i.e. a 2-dimensional input where the output is just one number divided by the other). They peek "under the hood" and discover what the splines are approximating is exp(log(x) - log(y)). This kind of "interpretability" I assume would diminish rapidly once the network starts to get big. This network was a [2, 1, 1] network, meaning 2 neurons for the input, a 1 neuron hidden layer, and a 1 neuron output (only 2 layers because the input isn't counted as a "layer"). Imagine those numbers going into the hundreds and how are you going to interpret?

They tested the system by testing its ability to learn a set of mathematical functions. These functions were: Jacobian elliptic functions, elliptic integrals, Bessel functions, Associated Legendre functions, and spherical harmonics -- and I'm not going to say any more about them because I have no idea what any of those are. They also got 27 equations from Richard Feynman's textbooks. What they do with these is show they can get better results than regular neural networks with the same number of parameters, plus their "auto-discovery" system can find substantially more compact representations of the formulas. They would like to interpret these KANs and learn more about how these math formulas could be represented more compactly.

For the big test, they get into a branch of mathematics called knot theory. The idea is to take a knot as input, and find its "signature". I don't know enough about knot theory to tell you what at knot's "signature" is. But I can tell you they compared their system with DeepMind's, and DeepMind had 78% accuracy with a 4-layer, 300-neuron-on-each-layer neural network, while they had 81.6% accuracy on a [17, 1, 14] (2-layer) KAN. DeepMind's neural network had 30,000 parameters, while theirs had 200.

It remains to be seen whether this algorithm can be deployed at scale and reduce the number of parameters of neural networks in the wild by a factor of 100. But it is a fascinating development nonetheless. See below for video saying KANs are overhyped.

#solidstatelife #ai #sgd #backpropagation #mlps

https://arxiv.org/abs/2404.19756

waynerad@diasp.org

"I lost my job to AI this week."

The guy was a graphic artist doing visual design for electronic marketing campaigns. And in this case, apparently he was quite literally told he was being replaced by AI.

I suspect this is the ultimate fate of all of us. It's an open question how long it will take. Some days I think AI is going super fast and it'll be real soon, other days I think it'll be like when Elon Musk predicted in 2016 that Telsa would have "full self driving" by 2017. And in 2017, he predicted 2018, and in 2018 he predicted 2019... and so on. He was taking the current rate of change and extrapolating it out into the future, but in fact while the technology continued to improve, it entered a domain of diminishing returns, and while Tesla's Full Self Driving is by all accounts pretty good, nobody is ready to rip the steering wheel out of any car entirely, which is what "full self driving" is really supposed to mean. And I think that may happen with the current explosion in language, image, audio, and video models -- they may enter a domain of diminishing returns, and "artificial general intelligence" that surpasses humans may be farther away than people think.

I don't know. Right now either scenario seems plausible. The rate of change still feels fast. At the same time, people are running into the limitations of current models and getting annoyed by them.

See below for more thoughts on "the future of labor in an AI-driven economy" from Nikola Danaylov.

#solidstatelife #ai #technologicalunemployment

https://www.youtube.com/watch?v=U2vq9LUbDGs