#codingai

waynerad@diasp.org

"Together AI acquires CodeSandbox to launch first-of-its-kind code interpreter for generative AI."

What this is about is a system for letting large language models write code in a virtual machine "sandbox" where they can actually run the code. They can execute the code do all the testing and debugging that a human would ordinarily do.

"CodeSandbox pioneered a unique development environment infrastructure used by more than 4.5 million developers every month. CodeSandbox enables developers to spin up virtual machine sandboxes for code execution, hibernate them, and resume nearly instantly -- offering unparalleled performance, security and scale."

Together AI acquires CodeSandbox to launch first-of-its-kind code interpreter for generative AI

#solidstatelife #ai #genai #llms #codingai

waynerad@diasp.org

"I've observed two distinct patterns in how teams are leveraging AI for development. Let's call them the "bootstrappers" and the "iterators." Both are helping engineers (and even non-technical users) reduce the gap from idea to execution (or minimum viable product (MVP))."

"The Bootstrappers: Zero to MVP: Start with a design or rough concept, use AI to generate a complete initial codebase, get a working prototype in hours or days instead of weeks, focus on rapid validation and iteration."

"The Iterators: daily development: Using AI for code completion and suggestions, leveraging AI for complex refactoring tasks, generating tests and documentation, using AI as a 'pair programmer' for problem-solving."

The "bootstrappers" use tools like Bolt, v0, and screenshot-to-code AI, while "iterators" use tools like Cursor, Cline, Copilot, and WindSurf.

But there is "hidden cost".

"When you watch a senior engineer work with AI tools like Cursor or Copilot, it looks like magic, absolutely amazing. But watch carefully, and you'll notice something crucial: They're not just accepting what the AI suggests. They're constantly: Refactoring the generated code into smaller, focused modules, adding edge case handling the AI missed, strengthening type definitions and interfaces, questioning architectural decisions, and adding comprehensive error handling."

"In other words, they're applying years of hard-won engineering wisdom to shape and constrain the AI's output."

The author speculates on two futures for software: One is "agentic AI", where AI gets better and better and teams of AI agents can take on more and more of the work done by humans, and "software as craft", where humans make high-quality, polished software, with empathy, experience, and caring deeply about craft that can't be AI-generated.

The article used the term "P2 bugs" without explaining what that means. P2 means "priority 2". The idea is people focus all their attention on "priority 1" bugs, but fixing all the "priority 2" bugs is what makes software feel "polished" to the end user.

Commentary: My own experience is that AI is useful for certain use cases. If your situation fits those use cases, AI is magic. If your situation doesn't fit those use cases, AI isn't useful, or is of marginal utility. Because AI is useful-or-not depending on situation, it doesn't provide the across-the-board 5x productivity improvement that employers expect today. My feeling is that the current generation of LLMs aren't good enough to fix this, but because of the employer expectation, I have to keep trying new AI tools in pursuit of the expected 5x improvement in productivity. (If you are able to achieve a 5x productivity improvement over 2 years ago on a large (more than a half million lines of code) codebase written in a crappy language, get in touch with me -- I want to know how you do it.)

The 70% problem: Hard truths about AI-assisted coding

#solidstatelife #ai #genai #llms #codingai

waynerad@diasp.org

AI won't fix the fundamental flaw of programming, says YouTuber "Philomatics".

His basic thesis is that the "fundamental flaw of programming" is that software is unreliable and people no longer even expect it to be reliable.

"Jonathan Blow did an informal experiment where he took a screenshot every time some piece of software had an obvious bug in it. He couldn't keep this up for more than a few days because there were just too many bugs happening all the time to keep track of."

"I think we've all gotten so used to this general flakiness of software that we don't even notice it anymore. Workarounds like turning it off and on again or 'force quitting' applications have become so ingrained in us that they're almost part of the normal operation of the software. Smartphones are even worse in this regard. I'm often hesitant to do things in the mobile browser, for example using a government website or uploading my r''esum''e to a job board, because things often just don't work on mobile.

He goes on to say the cause of this is that we stack software abstractions higher and higher, but (citing Joel Spolsky), ultimately all non-trivial abstractions are leaky. (Joel Spolsky actually wrote, in 2002, an essay called "The Law of Leaky Abstractions".)

AI is the next pile of abstractions that we are going to throw on the stack of abstractions. Like compilers, where it's possible, in principle, for people to look at and edit the binary output, but nobody does it, it's possible for people to read and edit the output of AI systems that produce code, but before long, nobody will do it. AI code generators will become the next generation of compilers, allowing people to "write" code at a higher level of abstraction, while leaving the details to the AI systems. It won't make software more reliable.

Is software that unreliable, though? I recently upgraded my mobile phone and various things that were broken on the old phone (2 OS versions older) magically started working just fine. Considering the millions of lines of code running every time I run an app or view a webpage, "obvious bugs" are actually few and far between.

AI Won't Fix the Fundamental Flaw of Programming - Philomatics

#solidstatelife #ai #llms #genai #codingai

waynerad@diasp.org

25% of code written at Google is produced by AI. The question, well, 2 questions: 1) is this code as good as before, when code was all human-written? And 2) is Google actually making any good products? If not, they're just making products that aren't any good 25% faster. But Google kills lots of products so maybe they can kill products 25% faster, too?

I guess Gemini is pretty good -- what's not so good is trying to ram it into everything. I don't know about you but I skip those "AI summaries" in search results -- I go directly to Gemini when I want an "AI summary". Google Search itself doesn't seem to be getting any better, although, I know people say it's getting worse, but to me it seems like it's usually good enough -- I still use it. I'm getting text messages from Gemini on my phone -- it wants to help me write messages. I don't want someone else, AI or no, telling me what to say, tho. Who wants this? If you like and use this stuff, comment below.

There's also the issue of the 25% itself. At my work there is tremendous pressure for programmers to vastly increase productivity by using AI tools. Subjectively it seems like the expectation industrywide is about 5x, although at my work nobody has said 5x specifically. 25% is 1.25x, nowhere near 5x.

Sundar Pichai said on the earnings call today that more than 25% of all new code at Google is now generated by AI.

#solidstatelife #ai #genai #llms #codingai #google

waynerad@diasp.org

OpenAI o1 isn't as good as an experienced professional programmer, but... "the set of tasks that O1 can do is impressive, and it's becoming more and more difficult to find easily demonstrated examples of things it can't do."

"There's a ton of things it can't do. But a lot of them are so complicated they don't really fit in a video."

"There are a small number of specific kinds of entry level developer jobs it could actually do as well, or maybe even better, than new hires."

Carl of "Internet of Bugs" recounts how he spent the last 3 weeks experimenting with the o1 model to try to find its shortcomings. /

"I've been saying for months now that AI couldn't do the work of a programmer, and that's been true, and to a large extent it still is. But in one common case, that's less true than it used to be, if it's still true at all."

"I've worked with a bunch of new hires that were fresh out with CS degrees from major colleges. Generally these new hires come out of school unfamiliar with the specific frameworks used on active projects. They have to be closely supervised for a while before they can work on their own. They have to be given self-contained pieces of code so they don't screw up something else and create regressions. A lot of them have never actually built anything that wasn't in response to a homework assignment.

"This o1 thing is more productive than most, if not all, of those fresh CS graduates I've worked with.

"Now, after a few months, the new grads get the hang of things, and from then on, for the most part, they become productive enough that I'd rather have them on a project than o1."

When I have a choice, I never hire anyone who only has an academic and theoretical understanding of programming and has never actually built anything that faces a customer, even if they only built it for themselves. But in the tech industry, many companies specifically create entry-level positions for new grads."

"In my opinion, those positions where people can get hired with no practical experience, those positions were stupid to have before and they're completely irrelevant now. But as long as those kinds of positions still exist, and now that o1 exists, I can no longer honestly say that there aren't any jobs that an AI could do better than a human, at least as far as programming goes."

"o1 Still has a lot of limitations."

Some of the limitations he cited were writing tests and writing a SQL RDBMS in Zig.

ChatGPT-O1 Changes Programming as a Profession. I really hated saying that. - Internet of Bugs

#solidstatelife #ai #genai #llms #codingai #openai #technologicalunemployment

waynerad@diasp.org

"Today I read yet again someone suggesting that using ChatGPT to rewrite code from one programming language to another is a great idea. I disagree: a programming language is an opinionated way on how to better achieve a certain task and switching between world views without understanding how and why they do things the way they do is a recipe for inefficient code at best and weird bugs at worse."

"I decided to test my theory with Google's Gemini - I've seen students using it in their actual coding (probably because it's free) making it a fair choice. I asked the following:"

"Convert the following code from Python to Elixir:"

The code looks equivalent, but it's only equivalent for the normal case -- if the input is bad, the Python and Elixir code behave differently. I think this is a good example of how LLMs can make translations that look correct but aren't in subtle ways.

#solidstatelife #ai #genai #llms #codingai

https://7c0h.com/blog/new/therac_25_and_llms.html

waynerad@diasp.org

Using LLMs to reverse JavaScript minification. Project Humanify is a tool to automate this process.

"Minification is a process of reducing the size of a Javascript file in order to optimize for fast network transfer."

"Most minification is lossless; There's no data lost when true is converted to its minified alternative !0."

"Some data is lost during the minification, but that data may be trivial to recreate. A good example is whitespace."

"The most important information that's lost during the minification process is the loss of variable and function names. When you run a minifier, it completely replaces all possible variable and function names to save bytes."

"Until now, there has not been any good way to reverse this process; when you rename a variable from crossProduct to a, there's not much you can do to reverse that process."

How to codify the process of renaming a function:

"1. Read the function's body,"
"2. Describe what the function does,"
"3. Try to come up with a name that fits that description."

"For a classical computer program it would be very difficult to make the leap from 'multiply b with itself' to 'squaring a number'. Fortunately recent advances in LLMs have made this leap not only possible, but almost trivial."

"Essentially the step 2. is called 'rephrasing' (or 'translating' if you consider Javascript as its own natural language), and LLMs are known to be very good at that."

"Another task where LLMs really shine is summarization, which is pretty much what we're doing in step 3. The only specialization here is that the output needs to be short enough and formatted to the camel case."

Using LLMs to reverse JavaScript variable name minification

#solidstatelife #ai #genai #llms #codingai #javascript #minification

waynerad@diasp.org

Andy Jassy, the now CEO of Amazon, says using AI, specifically Amazon Q, applied to the task of upgrading "foundational" software dependencies, can reduce 50 developers-days' worth of work to just a few hours, and has saved the company $260 million.

One of the most tedious (but critical tasks) for software development teams is updating foundational software.

#solidstatelife #ai #genai #llms #codingai #amazonq

waynerad@diasp.org

Home-cooked software and barefoot programmers. Maggie Appleton envisions a future where the "long tail" of "local" software -- software for very small numbers of people, which can't be economically served by tech companies -- will be produced by "barefoot programmers" using LLMs. The term comes from a Chinese term for "barefoot doctors" who serve remote areas.

She envisions systems that go beyond what is available today that produces usable small pieces but not a way to connect them together -- she envisions systems that guide the process of connecting all the pieces together and guiding non-programmers into making successful small-scale software. She calls these "orchestration agents".

"Barefoot programmers could build software solutions that no industrial software company would ever build because there's not enough market value, and they don't understand the problem space well enough."

These barefoot programmers are "deeply embedded in their communities, so they understand the needs and problems of the people around them."

Home-cooked software and barefoot programmers: Maggie Appleton - Local-First Conf

#solidstatelife #ai #genai #llms #codingai

waynerad@diasp.org

Clio aims to be CoPilot for DevOps.

"Clio is an AI-powered copilot designed to help you with DevOps-related tasks using CLI programs. It leverages OpenAI's capabilities to provide intelligent assistance directly from your command line."

"Note: Clio is designed to safely perform actions. It won't do anything without your confirmation first."

Features: Kubernetes management, AWS integration, Azure integration, Google Cloud Platform integration, DigitalOcean integration, EKS management, and GitHub integration.

Clio - Your friendly and safe CLI Copilot

#solidstatelife #ai #genai #llms #codingai

waynerad@diasp.org

"How are engineers really using AI tools in 2024?"

"A total of 211 tech professionals took part in the survey." "Most respondents are individual contributors (62%). The remainder occupy various levels of engineering management."

"As many professionals are using both ChatGPT and GitHub Copilot as all other tools combined."

"GitHub Copilot Chat is mentioned quite a lot, mostly positively."

"Other tools earned honorable mentions as some devs' favorite tools: Claude, Gemini, Cursor, Codium, Perplexity and Phind, Aider, JetBrains AI, AWS CodeWhisperer, Rewatch."

More paywalled.

AI tooling for software engineers in 2024: Reality check (part 1)

#solidstatelife #ai #genai #llms #codingai

waynerad@diasp.org

At Google, the fraction of code created with AI assistance via code completion, defined as the number of accepted characters from AI-based suggestions divided by the sum of manually typed characters and accepted characters from AI-based suggestions, now exceeds 50%.

"We achieved the highest impact with UX that naturally blends into users' workflows."

"We observe that with AI-based suggestions, the code author increasingly becomes a reviewer, and it is important to find a balance between the cost of review and added value."

"Quick iterations with online A/B experiments are key, as offline metrics are often only rough proxies of user value. By surfacing our AI-based features on internal tooling, we benefit greatly from being able to easily launch and iterate, measure usage data, and ask users directly about their experience through UX research."

"High quality data from activities of Google engineers across software tools, including interactions with our features, is essential for our model quality."

"Human-computer interaction has moved towards natural language as a common modality, and we are seeing a shift towards using language as the interface to software engineering tasks as well as the gateway to informational needs for software developers, all integrated in IDEs."

"ML-based automation of larger-scale tasks -- from diagnosis of an issue to landing a fix -- has begun to show initial evidence of feasibility. These possibilities are driven by innovations in agents and tool use, which permit the building of systems that use one or more LLMs as a component to accomplish a larger task."

50% still seems like a lot. I wonder how much of that 50% has "code churn" -- has to be corrected again, even after being checked in? Maybe a lot of that 50% is actually correction code on previous LLM-generated code, lol.

Also, you would think if Google engineers are now writing code 2x as fast, we ought to be seeing rapid innovation in Google products. I'm not holding my breath. To be fair, Google is trying to innovate with Gemini, "AI summaries", and various other AI products. But, Google Search seems like it's been getting slowly worse for a long time (although I still use it and it's ok for most searches), and Google has a history of canceling a lot of products. I feed oddly doubtful this 2x productivity boost will make any visible difference to us users.

AI in software engineering at Google: Progress and the path ahead

#solidstatelife #ai #genai #llms #codingai

waynerad@diasp.org

The end of classical computer science is coming, and most of us are dinosaurs waiting for the meteor to hit, says Matt Welsh.

"I came of age in the 1980s, programming personal computers like the Commodore VIC-20 and Apple IIe at home. Going on to study computer science in college and ultimately getting a PhD at Berkeley, the bulk of my professional training was rooted in what I will call 'classical' CS: programming, algorithms, data structures, systems, programming languages."

"When I was in college in the early '90s, we were still in the depth of the AI Winter, and AI as a field was likewise dominated by classical algorithms. In Dan Huttenlocher's PhD-level computer vision course in 1995 or so, we never once discussed anything resembling deep learning or neural networks--it was all classical algorithms like Canny edge detection, optical flow, and Hausdorff distances."

"One thing that has not really changed is that computer science is taught as a discipline with data structures, algorithms, and programming at its core. I am going to be amazed if in 30 years, or even 10 years, we are still approaching CS in this way. Indeed, I think CS as a field is in for a pretty major upheaval that few of us are really prepared for."

"I believe that the conventional idea of 'writing a program' is headed for extinction, and indeed, for all but very specialized applications, most software, as we know it, will be replaced by AI systems that are trained rather than programmed."

"I'm not just talking about CoPilot replacing programmers. I'm talking about replacing the entire concept of writing programs with training models. In the future, CS students aren't going to need to learn such mundane skills as how to add a node to a binary tree or code in C++. That kind of education will be antiquated, like teaching engineering students how to use a slide rule."

"The shift in focus from programs to models should be obvious to anyone who has read any modern machine learning papers. These papers barely mention the code or systems underlying their innovations; the building blocks of AI systems are much higher-level abstractions like attention layers, tokenizers, and datasets."

This got me thinking: Over the last 20 years, I've been predicting AI would advance to the point where it could automate jobs, and it's looking more and more like I was fundamentally right about that, and all the people who poo-poo'd the idea over the years in coversations with me were wrong. But while I was right about that fundamental idea (and right that there wouldn't be "one AI in a box" that anyone could pull the plug on if something went wrong, but a diffusion of the technology around the world like every previous technology), I was wrong about how exactly it would play out.

First I was wrong about the timescales: I thought it would be necessary to understand much more about how the brain works, and to work algorithms derived from neuroscience into AI models, and looking at the rate of advancement in neuroscience I predicted AI wouldn't be in its current state for a long time. While broad concepts like "neuron" and "attention" have been incorporated into AI, there are practically no specific algorithms that have been ported from brains to AI systems.

Second, I was wrong about what order. I was wrong in thinking "routine" jobs would be automated first, and "creative" jobs last. It turns out that what matters is "mental" vs "physical". Computers can create visual art and music just by thinking very hard -- it's a purely "mental" activity, and computers can do all that thinking in bits and bytes.

This has led me to ponder: What occupations require the greatest level of manual dexterity?

Those should be the jobs safest from the AI revolution.

The first that came to mind for me -- when I was trying to think of jobs that require an extreme level of physical dexterity and pay very highly -- was "surgeon". So I now predict "surgeon" will be the last job to get automated. If you're giving career advice to a young person (or you are a young person), the advice to give is: become a surgeon.

Other occupations safe (for now) against automation, for the same reason would include "physical therapist", "dentist", "dental hygienist", "dental technician", "medical technician" (e.g. those people who customize prosthetics, orthodontic devices, and so on), and so on. "Nurse" who routinely does physical procedures like drawing blood.

Continuing in the same vein but going outside the medical field (pun not intended but allowed to stand once recognized), I'd put "electronics technician". I don't think robots will be able to solder any time soon, or manipulate very small components, at least after the initial assembly is completed which does seem to be highly amenable to automation. But once electronic components fail, to the extent it falls on people to repair them, rather than throw them out and replace them (which admittedly happens a lot), humans aren't going to be replaced any time soon.

Likewise "machinist" who works with small parts and tools.

"Engineer" ought to be ok -- as long as they're mechanical engineers or civil engineers. Software engineers are in the crosshairs. What matters is whether physical manipulation is part of the job.

"Construction worker" -- some jobs are high pay/high skill while others are low pay/low skill. Will be interesting to see what gets automated first and last in construction.

Other "trade" jobs like "plumber", "electrician", "welder" -- probably safe for a long time.

"Auto mechanic" -- probably one of the last jobs to be automated. The factory where the car is initially manufacturered, a very controlled environment, may be full of robots, but it's hard to see robots extending into the auto mechanic's shop where cars go when they break down.

"Jewler" ought to be a safe job for a long time. "Watchmaker" (or "watch repairer") -- I'm still amazed people pay so much for old-fashioned mechanical watches. I guess the point is to be pieces of jewlry, so these essentially count as "jewler" jobs.

"Tailor" and "dressmaker" and other jobs centered around sewing.

"Hairstylist" / "barber" -- you probably won't be trusting a robot with scissors close to your head any time soon.

"Chef", "baker", whatever the word is for "cake calligrapher". Years ago I thought we'd have automated kitchens at fast food restaurants by now but they are no where in sight. And nowhere near automating the kitchens of the fancy restaurants with the top chefs.

Finally, let's revisit "artist". While "artist" is in the crosshairs of AI, some "artist" jobs are actually physical -- such as "sculptor" and "glassblower". These might be resistant to AI for a long time. Not sure how many sculptors and glassblowers the economy can support, though. Might be tough if all the other artists stampede into those occupations.

While "musician" is totally in the crosshairs of AI, as we see, that applies only to musicians who make recorded music -- going "live" may be a way to escape the automation. No robots with the manual dexterity to play physical guitars, violins, etc, appear to be on the horizon. Maybe they can play drums?

And finally for my last item: "Magician" is another live entertainment career that requires a lot of manual dexterity and that ought to be hard for a robot to replicate. For those of you looking for a career in entertainment. Not sure how many magicians the economy can support, though.

The end of programming - Matt Welsh

#solidstatelife #genai #codingai #technologicalunemployment

waynerad@diasp.org

Devon "the first AI software engineer"

You put it in the "driver's seat" and it does everything for you. Or at least that's the idea.

"Benchmark the performance of LLaMa".

Devon builds the whole project, uses the browser to pull up API documentation, runs into an unexpected error, adds a debugging print statement, uses the error in the logs to figure out how to fix the bug, then builds and deploys a website with full styling as visualization.

See below for reactions.

Introducing Devin, the first AI software engineer - Cognition

#solidstatelife #ai #genai #llms #codingai #technologicalunemployment