#math

waynerad@diasp.org

ChatGPT is destroying Trefor Bazett's math exams.

"I just copy and pasted my exams from last semester -- this was a second year university level introductory linear algebra course -- into chat GPT and actually it got an A on my exams. But AI still makes a lot of pretty basic mistakes."

"What is the smallest integer whose square is between 15 and 30?"

ChatGPT-4o, Claude 3.5 Sonnet, and Google's Gemini all get nearly 100% on the GSM8K (which is a fancy way of saying "Grade School Math, 8000 questions") dataset.

GSM-Hard is a dataset with the same word problems as GSM8K but with gigantic numbers -- so the LLM has to outsource the calculation to something like Wolfram|Alpha to be able to get the correct answers.

The MATH dataset has high school competition problems. LLMs can get these if they can be solved with "content knowledge", such as by having formulas memorized, but can fail if the reasoning required is made more complex. LLMs get about 70% on the whole dataset.

There are additional datasets with Mathematical Olympiad problems. LLMs score poorly on these, but their scores are increasing.

ChatGPT is destroying my math exams - Dr. Trefor Bazett

#solidstatelife #ai #genai #llms #mathllms #math

hankg@friendica.myportal.social

This is a fascinating look at early prototype/bleed edge UX from Apple's first foray into tablet computing in 1991. It culminated in the limited released and then cancelled Apple PenLite device. Most interesting to me was Alan Kay's suggestion to use Kalman filtering to help make the pen motion feel smoother by denoising and predicting the motion. Turns out they used it on the first trackpads too. I never thought of using a Kalman filter for that sort of application. Maybe they still do. #apple #history #ComputerHistory #math
Apple PenLite: The iPad Before the iPad!

diane_a@diasp.org

The ancient Greeks wanted to believe that the universe could be described in its entirety using only whole numbers and the ratios between them โ€” fractions, or what we now call rational numbers. But this aspiration was undermined when they considered a square with sides of length 1, only to find that the length of its diagonal couldnโ€™t possibly be written as a fraction.
\
The first proof of this (there would be several) is commonly attributed to Pythagoras, a 6th-century BCE philosopher, even though none of his writings survive and little is known about him. Nevertheless, โ€œit was the first crisis in what we call the foundations of mathematics,โ€ said John Bell, a professor emeritus at Western University in London, Ontario.
\
That crisis would not be resolved for a long time. Though the ancient Greeks could establish what $latex \sqrt{2}$ was not, they didnโ€™t have a language for explaining what it was.

https://www.quantamagazine.org/how-the-square-root-of-2-became-a-number-20240621/
#math

bat_andrea@diasp.org

#Caturday #Math
catmath
Gwen Fisher @gwenbeads (Mastodon)

Physicists conjecture that for each cat, there is an anticat of the same size but opposite temperament. Some cats are shifted red and some are shifted blue. Iโ€™m not sure I got that all right. I was prety sleepy during the lecture.

Cat and Anticat
Doodle No. 141

8โ€ square

Drawn with archival archival black pigment ink, highly lightfast (fade resistant) watercolor pencils, mica paint on Arches 300 GSM 100% cotton paper
#watercolor #mathart #physics #astronomy #rainbow