#mathematics

mkwadee@diasp.eu

My wife was working through finding the derivative of the #exponential #function #exp(x) from first principles. I was made aware that she hadn't actually seen why the number e=2.7128... was the #base the of the function and that that's what you need to start with. In fact, that means one must actually start by finding the first differential of a general #logarithm and find #e from there. Once you've find the #FirstDerivative of #ln(x), the #derivative of the #ExponentialFunction is straightforward.

Anyway, here it for anyone who might be interested for education purposes.
Page 1 of deriving the differential of the log function
Page 2 of deriving the differential of the log function
Page 3 of deriving the differential of the log function

#Calculus #Derivative #Mathematics #Differentiation #CCBYSA

waynerad@diasp.org

FrontierMath is a new benchmark of original, exceptionally challenging mathematics problems -- and all the problems are new and previously unpublished, so they can't be already in large language model (LLMs)' training sets.

We don't have a good measurement of super advanced mathematics capabilities in AI models. The researchers note that current mathematics benchmarks for AI systems, like the MATH dataset and GSM8K, measure ability at the high-school level, and early undergraduate level. The researchers are motivated by a desire to measure deep theoretical understanding, creative insight, and specialized expertise.

There's also the problem of "data contamination" -- "the inadvertent inclusion of benchmark problems in training data." "This causes artificially inflated performance scores for LLMs, and that masks the models' true reasoning (or lack of reasoning) capabilities.

"The benchmark spans the full spectrum of modern mathematics, from challenging competition-style problems to problems drawn directly from contemporary research, covering most branches of mathematics in the 2020 Mathematics Subject Classification."

I had a look at the 2020 Mathematics Subject Classification. It's a 224-page document that is just a big list of subject areas with number-and-letter codes assigned to them. For example "11N45" means "Asymptotic results on counting functions for algebraic and topological structures".

"Current state-of-the-art AI models are unable to solve more than 2% of the problems in FrontierMath, even with multiple attempts, highlighting a significant gap between human and AI capabilities in advanced mathematics."

"To understand expert perspectives on FrontierMath's difficulty and relevance, we interviewed several prominent mathematicians, including Fields Medalists Terence Tao, Timothy Gowers, and Richard Borcherds, and Internatinal Mathematics Olympiad coach Evan Chen. They unanimously characterized the problems as exceptionally challenging, requiring deep domain expertise and significant time investment to solve."

Unlike many International Mathematics Olympiad problems, the FrontierMath problems have a single numerical answer, which makes them possible to check in an automated manner -- no human hand-grading required. At the same time, they have worked to make the problems "guess-proof".

"Problems often have numerical answers that are large and nonobvious." "As a rule of thumb, we require that there should not be a greater than 1% chance of guessing the correct answer without doing most of the work that one would need to do to 'correctly' find the solution."

The numerical calculations don't need to be done in the language model -- they have access to Python to perform mathematical calculations.

FrontierMath: A benchmark for evaluating advanced mathematical reasoning in AI

#solidstatelife #ai #genai #llms #mathematics

drnoam@diasp.org

I’m not crossing the #wordle picket line. But there are lots of alternatives!

nerdlegame 1022 6/6

🟩⬛⬛🟪🟪🟪⬛⬛
🟩🟪🟪⬛🟪⬛🟪⬛
🟩🟪🟪⬛⬛🟩🟩⬛
🟩⬛⬛🟪🟩🟩🟩⬛
🟩🟩⬛⬛🟩🟩🟩🟪
🟩🟩🟩🟩🟩🟩🟩🟩

#notWordle #puzzles #mathematics

psychmesu@diaspora.glasswings.com

https://mastodon.social/@gutenberg_org/113169348964233205 gutenberg_org@mastodon.social - American mathematician Dorothy Vaughan was born #OTD in 1910.

She was the first respected Black female manager at NASA, thus creating a long-lasting legacy for diversity in mathematics & science for West Area Computers. As one of the first female coders in the field who knew how to code FORTRAN, she was able to instruct other Black women on the coding language & paved a wave of female programmers to integrate their work into NASA’s systems.

https://en.wikipedia.org/wiki/Dorothy_Vaughan

#mathematics #womeninSTEM

waynerad@diasp.org

AlphaProof is a new reinforcement-learning based system for formal math reasoning from DeepMind. AlphaProof + AlphaGeometry 2, an improved version of DeepMind's geometry system, solved 4 out of 6 problems from this year's International Mathematical Olympiad (IMO), achieving the same level as a silver medalist.

"AlphaProof solved two algebra problems and one number theory problem by determining the answer and proving it was correct. This included the hardest problem in the competition, solved by only five contestants at this year's IMO. AlphaGeometry 2 proved the geometry problem, while the two combinatorics problems remained unsolved."

"AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go."

"Formal languages offer the critical advantage that proofs involving mathematical reasoning can be formally verified for correctness."

"When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching over possible proof steps in Lean. Each proof that was found and verified is used to reinforce AlphaProof's language model, enhancing its ability to solve subsequent, more challenging problems."

"We trained AlphaProof for the IMO by proving or disproving millions of problems, covering a wide range of difficulties and mathematical topic areas over a period of weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found."

The blog post seems to have revealed few details of how AlphaProof works. But it sounds like we're about to enter a new era of math proofs, where all kinds of theorems will be discovered and proved.

AI achieves silver-medal standard solving International Mathematical Olympiad problems

#solidstatelife #ai #genai #llms #reinforcementlearning #rl #mathematics #proofs

mkwadee@diasp.eu

Yesterday, I posted an image of the #LorenzAttractor showing the evolution of three trajectories (shown in red, green and blue) starting close together. Here, I've made it into a little animation to show how the paths initially stay close to each other but after about a quarter of the duration plotted, they #diverge from each other irrevocably (i.e. become uncorrelated) but remain part of the #ChaoticAttractor.

#DynamicalSystems #ChaoticAttractors #StrangeAttractors #NumericalSolutions #Mathematics #AppliedMathematics #CCBYSA #FreeSoftware #WxMaxima