#differentiation

mkwadee@diasp.eu

My wife was working through finding the derivative of the #exponential #function #exp(x) from first principles. I was made aware that she hadn't actually seen why the number e=2.7128... was the #base the of the function and that that's what you need to start with. In fact, that means one must actually start by finding the first differential of a general #logarithm and find #e from there. Once you've find the #FirstDerivative of #ln(x), the #derivative of the #ExponentialFunction is straightforward.

Anyway, here it for anyone who might be interested for education purposes.
Page 1 of deriving the differential of the log function
Page 2 of deriving the differential of the log function
Page 3 of deriving the differential of the log function

#Calculus #Derivative #Mathematics #Differentiation #CCBYSA

waynerad@diasp.org

The full text of Simone Scardapane's book Alice's Adventures in a Differentiable Wonderland is available online for free. It's not available in print form because it's being written and this is actually a draft. But it looks like Volume 1 is pretty much done. It's about 260 pages. It introduces mathematical fundamentals and then explains automatic differentiation. From there it applies the concept to convolutional layers, graph layers, and transformer models. A volume 2 is planned with fine-tuning, density estimation, generative modeling, mixture-of-experts, early exits, self-supervised learning, debugging, and other topics.

"Looking at modern neural networks, their essential characteristic is being composed by differentiable blocks: for this reason, in this book I prefer the term differentiable models when feasible. Viewing neural networks as differentiable models leads directly to the wider topic of differentiable programming, an emerging discipline that blends computer science and optimization to study differentiable computer programs more broadly."

"As we travel through this land of differentiable models, we are also traveling through history: the basic concepts of numerical optimization of linear models by gradient descent (covered in Chapter 4) were known since at least the XIX century; so-called 'fully-connected networks' in the form we use later on can be dated back to the 1980s; convolutional models were known and used already at the end of the 90s. However, it took many decades to have sufficient data and power to realize how well they can perform given enough data and enough parameters."

"Gather round, friends: it's time for our beloved Alice's adventures in a differentiable wonderland!"

Alice's Adventures in a differentiable wonderland

#solidstatelife #aieducation #differentiation #neuralnetworks