AI can't cross this line on a graph and we don't know why.

The graph has the "error" that the neural net is trying to minimize as part of its training (also called the "loss") on the vertical axis.

On the horizontal axis, it has the amount of computing power thrown at the training process.

When switched to a log-log graph -- logarithmic on both axes -- a straight line emerges.

This is actually one of 3 observed neural network scaling laws. The other two look at model size and dataset size, and see a similar pattern.

Have we discovered some fundamental law of nature, like the ideal gas law in chemistry, or is this an artifact of the particular methods we are using now to train neural networks?

You might think someone knows but no one knows.

That didn't stop this YouTuber from making some good animations of the graphs and various concepts in neural network training, such as cross-entropy. It introduces the interesting concept that language may have a certain inherent entropy.

The best theory as to why the scaling laws hold tries to explain it in terms of neural networks learning high-dimensional manifolds.

AI can't cross this line and we don't know why. - Welch Labs

#solidstatelife #ai #llms #genai #deeplearning #neuralnetworks #scalinglaws