"MatterGen: a generative model for inorganic materials design".

So the approach taken here to "generating" materials is to use a diffusion model, except change it so it generates materials instead of images. Or at least, one specific subclass of materials, which is inorganic crystals.

With a diffusion model that generates images, the basic idea is that you take the process of adding noise to an image and reverse it. In other words, you train a model to remove noise from an image. And it has to remove noise in the direction of the text prompt that you gave it.

Here, we represent a crystal as a combination of atoms, a "lattice" that specifies what type of symmetry the crystal has, and coordinates for where the atoms are in the lattice. Then we come up with rules for adding "noise" to each of these. The neural network that gets trained learns to reverse this noise adding process. It learns a different way to reverse each of these three inputs: the atoms (elements, ions, covalent and ionic bonds), the lattice, and the positions of the atoms relative to the lattice. The coordinate system for the atoms is not relative to absolute 3D space (that is to say, Cartesian), it's relative to the lattice. The lattices have names like C2/m (monoclinic), P4/mbm (tetragonal), R3m (trigonal), P1 (triclinic), Pm3m (cubic), P63/mmc (hexagonal), Fm3m (cubic), and so on (3 dimensional space groups, see below).

Ok, now that you have the basic idea of the 3 types of diffusion the network uses: atom type diffusion, lattice diffusion, and coordinate diffusion relative to the lattice. The coordinate diffusion requires variance adjustment for atomic density to work properly. The lattice diffusion has some complications related to its rotation, and its mean and variance limits, which I would explain to you if I understood them. But I don't so we're going to skip that.

Ok, at this point you may be wondering: with an image diffusion model, it does reverse-diffusion in the direction of the text prompt you give it. But here, there's no text prompt. So what is there and what happens?

Instead of a text prompt, you specify what properties you want your material to have. And just as an image-based diffusion model has to be trained on a massive number of image-text pairs, this system has to be trained on a massive number of examples of materials-and-properties pairs. These come from 3 databases: the Materials Project database, the Alexandria database, and the Inorganic Crystal Structure Database (ICSD). These add up to 1,081,850 materials with up to 20 atoms.

This enables the system to "steer" the reverse diffusion towards the properties you say you desire. Generally you want stable properties, so the system starts with a stability calculation based on density functional theory (DFT) calculations. The system will filter out structures where the energy per atom after relaxation via DFT is above some threshold, below which it qualifies as "stable". The system checks that the bond lengths are reasonable. The system checks that the charges balance, so you don't have an ionic crystal that has an imbalance of ionic charges.

Once these checks are passed, it comes down to what you ask for. You can put limits directly on the structure, such as restricting what types of symmetry you can get in your lattice.

More commonly, though, you'll ask for magnetic, electronic, or mechanical properties. Examples of these would be magnetic density, or a target band gap, which affects the material's conducting or semiconducting properties.

You can ask for a certain bulk modulus. This has to do with how "elastic" a material is. It's a measure of a material's decrease in volume with increase in pressure.

There's even something called the Herfindahl-Hirschman index, which you may have heard of from the world of investing. Classically, it's a measure of the size of a company relative to the size of the industry it is in and the amount of competitiveness. Here, it measures the "supply chain risk" of a material.

"MatterGen: a generative model for inorganic materials design"

#solidstatelife #ai #chemistry