Search Blogs

Thursday, February 27, 2025

A Note about MACE

I came about a subtle fact I didn't catch when I first read the MACE paper [1], that is they use the total energy of the system, not the binding or cohesive energy. More specifically, the energy is not:

$$ \begin{equation} E_{\text{MACE}} = E_{\text{system}} - \sum_{i=1} N_i E_i \label{eq:binding_energy} \end{equation} $$

but rather just,

$$ E_{\text{MACE}} = E_{\text{system}}\quad , $$

where $i$ denotes the atom type and $N_i$ is the number of atoms of that type. Here, $E_{\text{system}}$ is the total ground state energy of the system. The energy expansion is with respect to $E_{\text{MACE}} = \sum_{i=1} \mathrm{E}_i(\mathbf{h}_i)$, where $\mathbf{h}_i$ represents the learned node feature embeddings for an atom $i$ and contains all the body-order dependencies through the learned message construction, passing, and updating [2-3].

This realization came about because I was performing some equation of state calculations for certain systems using MACE and couldn't understand why the energy seemed excessively negative compared to other atomistic calculations and experimental values. It's important to note that this energy shift doesn't affect the equilibrium volume or bulk modulus from the fits, as those are related to the shape and derivatives of the curve, not the energy scale/level. Similarly, for MD, this won't affect the dynamics since the forces are again based on $F = -\nabla E$ and the energy is continuous and smooth.

The only quantities affected are thermodynamic ones like enthalpy, free energy, etc., which are related to the energy of the system. Any observed thermodynamic state changes you are investigating will be with respect to this reference energy. However, the interesting aspect of MACE is that the site energies are actually learned! This means an isolated atom in a vacuum has a non-zero energy.

MACE allows you to calculate the energy of an isolated atom, enabling you to subtract it to obtain the binding energy of the system as in eq.~\ref{eq:binding_energy}. Here is an example of the output for the diatomic curve of CO with MACE without the energy shift:

C-O diatomic curve showing the total energy shift

If we calculate the site energies for $\textrm{C}$ and $\textrm{O}$ in this system, we get:

  • $\textrm{C}$ = -2.0562 eV
  • $\textrm{O}$ = -2.0121 eV

Subtracting the sum of these two site energies from the total energy at the equilibrium bond length gives a binding energy of approximately 10.8 eV, which is quite close to the experimental value of 11.11 eV per molecule.

Therefore, be sure to keep this in mind when analyzing the results of MACE. I need to review if other equivariant models adopt this same approach or not.


References

[1] I. Batatia, D.P. KovΓ‘cs, G.N.C. Simm, C. Ortner, G. CsΓ‘nyi, MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields, (2022). https://doi.org/10.48550/arXiv.2206.07697.

[2] B. Sanchez-Lengeling, E. Reif, A. Pearce, A.B. Wiltschko, A Gentle Introduction to Graph Neural Networks, Distill (2021). https://doi.org/10.23915/distill.00033.

[3] D. Grattarola, A practical introduction to GNNs - Part 2, Daniele Grattarola (2021). https://danielegrattarola.github.io/posts/2021-03-12/gnn-lecture-part-2.html (accessed February 28, 2025).



Reuse and Attribution

Thursday, February 20, 2025

Well, Hello, Graph-PES!

I haven't had much to write the past weeks but did come across a new package by John Gardner at University of Oxford called Graph-PES. The primary scope of the package is to train graph neural network potentials for property prediction and simulation. It pairs really nicely with the load-atoms package which I have written about before.

Since I've been working with M3GNet and other GNN potentials, I thought it would be fun in my own time to try to implement M3GNet in GraphPES. The template implementation is here but it's not validated or even tested, so there's still much to work on and do.

What does Graph-PES let you do?

The main idea of the package is to provide a clean and straightforward API to implement any graph neural network architecture for property prediction and simulation. The main features are:

  • Experiment with new model architectures by inheriting from the GraphPESModel base class
  • Train existing model architectures like M3GNet and MACE
  • Use and fine-tune foundation models can fine-tune on supported pre-trained models
  • Configure distributed training with learning rate scheduling, weights and biases logging
  • Run molecular dynamics simulations via LAMMPS or ASE using any GraphPESModel

What I find particularly appealing about Graph-PES is its focus on making the development process streamlined. It's nice because you can store and track your model trainings with the WandB interface. Overall it's a really clean and easy to understand package for training GNN potentials. Probably the biggest challenge is implementing a specific model architecture that might not have many of the primitives in the base Graph-PES.

One other really nice feature is the use of torchscript to export models for efficient deployment. This makes using your model with LAMMPS and the corresponding Graph-PES pair-style straightforward. You can find a nice tutorial from start to end on how to do this here.

πŸŒ€ Spinning up Graph-PES

Quickstart

This is just a rewording of the already excellent quickstart docs.

The quickest way to get started with Graph-PES is to install it via pip and run a very minimal example:

pip install graph-pes
wget https://tinyurl.com/graph-pes-minimal-config -O config.yaml
graph-pes-train config.yaml

This will train the example here. If you want to go further you have two options to train a model: Use Graph-PES's Python API or a YAML script with predefined model architectures and Graph-PES cli tools. I'll just show the yaml script here since it's more concise and way easier to read. But first, we need to grab the training data.

Datasets

If you just want to train a model to get some hands-on experience you can use the load-atoms package and use one of the many datasets available. That's what I'll be doing here.

from load_atoms import load_dataset

structures = load_dataset("SiO2-GAP-22")
# Note: This is just for demonstration purposes.
train, val, test = structures.random_split([80, 10, 10])

train.write("train-sio2.xyz", format="extxyz")
val.write("val-sio2.xyz", format="extxyz")
test.write("test-sio2.xyz", format="extxyz")

With the split training, validation and test sets written out, we can now train a model. I'll train the equivariant1 MACE model [2] but use a very small architecture just to speed things up.

general:
    run_id: mace-sio2
    progress: logged

# train a small MACE  model
model:
    many-body:
        +MACE:
            elements: [Si, O]
            cutoff: 4.5
            radial_expansion: Bessel
            n_radial: 6
            l_max: 3
            channels: 12
            layers: 1
            readout_width: 12

data:
    train:
        +file_dataset:
            path: train-sio2.xyz
            cutoff: 4.5
            shuffle: true
    valid:
        +file_dataset:
            path: val-sio2.xyz
            cutoff: 4.5
            shuffle: true
    test:
        +file_dataset:
            path: test-sio2.xyz
            cutoff: 4.5

loss:
    - +PerAtomEnergyLoss()
    - +PropertyLoss: { property: forces, metric: RMSE }
    
fitting:
    trainer_kwargs:
        max_epochs: 100
        accelerator: auto
        check_val_every_n_epoch: 5

    optimizer:
        +Optimizer:
            name: AdamW
            lr: 0.001

    scheduler:
        +LRScheduler:
            name: ReduceLROnPlateau
            factor: 0.5
            patience: 10

    loader_kwargs:
        batch_size: 4

wandb:
    project: graph-pes-sio2
    tags: [demos]

The many-body section is where the model architecture is defined, and the rest are just details about training. To run this you can use the graph-pes-train command as shown above. This will log the training to WandB and save the model checkpoints and allow for monitoring the training progress. This is particularly useful if you're running on the cloud or cluster and want to easily monitor the training progress. On the terminal you will see something like this:

[graph-pes INFO]: Preparing data
[graph-pes INFO]: Setting up datasets
[graph-pes INFO]: Pre-fitting the model on 80 samples
[graph-pes INFO]: 
Number of learnable params:
    many-body (MACE): 13,012

WandB

You'll need to create a WandB account and copy your API key. Then when you run your first Graph-PES training it will prompt you for this and save it to your environment variables.

Below are some plots from the training for different hyperparameters. As you'll see in some cases the small batch size and reduced model doesn't generalize well. Some models do get close to per-atom mean absolute errors of about 50 meV/atom, which is pretty good but the forces are terrible. We can fix this by weighting the forces more in the loss function. In general this dataset size and model/training hyperparameters aren't sufficient to learn well enough.

Dashboard plots for different metrics of training and validation

I have not tried using any foundation/pre-trained models and then fine-tuning with Graph-PES, but this should be possible and described here. One other way to inspect the model is to look at the parity plots of the predicted vs. true values. The result for the best model (mace-sio2-4) is shown below. The plots don't look too bad; however, we really need to view it at a scale to see the meV variations.

Parity plots energy per atom and force components for the test data.

So that is Graph-PES in a nutshell. I'll be using this package to implement and test out some new model architectures and see how they perform. I like the API and design a lot, and it seems to be well thought out. I should note that all the foundation models have pretty straightforward CLI tools to train and fine-tune their models, so Graph-PES isn't about being better but more about being flexible with a very nice design.

Footnotes


  1. There are two types of GNNs that preserve input-output transformations: invariant and equivariant. With invariant GNNs, the output remains the same regardless of the transformation applied to the input—for example, the distance between two points does not change when the points or reference frame are translated or rotated. In contrast, equivariant GNNs exhibit symmetry such that the output transforms predictably with the input. An example is the force vector between atom $i$ and atom $j$, which remains invariant under translation but rotates in accordance with the reference frame under rotation. 

References

[1] J. Gardner, Graph-PES, (2025). https://github.com/jla-gardner/graph-pes (accessed February 20, 2025).

[2] I. Batatia, D.P. KovΓ‘cs, G.N.C. Simm, C. Ortner, G. CsΓ‘nyi, MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields, (2022). https://doi.org/10.48550/arXiv.2206.07697



Reuse and Attribution