Predicting previously unknown crystal structure can be seen as an ultimate challenge for machine learning potentials.
Such task requires the model to robust, accurate, and provide a good representation of the DFT potential energy surface.
Here we test a state-of-the-art model, M3GNet, three cases of CSP investigations:
Our results highlights the importance of data when building models aimed at universality - and the universality is only as good as the training data goes. It would also be useful if the uncertainty information is available whenever a model makes prediction, allowing out-of-sample cases to be quickly identified. We have not explored the use of additional training or fine-tuning using new DFT data, which could potentially allow the model to quickly adjust to the chemical space of interest and being more data-efficiently than building a specialised potential from the ground-up. Such strategies have being widely used in the domain of computer vision and natural language processing.