Nanoparticles have already found their way into applications ranging from energy storage and conversion to quantum computing and therapeutics. But given the vast compositional and structural tunability nanochemistry enables, serial experimental approaches to identify new materials impose insurmountable limits on discovery.
Now, researchers at Northwestern University and the Toyota Research Institute have successfully applied machine learning to guide the synthesis of new nanomaterials, eliminating barriers associated with materials discovery. The highly trained algorithm combed through a defined dataset to accurately predict new structures that could fuel processes in clean energy, chemical and automotive industries.
The authors asked the model to tell us what mixtures of up to seven elements would make something that hasn’t been made before. The machine predicted 19 possibilities, and, after testing each experimentally, 18 of the predictions were correct. The research work was led by Chad Mirkin, a Northwestern nanotechnology expert and the paper’s corresponding author. Mirkin is the George B. Rathmann Professor of Chemistry in the Weinberg College of Arts and Sciences; a professor of chemical and biological engineering, biomedical engineering, and materials science and engineering at the McCormick School of Engineering; and a professor of medicine at the Feinberg School of Medicine. He also is the founding director of the International Institute for Nanotechnology. The study, “Machine learning-accelerated design and synthesis of polyelemental heterostructures,” was published in the journal Science Advances.
According to the authors, what makes this so important is the access to unprecedentedly large, quality datasets because machine learning models and AI algorithms can only be as good as the data used to train them. The data-generation tool, called a “Megalibrary,” was invented by Mirkin and dramatically expands a researcher’s field of vision. Each Megalibrary houses millions or even billions of nanostructures, each with a slightly distinct shape, structure and composition, all positionally encoded on a two-by-two square centimeter chip. To date, each chip contains more new inorganic materials than have ever been collected and categorized by scientists.
Mirkin’s team developed the Megalibraries by using a technique (also invented by the research group) called polymer pen lithography, a massively parallel nanolithography tool that enables the site-specific deposition of hundreds of thousands of features each second. When mapping the human genome, scientists were tasked with identifying combinations of four bases. But the loosely synonymous “materials genome” includes nanoparticle combinations of any of the usable 118 elements in the periodic table, as well as parameters of shape, size, phase morphology, crystal structure and more. Building smaller subsets of nanoparticles in the form of Megalibraries will bring researchers closer to completing a full map of a materials genome.
Machine learning applications are ideally suited to tackle the complexity of defining and mining the materials genome, but are gated by the ability to create datasets to train algorithms in the space. The combination of Megalibraries with machine learning may finally eradicate that problem, leading to an understanding of what parameters drive certain materials properties.
Using Megalibraries as a source of high-quality and large-scale materials data for training artificial intelligence (AI) algorithms, enables researchers to move away from the “keen chemical intuition” and serial experimentation typically accompanying the materials discovery process.
The research team compiled previously generated Megalibrary structural data consisting of nanoparticles with complex compositions, structures, sizes and morphologies. They used this data to train the model and asked it to predict compositions of four, five and six elements that would result in a certain structural feature. In 19 predictions, the machine learning model predicted new materials correctly 18 times—an approximately 95% accuracy rate. With little knowledge of chemistry or physics, using only the training data, the model was able to accurately predict complicated structures that have never existed on earth.
Metal nanoparticles show promise for catalyzing industrially critical reactions such as hydrogen evolution, carbon dioxide (CO2) reduction and oxygen reduction and evolution. The model was trained on a large Northwestern-built dataset to look for multi-metallic nanoparticles with set parameters around phase, size, dimension and other structural features that change the properties and function of nanoparticles. The Megalibrary technology may also drive discoveries across many areas critical to the future, including plastic upcycling, solar cells, superconductors and qubits.
Before the advent of megalibraries, machine learning tools were trained on incomplete datasets collected by different people at different times, limiting their predicting power and generalizability. Megalibraries allow machine learning tools to do what they do best—learn and get smarter over time. The new model will only get better at predicting correct materials as it is fed more high-quality data collected under controlled conditions.
The new approach has the potential in the future to find catalysts critical to fueling processes in clean energy, automotive and chemical industries. Identifying new green catalysts will enable the conversion of waste products and plentiful feedstocks to useful matter, hydrogen generation, carbon dioxide utilization and the development of fuel cells. Producing catalysts also could be used to replace expensive and rare materials like iridium, the metal used to generate green hydrogen and CO2 reduction products.
Carolin B. Wahl , Jordan H. Swisher , Joseph H. Montoya , Santosh K. Suram Chad A. Mirkin. Machine learning–accelerated design and synthesis of polyelemental heterostructures. Science Advances • 22 Dec 2021 • Vol 7, Issue 52