## Applications in Data and Materials Science (SPRING ME 582/CS 590)

### Course Overview

In this special topics course, AI principles are applied to a series of materials science example problems, each taught in a module by an expert in materials science and/or data science. Each module spans 2-3 weeks, demonstrating an array of data science/AI methods in unique materials case studies in advancing discovery or design principles. Example modules include: Boosted decision trees for discovering structural patterns controlling bandgaps in metamaterials, Machine learning to predict dynamic heterogeneities in materials, and AI assisted design of high entropy materials for catalysis. Each module will have a homework assignment which will include application of AI methods to the module topic and there is no final exam.

### Pre-requisites

Prior materials science course and prior AI/ML course; instructor permission.

### Course Topics

**Module 1: Boosted Decision Trees for the Discovery of Structural Patterns Controlling Bandgaps in Architected Metamaterials** (Cate Brinson, Cynthia Rudin)

In this module, we will explore a MetaMaterial case study using boosted decision trees to discover structural features responsible for band gaps in patterned metamaterials. Students will first learn the basic physics of mechanical metamaterials with an understanding of aspects such as negative poison ratio and elastic wave band gaps. Students will then run an existing bandgap prediction code on all possible unit cell configurations of a given size to determine the bandgap frequencies and widths. The order 10^{5} structures will then be analyzed using decision tree and random forest methods, enabling classification of fundamental structure types leading to a given bandgap range. Students will also explore standard forms of neural networks for computer vision and interrogate underlying reasons that they may not be effectively used in this kind of case. Students will also learn about state-of-the-art methods for dimension reduction, which allows them to potentially see patterns in high-dimensional data that are difficult to see in other ways, and they will learn to critically evaluate the results of such methods. A homework assignment will apply neural network models with and without features and demonstrate the change in predictability.

##### Learning Objectives

1. Students will be able to describe the fundamentals of mechanical metamaterials and the physical meaning of a mechanical band gap;

2. Students will be able to apply black box and interpretable machine learning models to identify and predict structural patterns related to band gaps.

3. Students will be able to visualize data using state-of-the-art dimension reduction techniques.

**Module 2: Materials Optimization** (Richard Sheridan, Cate Brinson)

In this module, students will practice optimizing materials. These materials will be represented in a geometric basis, which allows a flexible set of materials to be derived from combinations of basis elements. Students will then use a simulation model to determine the performance of these materials. Leveraging these tools, students will subsequently optimize the material’s performance using Bayesian optimization on the ML model. Bayesian optimization allows uncertainty calculations to be used for optimization: If there is a high-performing material whose uncertainty in performance is high, the Bayesian optimization routine might suggest to then calculate the performance using the simulation model in order to determine its performance with more certainty, as this material formulation could be close to the optima. Over iterations, the Bayesian optimization routine will come closer and closer to the optimal material. Running the simulation repeatedly is expensive, so students will come to understand the value of a careful exploration-exploitation tradeoff.

##### Learning Objectives

1. Students will grasp the fundamentals of Bayesian optimization and the exploration-exploitation tradeoff as indicated by their ability to:

- accurately describe differences and similarities between BO and other optimization/response surface methods,
- give examples of what “decisions” BO algorithms would make given example data,
- give an example optimization problem, make and explain their own choice between explorative vs exploitative decision policies.

2. Students will apply Bayesian optimization to (approximately) find a globally optimal material parameter set.

**Module 3: Machine Learning for Predicting Dynamical Heterogeneities in Materials** (Gaurav Arya, Jianfeng Lu)

In this module, students will investigate glassy liquids through a combination of molecular dynamics (MD) simulations and ML classification methods to identify and develop structural predictors for dynamic heterogeneities in glassy systems. Students will learn the theory behind MD simulations and glassy materials, get hands-on training in performing MD simulations of glassy liquids, and learn how to calculate structural and dynamical properties from simulation data. The students will also learn several ML classification methods, including support vector machines, logistic regression, and deep learning, and apply these methods to identify local structural characteristics that lead to dynamical heterogeneities in glassy liquids. We will also discuss how to apply ML techniques like clustering and wavelets to analyze local defects in MD or experimental images for poly-crystalline materials.

##### Learning Objectives

1. Students will learn the fundamentals of molecular modeling and simulations and ML-based classification methods

2. Students will be able to carry out MD simulations and compute material properties from simulation data.

3. Students will be able to apply ML classification techniques to build structure-function relationships in materials.

**Module 4: Probabilistic Learning in Mechanics of Materials** (Johann Guilleminot, Jianfeng Lu)

In this module, we will explore the use of probabilistic learning in the mechanics of materials. We will begin with a basic introduction to diffusion and stochastic simulations. Students will then learn about manifold learning, which is an efficient nonlinear dimension reduction technique enabling the analysis and representation of data exhibiting some geometrical structure. Students will then see how to combine this approach with Hamiltonian MCMC, with the aim of jointly sampling the inputs and outputs of a given materials model on an intrinsic manifold. Both theoretical and computational aspects will be reviewed, with an emphasis on key methodological components. Various applications will be presented, including design optimization, multiscale analysis, and uncertainty quantification.

##### Learning Objectives

1. Students will be able to describe fundamental concepts and methodological components in Hamiltonian-based

sampling on manifolds;

2. Students will demonstrate the ability to apply the probabilistic learning framework to augment datasets.

**Module 5: Design of Experiments and Response Surface Methodology** (David Banks)

This module will introduce experimental design, including interactions and fractional factorial experiments. It will also cover the basics of response surface methodology, Nelder-Mead exploration, Placket-Burman designs, and mixture designs.

##### Learning Objectives

1. Students will be able to design and analyze moderately sophisticated experiments with multiple factors and multiple factor levels.

2. Students will be able to find a maximum response in a multivariate surface using a small number of noisy measurements.

3. Students will recognize when a mixture design is appropriate.

**Module 6: Artificial Intelligence Assisted Design and Synthesis of High Entropy Materials for Electrochemical Catalysis** (Jie Liu, and Stephano Curtarolo)

The key objective of the project is to demonstrate the use of high entropy materials (HEMs) as catalysts with optimized properties beyond that can be achieved by traditional catalysts. Due to the unique nature of high entropy materials, they offer a surface atomic structure that cannot be obtained previously. HEMs are stable single-phase materials with more than 5 different metal elements that are stabilized by their high entropy. These materials represent a new direction in materials research and can be the solution for some of the long-standing problems in applications like catalysis due to the high temperature stability of the surface structure that contains multiple metal sites in their surfaces. By combining rapid synthesis and AI based materials design, we can explore the use of HEMs as catalysts in electrochemistry. Our hypothesis is that the precise design of new class of HEMs with the optimized surface biding sites for specific chemical reactions can significantly enhance the catalytic efficiency. The use of HEMs in catalysis can impact a broad class of chemical reactions.

##### Learning Objectives

1. Students will understand and be able to discuss the concept of catalysis;

2. Students will be able to describe the relationship between computational results and experimental performance of materials.