In recent years, the synthesis of monomer sequence-defined polymers has expanded into broad-spectrum applications in biomedical, chemical, and materials science fields. Pursuing the characterization and inverse design of these polymer systems requires our fundamental understanding not only at the individual monomer level, but also considering the chain scales, such as polymer configuration, self-assembly, and phase separation. However, our accessibility to this field is still rudimentary due to the limitations of traditional design approaches, the complexity of chemical space along with the burdened cost and time issues that prevent us from unveiling the underlying monomer sequence-structure-property relationships. Fortunately, thanks to the recent advancements in molecular dynamics simulations and machine learning (ML) algorithms, the bottlenecks in the tasks of establishing the structure-function correlation of the polymer chains can be overcome. In this review, we will discuss the applications of the integration between ML techniques and coarse-grained molecular dynamics (CGMD) simulations to solve the current issues in polymer science at the chain level. In particular, we focus on the case studies in three important topics—polymeric configuration characterization, feed-forward property prediction, and inverse design—in which CGMD simulations are leveraged to generate training datasets to develop ML-based surrogate models for specific polymer systems and designs. By doing so, this computational hybridization allows us to well establish the monomer sequence-functional behavior relationship of the polymers as well as guide us toward the best polymer chain candidates for the inverse design in undiscovered chemical space with reasonable computational cost and time. Even though there are still limitations and challenges ahead in this field, we finally conclude that this CGMD/ML integration is very promising, not only in the attempt of bridging the monomeric and macroscopic characterizations of polymer materials, but also enabling further tailored designs for sequence-specific polymers with superior properties in many practical applications.
Data-driven methods are receiving increasing attention to accelerate materials design and discovery for organic light-emitting diodes (OLEDs). Machine learning (ML) has enabled high-throughput screening of materials properties to suggest new candidates for organic electronics. However, building reliable predictive ML models requires creating and managing a high volume of data that adequately address the complexity of materials’ chemical space. In this regard, active learning (AL) has emerged as a powerful strategy to efficiently navigate the search space by prioritizing the decision-making process for unexplored data. This approach allows a more systematic mechanism to identify promising candidates by minimizing the number of computations required to explore an extensive materials library with diverse variables and parameters. In this paper, we applied a workflow of AL that accounts for multiple optoelectronic parameters to identify materials candidates for hole-transport layers (HTL) in OLEDs. Results of this work pave the way for efficient screening of materials for organic electronics with superior efficiencies before laborious simulations, synthesis, and device fabrication.
In recent years, generative machine learning approaches have attracted significant attention as an enabling approach for designing novel molecular materials with minimal design bias and thereby realizing more directed design for a specific materials property space. Further, data-driven approaches have emerged as a new tool to accelerate the development of novel organic electronic materials for organic light-emitting diode (OLED) applications. We demonstrate and validate a goal-directed generative machine learning framework based on a recurrent neural network (RNN) deep reinforcement learning approach for the design of hole transporting OLED materials. These large-scale molecular simulations also demonstrate a rapid, cost-effective method to identify new materials in OLEDs while also enabling expansion into many other verticals such as catalyst design, aerospace, life science, and petrochemicals.