Generative artificial intelligence in chemical engineering spans multiple scales

Decardi-Nelson, Benjamin; Alshehri, Abdulelah S.; You, Fengqi

doi:10.3389/fceng.2024.1458156

PERSPECTIVE article

Front. Chem. Eng., 29 August 2024

Sec. Computational Methods in Chemical Engineering

Volume 6 - 2024 | https://doi.org/10.3389/fceng.2024.1458156

This article is part of the Research TopicEditors’ Showcase: Computational Methods in Chemical EngineeringView all articles

Generative artificial intelligence in chemical engineering spans multiple scales

Benjamin Decardi-Nelson^1,2

Abdulelah S. Alshehri^3,4

Fengqi You^1,2,3*

¹Systems Engineering, Cornell University, Ithaca, NY, United States
²Cornell University AI for Science Institute, Cornell University, Ithaca, NY, United States
³Robert Frederick Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, United States
⁴Department of Chemical Engineering, College of Engineering, King Saud University, Riyadh, Saudi Arabia

Recent advances in generative artificial intelligence (GenAI), particularly large language models (LLMs), are profoundly impacting many fields. In chemical engineering, GenAI plays a pivotal role in the design, scale-up, and optimization of chemical and biochemical processes. The natural language understanding capabilities of LLMs enable the interpretation of complex chemical and biological data. Given the rapid developments of GenAI, this paper explores the extensive applications of GenAI in multiscale chemical engineering, spanning from quantum mechanics to macro-level optimization. At quantum and molecular levels, GenAI accelerates the discovery of novel products and enhances the understanding of fundamental phenomena. At larger scales, GenAI improves process design and operational efficiency, contributing to sustainable practices. We present several examples to demonstrate the role of GenAI, including its impact on nanomaterial hardness enhancement, novel catalyst generation, protein design, and the development of autonomous experimental platforms. This multiscale integration demonstrates the potential of GenAI to address complex challenges, drive innovation, and foster advancements in chemical engineering.

Introduction

Generative artificial intelligence (GenAI) has enabled several recent developments in various fields (Decardi-Nelson et al., 2024; Gangwal and Lavecchia, 2024; Preuss et al., 2024; Subramanian et al., 2024). A notable example is the few-shot learning capability of GenAI tools like ChatGPT, which can understand and interpret natural language (Wu et al., 2023). GenAI refers to artificial intelligence (AI) models that generate new data that resembles a given set of input data. Recently, large GenAI models with extensive parameters have gained significant attention for their ability to perform a wide range of tasks including natural language processing (NLP), image generation, and complex decision-making. These models include large language models (LLMs) (Zhao et al., 2023), large vision-language models (LVLMs) (Zhang et al., 2024), and large decision models (LDMs) (Zhang, 2023) (see Figure 1). Typically, these GenAI models are built using deep learning models, such as generative adversarial networks (GANs) (Goodfellow et al., 2020), autoencoders (Kingma and Welling, 2013), autoregressive (Vaswani et al., 2017), diffusion (Ho et al., 2020), and flow-based models (Chen et al., 2019). For instance, ChatGPT is an LLM powered by a Transformer model (Vaswani et al., 2017), which is an autoregressive model. The recent success of GenAI across multiple disciplines highlights the need to explore its potential in chemical engineering.

Figure 1

Figure 1. GenAI in chemical engineering span multiple scales. GenAI is reshaping chemical engineering by impacting multiple levels of design and operation, including quantum, molecular, process unit, plant, and enterprise-wide scales. At the quantum and molecular levels, GenAI enhances our understanding of fundamental chemical and biological phenomena and accelerates the discovery of novel products. At the process, plant, and enterprise scales, GenAI improves the overall design and operational inefficiencies. These advancements collectively contribute to more efficient and sustainable chemical engineering practices. Large GenAI models like LLMs, LVLMs, LDMs, as well as their multimodal counterparts are behind the recent successes of GenAI. Notable implementations of large GenAI models have been provided.

In modern chemical engineering, which involves the design, scale-up, and optimization of chemical and biological processes, the impact of GenAI across multiple scales is equally significant. In this context, text-based representations of chemical and biological processes can be considered as codified unstructured languages to describe domain knowledge, which parallels with general NLP tasks. As discussed earlier, GenAI goes beyond NLP, to encompass mechanisms that generate data in an adversarial manner (such as GANs), or ones that mimic diffusion and flows, each uniquely equipped to capture the underlying data patterns and generate novel instances. While GenAI in process design has been previously discussed (Schweidtmann, 2024), here we emphasize that the applications of GenAI in chemical engineering extend significantly beyond such confines, poising to address a spectrum of multiscale chemical engineering problems from quantum mechanics to macro-level optimization (see Figure 1) (Decardi-Nelson et al., 2024).

Generative AI in multiscale chemical engineering

In molecular and materials design, the integration of GenAI techniques is inspiring a multiscale design approach, from atomic-scale interactions to macroscopic phenomena (Alshehri and You, 2021). A notable example application of GenAI in tooth enamel design has demonstrated its effectiveness in enhancing nanomaterial hardness through non-destructive methods, facilitating bioinspired engineering solutions using a generative adversarial model (Goodfellow et al., 2020) with deep image regression (Lew et al., 2023). Another example in catalysis is the application of generative variational autoencoder (Kingma and Welling, 2013), inspired by interatomic insights from density functional theory (DFT) data, to facilitate the generation of novel catalysts with optimized binding energies via latent space representation and deep learning-based regression (Schilter et al., 2023). These innovations span multiple areas of materials design, including drug discovery (Decardi-Nelson et al., 2024), functional biomaterials (Gartner et al., 2024), among others (Alshehri and You, 2022). This multiscale integration combining imaging techniques, quantum chemistry calculations, and molecular dynamics simulations, and empirical data using GenAI, offers improvements in molecular and materials design, bridging the gap to broader chemical product and process scales (Gartner et al., 2024).

Another aspect of chemical engineering where GenAI is significantly impacting is protein design. The Chroma (Ingraham et al., 2023) generative model samples novel protein structures and steers the design process towards desired functionalities. Incorporating a diffusion-based framework (Ho et al., 2020), the Chroma generative model captures the complex statistical distributions of natural proteins, transforming them into simpler distributions through a series of infinitesimal, constraint-biased steps, enabling the design of novel protein structures that meet specific functional requirements (Ingraham et al., 2023). These developments extend across the biomolecular domain with novel enzymes and nucleic acids, carrying promising advances for bioengineering and therapeutic innovations in medicine (Langer and Peppas, 2024), and more broadly in biomanufacturing.

On a macroscale, the integration of GenAI into robotic experimentation platforms, exemplified by GPT-Lab, can potentially transform the planning and execution of chemical experiments (Qin et al., 2023). As an (Analysis - Retrieval - Mining - Feedback - Execution) workflow, GPT-Lab employs a GPT-4 (OpenAI, 2024) as the generative model to analyze and synthesize experimental parameters, integrating these with robotic platforms for the autonomous execution of chemical syntheses (Qin et al., 2023). By mining literature for experimental parameters and validating outcomes through high-throughput synthesis, GenAI has brought us closer to achieving full-process autonomy in self-driven laboratories.

Beyond GenAI’s roles in design and optimization, interpretable GenAI enhances our scientific understanding of complex phenomena within complex fluids and interfacial science, such as the nature of disorder in domain boundaries (Dan et al., 2023). However, this important aspect of chemical engineering was not discussed in the literature (Schweidtmann, 2024). Utilizing a diffusion-based approach (Ho et al., 2020), the hybrid generative model synthesizes domain boundary structures by employing a limited Markovian dataset to algorithmically predict and scale structural motifs from atomistic to mesoscopic levels, thus uncovering critical, previously unobserved configurations that enhance our understanding and design of functional materials (Dan et al., 2023).

Table 1 illustrates the diverse applications of GenAI across prominent chemical engineering disciplines. Despite the limited examples, they underscore the expansive potential and broad applicability of various generative techniques within the branched and complex landscape of chemical engineering.

Table 1

Table 1. Example applications of GenAI techniques across chemical engineering disciplines.

Challenges and opportunities

Despite the promising potential of GenAI in chemical engineering across multiple scales, their use and implementation come with significant challenges and limitations. Successfully addressing these challenges will require international collaboration among all stakeholders, including researchers from relevant disciplines, industrial practitioners, and regulatory authorities.

One of the foremost issues is the quality and availability of data. GenAI models, such as LLMs, need vast amounts of high-quality, domain-specific data to train effectively (Whang et al., 2023). In chemical engineering, such data is often proprietary, sparse, or inconsistent (Chiang et al., 2017), complicating the development of robust GenAI models. This challenge presents an opportunity for the entire community to collaborate in establishing standard data representations and open data-sharing platforms, thus facilitating the development and application of chemical engineering-specific GenAI models.

Another major limitation is the interpretability of GenAI models. Many large GenAI models often hallucinate (Rawte et al., 2003), and provide little to no insight into how they arrive at specific solutions (Ross et al., 2021). This lack of transparency can be a significant barrier to adoption in the safety-critical applications often encountered in chemical engineering. Therefore, there is a need to develop benchmarks and metrics tailored to the needs of chemical engineering, requiring input from regulators, researchers, and industry. Additionally, integrating well-established first principles modeling in chemical engineering with GenAI can enhance their interpretability and trustworthiness (Takeishi and Kalousis, 2021).

Lastly, the ethical and regulatory implications of deploying GenAI in chemical engineering cannot be overlooked. Issues such as data privacy, security, and ethical considerations surrounding the autonomous nature of GenAI systems need to be carefully addressed (Huang et al., 2024). Regulatory bodies, researchers, and industrial practitioners must collaborate to establish guidelines on data use, security, and ethical issues.

Outlook

These multiscale successes demonstrate the potential of GenAI in chemical engineering. This potential extends beyond individual examples, offering novel solutions to the complex, multiscale challenges at the forefront of research in the field (Torrente-Murciano et al., 2024). At various scales, from molecular engineering to enterprise-wide supply chain (Grossmann, 2005), GenAI can enable the design and optimization of chemical and biological processes across multiple scales with high precision and efficiency. Particularly promising areas include foundation models that can be adapted to diverse chemical engineering tasks, multimodal systems that integrate heterogeneous data types (e.g., textual, visual, and experimental data), and language models that enhance data retrieval and knowledge extraction processes in chemical systems. Additionally, GenAI can facilitate advanced task learning and the development of autonomous experimental robotic systems, thereby accelerating the cycle of hypothesis generation, testing, and validation in chemical research and development. The integration of GenAI across the multiple scales and facets of chemical engineering holds the promise of significantly advancing the field, driving innovation, and fostering sustainable industrial practices.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

BD-N: Formal Analysis, Investigation, Methodology, Validation, Visualization, Writing–original draft, Writing–review and editing. AA: Investigation, Writing–review and editing. FY: Conceptualization, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Writing–review and editing.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. BD-N. acknowledges the partial support from Schmidt Futures via an Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship to Cornell University.

Acknowledgments

BD-N. acknowledges the partial support from Schmidt Futures via an Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship to Cornell University.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., et al. (2023). GPT-4 technical report. arXiv [csCL]. Retrieved from: http://arxiv.org/abs/2303.08774.

Alshehri, A. S., and You, F. (2021). Paradigm shift: the promise of deep learning in molecular systems engineering and design. Front. Chem. Eng. 3, 700717. doi:10.3389/fceng.2021.700717