Skip to main content

PERSPECTIVE article

Front. Chem. Eng., 29 August 2024
Sec. Computational Methods in Chemical Engineering
This article is part of the Research Topic Editors’ Showcase: Computational Methods in Chemical Engineering View all articles

Generative artificial intelligence in chemical engineering spans multiple scales

  • 1Systems Engineering, Cornell University, Ithaca, NY, United States
  • 2Cornell University AI for Science Institute, Cornell University, Ithaca, NY, United States
  • 3Robert Frederick Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, United States
  • 4Department of Chemical Engineering, College of Engineering, King Saud University, Riyadh, Saudi Arabia

Recent advances in generative artificial intelligence (GenAI), particularly large language models (LLMs), are profoundly impacting many fields. In chemical engineering, GenAI plays a pivotal role in the design, scale-up, and optimization of chemical and biochemical processes. The natural language understanding capabilities of LLMs enable the interpretation of complex chemical and biological data. Given the rapid developments of GenAI, this paper explores the extensive applications of GenAI in multiscale chemical engineering, spanning from quantum mechanics to macro-level optimization. At quantum and molecular levels, GenAI accelerates the discovery of novel products and enhances the understanding of fundamental phenomena. At larger scales, GenAI improves process design and operational efficiency, contributing to sustainable practices. We present several examples to demonstrate the role of GenAI, including its impact on nanomaterial hardness enhancement, novel catalyst generation, protein design, and the development of autonomous experimental platforms. This multiscale integration demonstrates the potential of GenAI to address complex challenges, drive innovation, and foster advancements in chemical engineering.

Introduction

Generative artificial intelligence (GenAI) has enabled several recent developments in various fields (Decardi-Nelson et al., 2024; Gangwal and Lavecchia, 2024; Preuss et al., 2024; Subramanian et al., 2024). A notable example is the few-shot learning capability of GenAI tools like ChatGPT, which can understand and interpret natural language (Wu et al., 2023). GenAI refers to artificial intelligence (AI) models that generate new data that resembles a given set of input data. Recently, large GenAI models with extensive parameters have gained significant attention for their ability to perform a wide range of tasks including natural language processing (NLP), image generation, and complex decision-making. These models include large language models (LLMs) (Zhao et al., 2023), large vision-language models (LVLMs) (Zhang et al., 2024), and large decision models (LDMs) (Zhang, 2023) (see Figure 1). Typically, these GenAI models are built using deep learning models, such as generative adversarial networks (GANs) (Goodfellow et al., 2020), autoencoders (Kingma and Welling, 2013), autoregressive (Vaswani et al., 2017), diffusion (Ho et al., 2020), and flow-based models (Chen et al., 2019). For instance, ChatGPT is an LLM powered by a Transformer model (Vaswani et al., 2017), which is an autoregressive model. The recent success of GenAI across multiple disciplines highlights the need to explore its potential in chemical engineering.

Figure 1
www.frontiersin.org

Figure 1. GenAI in chemical engineering span multiple scales. GenAI is reshaping chemical engineering by impacting multiple levels of design and operation, including quantum, molecular, process unit, plant, and enterprise-wide scales. At the quantum and molecular levels, GenAI enhances our understanding of fundamental chemical and biological phenomena and accelerates the discovery of novel products. At the process, plant, and enterprise scales, GenAI improves the overall design and operational inefficiencies. These advancements collectively contribute to more efficient and sustainable chemical engineering practices. Large GenAI models like LLMs, LVLMs, LDMs, as well as their multimodal counterparts are behind the recent successes of GenAI. Notable implementations of large GenAI models have been provided.

In modern chemical engineering, which involves the design, scale-up, and optimization of chemical and biological processes, the impact of GenAI across multiple scales is equally significant. In this context, text-based representations of chemical and biological processes can be considered as codified unstructured languages to describe domain knowledge, which parallels with general NLP tasks. As discussed earlier, GenAI goes beyond NLP, to encompass mechanisms that generate data in an adversarial manner (such as GANs), or ones that mimic diffusion and flows, each uniquely equipped to capture the underlying data patterns and generate novel instances. While GenAI in process design has been previously discussed (Schweidtmann, 2024), here we emphasize that the applications of GenAI in chemical engineering extend significantly beyond such confines, poising to address a spectrum of multiscale chemical engineering problems from quantum mechanics to macro-level optimization (see Figure 1) (Decardi-Nelson et al., 2024).

Generative AI in multiscale chemical engineering

In molecular and materials design, the integration of GenAI techniques is inspiring a multiscale design approach, from atomic-scale interactions to macroscopic phenomena (Alshehri and You, 2021). A notable example application of GenAI in tooth enamel design has demonstrated its effectiveness in enhancing nanomaterial hardness through non-destructive methods, facilitating bioinspired engineering solutions using a generative adversarial model (Goodfellow et al., 2020) with deep image regression (Lew et al., 2023). Another example in catalysis is the application of generative variational autoencoder (Kingma and Welling, 2013), inspired by interatomic insights from density functional theory (DFT) data, to facilitate the generation of novel catalysts with optimized binding energies via latent space representation and deep learning-based regression (Schilter et al., 2023). These innovations span multiple areas of materials design, including drug discovery (Decardi-Nelson et al., 2024), functional biomaterials (Gartner et al., 2024), among others (Alshehri and You, 2022). This multiscale integration combining imaging techniques, quantum chemistry calculations, and molecular dynamics simulations, and empirical data using GenAI, offers improvements in molecular and materials design, bridging the gap to broader chemical product and process scales (Gartner et al., 2024).

Another aspect of chemical engineering where GenAI is significantly impacting is protein design. The Chroma (Ingraham et al., 2023) generative model samples novel protein structures and steers the design process towards desired functionalities. Incorporating a diffusion-based framework (Ho et al., 2020), the Chroma generative model captures the complex statistical distributions of natural proteins, transforming them into simpler distributions through a series of infinitesimal, constraint-biased steps, enabling the design of novel protein structures that meet specific functional requirements (Ingraham et al., 2023). These developments extend across the biomolecular domain with novel enzymes and nucleic acids, carrying promising advances for bioengineering and therapeutic innovations in medicine (Langer and Peppas, 2024), and more broadly in biomanufacturing.

On a macroscale, the integration of GenAI into robotic experimentation platforms, exemplified by GPT-Lab, can potentially transform the planning and execution of chemical experiments (Qin et al., 2023). As an (Analysis - Retrieval - Mining - Feedback - Execution) workflow, GPT-Lab employs a GPT-4 (OpenAI, 2024) as the generative model to analyze and synthesize experimental parameters, integrating these with robotic platforms for the autonomous execution of chemical syntheses (Qin et al., 2023). By mining literature for experimental parameters and validating outcomes through high-throughput synthesis, GenAI has brought us closer to achieving full-process autonomy in self-driven laboratories.

Beyond GenAI’s roles in design and optimization, interpretable GenAI enhances our scientific understanding of complex phenomena within complex fluids and interfacial science, such as the nature of disorder in domain boundaries (Dan et al., 2023). However, this important aspect of chemical engineering was not discussed in the literature (Schweidtmann, 2024). Utilizing a diffusion-based approach (Ho et al., 2020), the hybrid generative model synthesizes domain boundary structures by employing a limited Markovian dataset to algorithmically predict and scale structural motifs from atomistic to mesoscopic levels, thus uncovering critical, previously unobserved configurations that enhance our understanding and design of functional materials (Dan et al., 2023).

Table 1 illustrates the diverse applications of GenAI across prominent chemical engineering disciplines. Despite the limited examples, they underscore the expansive potential and broad applicability of various generative techniques within the branched and complex landscape of chemical engineering.

Table 1
www.frontiersin.org

Table 1. Example applications of GenAI techniques across chemical engineering disciplines.

Challenges and opportunities

Despite the promising potential of GenAI in chemical engineering across multiple scales, their use and implementation come with significant challenges and limitations. Successfully addressing these challenges will require international collaboration among all stakeholders, including researchers from relevant disciplines, industrial practitioners, and regulatory authorities.

One of the foremost issues is the quality and availability of data. GenAI models, such as LLMs, need vast amounts of high-quality, domain-specific data to train effectively (Whang et al., 2023). In chemical engineering, such data is often proprietary, sparse, or inconsistent (Chiang et al., 2017), complicating the development of robust GenAI models. This challenge presents an opportunity for the entire community to collaborate in establishing standard data representations and open data-sharing platforms, thus facilitating the development and application of chemical engineering-specific GenAI models.

Another major limitation is the interpretability of GenAI models. Many large GenAI models often hallucinate (Rawte et al., 2003), and provide little to no insight into how they arrive at specific solutions (Ross et al., 2021). This lack of transparency can be a significant barrier to adoption in the safety-critical applications often encountered in chemical engineering. Therefore, there is a need to develop benchmarks and metrics tailored to the needs of chemical engineering, requiring input from regulators, researchers, and industry. Additionally, integrating well-established first principles modeling in chemical engineering with GenAI can enhance their interpretability and trustworthiness (Takeishi and Kalousis, 2021).

Lastly, the ethical and regulatory implications of deploying GenAI in chemical engineering cannot be overlooked. Issues such as data privacy, security, and ethical considerations surrounding the autonomous nature of GenAI systems need to be carefully addressed (Huang et al., 2024). Regulatory bodies, researchers, and industrial practitioners must collaborate to establish guidelines on data use, security, and ethical issues.

Outlook

These multiscale successes demonstrate the potential of GenAI in chemical engineering. This potential extends beyond individual examples, offering novel solutions to the complex, multiscale challenges at the forefront of research in the field (Torrente-Murciano et al., 2024). At various scales, from molecular engineering to enterprise-wide supply chain (Grossmann, 2005), GenAI can enable the design and optimization of chemical and biological processes across multiple scales with high precision and efficiency. Particularly promising areas include foundation models that can be adapted to diverse chemical engineering tasks, multimodal systems that integrate heterogeneous data types (e.g., textual, visual, and experimental data), and language models that enhance data retrieval and knowledge extraction processes in chemical systems. Additionally, GenAI can facilitate advanced task learning and the development of autonomous experimental robotic systems, thereby accelerating the cycle of hypothesis generation, testing, and validation in chemical research and development. The integration of GenAI across the multiple scales and facets of chemical engineering holds the promise of significantly advancing the field, driving innovation, and fostering sustainable industrial practices.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

BD-N: Formal Analysis, Investigation, Methodology, Validation, Visualization, Writing–original draft, Writing–review and editing. AA: Investigation, Writing–review and editing. FY: Conceptualization, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Writing–review and editing.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. BD-N. acknowledges the partial support from Schmidt Futures via an Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship to Cornell University.

Acknowledgments

BD-N. acknowledges the partial support from Schmidt Futures via an Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship to Cornell University.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., et al. (2023). GPT-4 technical report. arXiv [csCL]. Retrieved from: http://arxiv.org/abs/2303.08774.

Alshehri, A. S., and You, F. (2021). Paradigm shift: the promise of deep learning in molecular systems engineering and design. Front. Chem. Eng. 3, 700717. doi:10.3389/fceng.2021.700717

CrossRef Full Text | Google Scholar

Alshehri, A. S., and You, F. (2022). Deep learning to catalyze inverse molecular design. Chem. Eng. J. 444, 136669. doi:10.1016/j.cej.2022.136669

CrossRef Full Text | Google Scholar

Chen, P., and Dorfman, K. D. (2023). Gaming self-consistent field theory: generative block polymer phase discovery. Proc. Natl. Acad. Sci. 120 (45), e2308698120. doi:10.1073/pnas.2308698120

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, R. T., Behrmann, J., Duvenaud, D. K., and Jacobsen, J.-H. (2019). Residual flows for invertible generative modeling. Adv. Neural Inf. Process. Syst., 32.

Google Scholar

Chiang, L., Lu, B., and Castillo, I. (2017). Big data analytics in chemical engineering. Annu. Rev. Chem. Biomol. Eng. 8, 63–85. doi:10.1146/annurev-chembioeng-060816-101555

PubMed Abstract | CrossRef Full Text | Google Scholar

Dan, J., Waqar, M., Erofeev, I., Yao, K., Wang, J., Pennycook, S. J., et al. (2023). A multiscale generative model to understand disorder in domain boundaries. Sci. Adv. 9 (42), eadj0904. doi:10.1126/sciadv.adj0904

PubMed Abstract | CrossRef Full Text | Google Scholar

Decardi-Nelson, B., Alshehri, A. S., Ajagekar, A., and You, F. (2024). Generative AI and process systems engineering: the next frontier. Comput. and Chem. Eng. 187 108723. doi:10.1016/j.compchemeng.2024.108723

CrossRef Full Text | Google Scholar

Duan, C., Du, Y., Jia, H., and Kulik, H. J. (2023). Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model. Nat. Comput. Sci. 3 (12), 1045–1055. doi:10.1038/s43588-023-00563-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Gangwal, A., and Lavecchia, A. (2024). Unleashing the power of generative AI in drug discovery. Drug Discov. Today 29, 103992. doi:10.1016/j.drudis.2024.103992

PubMed Abstract | CrossRef Full Text | Google Scholar

Gartner, T. E., Ferguson, A. L., and Debenedetti, P. G. (2024). Data-driven molecular design and simulation in modern chemical engineering. Nat. Chem. Eng. 1 (1), 6–9. doi:10.1038/s44286-023-00010-4

CrossRef Full Text | Google Scholar

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2020). Generative adversarial networks. Commun. ACM 63 (11), 139–144. doi:10.1145/3422622

CrossRef Full Text | Google Scholar

Grossmann, I. (2005). Enterprise-wide optimization: a new frontier in process systems engineering. AIChE J. 51 (7), 1846–1857. doi:10.1002/aic.10617

CrossRef Full Text | Google Scholar

Ho, J., Jain, A., and Abbeel, P. (2020). Denoising diffusion probabilistic models. Adv. neural Inf. Process. Syst. 33, 6840–6851.

Google Scholar

Huang, K., Ponnapalli, J., Tantsura, J., and Shin, K. T. (2024). “Navigating the GenAI security landscape,” in Generative AI security: theories and practices. Editors K. Huang, Y. Wang, B. Goertzel, Y. Li, S. Wright, and J. Ponnapalli (Nature Switzerland: Springer), 31–58.

CrossRef Full Text | Google Scholar

Ingraham, J. B., Baranov, M., Costello, Z., Barber, K. W., Wang, W., Ismail, A., et al. (2023). Illuminating protein space with a programmable generative model. Nature 623 (7989), 1070–1078. doi:10.1038/s41586-023-06728-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Kingma, D. P., and Welling, M. (2013). Auto-encoding variational bayes. arXiv Prepr. arXiv:13126114.

Google Scholar

Langer, R., and Peppas, N. A. (2024). A bright future in medicine for chemical engineering. Nat. Chem. Eng. 1 (1), 10–12. doi:10.1038/s44286-023-00016-y

CrossRef Full Text | Google Scholar

Lew, A. J., Stifler, C. A., Cantamessa, A., Tits, A., Ruffoni, D., Gilbert, P. U., et al. (2023). Deep learning virtual indenter maps nanoscale hardness rapidly and non-destructively, revealing mechanism and enhancing bioinspired design. Matter 6 (6), 1975–1991. doi:10.1016/j.matt.2023.03.031

CrossRef Full Text | Google Scholar

Liu, D.-F., Zhang, Y.-X., Dong, W.-Z., Feng, Q.-K., Zhong, S.-L., and Dang, Z.-M. (2023). High-temperature polymer dielectrics designed using an invertible molecular graph generative model. J. Chem. Inf. Model. 63 (24), 7669–7675. doi:10.1021/acs.jcim.3c01572

PubMed Abstract | CrossRef Full Text | Google Scholar

Luo, B., Liu, J., Deng, Z., Yuan, C., Yang, Q., Xiao, L., et al. (2023). AutoPCF: a novel automatic product carbon footprint estimation framework based on large language models in Proceedings of the AAAI Symposium Series 2 (1), 102–106. doi:10.1609/aaaiss.v2i1.27656

CrossRef Full Text | Google Scholar

Preuss, N., Alshehri, A. S., and You, F. (2024). Large language models for life cycle assessments: opportunities, challenges, and risks. J. Clean. Prod. 466, 142824. doi:10.1016/j.jclepro.2024.142824

CrossRef Full Text | Google Scholar

Qin, X., Song, M., Chen, Y., Ai, Z., and Jiang, J. (2023). GPT-lab: next generation of optimal chemistry discovery by GPT driven robotic lab. arXiv Prepr. arXiv:230916721. doi:10.48550/arXiv.2309.16721

CrossRef Full Text | Google Scholar

Rawte, V., Sheth, A., and Das, A. (2003) A survey of hallucination in large foundation models. arXiv preprint arXiv:230905922. 2023.

Google Scholar

Ross, A., Chen, N., Hang, E. Z., Glassman, E. L., and Doshi-Velez, F. (2021) “Evaluating the interpretability of generative models by interactive reconstruction,” in Presented at: proceedings of the 2021 CHI conference on human factors in computing systems. Yokohama, Japan.

CrossRef Full Text | Google Scholar

Schilter, O., Vaucher, A., Schwaller, P., and Laino, T. (2023). Designing catalysts with deep generative models and computational data. A case study for Suzuki cross coupling reactions. Digit. Discov. 2 (3), 728–735. doi:10.1039/D2DD00125J

PubMed Abstract | CrossRef Full Text | Google Scholar

Schweidtmann, A. M. (2024). Generative artificial intelligence in chemical engineering. Nat. Chem. Eng. 1 (3), 193. doi:10.1038/s44286-024-00041-5

CrossRef Full Text | Google Scholar

Subramanian, A., Gao, W., Barzilay, R., Grossman, J. C., Jaakkola, T., Jegelka, S., et al. (2024). Closing the execution gap in generative AI for chemicals and materials: freeways or safeguards. An MIT Exploration of Generative AI.

Google Scholar

Takeishi, N., and Kalousis, A. (2021). Physics-integrated variational autoencoders for robust and interpretable generative modeling.

Google Scholar

Torrente-Murciano, L., Dunn, J. B., Christofides, P. D., Keasling, J. D., Glotzer, S. C., Lee, S. Y., et al. (2024). The forefront of chemical engineering research. Nat. Chem. Eng. 1 (1), 18–27. doi:10.1038/s44286-023-00017-x

CrossRef Full Text | Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017). Advances in Neural Information Processing Systems 30. Retrieved from: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.

Vogel, G., Schulze Balhorn, L., and Schweidtmann, A. M. (2023). Learning from flowsheets: a generative transformer model for autocompletion of flowsheets. Comput. and Chem. Eng. 171, 108162. doi:10.1016/j.compchemeng.2023.108162

CrossRef Full Text | Google Scholar

Wang, Y., and Yan, P. (2024). RegGAN: a virtual sample generative network for developing soft sensors with small data. ACS omega 9 (5), 5954–5965. doi:10.1021/acsomega.3c09762

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, Z., Jeong, H., Gan, Y., Pereira, J.-M., Gu, Y., and Sauret, E. (2022). Pore-scale modeling of multiphase flow in porous media using a conditional generative adversarial network (cGAN). Phys. Fluids 34 (12), 123325. doi:10.1063/5.0133054

CrossRef Full Text | Google Scholar

Whang, S. E., Roh, Y., Song, H., and Lee, J.-G. (2023). Data collection and quality challenges in deep learning: a data-centric AI perspective. VLDB J. 32 (4), 791–813. doi:10.1007/s00778-022-00775-9

CrossRef Full Text | Google Scholar

Wu, T., He, S., Liu, J., Sun, S., Liu, K., Han, Q. L., et al. (2023). A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA J. Automatica Sinica 10 (5), 1122–1136. doi:10.1109/JAS.2023.123618

CrossRef Full Text | Google Scholar

Yao, Z., Sánchez-Lengeling, B., Bobbitt, N. S., Bucior, B. J., Kumar, S. G. H., Collins, S. P., et al. (2021). Inverse design of nanoporous crystalline reticular materials with deep generative models. Nat. Mach. Intell. 3 (1), 76–86. doi:10.1038/s42256-020-00271-1

CrossRef Full Text | Google Scholar

Zhang, J., Huang, J., Jin, S., and Lu, S. (2024). Vision-Language models for vision tasks: a survey. IEEE Trans. Pattern Analysis Mach. Intell. 46 (8), 5625–5644. doi:10.1109/TPAMI.2024.3369699

CrossRef Full Text | Google Scholar

Zhang, W. (2023) “Large decision models,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23. Editor E. Elkind, 7062–7067. doi:10.24963/ijcai.2023/808

CrossRef Full Text | Google Scholar

Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., et al. (2023) A Survey of Large Language Models. arXiv [Cs.CL]. Retrieved from: http://arxiv.org/abs/2303.18223.

Google Scholar

Keywords: artificial intelligence, AI, generative learning, quantum-chemical calculations, materials, process engineering

Citation: Decardi-Nelson B, Alshehri AS and You F (2024) Generative artificial intelligence in chemical engineering spans multiple scales. Front. Chem. Eng. 6:1458156. doi: 10.3389/fceng.2024.1458156

Received: 02 July 2024; Accepted: 12 August 2024;
Published: 29 August 2024.

Edited by:

José María Ponce-Ortega, Michoacana University of San Nicolás de Hidalgo, Mexico

Reviewed by:

Francisco Javier López-Flores, Michoacana University of San Nicolás de Hidalgo, Mexico
Fernano Israel Gómez-Castro, University of Guanajuato, Mexico
Rogelio Ochoa-Barragan, Michoacana University of San Nicolás de Hidalgo, Mexico

Copyright © 2024 Decardi-Nelson, Alshehri and You. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Fengqi You, fengqi.you@cornell.edu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.