- 1Todhunter Scientifics, Minneapolis, MN, United States
- 2Alberta Machine Intelligence Institute, Edmonton, AB, Canada
- 3Department of Mathematics, University of Waterloo, Waterloo, ON, Canada
- 4Department of Biology, University of Waterloo, Waterloo, ON, Canada
- 5New Harvest, Sacramento, CA, United States
Cultured meat has the potential to provide a complementary meat industry with reduced environmental, ethical, and health impacts. However, major technological challenges remain which require time-and resource-intensive research and development efforts. Machine learning has the potential to accelerate cultured meat technology by streamlining experiments, predicting optimal results, and reducing experimentation time and resources. However, the use of machine learning in cultured meat is in its infancy. This review covers the work available to date on the use of machine learning in cultured meat and explores future possibilities. We address four major areas of cultured meat research and development: establishing cell lines, cell culture media design, microscopy and image analysis, and bioprocessing and food processing optimization. In addition, we have included a survey of datasets relevant to CM research. This review aims to provide the foundation necessary for both cultured meat and machine learning scientists to identify research opportunities at the intersection between cultured meat and machine learning.
1 Introduction
Food production generates about a quarter of global greenhouse gas emissions and causes other negative impacts on the environment and human health (Crippa et al., 2021; Steinfeld et al., 2006). Animal products, including meat, seafood, eggs, and dairy contribute more than 56% of food’s emissions, despite providing only 37% of protein and 18% of calorie intake (Poore and Nemecek, 2018). Meat production is a large sector, producing 328 Mt. in 2020 and expected to expand to 374 Mt. by 2023, based on estimates from the Organisation for Economic Co-operation and Development (OECD) and the Food and Agriculture Organization (FAO) of the United Nations (Michele, 2021). In light of projected growth in the global population and income, these estimates projected that meat consumption will increase by 14%. To meet global meat demand and limit global warming to 1.5°C there is a need for major changes in the production of meat (Clark et al., 2020; Ivanovich et al., 2023). Cultured meat (CM), known by many names including “cell-based” or “cultivated” meat, is an emerging technology that uses tissue engineering and biomanufacturing techniques to produce animal meat through cell culture rather than animal husbandry. Proponents of the technology herald its potential to provide an option for producing animal agriculture products with reduced environmental, ethical, and health impacts (Sinke et al., 2023; Tuomisto and Teixeira de Mattos, 2011). However, major technological challenges remain in bringing CM products to market and achieving their proposed benefits (Risner et al., 2021). Many challenges stem from the fact that the technologies for mammalian tissue culture come primarily from the medical field, where the scale is much lower and the market has weaker incentives to reduce the costs of production. However, if these technologies are to be applied to food, the challenges of scale and cost must be addressed.
Some specific improvements to CM can be, and have been, made using traditional experimental approaches. However, tackling some of the more complex research questions requires more advanced approaches and experimental conditions. In recent years, an increasing number of groups have been using methods enhanced by artificial intelligence (AI), and in particular its subset machine learning (ML), for such tasks. ML can streamline experiments, predict optimal results, and reduce experimentation time and resources. There are many opportunities for ML to accelerate research development and reduce costs in CM. Some companies have indicated the use of ML in CM product development or CM-associated services (Business Wire, 2023; Ho, 2021; Leach, 2024; Marston, 2022; Penarredonda, 2017; Protein Report. Protein Report, 2022; Shieber and This, 2021; Southey, 2023), but very little of this progress in applying ML to CM has been shown or validated in the public domain. Academic and government-supported research in this space is emerging, including at the University of California at Davis, Virginia Tech, Tufts University, and The CentRe of Innovation for Sustainable banking and Production of cultivated Meats (CRISP Meats). However, few publications on the topic exist to date, which are summarized in Table 1 (Cosenza et al., 2023; Cosenza et al., 2022; Cosenza et al., 2021; Nikkhah et al., 2023; Ng and Tan, 2024). An increase in open public research on the use of ML to optimize and scale CM production would greatly accelerate the application of ML to the CM field.
In this review, we aim to provide the foundation necessary for researchers, from the CM or ML fields, to identify research opportunities at the intersection between CM and ML. We first provide a brief overview of both fields. Subsequent sections delve into both existing and potential ML applications for optimizing cell lines, formulating culture media, aiding cell culture microscopy and image analysis, and optimizing bioreactor and food processing parameters. Note that we have focused on CM challenges to which ML can be applied, and this should not be seen as a comprehensive review of all challenges in the CM or ML fields. As applications of ML in CM are limited, we discuss how ML methods that have been applied in other areas of bioinformatics can be adopted to solve tasks in CM. Finally, by combining existing literature and atlases, a compilation of animal biology datasets has been created for different CM-relevant species (Supplementary Table S1).
2 Background on the fields of cultured meat and machine learning
2.1 Cultured meat
CM aims to replicate the taste and texture of animal tissue within a manufacturing system using animal cells (Figure 1). First, cells from a species of interest are selected or engineered for desirable growth and differentiation traits in vitro. The selected cells are grown in a suitable medium, providing the nutrients and signaling cues that would normally be provided in the body. At early stages, such as cell selection, the cell culture may start at a small scale in plastic dishes or flasks. Cell culture is eventually scaled up to industrial-scale bioreactors, devices capable of controlling environmental temperature, pH, dissolved oxygen, and nutrient exchange at large volumes (Post et al., 2020). Once enough cells are grown, they are differentiated into mature cell types. At this stage, the cell culture may be formed into a tissue (i.e., structured product, such as a steak) or a cell slurry, which is later processed into a meat product (i.e., unstructured product, such as ground beef).
Figure 1. Cultured meat manufacturing process. Reproduced from Reiss et al. (2021) with permission.
This process leans heavily on medical tissue engineering, an area of research that has been studied for nearly 40 years and has been commercialized at a small scale for simple grafts of cell-laden scaffolds for skin and cartilage. However, complete tissues, such as skin with hair follicles and sebaceous glands or functional muscle, still face technological barriers (Beheshtizadeh et al., 2022). While tissue function is not critical for CM, using this technology for the production of food comes with unique constraints, especially the need for larger scale, lower costs, and materials that are both edible and palatable.
How closely CM products replicate the properties of conventional meat varies depending on the techniques used and will rely on further technological development to reach the goal of equivalent taste, texture, nutrition, and cost. A comparison of CM and conventional meat has been reviewed elsewhere (Fraeye et al., 2020; Broucke et al., 2023; Chriki et al., 2024).
Commercial and academic interest in CM has grown rapidly in the last decade. The number of companies working on CM grew from 1 to over 170 from 2011 to 2023, with over 3 billion dollars of investment (Battle et al., 2024). Similarly, academic interest in the field has grown dramatically, with 350+ papers published on CM in the last 2 years, more than all other years prior (with tracking) combined (Battle et al., 2024). In parallel, cost estimates for CM have come down significantly in the last decade: from the first CM demonstration in 2013 of a 140-gram burger at approximately €250,000 (Kupferschmidt, 2013), to recent claims of costs as low as $7.70/lb from industry developers (Poinski, 2021). However, these industry cost claims have yet to be proven publicly. Furthermore, CM production has yet to be shown on a scale close to that needed to offset even a fraction of current meat consumption (FAO, 2020). Current production capacity is not known, but estimates range from 1 to 10 kg/y, compared to the 3.2 × 1011 kg/y produced by conventional meat (Humbird, 2020). Highly optimized industrial mammalian cell lines, such as Chinese hamster ovary (CHO), are still produced at a much lower scale and higher cost than needed for food production (Humbird, 2020). Given that consumers are unlikely to want to consume hamster ovary cells, significant technological challenges must be overcome for meat and seafood-relevant cells to be produced at a scale and cost needed to replace a meaningful portion of the conventional meat market.
2.2 Machine learning
An ML workflow involves a series of steps, as illustrated in Figure 2, which starts with preparation of the dataset. A dataset typically consists of datapoints, each one an observation with features that describe the datapoint. The type of data varies widely: numerical, time series, text, images, audio, video, sequential, graph, or any combination of these. This data undergoes preprocessing, where procedures such as imputing missing values and reducing data dimensionality are undertaken. Since many machine learning models require numerical data, a data transformation step converts the data to numerical values (Kabas et al., 2023; Kayakuş and Açıkgöz, 2022). Subsequently, an integral component of the ML methodology is dividing data into distinct subsets: training, validation, and testing. Some common approaches, such as K-fold cross-validation, leave-one-out, and holdout validation, can typically be used for most problems (Bishop, 2006). Prior to model training, it may be necessary to employ feature selection or extraction techniques on the training set to identify the most informative variables in the data, and these features must be used in both the test and validation sets to maintain the integrity of the evaluation process.
Figure 2. Illustration of the typical steps of how machine learning can be applied to biological data.
The culmination of this preparatory work is a dataset suitable for training an ML model. Typically, experiments are conducted with different ML algorithms/architectures (such as random forests, k-means, and deep neural networks) to settle on the most performant overall model. However, sometimes the choice of ML model depends on the type of data. For example, convolutional neural networks (CNNs) are preferred for image data since they can extract spatial features from the image (Li et al., 2022). On the other hand, recurrent neural networks (RNNs) are typically used for sequential data since they can remember information in a long sequence through their gate mechanism (Lipton et al., 2015).
The training set is instrumental in building the model since the model adjusts its parameters to learn the distribution of the training set. The validation set is required for fine-tuning the model’s hyperparameters and ensuring that the model generalizes well beyond the training data, thereby avoiding overfitting. Finally, the test set provides a measure of the model’s predictive accuracy and overall performance in real-world scenarios. Some of the typical performance metrics for classification tasks are accuracy, precision, recall, and F1-Score. Typical performance metrics for regression are R2 score, mean absolute error, and mean square error (Alpaydin, 2020). However, lots of other performance metrics are used depending on the tasks and types of data. The purpose of performance metrics is to compare the performance between different models and also to understand how a model is performing overall for the specific task. In Figure 2, to make the machine learning procedure easily understandable, we provided the example of a simple machine learning process. Note, this example does not include procedures that involve reinforcement learning (discussed in Types of Machine Learning subsection) or other complex supervised and unsupervised learning scenarios.
2.3 Types of machine learning
ML methodologies can be broadly classified into three categories: supervised, unsupervised, and reinforcement learning (RL). In supervised learning, the model is trained on a labeled dataset, where each example is paired with an outcome or label that aligns with the objective of the specific task at hand. Consider the goal of predicting gene expression levels from DNA sequences; here, the data point would be the DNA sequence, while the associated expression level serves as the label. The model hones its predictive capability by learning the relationship between input features and their corresponding labels. When it comes to evaluation, the trained model is tasked with predicting labels for new, unseen data points. Key traditional supervised learning models include logistic/linear regression, k-nearest neighbors, and support vector machines. The supervised ML approach has made significant contributions across various domains in bioinformatics, enabling advancements in DNA segmentation, gene expression prediction, and protein structure prediction (Larrañaga et al., 2006).
Unsupervised learning, in contrast, does not rely on labeled data. It is particularly useful when the goal is to unearth underlying patterns or structures within the data, independent of predefined outcomes. This makes unsupervised learning a potent tool for exploratory analysis, especially in scenarios where labeled data is scarce or when the structure of the data is not fully understood. Some of the key unsupervised learning approaches are k-means, hierarchical clustering, and density-based spatial clustering of applications with noise. Unsupervised learning has been successfully applied in grouping functionality-related genes, microarray analysis, and biological image segmentation (Larrañaga et al., 2006; Parasa et al., 2021).
RL is a dynamic and adaptive approach well-suited for situations where a machine is required to make a series of decisions to achieve a desired goal or perform an optimization task, offering a framework for learning through interaction (Sutton and Barto, 2018). Within this paradigm, an agent—often an advanced ML model in deep RL—engages in a sequential decision-making process, each time interacting with a complex environment represented by a set of variables that define the current state of the environment. The agent executes actions, transitioning between states, and ultimately may reach a terminal state, signaling the conclusion of the decision sequence. The RL framework incorporates a system of rewards and penalties, with the agent receiving feedback in the form of rewards for beneficial actions or penalties for undesirable outcomes. The objective for the agent is to devise a strategy that maximizes cumulative rewards, thus steering toward the most optimal actions to attain its final goal. RL has recently been applied to bioinformatics (Angermueller et al., 2020; Jumper et al., 2021; Neftci and Averbeck, 2019) and gained lots of attention because of its success in areas such as sequence alignment (Jafari et al., 2019) and protein loop sampling (Barozet et al., 2020).
2.4 Neural networks
Recent advances in applying ML to biology are based on neural networks, a subset of ML methods that can be employed in all three ML categories: supervised, unsupervised, and RL (Menden et al., 2020). ML models with neural networks are often referred to as deep neural networks when there are multiple layers of neural networks in the architecture of the model, a method more broadly known as deep learning. These neural network layers typically attempt to mimic the activity of brain neurons, where each neuron employs a mathematical function that alters the data it receives from the previous layer (LeCun et al., 2015; Schmidhuber, 2015). At first, the data is fed into an input layer, which then connects to hidden layer(s) (used for computing), and finally an output layer, designed to deliver the final prediction. The model learns by optimizing the function parameters of each node. Some popular neural network architectures are feed forward networks, CNNs, RNNs, and transformers (LeCun et al., 2015; Schmidhuber, 2015; Vaswani et al., 2017). Deep learning models typically capture complex biological processes and incorporate heterogeneous data in the model through its different layers, which may be a necessity for many optimization and prediction tasks in CM (Li et al., 2020). It can be applied to both supervised and unsupervised scenarios. In RL, deep learning models can be employed as agents as well (Li, 2018).
2.5 Generative AI
Generative AI is another subfield of unsupervised learning which typically aims to generate new data or samples based on the patterns learned from the training data. Variational Autoencoders (VAEs) (Kingma and Welling, 2019; Singh and Ogunfunmi, 2022; Wei and Mahmood, 2021), which employ deep learning, are the first generation of generative AI models. VAEs typically employ two deep learning models: (i) an encoder that encodes the input into a latent space and (ii) a decoder that reconstructs the input by sampling from this latent space using variational inference techniques. Together, the encoder and decoder minimize the reconstruction loss of the input. Figure 3 shows the general architecture of VAE.
The second generation of generative models is deep adversarial networks (Creswell et al., 2018; Zhang et al., 2017). These architectures also have two networks: a generator and a discriminator. Initially, generators generate new data randomly and discriminators classify whether the generated sample is original or generated. Iteratively, both the generator and discriminator learn how to generate a better sample and distinguish between original or generated sample. Eventually, the generator learns to generate realistic data samples that are difficult for the discriminator to distinguish from real data.
Recent advances in generative AI are mostly based on transformers (Vaswani et al., 2017), which have revolutionized various fields including natural language processing, computer vision, and bioinformatics (Ji et al., 2021; Zhang et al., 2024). Transformers have demonstrated exceptional capabilities in capturing long-range dependencies and modeling complex sequential data. In bioinformatics, methodologies heavily draw inspiration from NLP techniques due to the inherent similarities between biological sequences and natural language texts. By employing transformers, researchers are able to effectively model biological sequences such as DNA, RNA, and protein sequences, leading to significant advancements in tasks such as sequence generation, structure prediction, and drug discovery (Jumper et al., 2021; Ji et al., 2021; Zhang et al., 2024; Bran and Schwaller, 2023; Nguyen et al., 2024).
2.6 Graph neural networks
Graph neural networks (GNNs) are another area of ML that works with graph structured data and biological networks, such as via protein interaction networks, gene coexpression networks, and metabolic networks (Scarselli et al., 2009; Zhou et al., 2020). A biological network or biological graph typically is arranged in terms of nodes and edges, where the nodes are biological entities (i.e., genes, proteins, metabolites, etc.) and the edges indicate how these entities relate to one another. GNNs can be used to model these complex networks and predict how perturbations will affect the whole network.
GNNs are based on the principle of message passing, where each node in the graph aggregates all embeddings (messages) of its neighbor nodes and updates the weights of pooled messages through a neural network. There are three main tasks in graph structured data. The first one is link prediction between two biological entities. Examples of link prediction are predicting the interactions between proteins or predicting interactions between genes/regulatory elements (Kumar et al., 2020; Li et al., 2022). The second task is node functionality prediction. For example, predicting an unknown function of a protein based on the physical interactions between proteins in the protein interaction network (Muzio et al., 2021). The third application of ML in network analysis is to classify sub-network functionality. Figure 4 shows different ML tasks in a graph or network structured data. Muzio et al. classified functions of subnetwork based on a molecule’s toxicity or solubility (Muzio et al., 2021). In addition, ML is also applied to obtain subnetwork embedding where the subnetwork is represented as a vector preserving the important information within the subnetwork in a numeric form. This embedding can facilitate further analysis of the subnetwork and is used for various downstream tasks (Nelson et al., 2019).
3 Cell lines
Meat consists of various cell types, predominantly muscle cells (approximately 90%), with fat and connective tissue cells accounting for the remaining 10% (Listrat et al., 2016). Additionally, there are some vascular, neural, and tissue-resident immune cells present in small amounts (Listrat et al., 2016; Ben-Arye and Levenberg, 2019; Reiss et al., 2021). To provide the taste, texture, and aroma expected in meat, most CM development has focused on muscle, fat, and connective tissue production. Cells will also likely contribute to the nutritional properties of CM products, and cell optimization may be used to tailor the nutritional profile of CM (Smith-Uchotsk and Wanjiru, 2023; Stout et al., 2020).
The cell types used to produce CM range from lineage-committed progenitor cells (e.g., muscle satellite cells, myoblasts, or preadipocytes) to stem cells able to differentiate into a broader set of cells (e.g., mesenchymal stem cells, embryonic stem cells, or induced pluripotent stem cells) (Ben-Arye and Levenberg, 2019). Developers may use primary cells, meaning cells that are isolated directly from an animal. However, cell lines, which are established cultures of cells that have been selected and optimized, are ideal because they are more consistent, characterized, and reduce the use of animals in the supply chain. Furthermore, immortalized cell lines or pluripotent stem cells are of particular interest due to their ability to escape the typical limits on population doublings seen in most primary cells (i.e., Hayflick limit) (Cong et al., 2002; Hayflick, 1965). A detailed review of cells used in the production of CM can be found elsewhere (Reiss et al., 2021).
Few CM-relevant cell lines are currently well characterized and commercially available, with most coming from model organisms used in biomedical research, such as mouse, rat, or zebrafish, and cell lines for many agricultural species still need to be developed (Soice and Johnston, 2021). Cell lines for marine species, especially invertebrates such as mollusks or crustaceans, are especially underdeveloped, with only a few reported cell lines and not all are food-relevant species (Béjar et al., 2002; Buonocore et al., 2006; Gignac et al., 2014; Goswami et al., 2023; Krishnan et al., 2023; Li et al., 2021; Parameswaran et al., 2007; Parton et al., 2007; Potter et al., 2020; Saad et al., 2023). For many species, there is a lack of basic knowledge of their physiology and the biochemistry required for in vitro culture or immortalization (Soice and Johnston, 2021; Rubio et al., 2019; Musgrove et al., 2024). The lack of knowledge of molecular and genetic markers as well as few species-specific antibodies available makes identifying and curating cell lines difficult for under-studied species (Musgrove et al., 2024; Ravikumar et al., 2024). ML can help biologists analyze complex cellular data to assist with identifying ideal cell line populations and optimizing existing cell lines through gene perturbations.
3.1 Network analysis is a tool to model biological interactions
Optimization of cell lines is often challenging because it requires understanding the “state” of a cell or cell population (i.e., what the gene and protein networks are doing) and selecting or engineering for desired cell states. Measuring and predicting these states involves interpreting complex interactions between genes and proteins, identifying those that are important for specific qualities, and predicting how perturbations will affect the whole network. ML can be used to model these interactions through network analysis. Network analysis is used throughout sections 3 and 4, so a general overview is first presented here. In recent years, GNNs have been employed for biological network analysis. GNNs have been implemented to predict protein interactions (Jha et al., 2022; Yang et al., 2020), molecular interactions (Huang et al., 2020; Kang et al., 2022), metabolite-disease associations (Sun et al., 2022) and obtain subnetwork embeddings (Ciortan and Defrance, 2021) that can be used to identify the functions of a biological subnetwork. Generative deep learning strategies, such as VAEs and deep adversarial network-based models, are also trained on biological networks by employing GNNs (Sun et al., 2022; Zhang et al., 2020). The most famous example to date is AlphaFold2, a deep learning GNN model that was employed on amino acid sequences to predict more than 200 million protein structures (Jumper et al., 2021). This is a significant advance in the field of structural biology which can be used in protein design, drug target prediction, cell type identification, and antibody development. AlphaFold2 can play a big role in designing new proteins and understanding the physical relationships between proteins (Evans et al., 2024).
Network analysis can employ multiple types of biological data within a multi-omics setting. Since each omics technique typically captures a specific biological process, integrating multi-omics data can provide a holistic overview of the biological process. In a multi-omics ML approach, different ML models are employed on different types of data (such as genomics, transcriptomics, proteomics, and metabolomics) to obtain numerical representations. These numerical representations are then combined together to obtain a more informative representation used to predict final outputs. For example, MOGONET employs a GNN on mRNA expression, DNA methylation, and miRNA expression data to predict disease information (Wang et al., 2021). Similar datasets can also be used for CM to predict cell line features. More applications of multi-omics ML methods are discussed in section 3.2.
Furthermore, network analysis can help determine aroma and flavor (Lee et al., 2023). Since meat aroma and flavor are largely controlled by metabolic pathways (Ramalingam et al., 2019), network analysis can be applied to biosynthetic pathways to enhance or add flavors to CM products. Network optimization has been used to design yeast that overexpress licorice glycoside (Huang et al., 2021), and although licorice is a flavor that few would want in a steak, one could imagine more savory corollaries. In other yeast experiments, transcriptomic network analysis has been used to improve acid resistance, and acid resistance is just as important to mammalian culture as to microbial culture (Li et al., 2021).
3.2 Machine learning can help to analyze complex omics data to identify and characterize new cell lines
“Omics” approaches, such as genomics, transcriptomics, proteomics, and metabolomics, are powerful tools for identifying and characterizing cell lines. In particular, RNA sequencing (RNA-seq) technologies are commonly used to quantify cellular gene expression to validate and optimize cell lines. However, analyzing RNA-seq or other-omics data across many candidate cells is a complex and daunting analytical task. ML can help these analyses in multiple ways, including by grouping functionality-related cells using an unsupervised approach (Ciortan and Defrance, 2021), profiling gene expression using a supervised approach (Chen et al., 2016), and identifying different tissue types using unsupervised (Li et al., 2020; Lin et al., 2020) and semi-supervised approaches (Alvarez et al., 2020).
Using gene expression data to cluster cells based on cell type or behavior can help explain heterogeneity among cell populations and discover subpopulations with beneficial characteristics. When establishing cells for CM production, scientists may want to isolate only certain cell types with optimal attributes or remove undesirable cell types. For example, using single cell RNA-seq (scRNA-seq), Messemer et al. found that an isolation from cattle muscle contained 11 distinct cell types (Messmer et al., 2023). This work led to a better understanding of the cells derived from a primary isolation, as well as cell surface markers suitable for the identification and separation of populations by flow cytometry. Additionally, in a recent preprint, Melzener et al. used RNA-seq to study subpopulations during muscle differentiation, understanding cell fates with an aim to improve the efficiency of cell differentiation and maturation (Melzener et al., 2024). ML is also increasingly being explored for the modeling of cell trajectories via scRNA-seq (Qiu et al., 2022).
Unsupervised ML can help to map cellular heterogeneity by grouping functionality-related cells, identifying cell sub-populations, and performing dimensionality reduction (Li et al., 2020; Lin et al., 2020; Brendel et al., 2022). Typically, the input to the unsupervised ML model is gene expression data obtained from RNA-seq. In conventional ML frameworks, which are not based on neural networks, the outcome is generally an assigned cluster number for each cell or gene. For an in-depth exploration of how traditional ML techniques are applied in this context, we direct readers to the comprehensive review by Petegrosso et al. (2019). In contrast, unsupervised deep learning methods predominantly leverage autoencoders, which compress high-dimensional cellular data into a more manageable lower-dimensional space while retaining essential information (Li et al., 2020; Lin et al., 2020; Brendel et al., 2022). This lower dimensional representation of the encoder can be used to obtain clusters of cells (Li et al., 2020; Lin et al., 2020; Svensson et al., 2020; Tian et al., 2019; Wang and Gu, 2018) and can be further fine-tuned for gene expression profiling (Menden et al., 2020; Alharbi and Vakanski, 2023).
Recently, another family of autoencoders that uses GNNs has been employed to obtain the lower dimensional representation of transcriptomics data. These GNN-based autoencoders use the knowledge of biological networks, such as gene–gene relationships, cell interaction networks, protein interaction networks and biological pathways, along with gene expression data to obtain a more robust and informative representation of cells and genes (Ciortan and Defrance, 2021; Brendel et al., 2022; Gan et al., 2022; Rao et al., 2021; Shan et al., 2023; Wang et al., 2021; Wen et al., 2022). The GNN autoencoders are unique compared to traditional autoencoders as they encode information on biological interaction between entities along with structural information and biological properties.
Semi-supervised approaches have been employed in many tasks, such as learning responses due to gene perturbation, different functional score prediction due to gene knockouts, identifying different tissue types, and transcriptome analysis (Alvarez et al., 2020; Aromolaran et al., 2020; Chuai et al., 2018; He et al., 2020; Lotfollahi et al., 2021; Osorio et al., 2022; Tian et al., 2018). These models are adept at leveraging both labeled and unlabeled data, typically employing the unlabeled data to train an autoencoder and then using the labeled data to fine-tune the autoencoder toward the target outcomes (Alvarez et al., 2020; He et al., 2020; Bernstein et al., 2020). Fine-tuning is the process of adjusting pretrained model parameters for a specific task by utilizing a small labeled dataset. The semi-supervised approach is notably different from its unsupervised counterpart as it provides a degree of guided learning, which is crucial when the available labeled data is sparse but critical for the identification of specific labels. For instance, while the categorization of cells based on functional similarities may fall under unsupervised learning—grouping cells by inherent characteristics—the task of pinpointing cell doublets could benefit from a semi-supervised model that utilizes a limited set of known doublet samples to enhance its predictive accuracy.
In addition to transcriptomic data, further single-cell multimodal sequencing technologies have been developed that provide cell-specific information, such as chromatin accessibility (scATAC-seq) (Baek and Lee, 2020) and surface proteins (CITE-seq) (Stoeckius et al., 2017). These data types provide complimentary insight into scRNA-seq, such as improving accuracy in modeling gene regulatory networks in the case of scATAC-seq (Huang et al., 2023; Kim et al., 2023). This understanding can aid in CM-related tasks, such as identifying transcriptomic/epigenetic markers predictive of high proliferation and differentiation potential given previously observed variability in primary cell culture performance (Melzener et al., 2022; Meßmer, 2023; Metzger et al., 2020).
In an ML context, autoencoders and GNN-based deep learning models are mostly applied to this multimodal data (Athaya et al., 2023). Some of these autoencoders employ individual encoders and decoders for each modality. We refer the readers to a review of multimodal single-cell models, and a review on general best practices for single cell analysis, to learn more about the topic (Athaya et al., 2023; Heumos et al., 2023).
3.3 Antibody design for characterization and isolation of novel species can be aided by machine learning
In cell line development it is crucial to both characterize cells, commonly via omics data (as discussed above) or visual identification (discussed in section 5), and isolate the cells of interest, typically using flow cytometry cell sorting. Visual identification and flow cytometry both require the use of antibodies with specific binding affinity to known cellular markers. Markers of muscle cell differentiation are well understood for most mammalian species (Yu et al., 2021). However, in under-explored species, such as fish or aquatic invertebrates, these proteins are not always shared with mammalian cells or, to the degree they are, have low sequence conservation (Musgrove et al., 2024; Liongue and Ward, 2007). For example, recent work with an Atlantic mackerel (Scomber scombrus) skeletal muscle cell line found that antibodies for the muscle satellite cell marker paired-box protein 7 (PAX7) was successful, while early myogenic marker myoblast determination protein 1 (MYOD) was not (Saad et al., 2023). Surface markers, which are most commonly used during flow cytometry cell sorting, are particularly understudied in CM-relevant cell types, with most existing research focused on immunology (Liongue and Ward, 2007; Charoensawan et al., 2010).
Attempts at validating commonly used mammalian antibodies in fish have led to issues with cross-reactivity and specificity, suggesting that establishing cell lines for CM research will require the repurposing of existing antibodies, if not the development of fully novel antibodies (Antuofermo et al., 2023). The use of ML on omics data has been shown to aid antibody development, such as via improving species cross-reactivity, antibody co-optimization, and binding affinity (Bennett et al., 2024; Hie et al., 2024; Makowski et al., 2022). The autoencoder model totalVI, trained on joint scRNA-seq and CITE-seq data, is able to provide insight into many key variables that would aid antibody development: the identification of novel differentially expressed features to target in a given cell subpopulation, the improved prediction of false positive/negative surface proteins, the reduction of technical bias common in antibody-based measurements such as “background,” and the improvement of experimental design by helping determine optimal antibody titrations/sequencing depths for balancing cost and signal-to-noise ratio (Gayoso et al., 2021). For a full review of computational methods relevant to antibody development, readers are pointed to Kim et al. (2023).
3.4 Machine learning can map and enhance genetic traits to optimize cell lines
Gene editing can be used to enhance or alter cellular traits to generate cell lines optimized for CM production, such as accelerating growth, extending growth (i.e., immortalization), reducing input costs, or tailoring the flavor and nutrition. For example, manipulating cellular metabolism has a large potential for increasing the efficiency (and therefore reducing costs) of CM products (Risner et al., 2021; Humbird, 2021). As another example, Stout et al. engineered cells to overexpress FGF2, eliminating the need for FGF2 supplementation in media through autocrine signaling (Stout et al., 2024). Because many of the species used in CM are under-studied, their genome regulatory networks are not yet well understood, slowing efforts for gene editing. ML offers an opportunity to accelerate gene editing technology.
Genes typically have many regulatory regions, such as upstream and downstream regions, untranslated regions, promoters, and enhancers. These regulatory regions determine where, when, and how much a gene is expressed. Thus, to apply gene editing technology, identifying these regulatory regions is an essential task where ML models have been applied successfully (Ji et al., 2021; Danilevicz et al., 2022; Levy et al., 2022). The most relevant modern ML techniques demonstrated to be useful for these tasks are generative adversarial networks (Wang et al., 2020; Zrimec et al., 2022) and convolutional neural networks (Kotopka and Smolke, 2020). ATAC-sequencing is a very useful data modality for informing on this set of tasks (Yan et al., 2020).
More recently, researchers have adopted transformers and architectures similar to large language models, such as BERT, to segment regulatory regions and predict expression levels (Ji et al., 2021). These models are typically trained on a large number of DNA sequences or similar kinds of data using a supervised fashion by employing a masking strategy to generate a labeled dataset when labels are not available (Ji et al., 2021; Danilevicz et al., 2022). This entails subdividing a DNA sequence into numerous smaller fragments, typically employing a k-mers based approach, where a k-mer is a subsequence of length k within the DNA. Selected k-mers are then masked within the sequence. The objective of the transformer-based model is to predict those masked parts of the sequence while imitating the interactions among different fragments (Ji et al., 2021; Akiyama and Sakakibara, 2022). By predicting the masked fragment, the model learns the underlying structure of the DNA sequences. The resulting output is a set of numerical vectors, often referred to as embeddings, representing each k-mer, encapsulating the sequence information. These vector representations can be further used to fine-tune the model for various downstream tasks, such as segment identification, sequence alignment, and gene expression prediction where limited labeled data is available (Ji et al., 2021; Danilevicz et al., 2022; Levy et al., 2022).
When labeled data is available, convolutional neural networks and transformers can be combined to predict outcomes. This approach employs convolutional neural networks on the small DNA fragments to obtain an initial representation. Following this, the processed fragments are inputted into transformer layers. The transformer architecture finds how each fragment interacts with other fragments and summarizes these interactions in a vector that represents the fragment. The vector representations of all fragments are finally used to predict different genomic tracks (Avsec et al., 2021). Since the vector representations of the DNA fragments derived from the final transformer layer encapsulate information that is both comprehensive and adaptable across various genomic tracks, these vectors can be effectively utilized for predicting a range of genomic tasks beyond those initially targeted.
Since these approaches can be used for gene expression prediction, they are particularly useful for predicting the expression of an edited DNA sequence. Gene editing involves adding, removing, or substituting a segment of DNA to alter the expression of a specific gene. Identification of replacement DNA segments is a complex task that can be addressed by employing an ML algorithm that can estimate the gene expression. Alternatively, using RL, the discussed ML approaches can be trained to generate gene-edited DNA sequences that maximize the expression by following the same training architecture of Large Language Models, such as ChatGPT (Ouyang et al., 2022). Initially, one of these models can be fine-tuned to generate a gene-edited DNA sequence from an input DNA sequence (supervised fine-tuned model). Then, the supervised fine-tuned model can be optimized for predicting gene expression, which will work as the reward function of the RL agent (reward model). Finally, the supervised fine-tuned model can be further optimized by employing RL optimization techniques, such as Proximal Policy Optimization (Schulman et al., 2017), where outputs of the reward model are used as rewards for generated sequences.
4 Media design
Cell culture requires media to keep cells alive, promote growth, and direct differentiation. Culture media contains the nutrients and signals that cells receive in the body and is typically composed of carbohydrates, amino acids, vitamins, minerals, buffers, proteins, peptides, fatty acids, lipids, and growth factors. In addition, media components may also contribute to the flavor of the final product (O’Neill et al., 2021; Simsa et al., 2019). A detailed discussion of how cells utilize these components has been covered previously (O’Neill et al., 2021).
Most existing cell culture media formulations are expensive, not designed for the species of interest for CM production, and include animal-derived ingredients (O’Neill et al., 2021). Media design for CM requires optimizations to address these limitations, which are often complex and resource-intensive research efforts. ML is well suited to accelerate research in this space.
4.1 Culture media design can be treated as a hyperparameter optimization problem
The objective of media design is to identify conditions that are optimized for various parameters, such as price, growth rate, stability, flavor, environmental impacts, and lot-to-lot reproducibility, sometimes optimizing for several of these objectives simultaneously (O’Neill et al., 2021; Hubalek et al., 2022). In addition, media may be optimized for specific cell responses, such as differentiation or spontaneous immortalization (Messmer et al., 2022; Pasitka et al., 2023). In certain cases, media may need to be developed for uncharted parameter spaces; for instance, cells derived from lamb, duck, and deer that were not previously grown for biomedical purposes.
Most culture media contain dozens of ingredients— the classic media Ham’s F-12 contains 47 ingredients (Ham, 1965). Certain ingredients, such as fetal bovine serum or B27 (Brewer et al., 1993), are themselves complex mixtures of additional ingredients. In the absence of ML, single specific improvements of large effect can still be sometimes made to culture media, such as a group that swapped animal albumin for plant albumin (Stout et al., 2022). Big gains in the stability of stem cell media have been made with recombinant peptides (Kuo et al., 2020). Other groups have optimized media, such as bovine myoblast media, using straightforward factorial approaches, and design-of-experiments (DOE) for these approaches are quite mature (Franceschini and Macchietto, 2008; Kolkmann et al., 2022). However, a medium with 49 ingredients, each with five relevant concentrations, has a design space of over 1034 potential recipes, exceeding the capacity of traditional factorial methods (i.e., Taguchi methods; Freddi and Salmon, 2019) or high-throughput screening.
Some attempts have been made to apply ML to the culture media design problem as a means of exploring the experimental search space more effectively. A recent preprint (Hashizume et al., 2022) used gradient-boosted trees to identify culture media that improve the growth characteristics of suspension-phase HeLa cells. Another recent paper trained a neural network to model the results of fractional factorial culture media experiments (Nikkhah et al., 2023). Response surface methodology, a classic DOE method, has been used in conjunction with a genetic algorithm for optimizing media properties such as cost, growth rate, and global warming potential for CM production in both zebrafish fibroblast (Nikkhah et al., 2023) and mouse myocyte (Cosenza et al., 2021) culture media. However, these methods, on their own, aren’t optimal for dealing with the complexity of full culture media recipe design. In addition, both studies use well-characterized cell lines, however their applicability for food production is limited, and further work is needed to translate these methods to more food-relevent species.
Designing culture media is fundamentally a high-dimensional search problem, and Bayesian optimization can effectively navigate problems of this nature, which has the advantages of requiring only a small number of experiments and dealing with uncertainty in a robust manner. Cosenza et al. has used Bayesian optimization to optimize myocyte culture media for CM applications, first demonstrating its increased efficiency over a standard DOE approach (Cosenza et al., 2022), followed by its use to optimize serum-free growth media for CM (Cosenza et al., 2023). However, these studies were performed using the C2C12 mouse myoblast cell line as a model system and further research is required to translate this to a species more relevant to CM production. Outside of CM research, Bayesian optimization has been used to optimize spirulina culture media (Gamble et al., 2021) and keratinocyte differentiation media (Kanda et al., 2022). Google Vizier, which is Google’s hyperparameter tuning service, is a key enabler of Bayesian optimization, as it provides biologists access to this algorithm via API, bypassing the need to code from scratch (Golovin et al., 2024). Additionally, the BoTorch library for Python offers a powerful free toolkit to implement Bayesian optimization (Balandat et al., 2020).
Even given an optimal algorithm for optimization, obtaining and analyzing relevant data from cultured cells remains a challenge in media design. One potential solution is through the use of high-content imaging, which can be enhanced through the implementation of ML methods, as discussed below in section 5. Another approach involves the application of gene network analysis, which may be the most effective method for gaining detailed insights into the cellular response to a given culture medium.
Bayesian optimization is the leading hyperparameter exploration technique, but it stands among others. Genetic algorithms, which cluster parameters in a way that mimics biological chromosomes to enable recombination (Katoch et al., 2021), have been used to optimize cyanobacteria media (Havel et al., 2006). Simulated annealing, which is based on a metallurgical concept (Delahaye et al., 2019), has been used in conventional animal agriculture to optimize poultry feed (Wijayaningrum et al., 2017) and to produce a population balance model for CHO cells (Wijayaningrum et al., 2017). However, Bayesian optimization appears to be the most widely used among these techniques, probably because it can work on non-differentiable objective functions without the difficulty of partitioning parameter space into somewhat arbitrary and finicky “chromosomes” (Qualities, 2020).
4.2 Machine learning can interpret how cells respond to media conditions
The heart of media design is attempting to alter the properties of cells by altering their environment. Assessing the properties of cells can be challenging—although RNA-seq can provide insight on the inner state of cells by quantifying expressed genes (Messmer et al., 2022). However, that information is often not readily interpretable because it is high-dimensional and the effect of any single gene is often esoteric or context-dependent. ML-guided network analysis is a solution - it can assess changes to the inner state of cells when performing media design or other optimizations, such as culture vessel geometry or atmosphere composition. Both gene-based network analysis and metabolite-based network analysis can be used for this application.
Gene network analysis has not yet been directly used in CM media research, but it has been used in parallel applications. Network analysis has been used to improve mammalian culture, CHO cells in particular, to identify feed supplements (Schinn et al., 2021). Network analysis has also been used to find gene networks important for feed efficiency or milk production in cattle, with the networks formed from phenotype–genotype analysis (Suchocki et al., 2016; Taussat et al., 2020).
To get more details about a gene network, gene network inference (GNI) methods can deduce the regulatory interactions between genes. The potential value of GNI to CM media design is that GNI can aid and simplify the interpretation of an RNA-seq data matrix; instead of looking at the differential expression of tens of thousands of genome features, the analyst can look at the differential activity of dozens of network modules. Among recent GNI methods are dynGENIE3 (random forests) (Huynh-Thu and Geurts, 2018), Scribe (Qiu et al., 2020), and TENET (Kim et al., 2021), all of which are benchmarks for the most recent papers (Liu et al., 2022). GNI has been used to optimize output in traditional agriculture, both soybean cultivation (Hale et al., 2022) and maize cultivation (Huang et al., 2018), and GNI could be used for analogous output maximization problems in CM.
In biotechnology, metabolic network analysis has, historically, most commonly been done with flux balance analysis to predict the inputs and outputs of metabolites in cells (Orth et al., 2010). On its own, flux balance analysis has been used for CM-relevant problems like modeling CHO metabolic states (Hagrot et al., 2017), evaluating metabolic hypotheses (Martínez-Monge et al., 2019), and developing genome-scale metabolic models (Széliová et al., 2020). However, flux balance analysis has also been combined with ML in hybrid systems for purposes such as correlating real-world data (Wu et al., 2024) or determining the most relevant model features (Vijayakumar et al., 2020). Given the use of flux balance analysis in CM and the demonstrated ability of ML to enhance it, this may be a useful application of ML to CM going forward.
4.3 Optimizing and engineering proteins is much easier with machine learning
Another opportunity for optimization is in the ingredients themselves. Growth factors are currently estimated to contribute over 95% of the total production cost (Specht, 2020). Altering proteins for use in cell culture media is a promising means to optimize the stability, price, or other desirable properties of these ingredients.
Recombinant production, a commonly used animal-free strategy to produce proteins, is a major contributor to media costs (Venkatesan et al., 2022). Most proteins present in culture medium are highly specialized eukaryotic proteins, requiring post-translational modifications such as phosphorylation or glycosylation that require mammalian cell culture systems, rather than the far cheaper and more scalable bacterial production systems. Recent work developed an E. coli expression system to reduce the price of several relevant growth factors (Venkatesan et al., 2022). Similar efforts are necessary to reduce the cost of other media ingredients.
Probably the biggest challenges when importing transgenic proteins are differential folding between species (i.e., different chaperones, different organism temperatures) and differential post-translational modification. There have been many efforts to use ML to predict post-translational glycosylation, with earlier efforts using random forests (Hamby and Hirst, 2008) and more recent efforts using multi-layer perceptrons (Pakhrin et al., 2021) or ensemble models (Ching-Hsuan et al., 2020). As for differential folding, transformer-based models have been used to predict thermal protein stability (Jung et al., 2023). Combining transformers and GNNs has been used to predict protein subcellular localization (Dubourg-Felonneau et al., 2022). More recently, AlphaFold has been used to predict protein stability in a general sense (Pak et al., 2023), and ColabFold was developed to improve upon it (Mirdita et al., 2022).
Protein ingredients can also be engineered to optimize their stability in culture, reducing the total amount of protein needed for CM production (Goldenzweig and Fleishman, 2018). For example, FGF2 thermal stability was increased using point mutations in the protein (Dvorak et al., 2018). Similarly, Long R3 IGF-1 is a modified recombinant form of IGF that prevents inactivation by IGF binding factors, leading to x200 potency and x3 stability compared to standard insulin (Voorhamme and Yandell, 2006). Similar strategies could be applied to other expensive media ingredients to reduce the necessary concentrations or additions. RFdiffusion (based on a generative diffusion model) (Watson et al., 2022) and ProteinMPNN (Dauparas et al., 2022) are tools designed to solve this problem. ML is also used in directed evolution experiments that iterate upon a given protein’s design (Hie et al., 2024; Saito et al., 2021).
For some proteins used in high concentrations, such as albumin, transferrin, and insulin, replacement with lower-cost alternatives, such as plant (Humbird, 2021; Okamoto et al., 2022) or microbial (Tuomisto and Teixeira de Mattos, 2011; Jeong et al., 2021) hydrolysates may be beneficial. For example, a protein similar to bovine insulin was found in cowpea (Vigna unguiculata) and could be isolated for use as an insulin replacement (Mj et al., 2015; Venâncio et al., 2003). Screening plants or microbes for sequence homology to proteins of interest could accelerate this work. The traditional, non-ML tool for homology searches is protein BLAST (Altschul et al., 1990), but BLAST is underpowered because proteins can have similar structures without having similar sequences. Services like Dali provide protein structure alignment but are constrained by the paucity of known protein structures and the computational complexity of three-dimensional comparisons (Holm and Laakso, 2016). ML has changed how to approach this problem with FoldSeek (van Kempen et al., 2023), which uses autoencoder-based compression to simplify three-dimensional protein structural comparisons, using AlphaFold to infer protein structures when no empirical data is available (Jumper et al., 2021).
5 Cell culture microscopy and image analysis
Microscopy is one of the foundational techniques of cell culture, providing information such as (i) the health of cells (e.g., whether they are mitotic, senescent, or apoptotic); (ii) the behavior of cells (e.g., whether they are invasive, contractile, or secretory); and (iii) the lineage of cells (e.g., whether they are stem cells, progenitor cells, or terminally differentiated cells). Many types of cell analysis rely on microscopy. For example, fusion index, the percentage of nuclei inside myotubes using immunostaining, is the predominant test used to quantify myogenic differentiation (Ben-Arye and Levenberg, 2019). Alternatively, inexpensive colorimetric staining has been shown as an alternative measure of myotube differentiation (Veliça and Bunce, 2011). Additionally, brightfield imaging, without the use of dyes, has been used to measure cell contractility, which is a measure of muscle cell maturity (Furuhashi et al., 2021; Ribeiro et al., 2017).
The relevance of microscopy during large-scale production will be especially dependent on its ability to serve as a low-cost, high-throughput tool. However, microscopy has been historically constrained by the complexity of its analysis. Microscopy analysis is routinely done manually by researchers with well-trained eyes, and doing it automatically requires systems that can incorporate many nuanced features of the image data. Unfortunately, appropriate systems tend to be nascent, poor, or non-existent in biological research. Furthermore, the use of dyes to improve image quality is undesirable due to the cost and time constraints of CM production. For CM production, cell segmentation and classification are fundamental and indispensable due to their multifaceted contributions - they are relevant to quality control, monitoring of cell culture health, and optimizing production. Utilizing ML approaches for automated cell segmentation and classification can lead to a reduction in the time, expenses, and errors involved in preparing the setup for manually analyzing the image data.
5.1 Automatic image analysis for cell segmentation using machine learning
Cell segmentation is the process of identifying and separating individual cells within an image. This is done in digital microscopy and histopathological imaging to study cell structure and function. The goal of cell segmentation is to pick out cells in an image, and segmentation is necessary to measure cell size, shape, and number. Segmentation also enables the tracking of individual cells over time, allowing researchers to assess changes in cell behavior and morphology, which can help in optimizing conditions for CM production. Furthermore, cell segmentation can also help in identifying contaminating debris or inappropriate cells from the culture, ensuring the quality and safety of the final product.
Recently ML has been used to improve cell segmentation for various applications (Al-Kofahi et al., 2018; Durkee et al., 2021; Kumar et al., 2020; Pachitariu and Stringer, 2022). Manual identification of individual cells has poor reliability between and within human evaluators. Additionally, automating cell imaging tasks can free up researchers’ time for higher value tasks and reduce errors from fatigue and subjectivity (Verma et al., 2021). However, there are several challenges that hinder the effectiveness of ML-based cell segmentation. First, cells are highly variable in size, shape, and morphology. Second, segmentation may fail on images that are low-contrast, have uneven illumination, or are out-of-focus. Third, crowded and overlapping cells can make it difficult to distinguish individual cells using cell-segmentation algorithms. Fourth, training data can overfit on imaging artifacts, such as non-uniform illumination and background noise (Zinchuk and Grossenbacher-Zinchuk, 2023). These challenges are not unique to CM but must be taken into account prior to applying ML for cell segmentation in any field.
In order to overcome the above challenges and to develop robust algorithms for detecting and segmenting cells, the computer vision community requires access to large, diverse, and well-curated datasets with comprehensive annotations. While some public datasets for nuclei and cell segmentation have been released in the past (Kumar et al., 2020; Verma et al., 2021; Caicedo et al., 2019; Greenwald et al., 2022; Kaimal et al., 2021; Naylor et al., 2019), additional datasets specific to CM are necessary to develop such robust algorithms for accurate cell segmentation in microscopic images.
A classic method for cell segmentation is the watershed algorithm combined with optimal thresholding, which has been used in applications such as segmenting lymphocytes into nuclear and cytoplasmic regions (Mohammed et al., 2013). There has been growing interest in deep neural net architectures inspired by fully convolutional networks for cell segmentation (Kumar et al., 2020; Naylor et al., 2019; Gómez-de-Mariscal et al., 2021; Long and Shelhamer, 2015; Wienert et al., 2012; Yu et al., 2016). These architectures employ encoder-decoder blocks to transfer features from multiple scales and levels for efficient cell segmentation on histopathology and microscopy images. The U-Net model, a variant of the fully convolutional network architecture, has shown particular promise for this task (Falk et al., 2019; Ronneberger et al., 2015). Unlike fully convolutional networks, U-Net incorporates skip connections that facilitate precise semantic segmentation by amalgamating features from diverse resolutions, enhancing the model’s capability to capture intricate details. Similarly, U-Net++ employs advanced encoder-decoder structures and loss functions to enhance performance for cell segmentation (Zhou et al., 2018). Such deep learning architectures have outperformed pathologists’ performance for cell segmentation in various applications (Franklin et al., 2021; Hekler et al., 2019).
5.2 Automatic image analysis for cell classification using machine learning
Cell classification is another important task in image analysis. While cell segmentation is the process of separating individual cells from the background and from each other in an image, cell classification is the process of assigning labels to the segmented cells based on their morphology, phenotype, or function (Dursun et al., 2023; Huynh et al., 2021). The primary goal of cell classification is to assign each segmented cell to a specific category or label, such as cell type, state, or condition.
Manual analysis of thousands of microscopy images for cell classification is a tedious and error-prone task. Thus, there is a need for ML algorithms for automatic cell classification. Recently, ML has been used for the classification of cells within microscopy images, including the identification of anemia or blood disorders based on the shape, size, and optical properties of red blood cells (Belashov et al., 2021), and for the analysis of cellular microenvironments to offer novel insights into biological mechanisms (Winfree, 2022). However, the effectiveness of ML-based cell classification for microscopy faces several challenges. One challenge arises from the crowded or overlapping cells, making it difficult to distinguish individual cells. Differences in cell maturity and variations in cell shape resulting from diverse cultivation and treatment methods can introduce complexity into the analysis, especially when distinguishing between different types of cultured cells. Complex textures, patterns, and shapes in multi-modal microscopy further complicate cell classification, requiring the integration of multi-stream models that consider low-level cues such as edges and gradients (Lou et al., 2023). The intricate cell structures found in tissue images introduce additional complexities such as heterogeneous cell populations and staining variations. These challenges necessitate the development of precise and efficient algorithms for cell classification.
In the domain of microscopy image analysis, many innovative architectures and methods have emerged for classifying cells. Among these, CNNs stand as a robust choice, with models like U-Net (Ronneberger et al., 2015) and Mask R-CNN (He et al., 2015) excelling in cell classification tasks. RNNs, particularly long short-term memory and gated recurrent units, demonstrate their prowess when handling sequential data, making them valuable for tracking cell dynamics (Ghojogh and Ghodsi, 2023). For tasks involving complex relationships between cells, GNNs, such as graph convolutional networks, prove invaluable (Chen et al., 2022). Traditional approaches like random forests and decision trees, which are based on hand-picked image features, are still relevant for classification because they are computationally fast, easy to train, and resist overfitting (Gurcan et al., 2009; Kumar et al., 2022). Transfer learning techniques harness pre-trained deep learning models like ResNet (He et al., 2015), while attention mechanisms and ensemble methods contribute to improved accuracy (Marzahl et al., 2019). With the ever-evolving landscape of microscopy, these versatile architectures continue to play pivotal roles in cell classification.
Following cell classification, cell phenotype analysis is pivotal in exploring cellular characteristics, encompassing physical and biochemical attributes such as size, shape, function, viability, proliferation, signaling, and morphological structure. This approach is particularly valuable in culture media optimization (Zhou et al., 2023). By scrutinizing the physical attributes and functional behavior of cultured cells, researchers can ensure the consistency and quality of cell populations, evaluate metabolic activity, and assess cellular functionality within the culture. Moreover, the examination of cell morphology offers insights into cellular health and enables fine-tuning of culture media formulations (Zhou et al., 2023; Grzesik and Warth, 2021). ML has been used to combine different forms of microscopy using transfer learning - such as between fluorescence and dye-free microscopy (Jang et al., 2021) or between traditional microscopy and mass spectrometry imaging (Race et al., 2021). Thus, in the burgeoning field of CM, phenotype analysis contributes to the refinement of CM production, guiding the selection of cell strains and culture conditions to achieve better quality and desired growth attributes. Overall, cell phenotype analysis plays a pivotal role in enhancing cell culture processes and product quality, making it a valuable tool for CM production.
6 Bioprocess and food processing optimization
Moving CM from lab bench scale to commercial scale requires efficient bioprocess design. This centers around the use of large bioreactors to produce a controlled environment for cell growth and differentiation that maximizes biomass and minimizes by-product yields. A variety of bioreactor types have been proposed for use in CM, which are reviewed elsewhere (Allan et al., 2019). One study estimated that producing 1 kg of protein from muscle cells would require stirred tank bioreactors on the order of 5,000 L (Stephens et al., 2018). This dwarfs research-scale mammalian cell culture and will require extensive optimization. In addition, after harvesting cells from the bioreactor, CM products will likely require food processing steps to create a final product. ML is well suited to increase the scale and efficiency of CM bioprocessing and food processing in a variety of ways.
6.1 Bioreactor homeostasis can be maintained with machine learning
As the culture scale is increased, automation of closed systems will become important to increase efficiency and reduce contamination or other failure events (Specht et al., 2018). Real-time quality assurance and process monitoring will allow for adjusting culture conditions to optimize yield, monitoring for potential contamination, and reducing human errors. Automation in processes such as media recycling will further optimize the process and bring costs down.
Historically, mammalian bioreactor operation has been governed by systems such as proportional-integral-derivative controllers (Synoground et al., 2021) or model predictive control (Sarna et al., 2023). These systems, although mainstays of control theory, are designed to work in deterministic environments that can be well-described by linear differential equations. Although these systems can struggle with the vagaries and unpredictability of biological systems, they have been successfully employed in bioreactors to maximize antibody production (Kiparissides et al., 2015), constrain overflow metabolism (Bogaerts et al., 2017), and maintain glucose homeostasis (Craven et al., 2014). This type of modeling has ceded ground to ML-based modeling in recent years, probably because, compared to ML, these models are relatively fragile because of their dependence on their top-down mathematical models accurately describing reality.
The most straightforward application of ML to bioreactors is to use ML-based models to control the bioreactor’s inputs. A wide gamut of ML models have been used to monitor and control bioreactors and industrial bioprocesses, especially supervised learning models like neural networks (Zavala-Ortiz et al., 2022), random forests (Vaitkus et al., 2020), and gradient boosting (Zhang et al., 2021). The step beyond using ML to optimize models is to use ML to optimize the policies themselves that determine the models, which is done by reinforcement learning, with examples in bioreactors ranging across deep Q-networks (Oh et al., 2022), policy gradients (Petsagkourakis et al., 2020), and probabilistic Bayesian optimization (Luna and Martínez, 2014). These policy-based learning methods do not necessarily require a prior understanding of the biochemistry of the bioreactor.
6.2 The unique challenges of structured products can be addressed by machine learning
As opposed to ground meat, structured tissues (i.e., a steak or fish filet) will require the formation of organized 3-dimensional tissues and bioreactor systems capable of supporting them. A structured product entails growing an organized and three-dimensional tissue, as opposed to growing cells in suspension - the difference between growing cells as a soup and growing cells as a steak. This type of engineered tissue has not been shown on any scale beyond a tissue for a single patient, making this a major whitespace in the field (Specht et al., 2018).
Imposing organization upon cells is a classic, challenging problem from the field of tissue engineering, and solutions often involve ML. Random forests (Conev et al., 2020) and neural networks (Bone et al., 2020) have been used to predict optimal parameters for the extrusion printing of hydrogel scaffolds. Gradient boosting has been used to predict the self-assembly of dipeptide-based hydrogel scaffolds (Li et al., 2019). In theory, ML could be used to design the structures of tissue scaffolds, but this avenue appears to be unexplored at this time. Further review of the application of ML for bioprinting has been completed previously (Ng and Tan, 2024).
Tissue self-organization is a powerful principle for structured meat production. Tissue scaffolds or extrusion printing are currently used to create top-down structured tissues because of a lack of understanding of bottom-up self-organization of tissue in vitro. However, breakthroughs in bottom-up tissue self-organization would accelerate the scale-up of CM, eliminating the costs of scaffolds. In the context of CM, relevant types of self-organization are the alignment of fibers (especially myofibrils and collagen cables), the production of functional vasculature networks, maintaining ratios of meat-relevant cell types (myocytes, adipocytes, fibroblasts), and the structural determinants of texture and mouthfeel (Nishimura, 2010). Attempts have been made to model tissue self-organization using differential adhesion (Cerchiari et al., 2015), cellular Potts models (Libby et al., 2019), and agent-based modeling (Wang et al., 2020). Accurate models of tissue self-organization may provide design principles for making tissues with defined structure in CM.
A family of methods that may help with structured tissue construction in CM is spatial transcriptomics, which correlates gene expression in situ to physical coordinates within a tissue section (Tian et al., 2023). Spatial transcriptomics has been used for understanding specific aspects of tissue structure and cell organization that rely on spatial context, such as extracellular forces and gradients of signaling molecules (Heumos et al., 2023). GNNs have been successfully used with spatial transcriptomics data to model cellular communication (Fischer et al., 2023; Hu et al., 2021; Tanevski et al., 2022) and the deep learning model Tangram has also been used to resolve cell types and decrease imputation error from spatial data (Biancalani et al., 2021). Although spatial transcriptomics has not yet been used in the context of CM, it could prove useful in dissecting the complex tissue architectures involved in structured products.
Structured products will also require bioreactor systems capable of perfusion and harvesting of large intact tissues. The nature of a structured product bioreactor bears similarity to the fluidized and/or packed bed bioreactors that are used industrially in the wastewater and mineral extraction industries. Numerical fluid dynamical models are classically used to predict the behavior of these systems, but ML has, in recent years, been used for the more unpredictable aspects of these systems (Koerich et al., 2018; Ouyang et al., 2018). In particular, gradient boosting has been used both to predict bed expansion of fluidized bed bioreactors (Peng et al., 2022) and mass transfer in packed bed bioreactors (Guo et al., 2023). Similar methods may be useful for the particular challenge of CM structured product bioreactors.
6.3 Real-time sensory prediction and control could be applied with reinforcement learning
Unlike tissue engineering for medical treatments, the sensory properties of CM, such as flavor and texture, are critical to its commercial success. These may be generated during the cell culture from flavor or texture components of cells, media, or scaffolds, or after harvest using food processing or additives. ML is playing an important role in enhancing the analysis of the flavors and textures of other food products through the analysis of diverse data types and these techniques are likely to play a role in CM development as well.
During product development, flavor can be measured in a variety of ways. Earlier ML models used data from gas chromatography–mass spectrometry, which is an analytical chemistry method used to separate and fingerprint substances from complex mixtures (Bi et al., 2020; Zhu et al., 2021). Later, researchers focused on electronic noses that mimic the olfactory capability of humans through different sensors. Since an electronic nose can be used to collect real-time sensor data during production, ML can also be applied in real-time for quality and flavor control (Gonzalez Viejo et al., 2021; Gonzalez Viejo et al., 2020; Tian et al., 2020). However, it should be taken into consideration that many compounds important for the aroma of meat are generated during cooking (Khan et al., 2015). The latest research efforts concentrate on utilizing the molecular structure and physicochemical attributes of flavor compounds to predict flavors, including taste or smell (Bouysset et al., 2020; Wiltschko, 2019; Lee et al., 2022; Tuwani et al., 2019; Wang et al., 2021). These characteristics are quantified into molecular descriptors, numerical representations that encapsulate the properties of the molecules involved. These descriptors then serve as inputs for ML models, which are trained to predict flavor profiles and odor characteristics with greater objectivity.
Since most of these data are tabular, a wide range of traditional ML approaches, such as support vector machines, random forests, k-nearest neighbor, and AdaBoost Tree, have been employed (Lee et al., 2022; Wang et al., 2021; Ji et al., 2023). Additionally, deep learning approaches, such as CNNs (Bi et al., 2020) and multilayer perceptrons (Zhu et al., 2021; Gonzalez Viejo et al., 2021; Tian et al., 2020), and unsupervised learning approaches, such as cluster analysis by using principal component analysis (Gonzalez Viejo et al., 2021; Tian et al., 2020), have also been applied to identify or predict flavors. These methodologies have collectively demonstrated that ML can significantly contribute to the enhancement of the sensory properties of a range of food products.
AI approaches, particularly RL, offer significant potential for enhancing efficiency in food processing operations (Petsagkourakis et al., 2020; Aljaafreh, 2017; Bi et al., 2020). Food-processing facilities are typically equipped with a variety of sensors, including those for temperature, pressure, moisture, and pH levels. These sensors play a crucial role in ensuring precise ingredient measurements, which are fundamental for achieving the desired taste and aroma profiles of food products. However, the challenge arises when new products are developed, as determining the optimal mixture of ingredients often involves extensive trial and error. Deploying an RL agent in this context can effectively manage and adjust the various sensor readings, thereby ensuring that the food consistently meets the specific standards and requirements set by the food processor. This approach not only streamlines the development process but also enhances the precision and quality of the final product. RL can also be valuable for sensory prediction, in particular the texture prediction of finished CM products (Kircali Ata et al., 2023).
7 Discussion
In the last decade, the field of CM has made strides toward lower-cost and more efficient production processes but must progress significantly further to effectively rival traditional meat. ML offers great promise in improving every stage of CM production, from cell line development to the final product’s sensory characteristics. With the current surge in both public and private research and development for both the ML and CM fields, and the already successful integration of ML into numerous life sciences fields, the integration of ML into CM research is timely. Despite the numerous opportunities, there are only a handful of peer-reviewed, publicly accessible studies that describe the use of ML in CM production (Table 1), and an equally small group of researchers versed in both ML and CM. This review aims to bridge the gap between the ML and CM fields and create a starting point for scientists to better understand how to apply ML to CM research. To our knowledge, this is the first review to provide a comprehensive overview of the applications of ML to CM, covering the topics of cells, media, microscopy, bioprocess, and final product properties.
Since ML has been successfully employed in many other bioinformatics sectors, many of the existing methods can be adopted to CM. However, the key limitation is the availability of sufficient data. Creating ML models requires large training and validation datasets, and ML models are only as robust and reliable as the amount and quality of data used to develop them (Priestley et al., 2023). The scarcity of public data in CM complicates the development of models or even the assessment of potential model types. There are some publicly available dataset repositories, such as the Gene Expression Omnibus (GEO) for RNA-seq data,1 GenBank for sequenced genome data,2 and Uniprot for protein sequences.3 However, compared to species used in medical studies, few CM-relevant datasets are reported and many lack adequate descriptions, data annotations, or samples and replicates to be considered for statistical analysis. As an example, a query for “stem cell” in the GEO DataSets generates 71,327 datasets for Homo sapiens (human) and only 61 for Bos taurus (cattle) (as of February 1, 2024). Efforts to generate properly annotated data and incentives for the sharing of data from ongoing experiments would greatly accelerate the application of ML to CM research. The Cultivated Meat Modeling Consortium4 offers a model for community-generated and shared data, while protecting intellectual property, to accelerate computational models.
A survey of datasets with potential relevance to CM research is included in Supplementary Table S1. The dataset survey encompasses a compilation of open-access biological datasets derived from CM-relevant species such as fish, crustacean, mollusks, cow, pig, and chicken. These curated datasets span a diverse array of sources including sequencing (RNA, ATAC, ChIP, single cell, and genome), mass spectrometry (proteomics, lipidomics, and metabolomics), and microarray experiments.
Transfer learning might be able to partially make up for the scarcity of CM-relevant data, by using models from data-rich biomedical species, such as humans and mice, to inform models for data-poor CM-relevant species. Cross-species graph-based transfer learning has previously been applied in a non-CM context on RNA-seq data for cell-type identification (Liu et al., 2023; Wang et al., 2024). Challenges to this approach include biological heterogeneity from differing sets of genes or differing functions for genes across species, which could potentially be mitigated through techniques such as universal cell embeddings (Park et al., 2024; Rosen et al., 2024). As a starting point, researchers may look into fine-tuning large scale models that have been successful in achieving improved performance on biological tasks relevant to CM with limited task-specific data, including gene network analysis and cell type annotation, such as Geneformer and scGPT (Cui et al., 2024; Theodoris et al., 2023).
Another challenge relates to the scale of existing studies. Most ML research related to CM has been limited to laboratory environments, which might not mirror the conditions of mass production. Solutions could include either verifying these lab models at a commercial level or using process simulations to adapt them for larger operations.
From the perspective of biologists, an important way to speed up ML work is to produce the biological models that are used to generate data for training ML models. For example, the production and dissemination of more high-quality cell lines from agriculturally relevant animals is needed to aid the generation of omics and microscopy data. Furthermore, biological scientists can make efforts to contribute to the body of existing data by publishing any quality data that results from their experiments, whether or not they are directly used in their own studies, including negative results. Publication of transcriptomic and epigenomic data has become more commonplace, however complete microscopy image sets, in particular, are rarely published. There is also a need for more proteomic and metabolomic data, especially in understudied food-relevant species, to address numerous use cases including metabolic modeling, flavor profiling, and bioreactor scaling concerns (Nissa et al., 2022). Moreover, datasets should be properly annotated - this first includes metadata describing the samples the data came from, how the data was generated, and any processing that was done to the data. Ideally, data would include all raw data, which ML scientists could use to regenerate the original dataset. Similarly, properly labeled data (labeling of individual data points) is important for supervised learning models, which remain the most popular and widely used (Larrañaga et al., 2006). Finally, the quality and consistency of datasets are also critical for their use in training ML models. Variations in sample handling and data collection can lead to large and varied systematic errors that make it challenging to usefully combine multiple datasets or for a model trained on one dataset to apply to another (Priestley et al., 2023).
Efforts to make ML more accessible to researchers, such as infrastructure, frameworks, benchmarks, and libraries, could help facilitate the application of ML to CM. Currently, a variety of ML models are freely available from online resources like Paperswithcode5 and Hugging Face.6 Additionally, libraries such as Python’s scikit-learn (Grisel et al., 2024) or frameworks such as PyTorch Lightning7 offer a good starting point for coding ML. Similarly, accessible web interfaces could make ML tools more accessible to biologists, following the examples of AlphaFold and Foldseek, which both have interfaces integrated into the widely used online protein database UniProt.
Overall, there is an enormous opportunity for CM researchers to incorporate ML techniques and for ML professionals to explore the CM field. The ML field has been moving at unprecedented speed over the past few years, and CM researchers could make gains simply by porting over what is, in effect, yesterday’s news in ML. This review aims to be an introductory resource for researchers eager to explore this cross-disciplinary method, which could help to establish CM as a viable and sustainable protein source in our diets.
Author contributions
MT: Conceptualization, Writing – original draft, Writing – review & editing. SJ: Conceptualization, Visualization, Writing – original draft, Writing – review & editing. RV: Conceptualization, Writing – original draft, Writing – review & editing. RS: Visualization, Writing – review & editing. KS: Visualization, Writing – review & editing. BD: Conceptualization, Project administration, Writing – original draft, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. BD, MT, JS, and RV were funded by a grant from Schmidt Futures (SF ID 2022-10-17-8).
Acknowledgments
The authors would like to thank Zachary Consenza, Roy Nadler, Amin Nikkhah, Dave Staszak, Charlie Taylor, Sofia Giampaoli, Animesh Acharjee, Ana Velez Rueda, Jessica Carballido, and Leandro Sommese for reviewing early drafts of this manuscript and providing valuable insights. We thank Isha Datar and Cam Linke for their assistance in finding funding for the project. We would also like to thank Evan Rapoport for his facilitative leadership on this project.
Conflict of interest
Author MT was employed by company Todhunter Scientifics.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frai.2024.1424012/full#supplementary-material
Abbreviations
AI, artificial intelligence; CHO, Chinese hamster ovary; CM, cultured meat; CNN, convolutional neural network; FAO, Food and Agriculture Organization; GAN, generative adversarial network; GNI, gene network inference; GNN, graph neural network; ML, machine learning; NLP, natural language processing; OECD, Organization for Economic Co-operation; RL, reinforcement learning; RNN, recurrent neural network; VAE, Variational Autoencoder.
Footnotes
1. ^https://www.ncbi.nlm.nih.gov/geo/
2. ^https://www.ncbi.nlm.nih.gov/genbank/
References
Akiyama, M., and Sakakibara, Y. (2022). Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning. NAR Genomics Bioinform 4:lqac012. doi: 10.1093/nargab/lqac012
Alharbi, F., and Vakanski, A. (2023). Machine learning methods for cancer classification using gene expression data: a review. Bioengineering 10:173. doi: 10.3390/bioengineering10020173
Aljaafreh, A. (2017). Agitation and mixing processes automation using current sensing and reinforcement learning. J. Food Eng. 203, 53–57. doi: 10.1016/j.jfoodeng.2017.02.001
Al-Kofahi, Y., Zaltsman, A., Graves, R., Marshall, W., and Rusu, M. (2018). A deep learning-based algorithm for 2-D cell segmentation in microscopy images. BMC Bioinformatics 19:365. doi: 10.1186/s12859-018-2375-z
Allan, S. J., De Bank, P. A., and Ellis, M. J. (2019). Bioprocess design considerations for cultured meat production with a focus on the expansion bioreactor. Front. Sustain. Food Syst. 3:44. doi: 10.3389/fsufs.2019.00044
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403–410. doi: 10.1016/S0022-2836(05)80360-2
Alvarez, M., Rahmani, E., Jew, B., Garske, K. M., Miao, Z., Benhammou, J. N., et al. (2020). Enhancing droplet-based single-nucleus RNA-seq resolution using the semi-supervised machine learning classifier DIEM. Sci. Rep. 10:11019. doi: 10.1038/s41598-020-67513-5
Angermueller, C., Dohan, D., Belanger, D., Deshpande, R., Murphy, K., and Colwell, L.. Model-based reinforcement learning for biological sequence design. (2020).
Antuofermo, E., Orioles, M., Murgia, C., Burrai, G. P., Penati, M., Gottardi, C., et al. (2023). Exploring immunohistochemistry in fish: assessment of antibody reactivity by Western immunoblotting. Anim Open Access J MDPI 13:2934. doi: 10.3390/ani13182934
Aromolaran, O., Beder, T., Oswald, M., Oyelade, J., Adebiyi, E., and Koenig, R. (2020). Essential gene prediction in Drosophila melanogaster using machine learning approaches based on sequence and functional features. Comput. Struct. Biotechnol. J. 18, 612–621. doi: 10.1016/j.csbj.2020.02.022
Athaya, T., Ripan, R. C., Li, X., and Hu, H. (2023). Multimodal deep learning approaches for single-cell multi-omics data integration. Brief. Bioinform. 24:bbad313. doi: 10.1093/bib/bbad313
Avsec, Ž., Agarwal, V., Visentin, D., Ledsam, J. R., Grabska-Barwinska, A., Taylor, K. R., et al. (2021). Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203. doi: 10.1038/s41592-021-01252-x
Baek, S., and Lee, I. (2020). Single-cell ATAC sequencing analysis: from data preprocessing to hypothesis generation. Comput. Struct. Biotechnol. J. 18, 1429–1439. doi: 10.1016/j.csbj.2020.06.012
Balandat, M., Karrer, B., Jiang, D. R., Daulton, S., Letham, B., Wilson, A. G., et al. (2020). BoTorch: a framework for efficient Monte-Carlo Bayesian optimization. arXiv. Available at: http://arxiv.org/abs/1910.06403
Barozet, A., Molloy, K., Vaisset, M., Siméon, T., and Cortés, J. (2020). A reinforcement-learning-based approach to enhance exhaustive protein loop sampling. Bioinformatics 36, 1099–1106. doi: 10.1093/bioinformatics/btz684
Battle, M., Bomkamp, C., Carter, M., Colley Clarke, J., Fathman, L., Gertner, D., et al. (2024). 2023 state of the industry report: cultivated meat and seafood : Good Food Institute Available at: https://gfi.org/resource/cultivated-meat-and-seafood-state-of-the-industry-report/.
Beheshtizadeh, N., Gharibshahian, M., Pazhouhnia, Z., Rostami, M., Zangi, A. R., Maleki, R., et al. (2022). Commercialization and regulation of regenerative medicine products: promises, advances and challenges. Biomed. Pharmacother. 153:113431. doi: 10.1016/j.biopha.2022.113431
Béjar, J., Hong, Y., and Alvarez, M. C. (2002). An ES-like cell line from the marine fish Sparus aurata: characterization and chimaera production. Transgenic Res. 11, 279–289. doi: 10.1023/A:1015678416921
Belashov, A. V., Zhikhoreva, A. A., Belyaeva, T. N., Salova, A. V., Kornilova, E. S., Semenova, I. V., et al. (2021). Machine learning assisted classification of cell lines and cell states on quantitative phase images. Cells 10:2587. doi: 10.3390/cells10102587
Ben-Arye, T., and Levenberg, S. (2019). Tissue engineering for clean meat production. Front. Sustain. Food Syst. 3:46. doi: 10.3389/fsufs.2019.00046
Bennett, N. R., Watson, J. L., Ragotte, R. J., Borst, A. J., See, D. L., Weidle, C., et al. (2024). Atomically accurate de novo design of single-domain antibodies. bioRxiv. Available at: https://www.biorxiv.org/content/10.1101/2024.03.14.585103v1
Bernstein, N. J., Fong, N. L., Lam, I., Roy, M. A., Hendrickson, D. G., and Kelley, D. R. (2020). Solo: doublet identification in single-cell RNA-Seq via semi-supervised deep learning. Cell Syst. 11, 95–101.e5. doi: 10.1016/j.cels.2020.05.010
Bi, S., Zhang, B., Mu, L., Ding, X., and Wang, J. (2020). Optimization of tobacco drying process control based on reinforcement learning. Dry. Technol. 38, 1291–1299. doi: 10.1080/07373937.2019.1633662
Bi, K., Zhang, D., Qiu, T., and Huang, Y. (2020). GC-MS fingerprints profiling using machine learning models for food flavor prediction. PRO 8:23. doi: 10.3390/pr8010023
Biancalani, T., Scalia, G., Buffoni, L., Avasthi, R., Lu, Z., Sanger, A., et al. (2021). Deep learning and alignment of spatially resolved single-cell transcriptomes with tangram. Nat. Methods 18, 1352–1362. doi: 10.1038/s41592-021-01264-7
Bishop, C. M. Pattern recognition and machine learning. (2006). Available at: https://link.springer.com/book/9780387310732
Bogaerts, P., Mhallem Gziri, K., and Richelle, A. (2017). From MFA to FBA: defining linear constraints accounting for overflow metabolism in a macroscopic FBA-based dynamical model of cell cultures in bioreactor. J. Process Control 60, 34–47. doi: 10.1016/j.jprocont.2017.06.018
Bone, J. M., Childs, C. M., Menon, A., Póczos, B., Feinberg, A. W., LeDuc, P. R., et al. (2020). Hierarchical machine learning for high-Fidelity 3D printed biopolymers. ACS Biomater Sci. Eng. 6, 7021–7031. doi: 10.1021/acsbiomaterials.0c00755
Bouysset, C., Belloir, C., Antonczak, S., Briand, L., and Fiorucci, S. (2020). Novel scaffold of natural compound eliciting sweet taste revealed by machine learning. Food Chem. 324:126864. doi: 10.1016/j.foodchem.2020.126864
Bran, A. M., and Schwaller, P. (2023). Transformers and large language models for chemistry and drug discovery. arXiv. Available at: http://arxiv.org/abs/2310.06083
Brendel, M., Su, C., Bai, Z., Zhang, H., Elemento, O., and Wang, F. (2022). Application of deep learning on single-cell RNA sequencing data analysis: a review. Genomics Proteomics Bioinformatics 20, 814–835. doi: 10.1016/j.gpb.2022.11.011
Brewer, G. J., Torricelli, J. R., Evege, E. K., and Price, P. J. (1993). Optimized survival of hippocampal neurons in B27-supplemented neurobasal™, a new serum-free medium combination. J. Neurosci. Res. 35, 567–576. doi: 10.1002/jnr.490350513
Broucke, K., Van Pamel, E., Van Coillie, E., Herman, L., and Van Royen, G. (2023). Cultured meat and challenges ahead: a review on nutritional, technofunctional and sensorial properties, safety and legislation. Meat Sci. 195:109006. doi: 10.1016/j.meatsci.2022.109006
Buonocore, F., Libertini, A., Prugnoli, D., Mazzini, M., and Scapigliati, G. (2006). Production and characterization of a continuous embryonic cell line from sea bass (Dicentrarchus labrax L.). Mar. Biotechnol. 8, 80–85. doi: 10.1007/s10126-005-5032-2
Business Wire . (2023). Triplebar and umami meats initiate technology collaboration to optimize cell lines for sustainable seafood. Available at: https://www.businesswire.com/news/home/20230302005187/en/Triplebar-and-Umami-Meats-Initiate-Technology-Collaboration-To-Optimize-Cell-Lines-for-Sustainable-Seafood (Accessed June 1, 2024).
Caicedo, J. C., Goodman, A., Karhohs, K. W., Cimini, B. A., Ackerman, J., Haghighi, M., et al. (2019). Nucleus segmentation across imaging experiments: the 2018 data science bowl. Nat. Methods 16, 1247–1253. doi: 10.1038/s41592-019-0612-7
Cerchiari, A. E., Garbe, J. C., Jee, N. Y., Todhunter, M. E., Broaders, K. E., Peehl, D. M., et al. (2015). A strategy for tissue self-organization that is robust to cellular heterogeneity and plasticity. Proc. Natl. Acad. Sci. U. S. A. 112, 2287–2292. doi: 10.1073/pnas.1410776112
Charoensawan, V., Adryan, B., Martin, S., Söllner, C., Thisse, B., Thisse, C., et al. (2010). The impact of gene expression regulation on evolution of extracellular Signaling pathways. Mol Cell Proteomics 9, 2666–2677. doi: 10.1074/mcp.M110.003020
Chen, Y., Li, Y., Narayan, R., Subramanian, A., and Xie, X. (2016). Gene expression inference with deep learning. Bioinformatics 32, 1832–1839. doi: 10.1093/bioinformatics/btw074
Chen, R. J., Lu, M. Y., Wang, J., Williamson, D. F. K., Rodig, S. J., Lindeman, N. I., et al. (2022). Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging 41, 757–770. doi: 10.1109/TMI.2020.3021387
Chen, Z. H., You, Z. H., Guo, Z. H., Yi, H. C., Luo, G. X., and Wang, Y. B. (2020). Prediction of drug–target interactions from multi-molecular network based on deep walk embedding model. Front. Bioeng. Biotechnol. 8:338. doi: 10.3389/fbioe.2020.00338
Ching-Hsuan, C., Chi-Chang, C., Shih-Huan, L., Chi-Wei, C., Zong-Han, C., and Yen-Wei, C. (2020). N-GlycoGo: predicting protein N-glycosylation sites on imbalanced data sets by using heterogeneous and comprehensive strategy. IEEE Access. 8, 165944–165950. doi: 10.1109/ACCESS.2020.3022629
Chriki, S., Ellies-Oury, M. P., and Hocquette, J. F. (2024). “Chapter 5 - what should the properties of cultivated meat be?” in Cellular agriculture. eds. F. EDG, D. L. Kaplan, L. Newman, and R. Y. Yada (Academic Press), 65–75. Available at: https://www.sciencedirect.com/science/article/pii/B9780443187674000093
Chuai, G., Ma, H., Yan, J., Chen, M., Hong, N., Xue, D., et al. (2018). DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol. 19, 1–18.
Ciortan, M., and Defrance, M. (2021). GNN-based embedding for clustering scRNA-seq data. Bioinformatics 38, 1037–1044. doi: 10.1093/bioinformatics/btab787
Clark, M. A., Domingo, N. G. G., Colgan, K., Thakrar, S. K., Tilman, D., Lynch, J., et al. (2020). Global food system emissions could preclude achieving the 1.5° and 2°C climate change targets. Science 370, 705–708. doi: 10.1126/science.aba7357
Conev, A., Litsa, E. E., Perez, M. R., Diba, M., Mikos, A. G., and Kavraki, L. E. (2020). Machine learning-guided three-dimensional printing of tissue engineering scaffolds. Tissue Eng. A 26, 1359–1368. doi: 10.1089/ten.tea.2020.0191
Cong, Y. S., Wright, W. E., and Shay, J. W. (2002). Human telomerase and its regulation. Microbiol. Mol. Biol. Rev. 66, 407–25, table of contents. doi: 10.1128/MMBR.66.3.407-425.2002
Cosenza, Z., Astudillo, R., Frazier, P. I., Baar, K., and Block, D. E. (2022). Multi-information source Bayesian optimization of culture media for cellular agriculture. Biotechnol. Bioeng. 119, 2447–2458. doi: 10.1002/bit.28132
Cosenza, Z., Block, D. E., and Baar, K. (2021). Optimization of muscle cell culture media using nonlinear design of experiments. Biotechnol. J. 16:2100228. doi: 10.1002/biot.202100228
Cosenza, Z., Block, D. E., Baar, K., and Chen, X. (2023). Multi-objective Bayesian algorithm automatically discovers low-cost high-growth serum-free media for cellular agriculture application. Eng. Life Sci. 23:e2300005. doi: 10.1002/elsc.202300005
Craven, S., Whelan, J., and Glennon, B. (2014). Glucose concentration control of a fed-batch mammalian cell bioprocess using a nonlinear model predictive controller. J. Process Control 24, 344–357. doi: 10.1016/j.jprocont.2014.02.007
Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., and Bharath, A. A. (2018). Generative adversarial networks: an overview. IEEE Signal Process. Mag. 35, 53–65. doi: 10.1109/MSP.2017.2765202
Crippa, M., Solazzo, E., Guizzardi, D., Monforti-Ferrario, F., Tubiello, F. N., and Leip, A. (2021). Food systems are responsible for a third of global anthropogenic GHG emissions. Nat. Food. 2, 198–209. doi: 10.1038/s43016-021-00225-9
Cui, H., Wang, C., Maan, H., Pang, K., Luo, F., Duan, N., et al. (2024). scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480. doi: 10.1038/s41592-024-02201-0
Danilevicz, M. F., Gill, M., Tay Fernandez, C. G., Petereit, J., Upadhyaya, S. R., Batley, J., et al. (2022). DNABERT-based explainable lncRNA identification in plant genome assemblies. bioRxiv, 2022–2002.
Dauparas, J., Anishchenko, I., Bennett, N., Bai, H., Ragotte, R. J., Milles, L. F., et al. (2022). Robust deep learning–based protein sequence design using ProteinMPNN. Science 378, 49–56. doi: 10.1126/science.add2187
Delahaye, D., Chaimatanan, S., and Mongeau, M. (2019). “Simulated annealing: from basics to applications” in Handbook of metaheuristics. eds. M. Gendreau and J. Y. Potvin (Cham: Springer International Publishing), 1–35.
Dubourg-Felonneau, G., Abbasi, A., Akiva, E., and Lee, L. (2022). Improving protein subcellular localization prediction with structural prediction & graph neural networks. bioRxiv. doi: 10.1101/2022.11.29.518403v1
Durkee, M. S., Abraham, R., Clark, M. R., and Giger, M. L. (2021). Artificial intelligence and cellular segmentation in tissue microscopy images. Am. J. Pathol. 191, 1693–1701. doi: 10.1016/j.ajpath.2021.05.022
Dursun, G., Bijelić, D., Ayşit, N., Vatandaşlar, B. K., Radenović, L., Çapar, A., et al. (2023). Combined segmentation and classification-based approach to automated analysis of biomedical signals obtained from calcium imaging. PLoS One 18:e0281236. doi: 10.1371/journal.pone.0281236
Dvorak, P., Bednar, D., Vanacek, P., Balek, L., Eiselleova, L., Stepankova, V., et al. (2018). Computer-assisted engineering of hyperstable fibroblast growth factor 2. Biotechnol. Bioeng. 115, 850–862. doi: 10.1002/bit.26531
Evans, R., O’Neill, M., Pritzel, A., Antropova, N., Senior, A., Green, T., et al. (2024). Protein complex prediction with AlphaFold-Multimer. bioRxiv. doi: 10.1101/2021.10.04.463034v2
Falk, T., Mai, D., Bensch, R., Çiçek, Ö., Abdulkadir, A., Marrakchi, Y., et al. (2019). U-net: deep learning for cell counting, detection, and morphometry. Nat. Methods 16, 67–70. doi: 10.1038/s41592-018-0261-2
FAO (2020). World food and agriculture -statistical yearbook 2020. Rome: FAO Available at: https://www.fao.org/documents/card/en/c/cb1329en.
Fischer, D. S., Schaar, A. C., and Theis, F. J. (2023). Modeling intercellular communication in tissues using spatial graphs of cells. Nat. Biotechnol. 41, 332–336. doi: 10.1038/s41587-022-01467-z
Fraeye, I., Kratka, M., Vandenburgh, H., and Thorrez, L. (2020). Sensorial and nutritional aspects of cultured meat in comparison to traditional meat: much to be inferred. Front. Nutr. 7:35. doi: 10.3389/fnut.2020.00035
Franceschini, G., and Macchietto, S. (2008). Model-based design of experiments for parameter precision: state of the art. Chem. Eng. Sci. 63, 4846–4872. doi: 10.1016/j.ces.2007.11.034
Franklin, M. M., Schultz, F. A., Tafoya, M. A., Kerwin, A. A., Broehm, C. J., Fischer, E. G., et al. (2021). A deep learning convolutional neural network can differentiate between helicobacter pylori gastritis and autoimmune gastritis with results comparable to gastrointestinal pathologists. Arch. Pathol. Lab Med. 146, 117–122. doi: 10.5858/arpa.2020-0520-OA
Freddi, A., and Salmon, M. Introduction to the Taguchi method. In: Freddi, A., and Salmon, M., editors. Design principles and methodologies: from conceptualization to first prototyping with examples and case studies. Cham: Springer International Publishing; (2019), 159–180.
Furuhashi, M., Morimoto, Y., Shima, A., Nakamura, F., Ishikawa, H., and Takeuchi, S. (2021). Formation of contractile 3D bovine muscle tissue for construction of millimetre-thick cultured steak. Npj Sci. Food. 5:6. doi: 10.1038/s41538-021-00090-7
Gamble, C., Bryant, D., Carrieri, D., Bixby, E., Dang, J., Marshall, J., et al. (2021). Machine learning optimization of photosynthetic microbe cultivation and recombinant protein production. bioRxiv. doi: 10.1101/2021.08.06.453272v1
Gan, Y., Huang, X., Zou, G., Zhou, S., and Guan, J. (2022). Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network. Brief. Bioinform. 23:bbac018. doi: 10.1093/bib/bbac018
Gayoso, A., Steier, Z., Lopez, R., Regier, J., Nazor, K. L., Streets, A., et al. (2021). Joint probabilistic modeling of single-cell multi-omic data with totalVI. Nat. Methods 18, 272–282. doi: 10.1038/s41592-020-01050-x
Ghojogh, B., and Ghodsi, A. Recurrent neural networks and long short-term memory networks: tutorial and survey. arXiv (2023). Available at: http://arxiv.org/abs/2304.11461
Gignac, S. J., Vo, N. T. K., Mikhaeil, M. S., Alexander, J. A. N., MacLatchy, D. L., Schulte, P. M., et al. (2014). Derivation of a continuous myogenic cell culture from an embryo of common killifish, Fundulus heteroclitus. Comp. Biochem. Physiol. A Mol. Integr. Physiol. 175, 15–27. doi: 10.1016/j.cbpa.2014.05.002
Goldenzweig, A., and Fleishman, S. J. (2018). Principles of protein stability and their application in computational design. Annu. Rev. Biochem. 87, 105–129. doi: 10.1146/annurev-biochem-062917-012102
Golovin, D., Solnik, B., Moitra, S., Kochanski, G., Karro, J., and Sculley, D. (2024). “Google vizier: a service for black-box optimization” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Gómez-de-Mariscal, E., García-López-de-Haro, C., Ouyang, W., Donati, L., Lundberg, E., Unser, M., et al. (2021). DeepImageJ: a user-friendly environment to run deep learning models in ImageJ. Nat. Methods 18, 1192–1195. doi: 10.1038/s41592-021-01262-9
Gonzalez Viejo, C., Fuentes, S., Godbole, A., Widdicombe, B., and Unnithan, R. R. (2020). Development of a low-cost e-nose to assess aroma profiles: an artificial intelligence application to assess beer quality. Sens. Actuators B Chem. 308:127688. doi: 10.1016/j.snb.2020.127688
Gonzalez Viejo, C., Tongson, E., and Fuentes, S. (2021). Integrating a low-cost electronic nose and machine learning modelling to assess coffee aroma profile and intensity. Sensors 21:2016. doi: 10.3390/s21062016
Goswami, M., Pinto, N., Yashwanth, B. S., Sathiyanarayanan, A., and Ovissipour, R. (2023). Development of a cell line from skeletal trunk muscle of the fish Labeo rohita. Cytotechnology 75, 349–361. doi: 10.1007/s10616-023-00581-3
Greenwald, N. F., Miller, G., Moen, E., Kong, A., Kagel, A., Dougherty, T., et al. (2022). Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat. Biotechnol. 40, 555–565. doi: 10.1038/s41587-021-01094-0
Grisel, O., Mueller, A., Lars,, Gramfort, A., Louppe, G., Fan, T. J., et al. scikit-learn/scikit-learn: Scikit-learn 1.4.0. Zenodo. (2024). Available at: https://zenodo.org/records/10532824
Grzesik, P., and Warth, S. C. (2021). One-time optimization of advanced T cell culture media using a machine learning pipeline. Front. Bioeng. Biotechnol. 9:614324. doi: 10.3389/fbioe.2021.614324
Guo, S., Ao, X., Ma, X., Cheng, S., Men, C., Harada, H., et al. (2023). Machine-learning-aided application of high-gravity technology to enhance ammonia recovery of fresh waste leachate. Water Res. 235:119891. doi: 10.1016/j.watres.2023.119891
Gurcan, M. N., Boucheron, L. E., Can, A., Madabhushi, A., Rajpoot, N. M., and Yener, B. (2009). Histopathological image analysis: a review. IEEE Rev. Biomed. Eng. 2, 147–171. doi: 10.1109/RBME.2009.2034865
Hagrot, E., Oddsdóttir, H. Æ., Hosta, J. G., Jacobsen, E. W., and Chotteau, V. (2017). Poly-pathway model, a novel approach to simulate multiple metabolic states by reaction network-based model – application to amino acid depletion in CHO cell culture. J. Biotechnol. 259, 235–247. doi: 10.1016/j.jbiotec.2017.05.026
Hale, B., Ratnayake, S., Flory, A., Wijeratne, R., Schmidt, C., Robertson, A. E., et al. (2022). Gene regulatory network inference in soybean upon infection by Phytophthora sojae. bioRxiv. doi: 10.1101/2022.10.19.512983v2
Ham, R. G. (1965). Clonal growth of mammalian cells in a chemically defined, synthetic medium. Proc. Natl. Acad. Sci. U.S.A. 53, 288–293. doi: 10.1073/pnas.53.2.288
Hamby, S. E., and Hirst, J. D. (2008). Prediction of glycosylation sites using random forests. BMC Bioinformatics. 9:500. doi: 10.1186/1471-2105-9-500
Hashizume, T., Ozawa, Y., and Ying, B. W. (2022). Employing active learning in the optimization of culture medium for mammalian cells. bioRxiv. doi: 10.1101/2022.12.24.521878v1
Havel, J., Link, H., Hofinger, M., Franco-Lara, E., and Weuster-Botz, D. (2006). Comparison of genetic algorithms for experimental multi-objective optimization on the example of medium design for cyanobacteria. Biotechnol. J. 1, 549–555. doi: 10.1002/biot.200500052
Hayflick, L. (1965). The limited in vitro lifetime of human diploid cell strains. Exp. Cell Res. 37, 614–636. doi: 10.1016/0014-4827(65)90211-9
He, K., Gkioxari, G., Dollar, P., and Girshick, R.. Mask R-CNN. Available at: https://openaccess.thecvf.com/content_iccv_2017/html/He_Mask_R-CNN_ICCV_2017_paper.html
He, Y., Yuan, H., Wu, C., and Xie, Z. (2020). DISC: a highly scalable and accurate inference of gene expression and structure for single-cell transcriptomes using semi-supervised deep learning. Genome Biol. 21, 1–28. doi: 10.1186/s13059-020-02083-3
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep residual learning for image recognition. arXiv. Available at: http://arxiv.org/abs/1512.03385
Hekler, A., Utikal, J. S., Enk, A. H., Solass, W., Schmitt, M., Klode, J., et al. (2019). Deep learning outperformed 11 pathologists in the classification of histopathological melanoma images. Eur. J. Cancer 118, 91–96. doi: 10.1016/j.ejca.2019.06.012
Heumos, L., Schaar, A. C., Lance, C., Litinetskaya, A., Drost, F., Zappia, L., et al. (2023). Best practices for single-cell analysis across modalities. Nat. Rev. Genet. 24, 550–572. doi: 10.1038/s41576-023-00586-w
Hie, B. L., Shanker, V. R., Xu, D., Bruun, T. U. J., Weidenbacher, P. A., Tang, S., et al. (2024). Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol. 42, 275–283. doi: 10.1038/s41587-023-01763-2
Ho, S. (2021). This Cambridge Startup’s ‘renaissance farm’ uses AI to make cell-based meat - Green queen. Green Queen. Available at: https://www.greenqueen.com.hk/animal-alternative-technologies-renaissance-farm-cell-based-meat/
Holm, L., and Laakso, L. M. (2016). Dali server update. Nucleic Acids Res. 44, W351–W355. doi: 10.1093/nar/gkw357
Hong, J. K., Yeo, H. C., Lakshmanan, M., Han, S., Cha, H. M., Han, M., et al. (2020). In silico model-based characterization of metabolic response to harsh sparging stress in fed-batch CHO cell cultures. J. Biotechnol. 308, 10–20. doi: 10.1016/j.jbiotec.2019.11.011
Hu, J., Li, X., Coleman, K., Schroeder, A., Ma, N., Irwin, D. J., et al. (2021). SpaGCN: integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat. Methods 18, 1342–1351. doi: 10.1038/s41592-021-01255-8
Huang, Y., Jiang, D., Ren, G., Yin, Y., Sun, Y., Liu, T., et al. (2021). De novo production of Glycyrrhetic acid 3-O-mono-β-D-glucuronide in Saccharomyces cerevisiae. Front. Bioeng. Biotechnol. 9:709120. doi: 10.3389/fbioe.2021.709120
Huang, X., Song, C., Zhang, G., Li, Y., Zhao, Y., Zhang, Q., et al. (2023). scGRN: a comprehensive single-cell gene regulatory network platform of human and mouse. Nucleic Acids Res. 52, D293–D303. doi: 10.1093/nar/gkad885
Huang, K., Xiao, C., Glass, L. M., Zitnik, M., and Sun, J. (2020). SkipGNN: predicting molecular interactions with skip-graph networks. Sci. Rep. 10:21092. doi: 10.1038/s41598-020-77766-9
Huang, J., Zheng, J., Yuan, H., and McGinnis, K. (2018). Distinct tissue-specific transcriptional regulation revealed by gene regulatory networks in maize. BMC Plant Biol. 18:111. doi: 10.1186/s12870-018-1329-y
Hubalek, S., Post, M. J., and Moutsatsou, P. (2022). Towards resource-efficient and cost-efficient cultured meat. Curr. Opin. Food Sci. 47:100885. doi: 10.1016/j.cofs.2022.100885
Humbird, D. Scale-up economics for cultured meat: techno-economic analysis and due diligence. Engineering Archive; (2020). Available at: https://engrxiv.org/preprint/view/1438
Humbird, D. (2021). Scale-up economics for cultured meat. Biotechnol. Bioeng. 118, 3239–3250. doi: 10.1002/bit.27848
Huynh, H. T., Dat, V. V. T., and Anh, H. B. (2021). “White blood cell segmentation and classification using deep learning coupled with image processing technique” in Future data and security engineering big data, security and privacy, Smart City and industry 40 applications. eds. T. K. Dang, J. Küng, T. M. Chung, and M. Takizawa (Singapore: Springer), 399–410.
Huynh-Thu, V. A., and Geurts, P. (2018). dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data. Sci. Rep. 8:3384. doi: 10.1038/s41598-018-21715-0
Ivanovich, C. C., Sun, T., Gordon, D. R., and Ocko, I. B. (2023). Future warming from global food consumption. Nat. Clim. Chang. 13, 297–302. doi: 10.1038/s41558-023-01605-8
Jafari, R., Javidi, M. M., and Kuchaki, R. M. (2019). Using deep reinforcement learning approach for solving the multiple sequence alignment problem. SN Appl. Sci. 1:592. doi: 10.1007/s42452-019-0611-4
Jang, J., Wang, C., Zhang, X., Choi, H. J., Pan, X., Lin, B., et al. (2021). A deep learning-based segmentation pipeline for profiling cellular morphodynamics using multiple types of live cell microscopy. Cell Rep. Methods 1:100105. doi: 10.1016/j.crmeth.2021.100105
Jeong, Y., Choi, W. Y., Park, A., Lee, Y. J., Lee, Y., Park, G. H., et al. (2021). Marine cyanobacterium Spirulina maxima as an alternate to the animal cell culture medium supplement. Sci. Rep. 11:4906. doi: 10.1038/s41598-021-84558-2
Jha, K., Saha, S., and Singh, H. (2022). Prediction of protein–protein interaction using graph neural networks. Sci. Rep. 12:8360. doi: 10.1038/s41598-022-12201-9
Ji, H., Pu, D., Yan, W., Zhang, Q., Zuo, M., and Zhang, Y. (2023). Recent advances and application of machine learning in food flavor prediction and regulation. Trends Food Sci. Technol. 138, 738–751. doi: 10.1016/j.tifs.2023.07.012
Ji, Y., Zhou, Z., Liu, H., and Davuluri, R. V. (2021). DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120. doi: 10.1093/bioinformatics/btab083
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589. doi: 10.1038/s41586-021-03819-2
Jung, F., Frey, K., Zimmer, D., and Mühlhaus, T. (2023). DeepSTABp: a deep learning approach for the prediction of thermal protein stability. Int. J. Mol. Sci. 24:7444. doi: 10.3390/ijms24087444
Kabas, O., Kayakus, M., Ünal, İ., and Moiceanu, G. (2023). Deformation energy estimation of cherry tomato based on some engineering parameters using machine-learning algorithms. Appl. Sci. 13:8906. doi: 10.3390/app13158906
Kaimal, J., Thul, P., Xu, H., Ouyang, W., and Lundberg, E. (2021). HPA cell image segmentation dataset. Zenodo. Available at: https://zenodo.org/record/4665863
Kanda, G. N., Tsuzuki, T., Terada, M., Sakai, N., Motozawa, N., Masuda, T., et al. (2022). Robotic search for optimal cell culture in regenerative medicine. eLife 11:e77007. doi: 10.7554/eLife.77007
Kang, C., Zhang, H., Liu, Z., Huang, S., and Yin, Y. (2022). LR-GNN: a graph neural network based on link representation for predicting molecular associations. Brief. Bioinform. 23:bbab513. doi: 10.1093/bib/bbab513
Katoch, S., Chauhan, S. S., and Kumar, V. (2021). A review on genetic algorithm: past, present, and future. Multimed. Tools Appl. 80, 8091–8126. doi: 10.1007/s11042-020-10139-6
Kayakuş, M., and Açıkgöz, F. Y. (2022). Classification of news texts by categories using machine learning methods. Alphanumeric J. 10, 155–166. doi: 10.17093/alphanumeric.1149753
Khan, M. I., Jo, C., and Tariq, M. R. (2015). Meat flavor precursors and factors influencing flavor precursors—a systematic review. Meat Sci. 110, 278–284. doi: 10.1016/j.meatsci.2015.08.002
Kim, J. T., Jakobsen, S., Natarajan, K. N., and Won, K. J. (2021). TENET: gene network reconstruction using transfer entropy reveals key regulatory factors from single cell transcriptomic data. Nucleic Acids Res. 49:e1. doi: 10.1093/nar/gkaa1014
Kim, J., McFee, M., Fang, Q., Abdin, O., and Kim, P. M. (2023). Computational and artificial intelligence-based methods for antibody development. Trends Pharmacol. Sci. 44, 175–189. doi: 10.1016/j.tips.2022.12.005
Kim, D., Tran, A., Kim, H. J., Lin, Y., Yang, J. Y. H., and Yang, P. (2023). Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data. Npj Syst. Biol. Appl. 9, 1–13.
Kingma, D. P., and Welling, M. (2019). An introduction to variational autoencoders. Found Trends Mach. Learn. 12, 307–392.
Kiparissides, A., Pistikopoulos, E. N., and Mantalaris, A. (2015). On the model-based optimization of secreting mammalian cell (GS-NS0) cultures. Biotechnol. Bioeng. 112, 536–548. doi: 10.1002/bit.25457
Kircali Ata, S., Shi, J. K., Yao, X., Hua, X. Y., Haldar, S., Chiang, J. H., et al. (2023). Predicting the textural properties of plant-based meat Analogs with machine learning. Food Secur. 12:344. doi: 10.3390/foods12020344
Koerich, D. M., Lopes, G. C., and Rosa, L. M. (2018). Investigation of phases interactions and modification of drag models for liquid-solid fluidized bed tapered bioreactors. Powder Technol. 339, 90–101. doi: 10.1016/j.powtec.2018.07.102
Kolkmann, A. M., Van Essen, A., Post, M. J., and Moutsatsou, P. (2022). Development of a chemically defined medium for in vitro expansion of primary bovine satellite cells. Front. Bioeng. Biotechnol. 10:895289. doi: 10.3389/fbioe.2022.895289
Kotopka, B. J., and Smolke, C. D. (2020). Model-driven generation of artificial yeast promoters. Nat. Commun. 11:2113. doi: 10.1038/s41467-020-15977-4
Krishnan, S., Ulagesan, S., Cadangin, J., Lee, J. H., Nam, T. J., and Choi, Y. H. (2023). Establishment and characterization of continuous satellite muscle cells from olive flounder (Paralichthys olivaceus): isolation, culture conditions, and myogenic protein expression. Cells 12:2325. doi: 10.3390/cells12182325
Kumar, A., Singh, S. S., Singh, K., and Biswas, B. (2020). Link prediction techniques, applications, and performance: a survey. Phys. Stat. Mech. Appl. 553:124289. doi: 10.1016/j.physa.2020.124289
Kumar, N., Verma, R., Anand, D., Zhou, Y., Onder, O. F., Tsougenis, E., et al. (2020). A multi-organ nucleus segmentation challenge. IEEE Trans. Med. Imaging 39, 1380–1391. doi: 10.1109/TMI.2019.2947628
Kumar, N., Verma, R., Chen, C., Lu, C., Fu, P., Willis, J., et al. (2022). Computer-extracted features of nuclear morphology in hematoxylin and eosin images distinguish stage II and IV colon tumors. J. Pathol. 257, 17–28. doi: 10.1002/path.5864
Kuo, H. H., Gao, X., DeKeyser, J. M., Fetterman, K. A., Pinheiro, E. A., Weddle, C. J., et al. (2020). Negligible-cost and weekend-free chemically defined human iPSC culture. Stem Cell Rep. 14, 256–270. doi: 10.1016/j.stemcr.2019.12.007
Kupferschmidt, K. (2013). Lab burger adds sizzle to bid for research funds. Science 341, 602–603. doi: 10.1126/science.341.6146.602
Larrañaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., et al. (2006). Machine learning in bioinformatics. Brief. Bioinform. 7, 86–112. doi: 10.1093/bib/bbk007
Leach, C.Science Entrepreneur Club . (2024). Alt Atlas: Artificial Intelligence powering real meat of the future. Available at: https://www.science-entrepreneur.com/insights/artificial-intelligence-powering-real-meat-of-the-future
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature 521, 436–444. doi: 10.1038/nature14539
Lee, B. K., Mayhew, E. J., Sanchez-Lengeling, B., Wei, J. N., Qian, W. W., Little, K. A., et al. (2023). A principal odor map unifies diverse tasks in olfactory perception. Science 381, 999–1006. doi: 10.1126/science.ade4401
Lee, J., Song, S. B., Chung, Y. K., Jang, J. H., and Huh, J. (2022). BoostSweet: learning molecular perceptual representations of sweeteners. Food Chem. 383:132435. doi: 10.1016/j.foodchem.2022.132435
Levy, B., Xu, Z., Zhao, L., Kremling, K., Altman, R., Wong, P., et al. FloraBERT: cross-species transfer learning withattention-based neural networks for geneexpression prediction. (2022)
Li, Y. (2018). Deep reinforcement learning: an overview. arXiv. Available at: http://arxiv.org/abs/1701.07274
Li, N., Guo, L., and Guo, H. (2021). Establishment, characterization, and transfection potential of a new continuous fish cell line (CAM) derived from the muscle tissue of grass goldfish (Carassius auratus). Vitro Cell Dev Biol Anim. 57, 912–931. doi: 10.1007/s11626-021-00622-1
Li, F., Han, J., Cao, T., Lam, W., Fan, B., Tang, W., et al. (2019). Design of self-assembly dipeptide hydrogels and machine learning via their chemical features. Proc. Natl. Acad. Sci. U. S. A. 116, 11259–11264. doi: 10.1073/pnas.1903376116
Li, Z., Liu, F., Yang, W., Peng, S., and Zhou, J. (2022). A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 33, 6999–7019. doi: 10.1109/TNNLS.2021.3084827
Li, H., Sun, Y., Hong, H., Huang, X., Tao, H., Huang, Q., et al. (2022). Inferring transcription factor regulatory networks from single-cell ATAC-seq data based on graph neural networks. Nat. Mach. Intell. 4, 389–400. doi: 10.1038/s42256-022-00469-5
Li, H., Tian, S., Li, Y., Fang, Q., Tan, R., Pan, Y., et al. (2020). Modern deep learning in bioinformatics. J. Mol. Cell Biol. 12, 823–827. doi: 10.1093/jmcb/mjaa030
Li, X., Wang, K., Lyu, Y., Pan, H., Zhang, J., Stambolian, D., et al. (2020). Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis. Nat. Commun. 11:2338. doi: 10.1038/s41467-020-15851-3
Li, B., Wang, L., Wu, Y. J., Xia, Z. Y., Yang, B. X., and Tang, Y. Q. (2021). Improving acetic acid and furfural resistance of xylose-fermenting Saccharomyces cerevisiae strains by regulating novel transcription factors revealed via comparative transcriptomic analysis. Appl. Environ. Microbiol. 87, e00158–e00121.
Libby, A. R. G., Briers, D., Haghighi, I., Joy, D. A., Conklin, B. R., Belta, C., et al. (2019). Automated Design of Pluripotent Stem Cell Self-Organization. Cell Syst. 9, 483–495.e10. doi: 10.1016/j.cels.2019.10.008
Lin, E., Mukherjee, S., and Kannan, S. (2020). A deep adversarial variational autoencoder model for dimensionality reduction in single-cell RNA sequencing analysis. BMC Bioinformatics 21:64. doi: 10.1186/s12859-020-3401-5
Liongue, C., and Ward, A. C. (2007). Evolution of class I cytokine receptors. BMC Evol. Biol. 7:120. doi: 10.1186/1471-2148-7-120
Lipton, Z. C., Berkowitz, J., and Elkan, C. (2015). A critical review of recurrent neural networks for sequence learning. arXiv. Available at: http://arxiv.org/abs/1506.00019
Listrat, A., Lebret, B., Louveau, I., Astruc, T., Bonnet, M., Lefaucheur, L., et al. (2016). How muscle structure and composition influence meat and flesh quality. Sci. World J. 2016:e3182746. doi: 10.1155/2016/3182746
Liu, Z., Gao, J., Li, T., Jing, Y., Xu, C., Zhu, Z., et al. (2022). A novel approach GRNTSTE to reconstruct gene regulatory interactions applied to a case study for rat pineal rhythm gene. Sci. Rep. 12:10227. doi: 10.1038/s41598-022-14903-6
Liu, X., Shen, Q., and Zhang, S. (2023). Cross-species cell-type assignment from single-cell RNA-seq data by a heterogeneous graph neural network. Genome Res. 33, 96–111. doi: 10.1101/gr.276868.122
Long, J., and Shelhamer, E. (2015). Darrell T. Fully Convolutional Networks for Semantic Segmentation. arXiv. http://arxiv.org/abs/1411.4038.
Lotfollahi, M., Susmelj, A. K., De Donno, C., Ji, Y., Ibarra, I. L., Wolf, F. A., et al. (2021). Learning interpretable cellular responses to complex perturbations in high-throughput screens. bioRxiv. Available at: https://www.biorxiv.org/content/early/2021/05/18/2021.04.14.439903
Lou, W., Yu, X., Liu, C., Wan, X., Li, G., Liu, S., et al. (2023). Multi-stream cell segmentation with low-level cues for multi-modality images. PMLR 212:1–10.
Luna, M., and Martínez, E. (2014). A Bayesian approach to run-to-run optimization of animal cell bioreactors using probabilistic tendency models. Ind. Eng. Chem. Res. 53, 17252–17266. doi: 10.1021/ie500453e
Makowski, E. K., Kinnunen, P. C., Huang, J., Wu, L., Smith, M. D., Wang, T., et al. (2022). Co-optimization of therapeutic antibody affinity and specificity using machine learning models that generalize to novel mutational space. Nat. Commun. 13:3788. doi: 10.1038/s41467-022-31457-3
Marston, J. . (2022). Brief: prolific machines raises $42m from breakthrough, Mayfield to slash cultivated meat costs. Available at: https://agfundernews.com/prolific-machines-raises-42m-to-slash-cultivated-meat-manufacturing-costs
Martínez-Monge, I., Albiol, J., Lecina, M., Liste-Calleja, L., Miret, J., Solà, C., et al. (2019). Metabolic flux balance analysis during lactate and glucose concomitant consumption in HEK293 cell cultures. Biotechnol. Bioeng. 116, 388–404. doi: 10.1002/bit.26858
Marzahl, C., Aubreville, M., Voigt, J., and Maier, A. (2019). “Classification of leukemic B-lymphoblast cells from blood smear microscopic images with an attention-based deep learning method and advanced augmentation techniques” in ISBI 2019 C-NMC challenge: classification in cancer cell imaging. eds. A. Gupta and R. Gupta (Singapore: Springer), 13–22.
Melzener, L., Ding, S., Hueber, R., Messmer, T., Zhou, G., Post, M. J., et al. (2022). Comparative analysis of cattle breeds as satellite cell donors for cultured beef [internet]. bioRxiv. doi: 10.1101/2022.01.14.476358v2
Melzener, L., Schaeken, L., Fros, M., Messmer, T., Raina, D., Kiessling, A., et al. (2024). Optimisation of cell fate determination for cultured muscle differentiation [internet]. bioRxiv. doi: 10.1101/2023.09.06.556523v2
Menden, K., Marouf, M., Oller, S., Dalmia, A., Magruder, D. S., Kloiber, K., et al. (2020). Deep learning–based cell composition analysis from tissue expression profiles. Sci. Adv. 6:eaba2619. doi: 10.1126/sciadv.aba2619
Meßmer, T. (2023). Decoding cultured meat production: the transcriptomic landscape of bovine satellite cells in proliferation and differentiation. Maastricht: Maastricht University.
Messmer, T., Dohmen, R. G. J., Schaeken, L., Melzener, L., Hueber, R., Godec, M., et al. (2023). Single-cell analysis of bovine muscle-derived cell types for cultured meat production. Front. Nutr. 10:1212196. doi: 10.3389/fnut.2023.1212196
Messmer, T., Klevernic, I., Furquim, C., Ovchinnikova, E., Dogan, A., Cruz, H., et al. (2022). A serum-free media formulation for cultured meat production supports bovine satellite cell differentiation in the absence of serum starvation. Nat. Food. 3, 74–85. doi: 10.1038/s43016-021-00419-1
Metzger, K., Tuchscherer, A., Palin, M. F., Ponsuksili, S., and Kalbe, C. (2020). Establishment and validation of cell pools using primary muscle cells derived from satellite cells of pig skeletal muscle. In Vitro Cell. Dev. Biol. Anim. 56, 193–199. doi: 10.1007/s11626-019-00428-2
Michele, P. . OECD-FAO agricultural outlook 2021-2030. (2021). Available at: https://www.fao.org/documents/card/en?details=cb5332en
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv. Available at: http://arxiv.org/abs/1301.3781
Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., and Steinegger, M. (2022). ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682. doi: 10.1038/s41592-022-01488-1
Mj, B., Fo, U., and Cc, U. (2015). Influence of cowpea (Vigna unguiculata) peptides on insulin resistance. J. Nutr. Health Food Sci. 3, 1–3. doi: 10.15226/jnhfs.2015.00144
Mohammed, E. A., Mohamed, M. M. A., Naugler, C., and Far, B. H. Chronic lymphocytic leukemia cell segmentation from microscopic blood images using watershed algorithm and optimal thresholding. In: 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE). (2013). p. 1–5.
Musgrove, L., Russell, F. D., and Ventura, T. (2024). Considerations for cultivated crustacean meat: potential cell sources, potential differentiation and immortalization strategies, and lessons from crustacean and other animal models. Crit. Rev. Food Sci. Nutr., 1–25. doi: 10.1080/10408398.2024.2342480
Muzio, G., O’Bray, L., and Borgwardt, K. (2021). Biological network analysis with deep learning. Brief. Bioinform. 22, 1515–1530. doi: 10.1093/bib/bbaa257
Naylor, P., Laé, M., Reyal, F., and Walter, T. (2019). Segmentation of nuclei in histopathology images by deep regression of the distance map. IEEE Trans. Med. Imaging 38, 448–459. doi: 10.1109/TMI.2018.2865709
Neftci, E. O., and Averbeck, B. B. (2019). Reinforcement learning in artificial and biological systems. Nat. Mach. Intell. 1, 133–143. doi: 10.1038/s42256-019-0025-4
Nelson, W., Zitnik, M., Wang, B., Leskovec, J., Goldenberg, A., and Sharan, R. (2019). To embed or not: network embedding as a paradigm in computational biology. Front. Genet. 10:381. doi: 10.3389/fgene.2019.00381
Ng, W. L., and Tan, J. S. (2024). Application of machine learning in 3D bioprinting of cultivated meat. Int. J. AI Mater. Des. 1, 3–25. doi: 10.36922/ijamd.2279
Nguyen, E., Poli, M., Durrant, M. G., Thomas, A. W., Kang, B., Sullivan, J., et al. (2024). Sequence modeling and design from molecular to genome scale with Evo. bioRxiv. Available at: https://www.biorxiv.org/content/10.1101/2024.02.27.582234v1
Nikkhah, A., Rohani, A., Zarei, M., Kulkarni, A., Batarseh, F. A., Blackstone, N. T., et al. (2023). Toward sustainable culture media: using artificial intelligence to optimize reduced-serum formulations for cultivated meat. Sci. Total Environ. 894:164988. doi: 10.1016/j.scitotenv.2023.164988
Nishimura, T. (2010). The role of intramuscular connective tissue in meat texture. Anim. Sci. J. 81, 21–27. doi: 10.1111/j.1740-0929.2009.00696.x
Nissa, M. U., Reddy, P. J., Pinto, N., Sun, Z., Ghosh, B., Moritz, R. L., et al. (2022). The PeptideAtlas of a widely cultivated fish Labeo rohita: a resource for the aquaculture community. Sci. Data 9:171. doi: 10.1038/s41597-022-01259-9
O’Neill, E. N., Cosenza, Z. A., Baar, K., and Block, D. E. (2021). Considerations for the development of cost-effective cell culture media for cultivated meat production. Compr. Rev. Food Sci. Food Saf. 20, 686–709. doi: 10.1111/1541-4337.12678
Oh, T. H., Park, H. M., Kim, J. W., and Lee, J. M. (2022). Integration of reinforcement learning and model predictive control to optimize semi-batch bioreactor. AICHE J. 68:e17658. doi: 10.1002/aic.17658
Okamoto, Y., Haraguchi, Y., Yoshida, A., Takahashi, H., Yamanaka, K., Sawamura, N., et al. (2022). Proliferation and differentiation of primary bovine myoblasts using Chlorella vulgaris extract for sustainable production of cultured meat. Biotechnol. Prog. 38:e3239. doi: 10.1002/btpr.3239
Orth, J. D., Thiele, I., and Palsson, B. Ø. (2010). What is flux balance analysis? Nat. Biotechnol. 28, 245–248. doi: 10.1038/nbt.1614
Osorio, D., Zhong, Y., Li, G., Xu, Q., Yang, Y., Tian, Y., et al. (2022). scTenifoldKnk: an efficient virtual knockout tool for gene function predictions via single-cell gene regulatory network perturbation. Patterns 3:100434. doi: 10.1016/j.patter.2022.100434
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., et al. (2022). Training language models to follow instructions with human feedback. Adv. Neural. Inf. Process. Syst. 35, 27730–27744.
Ouyang, Y., Zou, H. K., Gao, X. Y., Chu, G. W., Xiang, Y., and Chen, J. F. (2018). Computational fluid dynamics modeling of viscous liquid flow characteristics and end effect in rotating packed bed. Chem Eng Process Process Intensif. 123, 185–194. doi: 10.1016/j.cep.2017.09.005
Pachitariu, M., and Stringer, C. (2022). Cellpose 2.0: how to train your own model. Nat. Methods 19, 1634–1641. doi: 10.1038/s41592-022-01663-4
Pak, M. A., Markhieva, K. A., Novikova, M. S., Petrov, D. S., Vorobyev, I. S., Maksimova, E. S., et al. (2023). Using AlphaFold to predict the impact of single mutations on protein stability and function. PLoS One 18:e0282689. doi: 10.1371/journal.pone.0282689
Pakhrin, S. C., Aoki-Kinoshita, K. F., Caragea, D., and Kc, D. B. (2021). DeepNGlyPred: a deep neural network-based approach for human N-linked glycosylation site prediction. Molecules 26:7314. doi: 10.3390/molecules26237314
Parameswaran, V., Shukla, R., Bhonde, R., and Hameed, A. S. S. (2007). Development of a pluripotent ES-like cell line from Asian Sea bass (Lates calcarifer)—an oviparous stem cell line mimicking viviparous ES cells. Mar. Biotechnol. 9, 766–775. doi: 10.1007/s10126-007-9028-y
Parasa, N. A., Namgiri, J. V., Mohanty, S. N., and Dash, J. K. (2021). Introduction to unsupervised learning in bioinformatics. Data Anal. Bioinforma Mach. Learn. Perspect., 35–49. doi: 10.1002/9781119785620.ch2
Park, Y., Muttray, N. P., and Hauschild, A. C. (2024). Species-agnostic transfer learning for cross-species transcriptomics data integration without gene orthology. Brief. Bioinform. 25:bbae004. doi: 10.1093/bib/bbae004
Parton, A., Forest, D., Kobayashi, H., Dowell, L., Bayne, C., and Barnes, D. (2007). Cell and molecular biology of SAE, a cell line from the spiny dogfish shark, Squalus acanthias. Comp. Biochem. Physiol. C Toxicol. Pharmacol. 145, 111–119. doi: 10.1016/j.cbpc.2006.07.003
Pasitka, L., Cohen, M., Ehrlich, A., Gildor, B., Reuveni, E., Ayyash, M., et al. (2023). Spontaneous immortalization of chicken fibroblasts generates stable, high-yield cell lines for serum-free production of cultured meat. Nat. Food. 4, 35–50. doi: 10.1038/s43016-022-00658-w
Penarredonda, J. L. (2017). Could AI help to create a meat-free world? Available at: https://www.bbc.com/future/article/20171214-could-ai-help-create-a-meat-free-world
Peng, J., Sun, W., Zhou, G., Xie, L., Han, H., and Xiao, Y. (2022). The accurate prediction and analysis of bed expansion characteristics in liquid–solid fluidized bed based on machine learning methods. Chem. Eng. Sci. 260:117841. doi: 10.1016/j.ces.2022.117841
Perozzi, B., Al-Rfou, R., and Skiena, S. (2014). “DeepWalk: online learning of social representations” in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY: Association for Computing Machinery).
Petegrosso, R., Li, Z., and Kuang, R. (2019). Machine learning and statistical methods for clustering single-cell RNA-sequencing data. Brief. Bioinform. 21, 1209–1223. doi: 10.1093/bib/bbz063
Petsagkourakis, P., Sandoval, I. O., Bradford, E., Zhang, D., and del Rio-Chanona, E. A. (2020). Reinforcement learning for batch bioprocess optimization. Comput. Chem. Eng. 133:106649. doi: 10.1016/j.compchemeng.2019.106649
Poinski, M. (2021). Future meat technologies raises $347M and makes a $1.70 cell-based chicken breast. Food Dive. Available at: https://www.fooddive.com/news/future-meat-technologies-raises-347m-and-makes-a-170-cell-based-chicken/611712/
Poore, J., and Nemecek, T. (2018). Reducing food’s environmental impacts through producers and consumers. Science 360, 987–992. doi: 10.1126/science.aaq0216
Post, M. J., Levenberg, S., Kaplan, D. L., Genovese, N., Fu, J., Bryant, C. J., et al. (2020). Scientific, sustainability and regulatory challenges of cultured meat. Nat. Food. 1, 403–415. doi: 10.1038/s43016-020-0112-z
Potter, G., Smith, A. S. T., Vo, N. T. K., Muster, J., Weston, W., Bertero, A., et al. (2020). A more open approach is needed to develop cell-based fish technology: it starts with zebrafish. One Earth. 3, 54–64. doi: 10.1016/j.oneear.2020.06.005
Priestley, M., O’donnell, F., and Simperl, E. (2023). A survey of data quality requirements that matter in ML development pipelines. J. Data Inf. Qual. 15, 1–11.
Protein Report. Protein Report . (2022). Hoxton farms raises a $22m series a to solve the big fat problem for meat alternatives. Available at: https://www.proteinreport.org/newswire/hoxton-farms-raises-22m-series-solve-big-fat-problem-meat-alternatives
Qiu, X., Rahimzamani, A., Wang, L., Ren, B., Mao, Q., Durham, T., et al. (2020). Inferring causal gene regulatory networks from coupled single-cell expression dynamics using scribe. Cell Syst. 10, 265–274.e11. doi: 10.1016/j.cels.2020.02.003
Qiu, X., Zhang, Y., Martin-Rufino, J. D., Weng, C., Hosseinzadeh, S., Yang, D., et al. (2022). Mapping transcriptomic vector fields of single cells. Cell 185, 690–711.e45. doi: 10.1016/j.cell.2021.12.045
Qualities, V. A. (2020). Challenges and future of genetic algorithms. Rochester, NY Available at: https://papers.ssrn.com/abstract=3726035.
Race, A. M., Sutton, D., Hamm, G., Maglennon, G., Morton, J. P., Strittmatter, N., et al. (2021). Deep learning-based annotation transfer between molecular imaging modalities: an automated workflow for multimodal data integration. Anal. Chem. 93, 3061–3071. doi: 10.1021/acs.analchem.0c02726
Ramalingam, V., Song, Z., and Hwang, I. (2019). The potential role of secondary metabolites in modulating the flavor and taste of the meat. Food Res. Int. 122, 174–182. doi: 10.1016/j.foodres.2019.04.007
Rao, J., Zhou, X., Lu, Y., Zhao, H., and Yang, Y. (2021). Imputing single-cell RNA-seq data by combining graph convolution and autoencoder neural networks. iScience 24:102393. doi: 10.1016/j.isci.2021.102393
Ravikumar, M., Powell, D., and Huling, R. (2024). Cultivated meat: research opportunities to advance cell line development. Trends Cell Biol. 34, 523–526. doi: 10.1016/j.tcb.2024.04.005
Ravindra, N., Sehanobish, A., Pappalardo, J. L., Hafler, D. A., and van Dijk, D.. Disease state prediction from single-cell data using graph attention networks. In: Proceedings of the ACM Conference on Health, Inference, and Learning. New York, NY: Association for Computing Machinery (2020). p. 121–130.
Reiss, J., Robertson, S., and Suzuki, M. (2021). Cell sources for cultivated meat: applications and considerations throughout the production workflow. Int. J. Mol. Sci. 22:7513. doi: 10.3390/ijms22147513
Ribeiro, A. J. S., Schwab, O., Mandegar, M. A., Ang, Y. S., Conklin, B. R., Srivastava, D., et al. (2017). Multi-imaging method to assay the contractile mechanical output of micropatterned human iPSC-derived cardiac myocytes. Circ. Res. 120, 1572–1583. doi: 10.1161/CIRCRESAHA.116.310363
Risner, D., Li, F., Fell, J. S., Pace, S. A., Siegel, J. B., Tagkopoulos, I., et al. (2021). Preliminary techno-economic assessment of animal cell-based meat. Food Secur. 10:3. doi: 10.3390/foods10010003
Ronneberger, O., Fischer, P., and Brox, T.. U-net: convolutional networks for biomedical image segmentation. arXiv ; (2015). Available fat: http://arxiv.org/abs/1505.04597
Rosen, Y., Brbić, M., Roohani, Y., Swanson, K., Li, Z., and Leskovec, J. (2024). Toward universal cell embeddings: integrating single-cell RNA-seq datasets across species with SATURN. Nat. Methods 21, 1492–1500. doi: 10.1038/s41592-024-02191-z
Rubio, N., Datar, I., Stachura, D., Kaplan, D., and Krueger, K. (2019). Cell-based fish: a novel approach to seafood production and an opportunity for cellular agriculture. Front. Sustain. Food Syst. 3:43. doi: 10.3389/fsufs.2019.00043
Saad, M. K., Yuen, J. S. K., Joyce, C. M., Li, X., Lim, T., Wolfson, T. L., et al. (2023). Continuous fish muscle cell line with capacity for myogenic and adipogenic-like phenotypes. Sci. Rep. 13:5098. doi: 10.1038/s41598-023-31822-2
Saito, Y., Oikawa, M., Sato, T., Nakazawa, H., Ito, T., Kameda, T., et al. (2021). Machine-learning-guided library design cycle for directed evolution of enzymes: the effects of training data composition on sequence space exploration. ACS Catal. 11, 14615–14624. doi: 10.1021/acscatal.1c03753
Sarna, S., Patel, N., Corbett, B., McCready, C., and Mhaskar, P. (2023). Process-aware data-driven modelling and model predictive control of bioreactor for the production of monoclonal antibodies. Can. J. Chem. Eng. 101, 2677–2692. doi: 10.1002/cjce.24752
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., and Monfardini, G. (2009). The graph neural network model. IEEE Trans. Neural Netw. 20, 61–80. doi: 10.1109/TNN.2008.2005605
Schinn, S. M., Morrison, C., Wei, W., Zhang, L., and Lewis, N. E. (2021). A genome-scale metabolic network model and machine learning predict amino acid concentrations in Chinese hamster ovary cell cultures. Biotechnol. Bioeng. 118, 2118–2123. doi: 10.1002/bit.27714
Schmidhuber, J. (2015). Deep learning in neural networks: an overview. Neural Netw. 61, 85–117. doi: 10.1016/j.neunet.2014.09.003
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv. Available at: http://arxiv.org/abs/1707.06347
Shan, Y., Yang, J., Li, X., Zhong, X., and Chang, Y. (2023). GLAE: a graph-learnable auto-encoder for single-cell RNA-seq analysis. Inf. Sci. 621, 88–103. doi: 10.1016/j.ins.2022.11.049
Shieber, J., and This, Y. (2021). Combinator startup is taking lab-grown meat upscale with elk, lamb and Wagyu beef cell lines. TechCrunch. Available at: https://techcrunch.com/2021/03/23/this-y-combinator-startup-is-taking-lab-grown-meat-upscale-with-elk-lamb-and-wagyu-beef-cell-lines/
Simsa, R., Yuen, J., Stout, A., Rubio, N., Fogelstrand, P., and Kaplan, D. L. (2019). Extracellular Heme proteins influence bovine Myosatellite cell proliferation and the color of cell-based meat. Food Secur. 8:521. doi: 10.3390/foods8100521
Singh, A., and Ogunfunmi, T. (2022). An overview of Variational autoencoders for source separation, finance, and bio-signal applications. Entropy 24:55. doi: 10.3390/e24010055
Sinke, P., Swartz, E., Sanctorum, H., van der Giesen, C., and Odegard, I. (2023). Ex-ante life cycle assessment of commercial-scale cultivated meat production in 2030. Int. J. Life Cycle Assess. 28, 234–254. doi: 10.1007/s11367-022-02128-8
Smith-Uchotsk, R., and Wanjiru, P. (2023). Identification of hazards in meat products manufactured from cultured animal cells: executive summary : Risk Assessment Unit Science, Evidence and Research Division, Food Standards Agency Available at: https://www.food.gov.uk/research/identification-of-hazards-in-meat-products-manufactured-from-cultured-animal-cells-executive-summary.
Soice, E., and Johnston, J. (2021). Immortalizing cells for human consumption. Int. J. Mol. Sci. 22:11660. doi: 10.3390/ijms222111660
Southey, F. (2023). ‘World’s first’ growth media factory in the works for cultivated meat, milk. Available at: https://www.foodnavigator.com/Article/2023/01/16/World-s-first-growth-media-factory-in-the-works-for-cultivated-meat-milk
Specht, L. (2020). An analysis of culture medium costs and production volumes for cultivated meat. Good Food Institute. Available at: https://gfi.org/wp-content/uploads/2021/01/clean-meat-production-volume-and-medium-cost.pdf (Accessed June 1, 2024).
Specht, E. A., Welch, D. R., Rees Clayton, E. M., and Lagally, C. D. (2018). Opportunities for applying biomedical production and manufacturing methods to the development of the clean meat industry. Biochem. Eng. J. 132, 161–168. doi: 10.1016/j.bej.2018.01.015
Steinfeld, H., Gerber, P., Wassenaar, T., Castel, V., Rosales, M., and de Haan, C.. Livestock’s long shadow. (2006). Available at: https://www.fao.org/3/a0701e/a0701e00.htm
Stephens, N., Di Silvio, L., Dunsford, I., Ellis, M., Glencross, A., and Sexton, A. (2018). Bringing cultured meat to market: technical, socio-political, and regulatory challenges in cellular agriculture. Trends Food Sci. Technol. 78, 155–166. doi: 10.1016/j.tifs.2018.04.010
Stoeckius, M., Hafemeister, C., Stephenson, W., Houck-Loomis, B., Chattopadhyay, P. K., Swerdlow, H., et al. (2017). Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868. doi: 10.1038/nmeth.4380
Stout, A. J., Mirliani, A. B., Soule-Albridge, E. L., Cohen, J. M., and Kaplan, D. L. (2020). Engineering carotenoid production in mammalian cells for nutritionally enhanced cell-cultured foods. Metab. Eng. 62, 126–137. doi: 10.1016/j.ymben.2020.07.011
Stout, A. J., Rittenberg, M. L., Shub, M., Saad, M. K., Mirliani, A. B., and Kaplan, D. L. (2022). A beefy-R culture medium: replacing albumin with rapeseed protein isolates. bioRxiv. doi: 10.1101/2022.09.02.506409v1
Stout, A. J., Zhang, X., Letcher, S. M., Rittenberg, M. L., Shub, M., Chai, K. M., et al. (2024). Engineered autocrine signaling eliminates muscle cell FGF2 requirements for cultured meat production. Cell Rep. Sustain. 1:2023.04.17.537163. doi: 10.1101/2023.04.17.537163
Suchocki, T., Wojdak-Maksymiec, K., and Szyda, J. (2016). Using gene networks to identify genes and pathways involved in milk production traits in polish Holstein dairy cattle. Czeh J. Anim. Sci. 61, 526–538. doi: 10.17221/43/2015-CJAS
Sun, F., Sun, J., and Zhao, Q. (2022). A deep learning method for predicting metabolite–disease associations via graph neural network. Brief. Bioinform. 23:bbac266. doi: 10.1093/bib/bbac266
Sun, C., Xuan, P., Zhang, T., and Ye, Y. (2022). Graph convolutional autoencoder and generative adversarial network-based method for predicting drug-target interactions. IEEE/ACM Trans. Comput. Biol. Bioinform. 19, 455–464. doi: 10.1109/TCBB.2020.2999084
Svensson, V., Gayoso, A., Yosef, N., and Pachter, L. (2020). Interpretable factor models of single-cell RNA-seq via variational autoencoders. Bioinformatics 36, 3418–3421. doi: 10.1093/bioinformatics/btaa169
Synoground, B. F., McGraw, C. E., Elliott, K. S., Leuze, C., Roth, J. R., Harcum, S. W., et al. (2021). Transient ammonia stress on Chinese hamster ovary (CHO) cells yield alterations to alanine metabolism and IgG glycosylation profiles. Biotechnol. J. 16:2100098. doi: 10.1002/biot.202100098
Széliová, D., Ruckerbauer, D. E., Galleguillos, S. N., Petersen, L. B., Natter, K., Hanscho, M., et al. (2020). What CHO is made of: variations in the biomass composition of Chinese hamster ovary cell lines. Metab. Eng. 61, 288–300. doi: 10.1016/j.ymben.2020.06.002
Tanevski, J., Flores, R. O. R., Gabor, A., Schapiro, D., and Saez-Rodriguez, J. (2022). Explainable multiview framework for dissecting spatial relationships from highly multiplexed data. Genome Biol. 23:97. doi: 10.1186/s13059-022-02663-5
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., and Mei, Q. (2015). “LINE: large-scale information network embedding” in Proceedings of the 24th International Conference on World Wide Web. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee.
Taussat, S., Boussaha, M., Ramayo-Caldas, Y., Martin, P., Venot, E., Cantalapiedra-Hijar, G., et al. (2020). Gene networks for three feed efficiency criteria reveal shared and specific biological processes. Genet. Sel. Evol. 52:67. doi: 10.1186/s12711-020-00585-z
Theodoris, C. V., Xiao, L., Chopra, A., Chaffin, M. D., Al Sayed, Z. R., Hill, M. C., et al. (2023). Transfer learning enables predictions in network biology. Nature 618, 616–624. doi: 10.1038/s41586-023-06139-9
Tian, L., Chen, F., and Macosko, E. Z. (2023). The expanding vistas of spatial transcriptomics. Nat. Biotechnol. 41, 773–782. doi: 10.1038/s41587-022-01448-2
Tian, H., Liu, H., He, Y., Chen, B., Xiao, L., Fei, Y., et al. (2020). Combined application of electronic nose analysis and back-propagation neural network and random forest models for assessing yogurt flavor acceptability. J. Food Meas. Charact. 14, 573–583. doi: 10.1007/s11694-019-00335-w
Tian, T., Wan, J., Song, Q., and Wei, Z. (2019). Clustering single-cell RNA-seq data with a model-based deep learning approach. Nat. Mach. Intell. 1, 191–198. doi: 10.1038/s42256-019-0037-0
Tian, D., Wenlock, S., Kabir, M., Tzotzos, G., Doig, A. J., and Hentges, K. E. (2018). Identifying mouse developmental essential genes using machine learning. Dis. Model. Mech. 11:dmm034546. doi: 10.1242/dmm.034546
Tuomisto, H. L., and Teixeira de Mattos, M. J. (2011). Environmental impacts of cultured meat production. Environ. Sci. Technol. 45, 6117–6123. doi: 10.1021/es200130u
Tuwani, R., Wadhwa, S., and Bagler, G. (2019). BitterSweet: building machine learning models for predicting the bitter and sweet taste of small molecules. Sci. Rep. 9:7155. doi: 10.1038/s41598-019-43664-y
Vaitkus, V., Brazauskas, K., and Repšytė, J. Soft-sensors based on black-box models for bioreactors monitoring and state estimation. In: Proceedings of the 2020 12th International Conference on Bioinformatics and Biomedical Technology. New York, NY: Association for Computing Machinery; (2020).
van Kempen, M., Kim, S. S., Tumescheit, C., Mirdita, M., Lee, J., Gilchrist, C. L. M., et al. (2023). Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 42, 243–246. doi: 10.1038/s41587-023-01773-0
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017). “Attention is all you need” in Advances in neural information processing systems (Curran Associates, Inc.). Available at: https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
Veliça, P., and Bunce, C. M. (2011). A quick, simple and unbiased method to quantify C2C12 myogenic differentiation. Muscle Nerve 44, 366–370. doi: 10.1002/mus.22056
Venâncio, T. M., Oliveira, A. E. A., Silva, L. B., Machado, O. L. T., Fernandes, K. V. S., and Xavier-Filho, J. (2003). A protein with amino acid sequence homology to bovine insulin is present in the legume Vigna unguiculata (cowpea). Braz. J. Med. Biol. Res. 36, 1167–1173. doi: 10.1590/S0100-879X2003000900004
Venkatesan, M., Semper, C., Skrivergaard, S., Di Leo, R., Mesa, N., Rasmussen, M. K., et al. (2022). Recombinant production of growth factors for application in cell culture. iScience 25:105054. doi: 10.1016/j.isci.2022.105054
Verma, R., Kumar, N., Patil, A., Kurian, N. C., Rane, S., Graham, S., et al. (2021). MoNuSAC2020: a multi-organ nuclei segmentation and classification challenge. IEEE Trans. Med. Imaging 40, 3413–3423. doi: 10.1109/TMI.2021.3085712
Vijayakumar, S., Rahman, P. K. S. M., and Angione, C. (2020). A hybrid flux balance analysis and machine learning pipeline elucidates metabolic adaptation in cyanobacteria. iScience 23:101818. doi: 10.1016/j.isci.2020.101818
Vinodkumar, P. K., Ozcinar, C., and Anbarjafari, G. (2021). Prediction of SGRNA off-target activity in crispr/cas9 gene editing using graph convolution network. Entropy 23:608. doi: 10.3390/e23050608
Voorhamme, D., and Yandell, C. A. (2006). LONG R3IGF-I as a more potent alternative to insulin in serum-free culture of HEK293 cells. Mol. Biotechnol. 34, 201–204. doi: 10.1385/MB:34:2:201
Wang, D., and Gu, J. (2018). VASC: dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. Genomics Proteomics Bioinformatics 16, 320–331. doi: 10.1016/j.gpb.2018.08.003
Wang, J., Ma, A., Chang, Y., Gong, J., Jiang, Y., Qi, R., et al. (2021). scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nat. Commun. 12:1882. doi: 10.1038/s41467-021-22197-x
Wang, T., Shao, W., Huang, Z., Tang, H., Zhang, J., Ding, Z., et al. (2021). MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification. Nat. Commun. 12:3445. doi: 10.1038/s41467-021-23774-w
Wang, Y. M., Sun, Y., Wang, B., Wu, Z., He, X. Y., and Zhao, Y. (2024). Transfer learning for clustering single-cell RNA-seq data crossing-species and batch, case on uterine fibroids. Brief. Bioinform. 25:bbad426. doi: 10.1093/bib/bbad426
Wang, M., Tsanas, A., Blin, G., and Robertson, D. (2020). Predicting pattern formation in embryonic stem cells using a minimalist, agent-based probabilistic model. Sci. Rep. 10:16209. doi: 10.1038/s41598-020-73228-4
Wang, Y., Wang, H., Wei, L., Li, S., Liu, L., and Wang, X. (2020). Synthetic promoter design in Escherichia coli based on a deep generative network. Nucleic Acids Res. 48, 6403–6412. doi: 10.1093/nar/gkaa325
Wang, Y. T., Yang, Z. X., Piao, Z. H., Xu, X. J., Yu, J. H., and Zhang, Y. H. (2021). Prediction of flavor and retention index for compounds in beer depending on molecular structure using a machine learning method. RSC Adv. 11, 36942–36950. doi: 10.1039/D1RA06551C
Wang, N., Zeng, M., Li, Y., Wu, F., and Li, M. (2021). Essential protein prediction based on node2vec and XGBoost. J. Comput. Biol. 28, 687–700. doi: 10.1089/cmb.2020.0543
Watson, J. L., Juergens, D., Bennett, N. R., Trippe, B. L., Yim, J., Eisenach, H. E., et al. (2022). Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models. bioRxiv :2022.12.09.519842. doi: 10.1101/2022.12.09.519842v2
Wei, R., and Mahmood, A. (2021). Recent advances in variational autoencoders with representation learning for biomedical informatics: a survey. IEEE Access. 9, 4939–4956. doi: 10.1109/ACCESS.2020.3048309
Wen, H., Ding, J., Jin, W., Wang, Y., Xie, Y., and Tang, J. (2022). Graph neural networks for multimodal single-cell data integration. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, NY: Association for Computing Machinery. p. 4153–4163. Available at: https://doi-org.uml.idm.oclc.org/10.1145/3534678.3539213 (Accessed June 1, 2024).
Wienert, S., Heim, D., Saeger, K., Stenzinger, A., Beil, M., Hufnagl, P., et al. (2012). Detection and segmentation of cell nuclei in virtual microscopy images: a minimum-model approach. Sci. Rep. 2:503. doi: 10.1038/srep00503
Wijayaningrum, V. N., Mahmudy, W. F., and Natsir, M. H. (2017). Optimization of poultry feed composition using hybrid adaptive genetic algorithm and simulated annealing. J. Telecommun. Electron. Comput. Eng. 9, 183–187.
Wiltschko, A. B. Learning to smell: using deep learning to predict the olfactory properties of molecules. (2019). Available at: https://blog.research.google/2019/10/learning-to-smell-using-deep-learning.html; https://jtec.utem.edu.my/jtec/article/view/2652
Winfree, S. (2022). User-accessible machine learning approaches for cell segmentation and analysis in tissue. Front. Physiol. 13:833333. doi: 10.3389/fphys.2022.833333
Wu, D., Xu, F., Xu, Y., Huang, M., Li, Z., and Chu, J. (2024). Towards a hybrid model-driven platform based on flux balance analysis and a machine learning pipeline for biosystem design. Synth. Syst. Biotechnol. 9, 33–42. doi: 10.1016/j.synbio.2023.12.004
Yan, F., Powell, D. R., Curtis, D. J., and Wong, N. C. (2020). From reads to insight: a hitchhiker’s guide to ATAC-seq data analysis. Genome Biol. 21:22. doi: 10.1186/s13059-020-1929-3
Yang, F., Fan, K., Song, D., and Lin, H. (2020). Graph-based prediction of protein-protein interactions with attributed signed graph embedding. BMC Bioinformatics 21:323. doi: 10.1186/s12859-020-03646-8
Yang, L., Li, L. P., and Yi, H. C. (2022). DeepWalk based method to predict lncRNA-miRNA associations via lncRNA-miRNA-disease-protein-drug graph. BMC Bioinformatics 22:621. doi: 10.1186/s12859-022-04579-0
Yu, D., Cai, Z., Li, D., Zhang, Y., He, M., Yang, Y., et al. (2021). Myogenic differentiation of stem cells for skeletal muscle regeneration. Stem Cells Int. 2021:8884283. doi: 10.1155/2021/8884283
Yu, K. H., Zhang, C., Berry, G. J., Altman, R. B., Ré, C., Rubin, D. L., et al. (2016). Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat. Commun. 7:12474. doi: 10.1038/ncomms12474
Yue, Y., Ye, C., Peng, P. Y., Zhai, H. X., Ahmad, I., Xia, C., et al. (2022). A deep learning framework for identifying essential proteins based on multiple biological information. BMC Bioinformatics 23:318. doi: 10.1186/s12859-022-04868-8
Zavala-Ortiz, D. A., Denner, A., Aguilar-Uscanga, M. G., Marc, A., Ebel, B., and Guedon, E. (2022). Comparison of partial least square, artificial neural network, and support vector regressions for real-time monitoring of CHO cell culture processes using in situ near-infrared spectroscopy. Biotechnol. Bioeng. 119, 535–549. doi: 10.1002/bit.27997
Zhang, J., Jiang, Z., Hu, X., and Song, B. (2020). A novel graph attention adversarial network for predicting disease-related associations. Methods 179, 81–88. doi: 10.1016/j.ymeth.2020.05.010
Zhang, W., Li, J., Liu, T., Leng, S., Yang, L., Peng, H., et al. (2021). Machine learning prediction and optimization of bio-oil production from hydrothermal liquefaction of algae. Bioresour. Technol. 342:126011. doi: 10.1016/j.biortech.2021.126011
Zhang, Y., Yang, L., Chen, J., Fredericksen, M., Hughes, D. P., and Chen, D. Z. (2017). “Deep adversarial networks for biomedical image segmentation utilizing unannotated images” in Medical image computing and computer assisted intervention − MICCAI 2017. eds. M. Descoteaux, L. Maier-Hein, A. Franz, P. Jannin, D. L. Collins, and S. Duchesne (Cham: Springer International Publishing), 408–416.
Zhang, D., Zhang, W., Zhao, Y., Zhang, J., He, B., Qin, C., et al. (2024). DNAGPT: a generalized pre-trained tool for multiple DNA sequence analysis tasks. bioRxiv. Available at: https://www.biorxiv.org/content/10.1101/2023.07.11.548628v3
Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., et al. (2020). Graph neural networks: a review of methods and applications. AI Open 1, 57–81. doi: 10.1016/j.aiopen.2021.01.001
Zhou, T., Reji, R., Kairon, R. S., and Chiam, K. H. (2023). A review of algorithmic approaches for cell culture media optimization. Front. Bioeng. Biotechnol. 11:1195294. doi: 10.3389/fbioe.2023.1195294
Zhou, Z., Siddiquee, M. M. R., and Tajbakhsh, N., Liang J. UNet++: a nested U-Net architecture for medical image segmentation. arXiv (2018). Available at: http://arxiv.org/abs/1807.10165
Zhu, W., Benkwitz, F., and Kilmartin, P. A. (2021). Volatile-based prediction of sauvignon blanc quality Gradings with static headspace-gas chromatography-ion mobility spectrometry (SHS-GC-IMS) and interpretable machine learning techniques. J. Agric. Food Chem. 69, 3255–3265. doi: 10.1021/acs.jafc.0c07899
Zinchuk, V., and Grossenbacher-Zinchuk, O. (2023). Machine learning for analysis of microscopy images: a practical guide and latest trends. Curr Protoc. 3:e819. doi: 10.1002/cpz1.819
Keywords: machine learning, artificial intelligence, cultured meat, cell culture, culture media design, microscopy, bioprocessing, food science
Citation: Todhunter ME, Jubair S, Verma R, Saqe R, Shen K and Duffy B (2024) Artificial intelligence and machine learning applications for cultured meat. Front. Artif. Intell. 7:1424012. doi: 10.3389/frai.2024.1424012
Edited by:
Zeynep Ünal, Ömer Halisdemir University, TürkiyeReviewed by:
Yuanfang Ren, University of Florida, United StatesMehmet Kayakus, Akdeniz University, Türkiye
Copyright © 2024 Todhunter, Jubair, Verma, Saqe, Shen and Duffy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Breanna Duffy, YnJlYW5uYUBuZXctaGFydmVzdC5vcmc=
†These authors have contributed equally to this work
‡These authors share first authorship