Intelligent Sensors for Sustainable Food and Drink Manufacturing

Watson, Nicholas J.; Bowler, Alexander L.; Rady, Ahmed; Fisher, Oliver J.; Simeone, Alessandro; Escrig, Josep; Woolley, Elliot; Adedeji, Akinbode A.

doi:10.3389/fsufs.2021.642786

ORIGINAL RESEARCH article

Front. Sustain. Food Syst., 05 November 2021

Sec. Sustainable Food Processing

Volume 5 - 2021 | https://doi.org/10.3389/fsufs.2021.642786

Intelligent Sensors for Sustainable Food and Drink Manufacturing

NJ
Nicholas J. Watson ¹^*
AL
Alexander L. Bowler ¹
AR
Ahmed Rady ¹
OJ
Oliver J. Fisher ¹
AS
Alessandro Simeone ^2,3
JE
Josep Escrig ⁴
EW
Elliot Woolley ⁵
AA
Akinbode A. Adedeji ⁶

1. Food, Water, Waste Research Group, Faculty of Engineering, University of Nottingham, Nottingham, United Kingdom
2. Intelligent Manufacturing Key Laboratory of Ministry of Education, Shantou University, Shantou, China
3. Department of Management and Production Engineering, Politecnico di Torino Corso Duca degli Abruzzi 24, Turin, Italy
4. i2CAT Foundation, Barcelona, Spain
5. Wollfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University, Loughborough, United Kingdom
6. Department of Biosystems and Agricultural Engineering, University of Kentucky, Lexington, KY, United States

Abstract

Food and drink is the largest manufacturing sector worldwide and has significant environmental impact in terms of resource use, emissions, and waste. However, food and drink manufacturers are restricted in addressing these issues due to the tight profit margins they operate within. The advances of two industrial digital technologies, sensors and machine learning, present manufacturers with affordable methods to collect and analyse manufacturing data and enable enhanced, evidence-based decision making. These technologies will enable manufacturers to reduce their environmental impact by making processes more flexible and efficient in terms of how they manage their resources. In this article, a methodology is proposed that combines online sensors and machine learning to provide a unified framework for the development of intelligent sensors that work to improve food and drink manufacturers' resource efficiency problems. The methodology is then applied to four food and drink manufacturing case studies to demonstrate its capabilities for a diverse range of applications within the sector. The case studies included the monitoring of mixing, cleaning and fermentation processes in addition to predicting key quality parameter of crops. For all case studies, the methodology was successfully applied and predictive models with accuracies ranging from 95 to 100% were achieved. The case studies also highlight challenges and considerations which still remain when applying the methodology, including efficient data acquisition and labelling, feature engineering, and model selection. This paper concludes by discussing the future work necessary around the topics of new online sensors, infrastructure, data acquisition and trust to enable the widespread adoption of intelligent sensors within the food and drink sector.

Introduction

Food and drink is the world's largest manufacturing sector with annual global sales of over £6 trillion (Department for Business Energy and Industrial Strategy, 2017). In the UK alone, the food and drink sector contributes over £28 billion to the economy and employs over 400,000 workers (Food and Drink Federation Statistics at a Glance, 2021). One of the biggest challenges facing the sector is how to produce nutritious, safe and affordable food whilst minimising the environmental impacts. It has been reported that global food and drink production and distribution consumes approximately 15% of fossil fuels and is responsible for 28% of greenhouse emissions (Department for Business Energy and Industrial Strategy, 2017). Manufacturing is experiencing the fourth industrial revolution (often labelled Industry 4.0 or digital manufacturing) which is the use of Industrial Digital Technologies (IDTs) such as robotics, sensors, and artificial intelligence within manufacturing environments. Key to the fourth industrial revolution is the enhanced collection and use of data to enable evidence-based decision making. Although IDTs have been shown to deliver productivity, efficiency and sustainability benefits in many manufacturing sectors, their adoption has been much slower in food and drink. This has often been attributed to the characteristics of the sector, which is extremely dynamic, producing high volumes of low-value products with limited resources to commit to process innovation. As data is the key component of digital manufacturing, there is a need for appropriate sensing technologies to collect this data.

Although many simple sensors exist, such as those for temperature and pressure measurements, there is a shortage of solutions for cost-effective, advanced technologies which can provide actionable information on the properties of materials streams (e.g., feedstocks, products, and waste) and the manufacturing processes. Data mining and Machine Learning (ML) techniques may be used to analyse sensor measurements and generate actionable information. Machine learning methods develop models which learn from a training data set and are capable of fitting complex functions between input and output data. Machine learning models are highly suited to food and drink manufacturing environments because the sector manufactures high volumes of products and therefore generates high volumes of data available to develop models.

The focus of this article is a methodology for combining sensor measurements and ML to improve the environmental sustainability of food and drink manufacturing processes. The article will begin with a summary of sustainability challenges and relevant sensor and ML research. Following this, the methodology for combining sensors measurements and ML will be presented and then applied to four industrially relevant case studies. These case studies will highlight the benefits and key considerations of the methodology in addition to any challenges which still remain.

Sustainability

The rapid increase in the consumption of processed foods together with the production of complex, multicomponent food products (e.g., breaded chicken breasts) driven by ever-changing consumer demand (e.g., new dietary requirements, low effort meals, etc.) makes the food industry one of the most energy-intensive manufacturing sectors (Ladha-Sabur et al., 2019). Additionally, the crop and animal production part of food supply needs to tackle challenges in land use, resource consumption due to extended season growing, waste production, use of chemicals and transport emissions (Food and Agriculture Organization of the United Nations, 2017). Within the factory, practically every sustainability challenge resolves around two core issues: resource (material, energy, water, and time) inefficiency and inherent waste production. For example, lead times are often longer than order times so manufacturers regularly over-produce which often leads to waste.

There are challenges around improving the energy efficiency of common processes such as refrigeration (Tsamos et al., 2017), drying (Sun et al., 2019), and frying (Su et al., 2018) whilst other processes such as washing and cleaning are hugely water-intensive (Simeone et al., 2018). Within food and drink manufacturing, examples of IDT use to improve resource efficiency include forecasting energy consumption to inform mitigation measures (Ribeiro et al., 2020), reducing energy consumption in drying (Sun et al., 2019), reducing the amount of product lost by automating product quality testing (García-Esteban et al., 2018) and reducing water consumption in agriculture via Industrial Internet of Things (IIoT) monitoring (Jha et al., 2019). Although time is rarely discussed in sustainable manufacturing discussions, it is a key aspect as reducing the time of a process reduces its resource demand and the associated overheads (e.g., lighting and heating a factory). Monitoring time can also enable more efficient use and scheduling of manufacturing assets.

The UK manufacturing sector is directly responsible for the production of about 1.5 million tonnes of food waste annually post-farm gate, of which 50% is estimated as wasted food and the rest being inedible parts (Parry et al., 2020). As waste is an inherent part of the raw food, avoidance options are not always available and, due to its low value, the waste is not normally managed in the most sustainable manner (Garcia-Garcia et al., 2019a). There are opportunities to take a systematic approach to industrial food waste management to reduce the proportions sent to landfill, shown in Garcia-Garcia et al. (2017, 2019b), and research has shown potential for the valorisation of food waste to further recover economic and environmental value (Garcia-Garcia et al., 2019a). There are examples utilising IDTs to track food waste during production which has led to reduced levels of food waste (Jagtap and Rahimifard, 2019; Jagtap et al., 2019; Garre et al., 2020). One example saw reductions of food waste by 60% by capturing waste data during manufacturing in real-time and sharing it with all the stakeholders in a food supply chain (Jagtap and Rahimifard, 2019). Another example used ML to predict deviations in production, reducing uncertainties related to the amount of waste produced (Garre et al., 2020). These examples demonstrate that increased monitoring and modelling of food and drink production systems increases their sustainability.

Online Sensors

Online sensors are a cornerstone technology of digital manufacturing as they generate real-time data on manufacturing processes and material streams. There are several contradictory definitions of what is meant by “online” sensors. For this work, we define online sensor as sensors that directly measure the material or process, in real time, without the need for a bypass loop or sample removal for further analysis. A survey on the state of the food manufacturing sector identified that digital sensors and transmitters were the most likely hardware components to be purchased in 2019 (Laughman, 2019). The expected rise in the industrial deployment of sensors is driven by several factors; they are considerably cheaper than other IDTs (e.g., robots) and can often be retrofitted onto existing equipment reducing disruption to existing manufacturing processes. Historically, the main sensors that are used in manufacturing processes monitor simple properties such as temperature, pressure, flow rate and fill level. Although these are essential for process monitoring and control, more advanced sensing technologies are required to provide detailed information on manufacturing processes and key material properties.

Sensors have been used to monitor resource consumption in single unit operations or entire food production systems (Ladha-Sabur et al., 2019). In addition, sensors have been used to measure and optimise the performance of unit operations that have a large carbon footprint (Pereira et al., 2016). A particular focus on the use of sensors within food and drink manufacturing is for monitoring the key quality parameters of products (e.g., Takacs et al., 2020). Although these measurements are primarily used for safety and quality control, this also impacts on the sustainability of the process as any product deemed to be of unacceptable quality is often sent to waste or reworked into another product, requiring the use of additional resources. Other sensors performing measurements, such as weight, have also been developed to directly monitor waste generated in food production processes (Jagtap and Rahimifard, 2019).

Many different types of sensors techniques exist, these are characterised by technical features including sensing modality, spatial measurement mode (point, line, area, or volume), resolution, accuracy, and speed of data acquisition and analysis. Other aspects which must be taken into account include the sensor's cost and ability to autonomously and non-invasively perform real-time measurements in production environments. Although many different sensing techniques exist, the most popular ones within food and drink manufacturing include visible imaging (Wu and Sun, 2013; Tomasevic et al., 2019), Near-Infrared (NIR) spectroscopy (Porep et al., 2015; McGrath et al., 2021), hyperspectral imaging (Huang et al., 2014; Saha and Manickavasagan, 2021), X-ray (Mathanker et al., 2013; de Medeiros et al., 2021), Ultrasonic (US) (Mathanker et al., 2013; Fariñas et al., 2021), microwave (Farina et al., 2019), and terahertz (Ok et al., 2014; Ren et al., 2019). The majority of previous work has focused on sensing the properties of the food materials but sensors have also been deployed to monitor processes such as mixing (Bowler et al., 2020a) and the fouling of heat exchangers (Wallhäußer et al., 2012). The majority of previously reported work is laboratory-based but advanced sensors are becoming more widely deployed within production environments with the most popular being optical and x-ray methods.

The key for sensors to work effectively in industrial environments is not to focus on adapting high precision lab-based analytical methods but to determine what is the key information required to make a manufacturing decision and identify the most suitable cost-effective sensing solution. For example, to determine if a piece or processing equipment is clean a sensor should be deployed which can determine if fouling is present on surface. Although more advanced sensing technologies could determine the composition and volume of fouling, these are not required to make the required manufacturing decision. Sensor techniques also experience the benefits of another key IDT: the IIoT. The IIoT enables sensors to be connected to the internet which reduces the cost and size of hardware required on-site and enables the sensors to benefit from enormous computing resources available in the cloud.

For any sensor technology, there is a need for suitable methods to process the recorded sensor measurement and produce information about the material or process being monitored. For many sensors, first principle models can be utilised based on a sound scientific understanding of its mode of operation. However, for more complex sensing technologies and measurement environments, it is often difficult to develop first-principles models, as they need to account for many factors which affect the sensor's measurement. This is especially challenging when using sensors to monitor highly complex and variable biological materials in production environments which are extremely noisy with constantly changing environmental conditions (e.g., atmospheric light and temperature).

Machine Learning

An alternative to first-principle methods are DDMs, a subset of empirical modelling that encompasses the fields of computational intelligence and ML (Solomatine et al., 2008). Computational intelligence are nature inspired computational approaches to problem-solving (Saka et al., 2013). Whereas, ML focuses on the development of algorithms and models that can access data and use it to learn for themselves (Coley et al., 2018). It is this capability that makes ML suited to intelligent sensor development. It should be noted that the prediction performance of ML models is only as good as the data used to train the model, but performance can continuously be improved as more or better data becomes available (Goodfellow et al., 2016). Machine learning is experiencing more widespread use within manufacturing primarily due to the ever-increasing amount of data generated by IIoT devices and constant improvements in computing power required to process these vast quantities of data. Machine learning models can be used for a variety of tasks, but the two most popular are classification and regression. Classification tasks are used to select a class of output (e.g., is a measured food of acceptable quality or not) whereas regression models output a numerical value (e.g., sucrose content in a potato). Machine learning methods are further categorised based on their learning approach, primarily either supervised or unsupervised methods. Supervised ML methods have a training dataset with input data for known outputs and these methods can be used to address classification and regression problems. Unsupervised methods do not have known outputs and use clustering methods, such as Principle Component Analysis (PCA) or k-means clustering, to identify structures which may exist within the data.

The majority of previous research that utilises ML to analyse sensor data has used supervised methods, with the most popular including Artificial Neural Networks (ANN), Support Vector Machines (SVM), Decision Trees (DT), and K-Nearest Neighbours (KNN) (Zhou et al., 2019; Bowler et al., 2020b). These standard methods are often called base, weak or shallow learners. The most recent advances in ML are generally in the area of deep learning, which can overcome limitations of earlier shallow networks that prevented efficient training and abstractions of hierarchical representations of multi-dimensional training data (Shrestha and Mahmood, 2019). There are a variety of different deep learning methods but often they include multi-layer neural networks and can automate the feature selection process using methods such as Convolution Neural Networks (CNN) (Liu et al., 2018). Deep learning does come with drawbacks however, such as high training time, overfitting and increased complexity (Shrestha and Mahmood, 2019). An alternative to deep learning is to improve base-learners predictive capabilities through ensemble methods that combine numerous base learners with techniques such as bagging or boosting to improve overall model prediction performance (Hadavandi et al., 2015).

Machine learning methods have been successfully combined with sensor data for a variety of applications within the food and drink manufacturing sector. The majority of this work has focused on optical techniques. Vision camera systems have utilised ML models for applications such as fruit and vegetable sorting (Mahendran et al., 2015), defect detection (Liu et al., 2018), poultry inspection (Chao et al., 2008), and quality assessment (Geronimo et al., 2019), adulteration of meat (Al-Sarayreh et al., 2018), quality inspection of baked products (Du et al., 2012), adulteration detection in spices (Oliveira et al., 2020), and curing of bacon (Philipsen and Moeslund, 2019). The majority of previous work has focused on combining sensors and ML to monitor the food materials being manufactured and commercial solutions are now available for applications such as potato grading (B-Hive, 2020). In addition, vision systems utilising ML have been combined with other IDTs such as robots for applications such as autonomous fruit harvesting (Yu et al., 2019). Although the majority of previous work combining sensors and ML has focused on optical systems (imaging and spectroscopic), research has been performed using data from other sensor instruments. For example, X-ray measurements have been combined with ML for the internal inspection of fruit (Van De Looverbosch et al., 2020). Several articles reviewing ML research within food and drink are available which the reader may refer to Liakos et al. (2018), Rehman et al. (2019), Zhou et al. (2019), and Sharma et al. (2020).

Intelligent Sensor Methodology

Many different approaches have been developed to standardise the data modelling process including CRISP-DM and Analytics Solutions Unified Method for Data Mining/predictive analytics (ASUM-DM) from Microsoft (Angée et al., 2018). However, many of these methods are focussed on general data-driven modelling projects and not specific to ML modelling of sensor data in food and drink production environments. As previously discussed, the transition to Industry 4.0 means that there is a growing variety of online sensors available to monitor production environments. Developing these sensors has considerations around precision of measurements, cost, positioning, and deployment in industrial environments, which will impact the volume and granularity of data available to develop a ML model. While undertaking the projects in the reported case studies, the need for a unifying methodology between the two areas became apparent. Figure 1 presents a methodology for combining sensor measurements and ML to create intelligent sensors to address the specific challenge of efficient resource use within food and drink manufacturing. The methodology has been devised as a synthesis of the methodologies applied in the reported case studies and hence is grounded in the practical application of the problem. While this methodology has been developed for the food and drink industry, it may be adapted for other industries and applications.

Figure 1

Resource Efficiency Problem

The intelligent sensor developer should first define the specific resource problem to be addressed, similar to the business need step in other data modelling methodologies (Azevedo and Santos, 2008). Specific resources challenges may include minimising the consumption of resources utilised during the process in addition to emissions and waste generated. The problem should be well-defined, take into consideration the scope of influence of a company (e.g., if they alter the composition of their waste streams, what impact might this have on available treatments) and also ensure that any related economic and social implications are considered.

Determine Resource Key Performance Indicator (KPI)

Decide the metrics that will be used to monitor the resource and how it will be measured. Metrics may include the amount of resources utilised or waste generated.

Intelligent Sensor Requirement

The developer must specify the required output from the predictive model. This could be a value related to a key quality parameter of a product predicted through a regression model (e.g., moisture content) or to determine if the product contains damage or is of acceptable quality or not through a classification model. Alternatively, the purpose of the model could be to predict something related to a unit operation (process). This could include predicting whether the process had reached its end-point or not (classification) or the predicted time remaining (regression) until optimal end-point. It could also include identifying if a fault has occurred in the process (anomaly detection) to determine if an intervention is required.

Sensor Selection

Although obvious, the developer must ensure that when selecting a sensor its sensing modality can record data sufficient to achieve the intelligent sensor requirement. For example, if the requirement is to determine the grade of a fruit or vegetable-based on size, the sensor must produce information on size and an imaging system would be appropriate. If the requirement is to determine a property such a moisture content, then a sensor sensitive to changes in moisture, such as NIR or dielectric, should be used. A sensor may be required to provide predictions on the internal aspect of food, so sensors capable of measuring internally such as X-ray, US, and electrical methods would be required. For certain applications, choice of sensing technologies may be limited, whereas for others there may be many. The developer should select from the appropriate technologies based on technical specifications, such as accuracy, precision, and resolution in addition to other factors such as size, cost and ease of installation and use. Food safety is an essential aspect of food and drink production, therefore any sensor should be easy to clean and not present any safety or contamination risks.

Sensor Data Collection and Labelling

At this stage, the developer must collect and label the sensor data required to train the ML models. Considerations need to be made in terms of the volume of sensor data and how representative it is of the system under investigation. Regarding volume, the developer must decide between the trade-off that always exists: ML models generally perform better with more data, but this comes with time costs associated with collecting and labelling the data. With ML it is important that the data set is appropriate for the modelling approach to be used. For example, when developing classification models, it is important to collect enough data for each output class to ensure effective performance. If only a small number of samples is expected in one class (e.g., rejections based on a rare quality defect), an anomaly detection model may be more suitable. If supervised ML methods are to be used, the recorded data requires a label to determine its class for classification models or value for regression models. There are different ways to label the data that the developer should consider. Often labelling is completed by humans, which can be extremely costly in terms of time and disruptions to production processes. Labelling of data is one of the primary challenges with utilising ML methods within production environments. If labelling of all collected data is not possible, semi-supervised or unsupervised methods should be explored in addition to domain adaption and transfer learning. The latter two are methods that apply a model trained in one or more “source” domains to a different, but related, “target” domain (Pan et al., 2011). However, they rely on a trained model already existing.

Once the data is collected the developer needs to partition it into training, validation, and test sets. Training data is used to train the models and validation is used to tune the model hyperparameters and the model input variables. A hyperparameter is an adjustable algorithm parameter (e.g., number of layers and nodes in an ANN) that must be either manually or automatically tuned in order to obtain a model with the optimal performance (Zeng and Luo, 2017). The test data is finally used to evaluate the performance of the model with data that was not used for any of the training and validation processes. Test data provides an unbiased evaluation of the final model fit on data outside of the training data set.

Partitioning of the data may vary depending on the volume of data available and how representative the data is to the system being modelled with (Clement et al., 2020) splitting their dataset into 70% training, 15% validation and 15% test.

Design Machine Learning Model(s)

In the first step of the modelling stage the developer must determine the most suitable ML algorithm(s) to use. A range of different ML algorithms exist, and it is often difficult to determine the most suitable one for a specific application. Often ML practitioners will assess a range of different algorithms to determine which results in the best performance based on the validation set. Once the algorithm has been selected, the developer must also determine model hyperparameters. Developers often initially set hyperparameters based on their own past experience, similar work available in the literature or initial prototype models.

Feature Engineering

The developer will be required to use domain knowledge to extract variables from raw recorded sensor signals and process them so that they are in a suitable modelling format (Kuhn and Johnson, 2020). Feature extraction methods tend to be unique to each different sensor and could be based on the physical interpretation of the recorded signal or an appropriate signal transformation. Feature engineering is not always necessary, as certain ML techniques, such as CNNs, automate this step. However, these techniques often require significantly larger volumes of data.

Feature Selection

Of the engineered features, the developer must determine if there are redundant or useless features which harm the learning process (Kuhn and Johnson, 2020). Feature selection is important as a high degree of dimensionality within input variables can cause overfitting in ML models. Overfitting is the generation of a model that corresponds too closely or exactly to the training dataset (and sometimes noise), which negatively impacts future predictions (Srivastava et al., 2014). The developer may use one or both of two categories of feature selection techniques. Firstly, supervised selection involves examining input variables in conjunction with a trained model where the effect of adding or removing variables can be assessed against model performance at predicting the target variable. The tuning of model input variables is incorporated into the model validation stage. The second approach, called unsupervised selection, performs statistical tests on the input variables (e.g., the correlation between variables) to determine which are similar or do not convey significant information. An example of this is to use PCA. This creates a projection of the data resulting in entirely new input features, or principal components.

Train Models

At this stage, the developer will train the ML model. Initially model parameters (such as node weights on ANN) should be assigned values using methods such as random initialisation, random seeds or fixed values (e.g., all zeros). The training dataset is used to determine the values for these parameters that most accurately fit the data through an optimisation algorithm. Optimisation algorithms are used to find the model parameters that minimise the cost function (measure of model prediction error). A range of different optimisation algorithms exists and are generally selected as the one that is most appropriate for a specific ML algorithm. For example, stochastic gradient descent and adaptive moment estimation are two common optimisation algorithms used when training a neural networks (Kingma and Ba, 2014). Similar to the primarily ML algorithm, the optimisation algorithms have several hyperparameters, including batch size and number of epochs, which will affect the success of training stage (Kingma and Ba, 2014).

Model Validation

The developer must next utilise validation techniques to tune the model features and hyperparameters and generate an initial assessment of the model's performance. Depending on whether a classification or regression model has been developed, different performance metrics will be used by the developer. For regression models the developer should use a combination of Coefficient of Determination (R²), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). Whereas, for classification models the developer may pick from classification accuracy and error, sensitivity, and specificity. It is important to use a range of assessment metrics, as relying on only one metric may give a false indication of the model's performance. K-fold cross-validation is recommended for tuning the model features and hyperparameters to avoid overfitting and selection bias (Krstajic et al., 2014). In k-fold cross-validation, the data is divided into k subsets and the model training and validation is repeated k times. Each time, one of the k subsets is used as the validation data and the remaining subsets to train the model. The average error across all k trials is computed. There are variations of k-fold cross validation aimed at further reducing the chance of overfitting and selection bias. These include leave one out, stratified, repeated, and nested cross-validation (Krstajic et al., 2014).

Test Model

The developer should test the model on data not used in any part of the model training or hyperparameter tuning to indicate its performance on new data. Test data is only used once the model learning stage is complete to provide an unbiased evaluation of the model's ability to fit new data. Again, a range of assessment metrics should be used. If the model's performance on the test data does not meet the manufacturer's requirements, then the developer should return to an earlier stage in the process (e.g., sensor selection or data collection). If there is uncertainty to what extent the training data is representative of the manufacturing system (e.g., unknown if the temperature measurements extends to the maximum temperature range of the system), it is recommended to perform further evaluation of the model on additional unseen data (Fisher et al., 2020).

Verification of Resource Efficiency

At the final stage, the developer should apply the model to monitor the resource problem and deploy it to achieve more efficient resource utilisation. If the model does not deliver this, different steps of the process may need to be reapplied. For example, choosing alternative key quality parameters for the intelligent sensor to predict.

The remainder of this article will focus on four case studies which combine sensors and ML for a variety of applications within food and drink manufacturing. The case studies use optical and/or US sensors and will demonstrate how the intelligent sensing methodology can improve sustainability in the manufacturing process. In addition, the case studies highlight some of the challenges of using ML methods. A summary of the case studies with their key features is presented in Table 1.

Table 1

	CS1 mixing monitoring	CS2 clean-in-place monitoring		CS3 fermentation monitoring	CS4 potato assessment
Resource efficiency problem	1. Overmixing results in excessive use of energy. 2. Poor mixing results in wasted material.	Excessive use of resources (water, chemicals, and energy) due to over cleaning		1. Over or under fermenting can negatively affect downstream processes (e.g., canning) 2. Bad fermentation requires wort to be wasted	1. Low yield from batch due to unsuitability for specific application (e.g., crisps or mash)
Resource efficiency KPI	1. Energy used for mixing. 2. Mixed material wasted due to poor mixing	Water, energy and chemical utilisation		1. Wasted amount of work 2. Additional resources required in downstream processes	Potato batch yield
Intelligent sensor requirement	1. Predict when mixing complete and time remaining to be complete	1. Predict when equipment is clean (fouling removed). 2. Predict time remaining until cleaning complete.		Predict ABV% of wort	1. Predict class based on size. 2. Predict dry matter content.
Sensor selection	US	Optical Camera	US	US	Optical Camera	Optical NIR spectroscopy
Data collection and labelling	US waveforms reflected from internal mixer surface	Images from internal surface of tank	US waveforms reflected from fouled surface	US waveforms 1 transmitted through wort and 2 reflected from surface	RGB images of whole potatoes	NIR spectra from whole and sliced potatoes
ML model design	Supervised classification and regression. ANN, SVM, LSTM, CNN	Supervised regression NARX Neural Network	Supervised classification models: KNN, SVM, RF, Adaboost	1. Supervised regression 2. Linear regression 3. ANN	1. Supervised Classification: LDA 2. KNN, PLS-DA	Supervised regression: PLSR, Classification: LDA, KNN, PLS-DA
Feature engineering	Physical feature (energy), PCA, DWT, gradients	No of fouled pixels from images	Features extracted from windowed US data	Physical features. Mean and STD of ref 1 and ref 2 averaged over 10 s	Morphological: area, perimeter, major and minor axis, eccentricity	Feature extracted from spectral data
Feature selection	All features used.	All features used.	K-best predictors	All features used.	All features used.	All features used.
Model results summary	Classification accuracy >96% Regression R2 >0.97	Mean square error <2.85 x10-7	Classification accuracy as high as 100%	RMSE <0.5 when 10 batches used. Model improves with more training data	Optimal r values > 0.98 for sliced tubers.	Classification accuracy as high as 96%
Aspects of ML highlighted by study (including challenges	1. Using multiple sensors 2. Different feature engineering 3. Different ML algorithms 4. Data labelling	1. Utilising different types of sensors 2. Model selection 3. Shallow learner vs. ensemble methods 4. Data labelling		1. Data collection and volume 2. Feature engineering 3. Data labelling	1. Using different sensors 2. Feature engineering 3. Model selection 4. Data labelling

Key features of the four intelligent sensing case studies.

ABV%, Alcohol by Volume; ANN, Artificial Neural Network; CS, Case Study; CNN, Convolution Neural Networks; DWT, Discrete Wavelet Transform; KNN, K-Nearest Neighbours; KPI, Key Performance Indicator; LDA, Linear Discriminant Analysis; LSTM, Long Short-Term Memory Neural Networks; ML, Machine Learning; NIR, Near-Infrared; NARX, Nonlinear Autoregressive Network with Exogenous; PLS-DA, Partial Least Squares Discriminant Analysis; PLSR, Partial Least Squares Regression; PCA, Principle Component Analysis; RF, Random Forest; RMSE, Root Mean Squared Error; SVM, Support Vector Machine; US, Ultrasonic.

Case Study1: Monitoring Mixing

Introduction

Most, if not all, food manufacturing processes use material mixing at some stage. Mixing is not only used for combining materials, but also to suspend solids, provide aeration, promote mass and heat transfer, and modify material structure (Bowler et al., 2020a). Inefficient mixing can result in off-specification products (waste) and excessive energy consumption. Therefore, this case study focuses on developing an intelligent sensor which could be used to inform on the mixing process KPIs: energy consumption and wasted mixing material. Due to the prevalence of mixing within factories, the optimisation of this process provides significant potential for improving manufacturing sustainability.

To address the resource efficiency problem, the intelligent sensor requirement was to predict (A) when the materials were fully mixed (mixing endpoint) and (B) time remaining until mixing endpoint. Therefore, classification ML models were developed to classify whether a mixture was non-mixed or fully mixed, and regression ML models were built to predict the time remaining until mixing completion. In a factory, the intelligent sensor prediction of the time remaining would provide additional benefits of better scheduling of batch processes and therefore improved productivity. Furthermore, prevention of under-mixing would eliminate product rework or disposal, and the prevention of over-mixing would minimise excess energy use.

There are many sensor techniques available which can monitor mixing processes in a factory (e.g., electrical resistance tomography or NIR spectroscopy), but each have benefits and downsides which limit them to specific applications (Bowler et al., 2020a). An US sensor was selected for this work due to being in-line, meaning they directly measure the mixture with no manual sampling required and so are suitable in automatic control systems. The sensors are also low-cost, non-invasive, and are capable of monitoring opaque systems. This case study demonstrates the benefits of using multiple sensors, within the intelligent sensor methodology, to monitor a mixing process and also investigates different feature engineering methods and ML algorithms.

Materials and Methods

The data to train the intelligent sensor was collected from a honey-water blending mixing system. Two magnetic transducers of 1 cm² active element surface area with 5 MHz resonance (M1057, Olympus) were externally mounted to the bottom of a 250 ml glass mixing vessel (Figures 2A,B). An overhead stirrer with a cross-blade impeller was used to stir the mixture. As honey is miscible in water, the sensors follow a change in component concentration at the measurement area as the mixture homogeneity increases. The transducers were attached to adhesive magnetic strips on the outside of the vessel with coupling gel applied between the sensor and strip. The transducers were used in pulse-echo mode to both transmit and receive the US signal. The sensing technique used in this work monitors the sound wave reflected from the vessel wall and mixture interface, which is dependent on the magnitude of the acoustic impedance mismatch between the neighbouring materials (McClements, 1995). Therefore, no transmission of the sound wave through the mixture is required. In industrial mixtures, there are typically many components present which create many heterogeneities for the sound wave to travel through. This causes the sound wave to undergo scattering, reflection, and attenuation during transmission. Combined with the large mixing vessel sizes in factories, this makes transmission-based techniques difficult to use without high power, and therefore high cost, transducers. A limitation of the non-transmission technique used in this work becomes the local material property measurement. Therefore, two sensors were used to monitor the mixing process to compare the effect of sensor positioning. One sensor was attached in the centre of the vessel base, and another was mounted offset from the centre, Figure 2B.

Figure 2

Two different volumes of honey were used for the experiments: 20 and 30 ml. A constant volume of 200 ml tap water was used throughout. The impeller speed was also set to values of either 200 or 250 rpm. These four parameter permutations were repeated three times across 1 day whilst varying the laboratory thermostat set point to produce a temperature variation from 19.3 to 22.1°C. Therefore, 12 runs were completed in total. This allowed an investigation of the ML models ability to generalise across process parameters. The labelled training data for ML model development was obtained by filming the mixing process with a video camera to determine the time at which the honey had fully dissolved.

A focus of this case study was to investigate the level of feature engineering required for acceptable prediction accuracy. Regarding the design of ML models, ANNs, SVMs, Long Short-Term Memory Neural Networks (LSTMs) shallow ML algorithms were compared with CNNs that use representation learning. Shallow learning requires manual feature engineering and selection for model development, and therefore typically requires some specialist domain knowledge of ultrasound and/or the mixing system from the operator. In contrast, CNNs automatically extract features, requiring no operator input. The features compared for shallow ML model development were full-waveform features, such as the waveform energy; principle components, using the amplitude at each sample point in a waveform; and frequency components of the waveform after applying the Discrete Wavelet Transform (DWT). A flow diagram detailing the ML feature engineering process is presented in Figure 3. Each run was held back sequentially for testing, and a model was developed using the remaining runs as training data.

Figure 3

Results and Discussion

To classify whether the honey-water mixing was complete, the highest accuracy was 96.3% (Table 2). This was achieved using the central sensor and an LSTM with the waveform energy, Sum Absolute Amplitudes (SAA), and their gradients as features. Performing data fusion between both sensors produced no improvement in classification accuracy over the central sensor alone, sometimes producing lower classification accuracies due to overfitting. This is because the last position for the honey to dissolve was the centre of the mixing vessel base, where the central sensor was located. High classification accuracy was achieved by being able to use data from previous time-steps. This was achieved using LSTMs, which store representations of every previous time-step; ANNs using feature gradients features; or time-domain CNNs which use stacking of 25 previous time-steps.

Table 2

Prediction task	Accuracy	Algorithm	Sensor	Features
Classification of non-mixed or fully-mixed state	95.0%	ANN	Central	DWT, PCs, G
	96.3%	LSTM	Central	E, SAA, G
	95.4%	LSTM	Combined	E, SAA, G
Regression of time remaining until mixing completion	R2: 0.973RMSE: 233.4MAPE:3065.9	LSTM	Combined	DWT, PCs
	0.977 (R2)RMSE: 122.1MAPE:118.6	CNN	Combined	Time-domain

A selection of prediction accuracy results for the honey-water blending experiments.

ANN, Artificial Neural Network; R², Coefficient of Determination; CNN, Convolution Neural Networks; DWT, Discrete Wavelet Analysis; E, Waveform energy; G, Feature gradient; LSTM, Long Short-Term Memory Neural Networks; MAPE, Mean Absolute Percentage Error; PCs, Principal Components; RMSE, Root Mean Squared Error; SAA, Sum Absolute Amplitudes.

An R²-value of 0.974 to predict the time remaining until mixing completion for the honey-water blending was achieved using both sensors with time-domain input CNNs. Using both sensors produced the highest prediction accuracies, owing to the non-central sensor having greater resolution near the beginning of the process, and the central sensor having a greater resolution at the end (Figure 4). Again, the ability to use previous time-steps as features was necessary for high prediction accuracy.

Figure 4

SVMs performed worst overall, most likely because of overfitting due to their convex optimisation problem leading to a global minima. Global cost minimisation may lead to poor prediction ability when the test data process parameters lie outside of the bounds of the training data. As a k-fold testing procedure is used in this work, testing on data lying outside of the process parameter space used in training is unavoidable. In comparison, ANNs only converge to local minima, which may have aided their ability to generalise to test data outside the parameter space of training.

The use of sensors and ML to monitor mixing processes relies on the availability of a set of complete labelled data. In a factory, a reference measurement is often not available, and if one is available it is typically obtained via manual sampling and off-line analysis, providing only a small set of labelled data. Therefore, techniques which can develop reliable ML models with limited labelled training data must be investigated. Two methods to achieve this are transfer learning and semi-supervised learning. Transfer learning involves leveraging knowledge used on a source domain to aid prediction of a target domain. For example, training a ML model on a lab-scale mixing system with a reference measurement to obtain a complete set of labelled data, and then combining this knowledge with the unlabelled data on the full-scale mixing system. On the other hand, semi-supervised learning can use unsupervised ML methods on the labelled data in conjunction with unlabelled runs to extract features, and then utilise a self-training procedure.

The purpose of this case study was to develop an intelligent sensor to reduce resource consumption (energy) and waste (off-specification product) during the mixing process. The predictions from the multi-sensor classification models all achieved prediction accuracies above 95% and the regression models R²-values above 97% with acceptable errors. This demonstrates the potential of combining ultrasonic sensors with ML to monitor and optimise mixing processes and deliver environmental benefits.