Deep learning and IoT enabled digital twin framework for monitoring open-pit coal mines

Yu, Rui; Yang, Xiuyu; Cheng, Kai

doi:10.3389/fenrg.2023.1265111

ORIGINAL RESEARCH article

Front. Energy Res. , 23 October 2023

Sec. Sustainable Energy Systems

Volume 11 - 2023 | https://doi.org/10.3389/fenrg.2023.1265111

Deep learning and IoT enabled digital twin framework for monitoring open-pit coal mines

Rui Yu*

Xiuyu Yang

Kai Cheng

China Coal Huajin Group Co., Ltd., Yuncheng City, Shaanxi, China

Early detection of cracks enables timely mitigation and maintenance actions, ensuring the safety of personnel and equipment within the open-pit coal mine. Monitoring open-pit coal mines and cracks is essential for the safety of workers and for saving national assets. Digital twins (DTs) can be crucial in open-pit coal mine crack detection. DTs enable continuous real-time monitoring of the open-pit mine, including its structures and surrounding environment. Various sensors and internet-of-things devices can be deployed to collect data on factors such as ground movement and strain. Integrating this data into the DT makes it possible to identify and analyze anomalous behavior or changes that may indicate crack formation or propagation. Deep learning-based networks are a crucial factor in detecting open-pit coal mine cracks. In this work, we propose a deep learning-based densely connected lightweight network incorporated into the DT-based framework for detecting cracks and taking predictive maintenance-based decisions by combining historical data, real-time sensor data, and predictive models. The proposed DT-based framework provides insights into the potential crack formation, allowing for proactive maintenance and mitigation measures. We compare the performance of our proposed network on different evaluation measures such as precision, recall, overall accuracy, mean average precision, F1-score, and kappa coefficient, where our proposed lightweight multiscale feature fusion-based network outperformed all other state-of-the-art deep neural networks. We also achieved the best performance on mean average precision by surpassing all other models. Additionally, we also compared the performance of our proposed network with U-Net and recurrent neural network on model training and prediction time benchmarks by outperforming those cutting-edge models.

1 Introduction

Modern computing technologies, for instance, the Internet-of-Things (IoT) Sasikumar et al. (2023), high-speed networks Bing et al. (2023) Hassan et al. (2018), and cloud computing Xu et al. (2023) possess ubiquitous real-world applications Hassan et al. (2021) Liu et al. (2023) Hassan et al. (2023). The massive amounts of data collected across various domains necessitate storage, processing, and applying specific monitoring and managing techniques. The use of technology is significant for geographical information systems Hussain et al. (2023). Mineral resources constitute a crucial component of the material foundation of human society and a vital assurance for economic growth and national security. China is a large, mineral-rich nation, and certain regions contain deep-buried mineral deposits that are simple to exploit Church and Crawford (2020). For their immediate desires, some illegal miners violate the rules governing the rapacious extraction of different natural resources, and others even illegitimately take territory for private mining without permission. These actions have resulted in severe ecological harm and depletion of national resources, and it is difficult for the appropriate regulatory agencies to identify such illicit mining practices immediately. These actions result in open-pit cracks, sometimes damaging workers’ safety and bad land conditions. Therefore, it is crucial to have quick and precise land reach, usage, and damage of open-pit mining regions to spot cracks in the coal mines and take preventive measures to end them Wang et al. (2020) promptly. To achieve effective crack detection in open-pit coal mines, advanced monitoring technologies such as remote sensing, geophysical surveys, ground-based radar, and drone-based inspections are often employed. These technologies allow for continuous monitoring of mine walls and slopes, enabling early detection of cracks and the implementation of preventive measures, ultimately ensuring the safety of personnel and the protection of assets and the environment.

Digital twins (DTs) are a valuable tool in the coal mining industry. A DT is a virtual representation of a physical asset or system, such as a mine, that captures real-time data and provides a digital replica of its operations, processes, and performance Hassan et al. (2022). DTs can indeed play a crucial role in open-pit coal mine crack detection. It combines real-time data, simulations, and analytics to comprehensively understand the asset’s behavior and performance. DTs enable continuous real-time monitoring of the open-pit mine, including its structures and surrounding environment. Various sensors and IoT devices can collect data on factors such as ground movement, vibrations, temperature, and strain Prauzek et al. (2023). Integrating this data into the DT makes it possible to identify and analyze any anomalous behavior or changes that may indicate crack formation or propagation. DTs provide a platform for data-driven decision-making. By leveraging the insights generated from the digital twin, mine operators and engineers can make informed decisions about maintenance schedules, structural interventions, and resource allocation to address identified cracks. The DT serves as a decision-support tool by comprehensively understanding the mine’s condition and the potential consequences of different actions. DTs can facilitate long-term monitoring and maintenance of the open-pit mine. By continuously updating the DT with real-time data, it becomes possible to track the evolution of cracks over time and assess the effectiveness of maintenance interventions. This helps in developing predictive maintenance strategies and optimizing the lifecycle management of the mine’s structures. DTs provide a holistic approach to crack detection in open-pit coal mines by integrating real-time monitoring, predictive analytics, simulation, visualization, and decision support. By leveraging the power of our proposed DT-based framework, mine operators can enhance safety, optimize maintenance efforts, and minimize the risks associated with crack formation and structural instability.

Open-pit coal mining is a type of surface mining that involves the extraction of coal from a large open-pit or excavation Benndorf (2013). Various services are typically involved in operating and maintaining open-pit coal mines. Several key services are commonly associated with open-pit coal mining. For example, exploration and geological Mao (2020), engineering and design Domingues et al. (2017), mining equipment and machinery, coal processing and preparation, and waste management and disposal services Yıldız (2020). These services are integral to the successful operation of open-pit coal mines, ensuring efficient extraction, processing, and responsible management of coal resources while prioritizing safety and environmental considerations Liu et al. (2022). Illegal mining is frequently rapid and aggressive to elude oversight, and such intense open-pit coal mining may seriously damage the surrounding ecosystem. Open-pit mining can create drastic changes to the original landform. Many academics have integrated remote sensing photos into the extraction of open-pit mine data in an effort to more effectively and quickly identify illicit mining activity. Most studies use an object-oriented strategy to gather data from open-pit mines Huo et al. (2021). Several other studies have shown the use of deep learning technology for processing remote-sensing photos effectively. However, since these approaches are applied on a single computer, deep learning methods may take a while to train models, especially if a lot of remote sensing data is needed. Therefore, detecting cracks in less time with precise identification becomes a problem of interest to be explored. Deep learning-based lightweight networks are a consistent solution for this problem that can be embedded in IoT devices as well to detect surface cracks in open-pit coal mines.

The process of open-pit coal mining involves excavating large areas of land to extract coal deposits. This process can result in ground movement and subsidence, which may cause cracks in the surrounding terrain. Subsidence can occur when coal removal destabilizes the overlying rock layers, leading to the sinking or settling of the ground. Open-pit coal mines often have steep slopes or high walls to access coal seams. These slopes can be unstable, manifesting as cracks or fractures along the slope faces Ning et al. (2020). Geological conditions, weathering, erosion, and mining activities can contribute to slope instability and potential cracking. Coal mining involves the extraction of coal seams from underground, which can induce fracturing or cracking in the surrounding rock strata. These fractures can extend from the underground workings to the surface, potentially affecting the stability and integrity of the open-pit mine Xiao et al. (2021). Therefore, timely monitoring and detecting these cracks is crucial to implementing preventive measures. Coal mining companies must implement comprehensive monitoring and management practices to identify and mitigate any potential issues related to ground movement, slope stability, and strata fracturing. These measures can help ensure the safety of personnel and infrastructure within and around the open-pit coal mine Lanciano and Salvini (2020).

For open-pit coal mines, the DT can utilize advanced analytics and machine learning algorithms to analyze the collected data and identify patterns or trends associated with crack development. By combining historical data, real-time sensor data, and predictive models, the DT can provide insights into the potential crack formation, allowing for proactive maintenance and mitigation measures. DTs can simulate and visualize the behavior of the open-pit mine and its structures under various conditions. This includes simulating the impact of different loads, environmental factors, and mining activities on the stability of the mine. We have incorporated the crack detection algorithms into the simulation, potential crack locations are identified, and their propagation patterns are visualized. This has helped understand the potential risks and assists in planning appropriate remediation actions.

In this work, we propose and develop a DT-based open-pit coal mines crack detection and open-pit mines extraction framework by leveraging the deep learning-based lightweight, densely connected network for crack detection and open-pit mines extraction. We utilize the Sentinal-2 imagery Madhuanand et al. (2021) of open-pit coal mines as a dataset for training and testing the performance of our proposed framework. We also claim that this is the first DT-based framework for detecting cracks and taking preventive measures to tackle future accidents in open-pit coal mines. We also compared our proposed lightweight multiscale feature fusion-based method with well-known models such as AlexNet Krizhevsky et al. (2012), VGG-16 Simonyan and Zisserman (2014), VGG-19 Simonyan and Zisserman (2014), GoogleNet Szegedy et al. (2015), ResNet-50 He et al. (2016), Single Shot Detector (SSD) Liu et al. (2016), DenseNet121 Huang et al. (2017), and MobileNetv2 Sandler et al. (2018). It can be seen that our method outperformed in all comparisons. The contributions of this work can be summarized as follows.

• This research focuses on detecting cracks in open-pit coal mines early. By identifying cracks promptly, it contributes to ensuring the safety of personnel and equipment within these mines.

• The utilization of DT in this work allows for continuous, real-time monitoring of open-pit coal mines, including their structures and surrounding environment. This is a significant contribution, as it enhances safety measures through comprehensive data collection.

• Integrating various sensors and IoT devices for data collection is critical to this work. It contributes by enabling the monitoring of factors such as ground movement and strain, which are essential indicators of potential crack formation.

• The proposal of a deep learning-based densely connected lightweight network within the DT framework is a notable contribution. The comparison of the proposed network with state-of-the-art models on various evaluation measures demonstrates its superior performance.

In our proposed work, we significantly enhance the safety and efficiency of open-pit coal mines by employing DTs, integrating sensor data, implementing deep learning techniques, and facilitating proactive maintenance decisions. Its superior performance compared to existing models further underscores its importance in the field of crack detection and mine safety.

The organization of this work is as follows. Section 2 presents the literature review and compares our proposed work with other related works. The methodological description of our proposed work is available in Section 3. We explain the evaluation metrics used in this work in Section 4. The experimental results are presented in Section 5 while the concluding remarks and future research directions are available in Section 6.

2 Literature review

This section reviews the existing literature on deep learning applications for coal mines and the use of digital twins in IoT-related applications.

2.1 Deep learning and coal mining applications

Open-pit coal mines exhibit diverse geometries and comprise multiple elements, resulting in a complex composite target encompassing drainage fields, mining pits, side gangs, and other heterogeneous spatial features Marove et al. (2022). Traditional approaches to image feature extraction heavily rely on manually designed extractors, which often lack robustness and generalization, leading to suboptimal accuracy when applied to practical open-pit coal mine extraction. In recent years, Convolutional Neural Network (CNN) models have shown significant success in a range of image-related tasks, including target segmentation Zhang et al. (2021), localization detection Chen et al. (2022), and image classification Hassan et al. (2019). Furthermore, CNN models have demonstrated their applicability in diverse domains such as text classification Minaee et al. (2021) and speech recognition Wang et al. (2019a). We review several deep learning-based models in the context of open-pit coal mines in the following literature.

Remote sensing data from the Landsat series was used by Zhu et al. (2020) to create guidelines for extracting mining exploitation locations based on the unique use of land features seen in open-pit mines. They used an object-oriented categorization strategy to extract land use data in open-pit mining regions. However, their proposed approach has only studied the ecological impacts without highlighting the severe threats to the mines and they applied principal component analysis for dimension reduction of slices of imagery data. To find the best segmentation scale for locating and labeling typical features in mining sites, Wang et al. (2019b) used histogram comparison approaches. Their method was used on Gaofen-1 and ZY-3 high-resolution satellite images. In summary, while the manuscript introduces a deep learning approach for object detection in high-resolution remote sensing images, it falls short in terms of providing sufficient methodological details, benchmarking, and addressing potential limitations. Using Landsat-8 satellite data and an object-oriented classification approach, Huangfu and Li (2020) categorized the Baotou Baiyun Ebo mining region. Results were compared to those obtained using supervised classification techniques, showing that the object-oriented strategy performed better. Open-pit coal mining in Yuzhou City, Henan Province, was the subject of Huo et al. Huo et al. (2021) use of Gaofen-2 satellite data and support vector machine (SVM) techniques to extract land use data for mining regions. Based on the research results, their strategy is superior to other object-oriented approaches and fused K-nearest neighbor. Iosif Vorovencii Vorovencii (2021) utilized Landsat images to map surface mining and reclamation in mining regions. He used several SVM classification techniques to examine satellite images for reclamation activity and surface mining evidence. However, the work does not discuss the potential limitations or challenges of using traditional image processing methods for change detection in mining areas. It does not address the issues of noise, variability in lighting conditions, or complex spatial patterns that can affect the accuracy of change detection. Our proposed work has the potential to overcome some of these challenges by learning robust features directly from the data.

Demirel et al. (2011) examined high-resolution multidimensional satellite images taken from 2003 to 2008 from the Goynuk open pit mine of Turkey. The SVM classification approach was used in their investigation to locate and quantify characteristics in the mining region. A research employed Landsat-TM images with a spatial resolution of 30 m to acquire information about the mining area Yuan et al. (2013). They used object-oriented supervised classification (ORSC) approaches to get accurate findings. Using a convolutional neural network (CNN), Chen et al. (2020) extracted and classified construction and usage data from high-resolution Gaofen-2 satellite images of the Jiangcang fifth open-pit mining in the Yuzhou City of China. The classification results were assessed and compared to the standard pixel-based maximum likelihood technique, demonstrating that the ORSC paired with SVM produced good accuracy and quality results. Another research used Sentinel-2A satellite images as a data source for extracting land use data for mining zones Shao et al. (2020). Their approach combined supervised classification with normalized index computation, successfully improving classification accuracy and permitting the extraction of diverse characteristics over a vast region.

ResCapsNet is a new deep model suggested by Guan et al. Guan et al. (2022) that incorporates a band selection strategy based on clustering with capsule and residual networks. Using data from the Gaofen-5 satellite, they assessed the stability of the model. Using multimodal remote sensing information and including multi-scale kernel functions, Qian et al. (2021) presented a multi-stream CNN model they called 3M-CNN. Using Gaofen-2 high-resolution satellite images, they created a CNN-based object-oriented system for modeling open-pit mines. Using semi-supervised (SVM-STV) and supervised (E-ReCNN) techniques, Camalan et al. (2022) investigated multiclass and binary variations in MDD mining pools.

Naixun et al. (2019) extracted the portion of the development land used for mining, predominantly open-pit quarries, by integrating deep learning methods and object-oriented notions. Convolutional neural networks trained on Gaofen-2 images were utilized for this purpose. The challenge of low accuracy in CNN-based recognition of open-pit mining sites due to limited training data was addressed by Cheng et al. (2018) by implementing a transfer learning approach. To do this, the bottom variables of the training CNN model network had to be frozen, while the top parameters had to be fine-tuned. Experiments comparing different training methods led to discovering the most successful strategy. In another work Zhang et al. (2020), they used dense connections to strengthen a fully convolutional neural network to achieve complete automation of open-pit mining area extraction in the Tongling region. This was part of their effort to reach their goal of full automation. The training of an open-pit mining area extraction model used data gathered from various remote sensing sources.

We also include a taxonomy of related works discussed regarding deep learning and coal mining applications in Table 1. In the characteristics, we highlight the article types, machine learning, and deep learning models used, and the datasets used for performing analysis, and core application areas of discussed works.

TABLE 1

TABLE 1. Taxonomy of works discussed in relation to deep learning and coal mining applications.

In the context of crack detection in open-pit coal mines using remote sensing data, there are inherent challenges associated with conventional identification techniques. These challenges include poor accuracy, limited generalization, inefficiency, prolonged training periods, and restricted automation. To overcome these difficulties, a paradigm shift is necessary. One potential solution lies in leveraging the untapped potential of cloud technology, which has been underutilized in the field of remote sensing. By harnessing the power of cloud computing, we can effectively handle large volumes of image data and accelerate the training process of models. The integration of cloud technology with remote sensing techniques offers the potential to address drawbacks commonly associated with traditional methods, such as extended execution and loading times for models.

2.2 Digital twins and IoT applications

This section reviews several state-of-the-art DT frameworks developed with respect to IoT-based applications for several use cases. For example, Glaessgen and Stargel (2012) proposed the most widely accepted description of a digital twin (DT) “it is an integrated multi-physics, multi-scale, probabilistic simulation of a complex product and uses the best available physical models, sensor updates to mirror the life of its corresponding twin.” As so, it serves as a link between the real and virtual worlds. Massive amounts of data are gathered in real-time from interconnected devices and distribution networks, all thanks to the IoT Allam et al. (2022). A research looked at the framework of DT-driven product design in production, paying special attention to the link between a product’s physical and digital versions Bertoni and Bertoni (2022). The Industry 4.0 wearable system was presented in one research paper. Human-generated data may facilitate the synchronization of cyber-physical systems Wang et al. (2022). Yet, further quantitative methodologies are needed to evaluate the hypotheses. A study by Yasin et al. (2021) suggested best practices for integrating DT into manufacturing infrastructure, particularly for small and medium-sized businesses. Acquiring data and running smoothly are both assured. The primary function of today’s information technology has shifted from monitoring to establishing an all-encompassing data infrastructure. From sensing to networking to analytics, they covered all the obstacles on the path to the DT. Researchers have discussed digital twins’ concepts and potential applications in various domains Hassan et al. (2022). One viewpoint suggests that a digital twin is a real-time virtual representation of a physical object. However, it is noted that the concept of digital twins is still primarily in the conceptual stage and calls for a unified framework to facilitate their implementation.

DTs have been studied for their potential use in asset condition monitoring and health assessment as part of asset lifecycle management El Bazi et al. (2023). DTs might improve asset management throughout their existence. A set of guidelines and methods for developing a DT using the IoT in the petrochemical sector have been offered Sharma et al. (2022). The goal is improved production management via the use of a DT that is powered by real-time data. DTs, which center on consolidating real-world components and their digital representations, have been used as simulation models to aid decision-making and process automation. It has also been stated how a DT may be used to help with logistics dispatching in ports. The digital twin enables performance forecasting and assessment of dispatching rules by using data received through the IoT and analyzing it via cloud computing. More study and development of DTs is warranted since these studies show the wide range of sectors that may benefit from them.

In an article, Darvishi et al. (2020) proposed a data-driven approach for anomaly detection in sensors for digital twins. To obtain a greater sensor validation performance, the suggested design simultaneously uses both reliable and faulty sensors inside the system, as well as the temporal correlation of the data. Weak faults, which are very difficult to identify and are often disregarded in the literature, are the main focus of produced defects. Several hyperparameters’ effects, such as the number of layers and nodes per layer, are evaluated for the situations under consideration. In another article, Darvishi et al. (2022) proposed a machine learning-based approach for sensor-fault detection for digital twins. We also make use of deep learning-based architecture for crack detection of coal mines in our proposed work. Yang et al. (2022) suggested building the DT-empowered IoT model using federated learning optimization. In particular, they created the DT and IoT-assisted deep reinforcement learning approach for the industrial IoT device selection process in federated learning, particularly for picking industrial IoT devices with high utility values, to address the heterogeneity of industrial IoT.

Table 2 highlights the taxonomy of works discussed in the literature in the context of IoT applications and digital twins. We also illustrate the Technology Readiness Level (TRL) and found that most works are on the concept level. Some works are systematic literature reviews (SLRs), and we also shed light on application areas of discussed works.

TABLE 2

TABLE 2. Taxonomy of works discussed in relation to IoT and Digital Twin applications.

The DT-based frameworks for monitoring critical infrastructure is well-needed these days. Since, current literature lacks in the development of such frameworks for monitoring the open-pit coal mines for safety reasons and predictive maintenance of mines, our proposed work contributes in monitoring mines and provides potential indicators for safety management as well.

3 Methodology

In this section, we briefly describe the proposed DT-based framework and proposed multiscale feature fusion-based deep learning model that we developed in order to provide effective and timely monitoring of cracks in the open-pit coal mines. In Section 3.1, we discuss the potential entities of a digital twin involved in our proposed DT-based monitoring framework for open-pit coal mines. In Section 3.2, we describe the proposed multiscale feature fusion-based lightweight deep learning model for detecting open-pit coal mines.

3.1 Digital twin platform for open-pit mining

A DT and IoT-enabled platform for monitoring open-pit mines is crucial. It includes both the virtual and the real worlds of asset encapsulation and cooperation. Figure 1 depicts our four-tiered design implementation of DT-based framework for the safety of open-pit coal mines. With the help of IoT services and devices, this architecture connects the real world open-pit coal mines with the virtual one framework. When workers are in various states and places, it might be challenging for managers to maintain command and implement their response. The notion of the DT is included in this framework’s conception and creation to provide immediate insight and transparency. Cyber entities are the electronic representations of physical assets, such as people, machines, and substances, and depend on data acquired in the field in real-time via IoT services and devices. When a physical object’s position or condition changes, its digital counterpart will be updated in real-time. A single, easily identifiable dot on the virtual location map may represent the operator’s movement from point A to B. In addition, the operator’s changing health state might be visually represented by a shift in the digital representation monitor. With the help of IoT devices, large amounts of data may be generated, and IoT services can aid in analyzing this data. Deployed IoT gateways provide information to the cloud, which may be used to make decisions.

FIGURE 1

FIGURE 1. Overview of our proposed digital twin-based framework for the safety of open-pit coal mines.

3.1.1 Physical world

The framework begins with this layer at its most fundamental level. In order to carry out everyday activities in the physical world, which we can see and touch, it is essential to have assets such as human resources, machinery, and materials at one’s disposal. Since it represents physical assets’ geographical and temporal status, two features are of utmost significance. The first thing to consider is the passage of time, which can be thought of as both the past and the present moment in the product’s lifespan. The second property that represents the exact three-dimensional position data of the physical assets is called space, and it is an attribute that represents the physical assets. Both aspects play important roles as critical criteria in measuring the coal mine environment’s safety.

3.1.2 IoT-based services

Proper IoT devices and services serve as the ladders that gather, create, analyze, and transport digital data from the real world to the virtual world. This is necessary to construct the virtual world for various stakeholders to improve monitoring and management. Several IoT technologies have been used in this framework to create a portable IoT wireless sensor tag. The 1D/2D barcode is used to identify and pair the tag. The accelerometer is a key component in the process of recognizing the pattern of motion of the assets. Movement data analyses are carried out to determine the asset’s current state; the specifics of this endeavor will be covered in the next section. The wireless communications device enables the transfer of the detected data and acts as an emitter for signal strength collection, which is necessary for outdoor locations. A power source for modules to work properly is also needed in our proposed framework for properly working of all devices and services. In IoT-based services, the deployed deep learning-based crack detection works. The primary reason behind developing a lightweight model is to install and deploy it in the edge devices.

The Raspberry Pi 4B development board is the foundation for the edge gateway for the IoT. Adding modules and sections corresponding to the desired responses makes it possible to tailor many functionalities. IoT edge gateways are responsible for collecting all of the data that has been detected by IoT wireless sensor tags using the wireless communication module. Since big and diverse physical assets create large-scale data in terms of volume and diversity, the centralized server could run into a significant amount of computing load. As a result, the cloud-side pressure is alleviated thanks to the distributed computation developed and included in the IoT edge gateway. Three important IoT services must be implemented for an edge IoT gateway to work properly. The registration services guarantee that the approved edge gateway is registered and connected to the process and the various situations. The time-window scanning, geo-location settings, and data destination are all guaranteed to be in their right configured states by the configuration services. The following features of the edge IoT gateway may be realized via commands and procedures conducted by execution services.

3.1.3 Virtual world

The virtual world is recognized as a gateway in which many stakeholders may watch and keep an eye on events. Setting the specifications of the open-pit coal mine’s tracking technique permits the modification of monitoring sensitivity, which can be seen in the parameters’ values for things like the fainting detection limit and the location environmental noise value. Their current state has been presented in the virtual world to illustrate better the circumstances surrounding the assets and the steps that need to be taken. Using the proposed DTs framework, the components that exist in the physical world are mapped to the virtual world, and the locations of these components are virtualized so that they may be represented graphically. A synchronization mechanism has been built to communicate tracking data with the existing management systems to reduce the disruption caused by altering the daily activities that were originally in place. The proposed framework receives simultaneous updates to information such as the location name, the number of staff members, and the appropriate operation procedure ID.

3.1.4 Stakeholders

All stakeholders, including supervisors, operators, and management, may access the tracking data if the approved control is in place. In addition, there may be a large number of stakeholders that have contradictory beliefs on the significance of the data and functions. Because operators are mainly perceived as the targets of monitoring, they only offer essential information about themselves. This is because operators are primarily seen as the targets of monitoring. It is acceptable for managers to use various online services to surveil their staff members to determine their present health and locations. In the case of a life-threatening emergency, supervisors have the option to summon outside aid from trained professionals. It is only permissible for the super manager, responsible for the effective safety monitoring of coal mines, to specify the criteria.

3.2 Proposed crack detection network based on deep learning

This section provides a detailed overview of the crack detection mechanism applied in our DT-based framework. We leverage a lightweight network to process the data in edge devices to make rapid decisions. Problem Statement: In the crack detection problem of our presented work, X represents the input data, which includes various sensor readings and measurements collected from the open-pit coal mine and its surrounding environment. These measurements may include factors such as ground movement, strain, and other relevant parameters. X is a multi-dimensional dataset where each data point is represented as X_i, i = 1, 2, …, N, where N is the total number of data points. Each X_i includes a set of features or variables, denoted as X_i = x₁, x₂, …, x_m, where m is the number of features in each data point. The outputs of the crack detection problem are formalized as a binary classification task, where the goal is to determine whether a crack is present or not based on the input data X.

For the output, we consider Y as the expected outputs, where Y_i is the binary label associated with each data point X_i. Y_i = 1 if a crack is present in the corresponding area represented by X_i. Y_i = 0 if no crack is present in the corresponding area represented by X_i.

We have formulated the problem statement as a mapping function F that takes the input data X and produces the corresponding binary labels Y:

\begin{aligned} F : X \to Y \\ F (X_{i}) = Y_{i}, where i = 1,2, \dots, N \end{aligned} (1)

The objective of the proposed deep learning-based approach is to train a lightweight network using historical data and real-time sensor data (X) to learn the mapping function F such that it accurately predicts the presence or absence of cracks (Y). The evaluation measures such as precision, recall, accuracy, mean average precision, F1-score, and kappa coefficient are used to assess the performance of this model in making these predictions.

3.2.1 Lightweight network

Lightweight networks are constantly being presented as a solution to the growing need for efficient deep learning. There are currently a few primary areas of investigation: the straightforward design of lightweight networks (such as GhostNet, and MobileNet), the secondary engineering of large-scale networks, for example, knowledge distillation and model pruning, and neural architecture search.

In neural network architecture, depth-wise separable convolution (DSC) is often utilized to decrease the model size and lower the computation required while keeping high extraction of features capacity. The DSC is made up of two distinct components: point-wise convolution (PWC), and depth-wise convolution (DWC), with DWC performing the convolution functioning separately for every channel to draw geographic features and PWC allowing the information to interact across channels via a 1 × 1 convolution kernel. The computation cost of DSC in Eq. 3 below is much lower than that of the generic convolution Eq. 2.

k_{1} = q \times γ \times γ \times p = γ^{2} p q (2)

k_{2} = q \times γ \times γ + 1 \times 1 \times q \times p = (γ^{2} + p) q (3)

In these equations, the size of the convolutional kernel is represented by γ, the number of feature maps is shown by p, while the convolution kernels’ number is shown by q.

3.2.2 Densely connected network

We leverage a lightweight, densely connected network to improve open-pit coal mining operations (see Figures 2A, B). There are two main parts to the network architecture. Our first step in feature extraction is a two-layer convolutional network. Second, we add a dense block of eight convolutional layers to the network in lieu of the original network architecture. Each layer receives the sum of the channels the layers behind it generates as input. To cope with smaller training datasets, this design strategy decreases the number of parameters, encourages efficient feature propagation, adds regularization advantages, and prevents overfitting. In addition to the feature extraction network, we also provide a channel attention module. The network’s performance is enhanced because of this module’s increased attention to the data in each channel.

FIGURE 2

FIGURE 2. The illustration of deep neural network-based architecture, including channel attention and spatial attention-based mechanisms. (A) Deep densely connected network, (B) output of the attention-based network, (C) Squeeze and Excitation network for global feature extraction, and (D) final extraction of output features from the spatial attention block.

3.2.3 Attention mechanism

Crack detection in open-pit coal mines is an extremely fine object recognition job since there are only subtle visual variations between the many types of earth surfaces. Our proposed network takes cues from the human visual system’s physiological perception process and uses attention processes to improve feature representation and extraction.

The Squeeze and Excitation (SE) attention block, adapted from SENet Hu et al. (2018), significantly enhances the capability of global extraction of features by enhancing the interplay of information across several channels. As demonstrated in Figure 2C, Squeeze combines the input map of features into an attention scalar that represents the global relevance at the channel’s level. This is accomplished via global averaging pooling, as seen in Eq. 4. Excitation is the process’s next phase, consisting of an adaptive recalibration carried out by a fully connected layer (see Eq. 5). This recalibration fits complicated channel correlations and shrinks the feature map to its starting condition. At last, the attention values will be used to reweight the output feature maps (see Eq. 6), giving more importance to the significant sections of the pictures.

A_{c} = F_{s o} (f_{m}) = \frac{1}{H t \times W d} \sum_{k = 1}^{H t} \sum_{l = 1}^{W d} f_{m} (k, l), f_{m} = v_{m} * Y = \sum_{s = 1}^{m^{'}} v_{m}^{s} * y^{s} (4)

Squeeze operation denoted by Fso, input feature map denoted by fm (k, l), input Y, input feature map height and width denoted by Ht and Wd, and convolution operation denoted by ^*, $v_{m}^{s} = [v_{m}^{1}, v_{m}^{2}, \dots, v_{m}^{s}, \dots, v_{m}^{m'}]$ When applied to a given Y-channel, V functions as a 2D spatial kernel representing a single vm-channel.

s = F_{e o} (o p, W d) = σ (g (o p, W d)) = σ (W d_{2} δ (W d_{1}, o p)) (5)

Feo stands for the Excitation operation, op for the output of the Squeeze, $W d_{1} \in R^{M * \frac{M}{r}}$ means the ReLU activation function, and represents the first completely linked operation, $W d 2 \in R^{M r * M}$ indicates the sigmoid function evaluation, which is the second fully connected action.

\tilde{Y_{m}} = F_{scale} (f_{m}, s_{m}) = s_{m} \cdot f_{m} (6)

Where, multiplication of the scalar s_m by the feature map f_m ∈ R^Ht×Wd channel-wise is denoted by $\tilde{Y_{m}} = [\tilde{y_{1}}, \tilde{y_{2}}, \dots, \tilde{y_{m}}]$ and $F_{scale} (f_{m}, s_{m})$ .

The Convolutional Block Attention Module (CBAM) is a plug-and-play focus lightweight block that enhances information flow across channels and optimizes spatial context data. The key components of CBAM are the channel attention (CA) block, which is used to extract the spatial attention (SA) block, and semantic data, which is used to extract data on position. The CA block performs average pooling, max pooling, and stochastic pooling simultaneously to the input feature maps and then fuses the results to produce a more information-rich channel focus, as illustrated in Figure 2D, the SA block successively performs the three pooling steps to the feature maps to acquire the spatial attention, utilizing the general convolution layer to recover the channel parameters. Additionally, prior research demonstrates that both the CA and SA blocks may be integrated with the network separately to enhance feature extraction performance.

3.2.4 Multi-scaled feature fusion

Although the backbone architecture of YOLOv8 Jocher and Qiu (2023) and attention technique improve our network’s feature extraction capacity to some level, the key barrier limiting its detection efficiency is still the interplay of image and semantic information across various layers. Image characteristics (such as edge, color, and texture) are often more strongly responded to by the shallow levels of the backbone network. In contrast, the deep layers respond more strongly to semantic features (such as object class). Vertical feature fusion through a feature pyramid network (FPN) Lin et al. (2017) is a method that successfully handles the problem described above and raises the level of representativeness of the model. This method works by fusing the multi-scale map of features that have been gathered from several backbone layers. However, The unidirectional FPN has certain downsides, including the loss of data features and an insufficient fusion of multi-scale features. As a result, our proposed network employs an attention-based path aggregation network to correct its deficiencies. Our network uses attention blocks to manage the feature maps obtained from the backbone layers and takes the dual-direction strategy, for example, Bottom-Up and Top-Down, for feature fusion, as shown in Figure 3. This is done to improve the interaction between the deep-layer semantic and shallow-layer image information. This produces a more important feature map, which may be used for further classification and location.

FIGURE 3

FIGURE 3. Multi-scale feature fusion strategy.

Most often realized by atrous spatial pyramid pooling Wang et al. (2018), spatial pyramid pooling or spatial pyramid-based horizontal feature fusion Jie et al. (2021) is also a recommended technique to improve the representation ability of the model. However, the aforementioned spatial pyramids are not optimal for ore sorting networks (lightweight) due to the difficulties of enormous parameters and feature redundancies. To address these issues, our proposed network employs responsive field blocks (RFB-s), as seen in Figure 3, to provide horizontal feature fusion, by which the perceptual field is widened with the help of dilated convolution at four distinct speeds. While simultaneously reducing the number and complications of the computation parameters, the 3 × 1 × 1 × 3 convolution in RFBs enhances the model’s nonlinear representational capabilities.

4 Performance evaluation

4.1 Dataset description

For the preparation of training images, freely accessible Sentinel-2 (L1C) data with a medium resolution is used. The resolution of Sentinel-2 varies between 10 m and 60 m across 13 spectral bands in the visible, near-infrared, and short-wave infrared spectrum. It has a 290 km field of view with a global coverage and a 5-day revisit period. It is a preferred option for building the database due to its high rate of revisits, open-source nature, and global scope. The geographic locations of 400 opencast coal mines are identified by hand to prepare the training dataset.

For testing, the dataset containing the two classifications Coal Mines and No Coal Mines is divided into a training dataset and a testing. 80% of the image fragments for each class are used for training, while 20% are used as a test set. The total number of training image fragments for Coal Mines is approximately 2,450, while the number for No Coal Mines is 2,100. There are 1,050 image validation patches for Coal Mines and 900 for No Coal Mines. During training, images from the training dataset are used as input for CNN learning, and the model’s performance is assessed by predicting the class of images from the testing dataset. The classification accuracy is the proportion of accurate predictions made by the model when applied to the test set.

In our study, we leverage freely accessible Sentinel-2 (L1C) satellite imagery with the medium resolution for a classification task, which forms a crucial component of our broader crack detection methodology. Although the primary dataset is labeled with the classifications “Coal Mines” and “No Coal Mines,” it plays a pivotal role in our crack detection approach. For example, Sentinel-2 data, known for its spectral diversity and high revisit frequency, offers valuable insights into the earth’s surface. We utilize this data for image-based analysis. We meticulously identify the geographic locations of 400 opencast coal mines within the Sentinel-2 imagery. These locations serve as points of interest in our analysis. Afterward, we perform a classification task to classify image fragments into two categories: “Coal Mines” and “No Coal Mines.” This classification is based on the imagery characteristics associated with these regions.

In the context of crack detection in coal mines, while our primary task is classifying coal mine areas, this is an integral step in the broader goal of crack detection within open-pit coal mines. The presence or absence of coal mines in specific areas can indicate geological features, structural changes, or ground disturbances, including cracks. We split the dataset into training and testing subsets for our classification model. During training, the proposed lightweight network learns to recognize features associated with coal mines. Importantly, this learned knowledge can be leveraged in our crack detection framework. Finally, the classification accuracy we measure reflects the ability of our model to discern between coal mine and non-coal mine areas. This accuracy serves as an initial indicator of our model’s capability to identify anomalous features, which may include cracks, in open-pit coal mines.

4.2 Loss function

The performance of a model is measured by reducing a loss function that measures how well it does its job of making predictions. One popular option is the Cross Entropy Loss Function. Obtaining the probabilities of “p” and “1-p” for each category is possible in the situation of binary classification, when the model predicts results for just two options. For binary classification, the Cross-Entropy Loss formula is as shown below:

L = \frac{1}{N} \sum_{j} - [x_{j} \cdot \log (p_{j}) + (1 - x_{j}) \cdot \log (1 - p_{j})] (7)

In this equation, the number of samples are shown by N, x_j denotes the label of sample j, the negative class is 0, the positive class is 1, and the likelihood that sample j belongs to the expected positive class is represented by p_j.

4.3 Evaluation metrics

To measure the accuracy of semantic segmentation models, evaluation metrics are used. Metrics like Recall (R), Precision (P), F1-score, Overall Accuracy (OA), and Kappa Coefficient (KC), are widely used for this purpose. “TP” signifies true positives in the algorithms used for these measurements, indicating accurately recognized open-pit coal mine pixels. “FP” stands for false positives or pixels that were wrongly detected as open-pit coal mines. The letter “TN” stands for true negatives, which are accurately detected surface coal mine images that are discarded. Finally, “FN” signifies false negatives, which are pixels that were wrongly rejected from surface coal mines.

4.3.1 Precision

The ratio of true positives (TP) to the total of false positives (FP) and true positives (TP) is used to calculate precision, a statistic that measures the level of accuracy of positive predictions. The accuracy calculation is shown below:

Precision = P = \frac{T P}{F P + T P} (8)

4.3.2 Recall

The Recall metric, which measures the ability of a model to identify positive instances correctly, is calculated by dividing the number of true positives by the sum of true positives and false negatives. The equation below represents the computation of Recall:

Recall = R = \frac{T P}{F N + T P} (9)

4.3.3 F1-Score

The F1-Score is a statistical measurement that determines how well a binary classification model performs. It is defined as the arithmetic mean that is harmonic between accuracy and recall. The following formulation determines the F1-score:

F 1 - score = F 1 S = \frac{2 \times P \times R}{P + R} (10)

The accuracy measure represents this calculation’s accuracy, while the Recall representation denotes the Recall metric. The F1S is a balanced measurement of accuracy and recall that considers both the model’s negative and positive classifications.

4.3.4 Overall accuracy

The Overall Accuracy (OA) measure computes the proportion of properly categorized samples to the total number of samples. It gives an evaluation of a classification model’s overall accuracy. The following equation gives the method for calculating total accuracy:

Overall accuracy = O A = \frac{T N + T P}{F P + F N + T P + T N} (11)

4.3.5 Kappa coefficient

The Kappa Coefficient (KC) is a metric used to examine the consistency between a classification model’s predicted and actual classification results. It goes beyond what would be predicted by chance in measuring the agreement. The Kappa Coefficient is calculated using the following formula:

K = (P_{o} - P_{e}) X (\frac{1}{1 - P_{e}}) (12)

In this formula, P_o stands for the observed agreement, which is calculated as the ratio of the observed agreement to the total number of samples. The probability of agreement by chance is shown by the expected agreement (P_e). The Kappa Coefficient evaluates the model’s effectiveness while considering the likelihood of agreement resulting from pure chance. Considering that the true samples number in every category is c₁, c₂, …, c_n predicted samples number in every category is d₁, d₂, …, d_n, and the total samples number will be Pe = c₁ × d₁ + c₂ × d₂ + ⋯ + c_n × d_n.

4.3.6 Mean average precision

The metric mean average precision (mAP) is a commonly used evaluation metric in object detection tasks, including crack detection in images. To calculate mAP for open-pit coal mine crack detection, we used the following formulation with different mAP settings such as mAP@0.25, mAP@0.5, mAP@0.75, and mAP@1.

m A P = \frac{1}{N} \sum_{i = 1}^{N} {A P}_{i} (13)

5 Experimental results

PyTorch was used throughout the tests that were carried out as part of this research project. Experiments were run on a server equipped with a GeForce RTX 3090 and 24 gigabytes of RAM. Linux was used as the operating system. To correct the imbalance that existed between non-mining and mining pixels in each dataset, an equal number of non-mining pixels were chosen as negative samples based on the mining pixels that were used to choose the positive samples. This was done to ensure that the training data had equal positive and negative samples. Additionally, the training set was divided in a ratio of 8:2 with the validation set. These experiments aimed to achieve the highest possible accuracy for the model. To accomplish this, various experiments were conducted on the model. These experiments can be categorized into two main parts, each focusing on specific aspects or techniques. We also used the same parameters for all other state-of-the-art models to run a fair comparison with our proposed lightweight multiscale feature fusion based network.

The research focused on defining the optimum slice size for the input data and choosing the best optimizer for the model in the first portion of the trials. Several slice sizes were tested to see which generated the best accuracy and overall model performance results. Simultaneously, many optimizers were examined to see which one was most successful in reaching optimum model performance. We compared the performance of the optimum model with state-of-the-art models in the second stage of the tests. P, R, OA, F1S, and KC were the assessment criteria employed in this comparison. These metrics evaluated the model’s classification accuracy, consistency, and overall efficacy in separating crack samples for open-pit coal mines.

5.1 Slice experiments

The size of the input network’s slices plays a crucial role in the training outcomes, as it affects the extraction of image features, which vary depending on the image scale. Larger images tend to contain more textures, contextual information, and important features. However, there is a point where increasing the size may not lead to improved classification performance, and it may even result in a decrease in performance. Moreover, larger image sizes require more computational resources. In our experimental study, we worked with Sentinel-2 images and shape files, which had dimensions of 3,001 × 2,205. These were divided into slices of six different sizes: 3 × 3, 6 × 6, 9 × 9, 18 × 18, and 36 × 36. The evaluation results of our model on the dataset, including metrics such as P, R, OA, F1S, and KC. Table 3 reports the slice experimental results of our proposed model for different satellite data images.

TABLE 3

TABLE 3. Experiments of the proposed model with different slice sizes.

For the satellite dataset in our experimental investigation, when cutting the input sizes, intervals of 2:1 were used, beginning at 9 × 9. Among the various slice sizes, 18 × 18 consistently produced the greatest KC when compared to the other sizes. Overall, the KC rose at first, then reduced as the slice size grew. P, R and F1S values were approximately 0.9 for all three slice sizes 36 × 36, 18 × 18, and 9 × 9 showing their usefulness in detecting open-pit coal mines’ cracks. OA of all studies was more than 0.9, indicating the success of open-pit coal mine crack detection and extraction. Only the 18 × 18 slice size above 0.9 in terms of KC, and its R, P F1S, and OA were consistently greater than the assessment results of other slice sizes. As a result, the slice size of 18 × 18 was selected for further investigation and analysis.

5.2 Comparison of different optimizers and CNN-based models

Extensive testing was used to determine the ideal slice size and optimizer. The most efficient combinations must be found to properly extract cracks from open-pit coal mines satellite data. On several dataset combinations, various slice sizes and optimizers were tested as part of the assessment process. When the findings from all three combinations of the dataset were analyzed, it became clear that our proposed lightweight network consistently performed better than the other state-of-the-art models. Table 4 presents the findings of experiments with different optimizers Stochastic Gradient Descent (SGD), Adam Optimizer, and Root Mean Square (RMS) Prop known as RMSProp. We compared our model with RNN and U-Net Ronneberger et al. (2015) in the experimental evaluation.

TABLE 4

TABLE 4. Experiments of different combinations of our network with optimizers and comparison with other networks.

Once the loss value is computed using the loss function, finding the optimal values for the model’s inputs is crucial. The model’s intrinsic parameters have a major impact on both the training process and the output. Therefore, employing diverse optimization strategies and algorithms becomes crucial in updating and calculating the network parameters. This study tested the SGD, Adam Optimizer, and RMSprop algorithms on three distinct model combinations. The evaluation metrics for the different optimization functions are P, R, OA, F1S, and KC, presented in Table 4. These key performance indicators (KPIs) provide valuable insights into the effectiveness of the optimization functions across the dataset. Upon comparing the results, we found that the model utilizing the SGD optimizer outperforms all other optimizers, exhibiting superior performance on all optimizers.

A comparison is also made with traditional CNNs to establish whether the suggested model accurately represents reality. The model was trained using a satellite imagery-based open-pit coal mine dataset, and its performance was evaluated using a variety of metrics, including P, R, OA, F1S, and KC. The proposed model outperformed well-known algorithms such as AlexNet Krizhevsky et al. (2012), VGG-16 Simonyan and Zisserman (2014), VGG-19 Simonyan and Zisserman (2014), GoogleNet Szegedy et al. (2015), ResNet-50 He et al. (2016), Single Shot Detector (SSD) Liu et al. (2016), DenseNet121 Huang et al. (2017), MobileNetv2 Sandler et al. (2018), and our proposed network. These findings highlight the significant accuracy improvement achieved by the proposed model compared to traditional approaches. Furthermore, the model’s streamlined structure enables faster processing, making it a compelling choice for practical applications in open-pit coal mine extraction. Table 5 represents the comparison of our model with other CNN-based networks.

TABLE 5

TABLE 5. Comparison of proposed network with other models.

The predicted results obtained from the model represent continuous probability values rather than a binary map of 0 or 1. A threshold must be set to convert these probability values into a binary map. This study used GIS software to apply different threshold values of 0.9, 0.8, and 0.7 for comparison purposes. The experimental analysis revealed that a threshold of 0.8 yielded better results. The model generally performed well in predicting most of the surface coal mines, with more accurate boundaries and strong continuity. However, some errors were observed in the prediction results. The results show instances in which some features with traits similar to open-pit mines were inadvertently included, or cases in which certain open-pit mines with less conspicuous characteristics were ignored. These inaccuracies highlight the difficulties involved in effectively differentiating open-pit mines from other objects seen in satellite images.

The mAP is also calculated for different settings such as mAP@0.25, mAP@0.5, mAP@0.75, and mAP@1. We compared AlexNet, VGG-16, VGG-19, GoogleNet, ResNet-50, SSD, DenseNet121, MobileNetv2, and our proposed network for mAP comparison and it can be seen in the Figure 4 that our network outperformed all other models in the mAP evaluation.

FIGURE 4

FIGURE 4. Comparison of our proposed network with other networks for mAP.

5.2.1 Qualitative evaluation

We also present the qualitative performance of our method with other CNN-based models such as DenseNet121 and ResNet50 in Figure 5. We show the cracks detection for satellite imagery dataset of open-pit coal mines. The detected cracks are sent to our proposed DT-based framework for predictive maintenance of evaluated data. The engineers take the necesssary preventive measures for predictive maintenance of open-pit coal mines.

FIGURE 5

FIGURE 5. Comparison of our proposed network with other networks for open-pit cracks detection.

The size and quality of the dataset have a significant impact on how accurate the detection results are. Incorporating image pre-processing methods like de-blurring and expanding the training sample pool may both help to improve detection accuracy. By collecting more data from the images, we hope to improve the process of mining open-pit coal mines in further studies. This involves locating side gangs, dumps, and mining locations. We may learn more about the geographical features of mining activities and enhance our knowledge of their environmental effect by broadening the scope of information extraction.

5.2.2 Performance of proposed model in DT-based framework

The crack detection and extraction of open-pit coal mines system was designed to operate in a distributed environment consisting of a central data center and two smaller IoT-based edge centers in the DT-based framework. The data center comprised multiple computers; each edge center had a single computer with comparable specifications. The network connecting the data center and edge centers had a throughput of 1 GB/s. Virtual machines hosted on virtualized servers within the architecture accommodated the proposed network in the cloud application modules. It was assumed that computing resources were fairly shared among the different services, and each component of the proposed network utilized its designated resources. To evaluate the performance of our model in the Cloud and DT-based Cloud settings, we compared it to other models like RNN and U-Net for training and prediction time evaluation. It can be seen from Figure 6 that in our proposed DT-based framework, the training and prediction time of open-pit cracks and extraction is lesser than in the cloud-based setting only.

FIGURE 6

FIGURE 6. Comparison of model training and prediction times computational complexity in the Cloud and DT-based Cloud server settings.

Using an edge node in our trials led to significant gains in both model prediction accuracy and training speed. Time spent making predictions using the model was cut by an average of 23.6% and as much as 28.1%. In addition, the time it takes to train a model improved by anywhere between 12.8% and 16.4%. It is important to note that image quality, file size, and available computing resources in the cloud all have a role in how long it takes to make a forecast. As a result, the amount of time needed to make a forecast might vary considerably depending on these parameters. We found that both cloud and DT-based cloud deployments had equal effects on the precision of models.

6 Conclusion and future works

Digital twins provide a holistic approach to crack detection in open-pit coal mines by integrating real-time monitoring, predictive analytics, simulation, visualization, and decision support. By leveraging the power of digital twins, mine operators can enhance safety, optimize maintenance efforts, and minimize the risks associated with crack formation and structural instability. This work presents a DT-based open-pit coal mines crack detection and open-pit coal mines extraction framework. We propose a lightweight, densely connected network for detecting the Sentinal-2 imagery dataset. Using the proposed DT-based framework, the engineers and workers working in the open-pit coal mines can immediately take preventive measures for their safety. The proposed framework is a first of its kind and can be used globally in open-pit coal mines.

By deploying diverse sensors and IoT devices, our DT-based framework excels in capturing essential data on factors like ground movement and strain. By harnessing this comprehensive dataset, our framework excels in detecting and analyzing anomalous behaviors or changes that might signal the formation or propagation of cracks. Our approach’s cornerstone lies in applying deep learning-based networks, specifically a densely connected lightweight network seamlessly integrated within the DT-based ecosystem. This fusion of historical data, real-time sensor information, and predictive models empowers our system to take proactive measures and make informed maintenance decisions.

Our extensive evaluation of the proposed network showcases its prowess, consistently outperforming state-of-the-art deep neural networks in precision, recall, overall accuracy, mean average precision, F1-score, and the kappa coefficient. Our lightweight multiscale feature fusion-based network achieves impressive scores of 0.969, 0.984, 0.946, 0.962, and 0.973, respectively, demonstrating its superiority in these crucial performance metrics. Additionally, it excels in mean average precision, surpassing all other competing models. Furthermore, our network demonstrates remarkable efficiency in terms of model training and prediction time benchmarks, outperforming cutting-edge models such as U-Net and recurrent neural networks.

In summary, our proposed framework, fueled by deep learning, data-driven insights, and the power of Digital Twins, represents a pioneering approach to open-pit coal mine crack detection and predictive maintenance. By harnessing the potential of these technologies, we strive not only to enhance safety and asset protection but also to lay the foundation for sustainable and efficient mining practices in the future.

While our current work represents a significant leap in open-pit coal mine safety and maintenance, the field of DTs and deep learning-driven crack detection continues to evolve. Future research endeavors may explore Multi-Sensor Integration by investigating a wider array of sensors and IoT devices to capture an even more comprehensive dataset, potentially encompassing additional environmental and geospatial factors that could impact crack formation. Future works can extend the capabilities of DTs to provide real-time decision support for crack detection and immediate safety interventions, predictive maintenance scheduling, and resource allocation. Also, future works can address the crucial issues of data privacy and security when deploying IoT devices and collecting sensitive operational data within open-pit coal mines. Our current research sets the stage for a new era of safety and efficiency in open-pit coal mining. However, there is still much-untapped potential in the fusion of Digital Twins and deep learning. Future studies should focus on these directions to further advance the state of the art in mine safety and operational excellence.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

RY: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Software, Supervision, Writing–original draft, Writing–review and editing. XY: Conceptualization, Investigation, Methodology, Supervision, Writing–original draft. KC: Conceptualization, Data curation, Software, Writing–original draft, Writing–review and editing.

Funding

The authors declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

Authors RY, XY, and KC were employed by China Coal Huajin Group Co., Ltd.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Allam, Z., Bibri, S. E., Jones, D. S., Chabaud, D., and Moreno, C. (2022). Unpacking the ‘15-minute city’via 6g, iot, and digital twins: towards a new narrative for increasing urban efficiency, resilience, and sustainability. Sensors 22, 1369. doi:10.3390/s22041369

PubMed Abstract | CrossRef Full Text | Google Scholar

Benndorf, J. (2013). Application of efficient methods of conditional simulation for optimising coal blending strategies in large continuous open pit mining operations. Int. J. Coal Geol. 112, 141–153. doi:10.1016/j.coal.2012.10.008

Deep learning and IoT enabled digital twin framework for monitoring open-pit coal mines

1 Introduction

2 Literature review

2.1 Deep learning and coal mining applications

2.2 Digital twins and IoT applications

3 Methodology

3.1 Digital twin platform for open-pit mining

3.1.1 Physical world

3.1.2 IoT-based services

3.1.3 Virtual world

3.1.4 Stakeholders

3.2 Proposed crack detection network based on deep learning

3.2.1 Lightweight network

3.2.2 Densely connected network

3.2.3 Attention mechanism

3.2.4 Multi-scaled feature fusion

4 Performance evaluation

4.1 Dataset description

4.2 Loss function

4.3 Evaluation metrics

4.3.1 Precision

4.3.2 Recall

4.3.3 F1-Score

4.3.4 Overall accuracy

4.3.5 Kappa coefficient

4.3.6 Mean average precision

5 Experimental results

5.1 Slice experiments

5.2 Comparison of different optimizers and CNN-based models

5.2.1 Qualitative evaluation

5.2.2 Performance of proposed model in DT-based framework

6 Conclusion and future works

Data availability statement

Author contributions

Funding

Conflict of interest

Publisher’s note

References

95% of researchers rate our articles as excellent or good