- 1School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, China
- 2State Key Laboratory of Mining Response and Disaster Prevention and Control in Deep Coal Mines, Anhui University of Science and Technology, Huainan, China
Accurate identification of coal and gangue is very important for realizing efficient separation of coal and gangue and clean utilization of coal. Therefore, a method for identifying coal and gangue by using multispectral spectral information and a convolutional neural network (CNN) model is proposed. First, 200 pieces of coal and 200 pieces of gangue in the Huainan mining area were collected as the experimental materials. The multispectral information of coal and gangue was collected, and the average value of each wavelength position was calculated to obtain the spectral information of the whole band. Then, based on the one-dimensional CNN (1D-CNN), namely, 1D-CNN-A and 1D-CNN-B, and with the help of stochastic gradient descent (SGD), Adam, Adamax, and Nadam optimizers, the rectified linear unit (ReLU) function and its improved function were used as the activation function to compare the identification ability of the identification models with different network structures and parameters. The best 1D-CNN model for identification of coal and gangue based on multispectral spectral information is obtained as follows: a network model containing three one-dimensional convolution units B, PReLU is used as the activation function, and Nadam is used as an optimizer to achieve the best identification effect. At this time, 40 coal samples in the test set can be accurately identified, and only one gangue sample in 40 gangue samples is wrongly predicted as coal. Finally, compared with the traditional recognition strategy (different combinations of principal component analysis and support vector machine), the excellent performance of this method is further proven. The results show that the combination of multispectral imaging and 1D-CNN can achieve accurate identification of coal and gangue without considering how to select appropriate preprocessing and feature extraction methods, which is of great significance in promoting the development of separation technology for coal and gangue.
1 Introduction
Coal is known as “the black gold.” Since the first industrial revolution, coal has played an important role in energy, chemical, and other fields (Dill, 2016), (Cai et al., 2018). It has been one of the main energy sources used in the world since the 18th century (Singh et al., 2015). According to the “BP World Energy Statistical Yearbook” (BP, 2021), released by BP on July 8, 2021, the world’s energy production and consumption patterns are undergoing profound changes. In 2020, coal consumption was reduced by 6.2 EJ, a decline of 4.2%. Among them, the United States (reducing 2.1 EJ) and India (reducing 1.1 EJ) had the largest decline, while China and Malaysia were two obvious exceptions. The coal consumption of the two countries increased by 0.5 EJ and 0.2 EJ, respectively. Nevertheless, coal still accounts for 27.2% of global energy consumption. This shows that coal still occupies a pivotal position in the global energy consu mption structure (Zhou et al., 2020).
Raw coal is the most original coal from underground mining. It is necessary to separate the coal and gangue and eliminate the gangue in the raw coal so as to realize clean and efficient utilization of coal resources (Yuan et al., 2019), (Yuan, 2020). The separation of coal and gangue can be mainly divided into two categories: wet gangue separation and dry gangue separation. Among them, the dry gangue separation represented by ray separation [X-ray (Zhang et al., 2021) and dual-energy gamma-ray (Yazdi and Esmaeilnia, 2003)], laser separation (Wang and Zhang, 2017), vibration separation (Wan et al., 2022), crushing separation (Yang et al., 2018), and visual separation (Bai et al., 2021) do not consume water resources, which is conducive to the environment. At present, it has become the mainstream separation method. With continuous development of image processing and artificial intelligence (AI), the visual separation technology with image recognition as the core is considered to be a potential separation method for coal and gangue (Dou et al., 2019). However, there are certain limitations in using traditional visual methods to separate coal and gangue. In particular, the identification accuracy of coal and gangue is affected by the imaging quality, while the traditional visual methods are disturbed by external factors such as light and dust.
Spectral imaging technology (Bioucas-Dias et al., 2013), developed in the 1980s, involving optics, electronics, information science, and many other disciplines, is a new generation of optical nondestructive testing technology. As an important branch of spectral imaging, multispectral imaging (MSI) effectively avoids the problems of narrow band range and is susceptible to interference in traditional RGB images by collecting images in several different spectral regions. Therefore, it is widely used in mineral engineering (Hu et al., 2022), agriculture (AlSuwaidi et al., 2018), food industry (Qin et al., 2013), biomedicine (Li et al., 2013), and other fields. In order to improve the availability and performance of mineral spectral data, Xie et al. (Xie et al., 2022) shared an integrated open mineral spectral library (also known as Rock Spectral Library, RockSL) and developed a so ftware system for the management, analysis, and ap plication of mineral and rock spectral data. Li et al. (Li et al., 2022) developed a set of coal gangue imaging systems based on visible and near-infrared hyperspe ctral imaging technology and used the feature selection m ethod to simplify the classification model, providing a reference for construction of coal and coal gangue mu ltispectral systems. Shao et al. (Shao et al., 2020) designe d a 91-channel hyperspectral lidar with an acousto-optic tunable filter (AOTF) as the spectral device. After collecting the spectra of four coal/rock samples, they used naive Bayesian (NB), logistic regression (LR), and support vector machine (SVM) for classification and achieved excellent classification accuracy. Aiming at how to realize real-time monitoring of coal, He et al. (He et al., 2019) used multispectral remote sensing to collect real-time coal image data and realized high-precision classification with an extreme learning machine. It can be seen that spectral imaging technology has a wide range of successful applications in geology.
In recent years, the deep learning method represented by the deep convolution neural network has developed rapidly. It has many applications in face recognition (Voulodimos et al., 2018), speech recognition (Peddinti et al., 2018), automatic driving (Xu et al., 2021), geoscience (Hu et al., 2019), and industrial detection (Munir et al., 2019), which makes the AI algorithm develop unprecedentedly. It is necessary to emphasize that the deep convolution neural network has also made considerable progress in spectral analysis. To solve the problem that the feature information extraction of hyperspectral by three-dimensional CNN (3D-CNN) needs to rely on complex models, Ghaderizadeh et al. (Ghaderizadeh et al., 2021) proposed a 3D fast learning block (deep separable convolution block and fast convolution block) and then introduced a two-dimensional CNN (2D-CNN) to extract the spatial characteristics of the spectrum tested it on the standard datasets (Salinas, University Pavia and Indian Pines). In this way, the complexity of the model was effectively reduced under the premise of considering the accuracy of the model. Considering the importance of automatic sorting of coal and gangue, Chen et al. (Chen et al., 2022) proposed a new idea to analyze the acoustic multi-channel auditory spectrum of hydraulic support based on the convolution neural network. The recognition rate of this method for coal and gangue can reach 99.5%, and it has excellent anti-noise ability. On the basis of obtaining light detection and ranging (LiDAR) data, Maxwell et al. (Maxwell et al., 2020) used Mask R-CNN to extract valley filling surfaces, and the accuracy, recall, and F1 score were all higher than 0.85, which showed that the combination of Mask R-CNN and LiDAR had great potential in geomorphologic feature rendering. In order to solve the problem of difficult monitoring of ground subsidence in mining areas, Wang et al. (Maxwell et al., 2020) constructed a new phase unwrapping method based on the U-Net convolutional neural network, which solved the problem that interference fringes were interrupted or partially confused due to low coherence in coal mining areas, and it was difficult to obtain the correct phase unwrapping. Combining spectroscopy with a deep learning algorithm, Xiao et al. (Xiao et al., 2022) proposed a method for rapid identification of coal types in the field. The convolution neural network was used to extract the two-dimensional spectral characteristics of coal, and the extreme learning machine was used as a classifier for feature recognition. Rapid and accurate identification of coal and gangue was realized. It is not difficult to find that convolutional neural network technology has a wide range of successful applications in earth science.
Considering the wide application of multispectral technology and the excellent performance of the CNN in spectral analysis, a recognition method for coal and gangue based on multispectral spectral information combined with 1D-CNN is proposed in this study. The main research objectives of this study are as follows: 1) to analyze the identification ability of multispectral spectral analysis combined with 1D-CNN for coal and gangue; 2) compare the recognition effect of 1D-CNN with different structures to select the best structure; 3) reasonably set the optimizer, activation function, and network depth to obtain the best 1D-CNN identification model of coal and gangue; and 4) by comparing with the traditional modeling methods, the superiority of the 1D-CNN model is further verified.
2 Materials and methods
2.1 Materials and samples
As we all know, the Huainan–Huaibei coal mine is an important coal production base in China, and the Huainan coal mine is a typical representative of the Huainan–Huaibei coal mine. Therefore, we took the coal and gangue samples from the Huainan mining area as the research objects. For the purpose of making the experimental results more reliable and effective and avoiding the interference of sample size and shape on the experimental results as much as possible, we select coal and gangue with a particle size of about 50 mm and a similar shape as experimental samples. On March 16, 2019, 200 pieces of coal and 200 pieces of gangues of similar sizes were collected in the Huainan mining area. A total of 400 experimental samples were collected for subsequent multispectral data collection and analysis. Some coal and gangue samples are shown in Figure 1.
2.2 Multispectral system
To realize the acquisition of multispectral information of coal and gangue, it is necessary to build a multispectral acquisition system. As shown in Figure 2, the core component is the acquisition unit of multispectral data. In addition, the conveyor belt for raw coal transmission and the host computer for multispectral information acquisition are also included. The acquisition unit of multispectral data is mainly composed of a multispectral system and a light source. Multispectral systems include filters, focusing lenses, and spectral cameras. The filter device consists of a 675-nm longpass filter (Edmund Optics, United States) and a 975-nm shortpass filter (Edmund Optics, United States), which limits the collected spectral range between 675 nm and 975 nm. The fixed focal length lens adopts a VIS-NIR lens (Edmund Optics, United States), and the adjustable range of its focal length is 1.4 mm–16 mm. The area array spectral camera is selected as the spectral camera (MQ022HG-IM-SM5X5-NIR, XIMEA GmbH, Germany), which is equipped with an advanced CMOS imager (CMV2000, Interuniversity Microelectronics Centre, The Kingdom of Belgium), which can realize spectral imaging of 25 bands at the same time. The auto scan light is a halogen light source (LS-LHA, SUMITA, Japan), and the power is set to 150 W. When collecting the multispectral data of coal and gangue, the acquisition distance between the lens and the sample is about 32 cm, the focal length of the lens is set to 2.8 mm, and the acquisition angle is set to 90° (that is, the lens is perpendicular to the sample acquisition plane). The exposure time is set to 70.01 ms by the HSImager software installed on the computer, and the experimental data are saved by the HSImager.
The spectral camera is equipped with a 5 × 5 array sensor, and the multispectral acquisition system can collect the spectral information of 25 bands of coal and gangue in the range of 675–975 nm. These 25 wavelength positions are 682 nm, 697 nm, 722 nm, 736 nm, 748 nm, 762 nm, 773 nm, 786 nm, 798 nm, 811 nm, 829 nm, 841 nm, 851 nm, 863 nm, 872 nm, 882 nm, 891 nm, 900 nm, 914 nm, 924 nm, 932 nm, 939 nm, 946 nm, 954 nm, and 959 nm.
2.3 Analysis method
For the purpose of constructing a suitable CNN identification model using the multispectral spectral information of coal and gangue, the multispectral CNN identification model of coal and gangue is constructed by using the flow chart shown in Figure 3. In particular, to adapt to the data format of the input multispectral spectral information, a 1D-CNN identification model is constructed for identification of coal and gangue. First, the multispectral spectral information of coal and gangue is used as the input of the CNN to construct the 1D-CNN identification model with the spectral information of coal and gangue. Then, the model performance is evaluated by the accuracy, loss, training time, and other indicators, and the model structure and hyperparameters are continuously optimized to construct an optimal 1D-CNN model for identifying coal and gangue. In addition, for the purpose of proving the superiority of the proposed method, the traditional identification method is used to construct the identification model as a comparison to further verify the reliability and effectiveness of this method. Specifically, different combinations of principal component analysis (PCA) and SVM are used to achieve spectral classification.
2.3.1 1D-CNN
Considering that the data collected by the multispectral equipment selected in this study are simple, the extracted spectral information contains the average spectral intensity of 25 wavelength positions, that is, the input dimension of the 1D-CNN model for identifying coal and gangue constructed in this study is 25. As an important branch of the CNN, 1D-CNN usually has two common model structures to implement spectral classification, as shown in Figure 4.
The basic composition of the two common 1D-CNN structures shown in the abovementioned graph is similar, which mainly includes a 1D convolutional layer, batch normalization layer, activation layer, 1D pooling layer, fully connected layer, and Softmax layer for classification. The depth of the network can be deepened by increasing the number of 1D convolution units (1D Conv Units) so as to construct a deeper 1D-CNN network. The difference between the abovementioned two structures is mainly reflected in the different composition of the 1D Conv Unit. In the structure shown in Figure 4A, the 1D Conv Unit A is mainly composed of a 1D convolutional layer, a batch normalization layer, an activation layer, and a 1D pooling layer. In the structure shown in Figure 4B, the 1D Conv Unit B is mainly composed of a 1D convolutional layer, an activation layer, and a 1D pooling layer. The batch normalization layer is placed before the 1D Conv Unit, and the 1D Conv Unit B does not contain the batch normalization layer, that is, such a network only needs a batch normalization operation.
2.3.2 Optimizer
A backpropagation algorithm is usually used in the CNN to train the parameters in the network model, and the best CNN model is selected by training and learning the sample data. In order to quantitatively evaluate the prediction effect of a CNN model, an objective function
where
SGD (Robbins and Monro, 1951), namely, stochastic gradient descent was proposed by Robbins and Monro in 1951. It has a history of more than 70 years. For the update of model parameters, SGD adopts the method of calculating the gradient of a mini batch in each iteration and then updating the model parameters, so it can effectively enhance the training speed. SGD, as a common optimization strategy, updates parameters according to Eqs. 2, 3:
where
The optimization idea of Adam (Kingma and Ba, 2014), namely, Adaptive Moment Estimation is to dynamically adjust the learning rate of each parameter by means of the first-order moment estimation and second-order moment estimation of the gradient. The main advantage of Adam is that the learning rate of each iteration after bias correction is within a certain interval, so the whole parameter optimization process is relatively stable. The Adam optimization strategy updates model parameters according to the following formula:
Here
Adamax, as a variant of Adam, changes Eqs. 9, 10 to make the upper limit of learning rate simpler. The specific formula is
where
Nadam is also a variant of Adam, similar to an Adam deformation form with Nesterov. The Nadam optimization strategy updates the model parameters according to the following formula:
Here
2.3.3 Activation function
The backpropagation algorithm is usually used in the CNN to train the parameters in the network model, and the best CNN model is selected by training and learning the sample data. In order to quantitatively evaluate the prediction effect of a CNN model, an objective function
The activation function plays a very important role in the learning and understanding of some complex and nonlinear functions of the neural network model. In neurons, after a weighted summation operation, the input data usually also act on a function, namely, the activation function, which is introduced to add nonlinear characteristics to the network model. If the activation function is not used, the output of each layer is a linear mapping of input. Therefore, no matter how to increase the depth of the network, the output of the network is essentially a linear combination of inputs. At this time, the network is also called a perceptron. If the activation function is used, nonlinear factors can be introduced. At this time, the neural network can approximate any nonlinear function, that is, it can be applied to nonlinear models and systems. That is to say, the activation function introduces nonlinear factors into the neural network so that the neural networks can be used to fit various curves. In the early stage, Sigmoid or tanh was used as the activation function in the neural network model. In recent years, with the deepening of the neural network model and the rapid development of the CNN, the ReLU function and its improved functions (such as the LeakyReLU function, PReLU function, ELU function, and ThresholdedReLU function) are widely used in the deep neural network.
Nair and Hinton (Nair and Hinton, 2010) proposed the rectified linear unit (ReLU) at the 27th International Conference on Machine Learning (ICML 2010), also known as the modified linear unit, which is used as the activation function of the CNN model. Compared with the traditional Sigmoid and tanh, ReLU has better and faster performance. The analytic expression of the ReLU function can be expressed as
In order to solve the problem that some neurons may die in the ReLU function, a leakage value is introduced in the negative half of the ReLU function, namely, the leakage linear rectification function (LeakyReLU) (Zhang et al., 2017). The LeakyReLU function can be expressed as
Exponential linear unit (ELU) (Clevert et al., 2015) is an improved function for the ReLU function, which attempts to make the output mean of the activation function close to zero so as to enhance the learning speed. At the same time, compared with the ReLU function, the function has a certain output to the negative input (i.e.,
At the 2015 IEEE International Conference on Computer Vision (ICCV), He et al. (He et al., 2015) proposed an improvement in the ReLU, the parametric rectified linear unit (PReLU), which can be considered a deformation form of the Leaky ReLU. The analytical expression of the PReLU function can be expressed as
The threshold modified ReLU (ThresholdedReLU) (Konda et al., 2014) can also be regarded as a deformation of ReLU. Its main idea is to introduce an activation threshold, which makes the activation function discontinuous. The analytical expression of the ThresholdedReLU function can be expressed as
2.4 Software tools
Matlab R2020a was utilized to draw and construct the identification model. Graphs in this study were constructed by EDraw Max and Origin 2021. A desktop computer (64-bit Win10 operating system) was used as a data processing platform to complete the construction and analysis of the recognition models. The device includes an i7–9700K CPU, an NVIDIA RTX2070 GPU, and 16GB RAM, which can excellently complete the model construction task of this article.
3 Results and discussion
3.1 Multispectral information analysis of coal and gangue
Multispectral spectral information is a combination of multispectral data from all bands to calculate the average value of each wavelength position to obtain spectral information covering the whole band. According to the imaging information at 25 wavelengths in the range of 675–975 nm of multispectral data, the multispectral spectral information of coal and gangue was extracted. There were 200 samples of coal and 200 samples of gangue, and a total of 400 samples of multispectral spectral data were obtained. The multispectral spectral information data set of coal and gangue was constructed for training the CNN recognition model. With 25 wavelength positions as the horizontal axis and spectral intensity as the vertical axis, the spectral information of coal and gangue was displayed in the same coordinate system (the range of the horizontal axis and the vertical axis is the same) in order to display the spectral information of coal and gangue more clearly, as shown in Figure 5. Figure 5A shows the multispectral spectral information of all 200 coal samples, it can be seen that the spectral intensity of coal is mainly concentrated in 35–50 arbitrary units (a.u.), and the maximum spectral intensity is less than 60 a.u. Also, Figure 5B shows the multispectral spectral information of all 200 gangue samples, it can be seen that the spectral intensity of gangue is mainly concentrated in 35–70 a.u., and the maximum spectral intensity is less than 80 a.u. By comparison, we find that the spectral intensity of gangue samples is higher than that of coal samples. This is because the color of coal is usually black, and the color of gangue is usually grayish brown. When the MSI system is used to obtain its multispectral data, the coal sample will absorb more energy, resulting in less reflected information. In terms of spectral information, the overall spectral intensity is lower than that of the gangue sample. At the same time, it can also be found that the spectral intensity of coal and gangue also overlaps in some samples, so it is necessary to use stoichiometric tools or data processing methods to realize the accurate identification of coal and gangue.
3.2 1D-CNN-A network identification model
According to the model construction idea shown in Figure 4A, a suitable 1D-CNN-A shallow network model for the identification of coal and gangue is established by using only one 1D Conv Unit A and the multispectral spectral information of coal and gangue. The samples from the training set were used to construct the model under the SGD, Adam, Adamax, and Nadam optimizers, and the identification performance of the model was verified by the test set. Three experiments were repeated, and the accuracy, loss, and training time of the test set in each experiment were recorded in Table 1.
The first step was to compare the accuracy of the four optimizers. First, by observing the accuracy of three experiments in the table, we found that the minimum accuracy of the test sample is 91.25%, and the maximum accuracy can reach 98.75%. Then, by comparing the average recognition rate, when Adam and Adamax were used as the optimizers, the average recognition rate of the test set in the three experiments was the highest, which can reach 96.67%. At this time, the standard deviation of the recognition rate of Adamax in the three experiments is only 0.72%, which indicates that using Adamax as the optimizer can make the 1D-CNN-A shallow network obtain better stability than when using Adam as the optimizer. When SGD was used as the optimizer, the average recognition rate under three experiments was the lowest, which was 94.17%. The second step was to compare the losses of the four optimizers. Observing the loss value of the 1D-CNN-A network under four optimizers, it was obvious that the loss value of the 1D-CNN-A shallow network containing only one 1D Conv Unit A is less than 0.34 for all experiments. When Adamax was used as the optimizer, the average loss of the sample in the test set under three experiments was the least, at only 0.1096. Also, when Nadam was used as the optimizer, the average loss of the test set in the three experiments was 0.2157. The third step was to compare the training times in each epoch of the four optimizers. It can be found that when using four optimizers to optimize the 1D-CNN-A network, the training time per epoch fluctuates between 12.00 and 16.500 ms/epoch, and when using the SGD optimizer, the training time was the shortest (the average time is only 13.0046 ms/epoch), and when using the Nadam optimizer, the training time was the longest (the average time is 15.2519 ms/epoch). In summary, when using the multispectral spectral information of coal and gangue combined with a 1D-CNN-A network to construct the identification model, under the condition of only containing a 1D Conv Unit A, using Adamax as the optimizer of the network can achieve the best identification effect, which can realize the average recognition rate maximization and the average loss minimization of the sample in the test set.
3.3 1D-CNN-B network identification model
According to the model construction idea shown in Figure 4B, a suitable 1D-CNN-B shallow network model for the identification of coal and gangue is established by using only one 1D Conv Unit B and the multispectral spectral information of coal and gangue. The training set was used to construct the model under the SGD, Adam, Adamax, and Nadam optimizers, and the identification performance of the model was verified by the test set. Repeating three experiments, we show the average results in the form of a histogram and the standard deviation in the form of a one-way error bar. The relevant results are shown in Figure 6. Figure 6A shows the recognition rate of the test set of four different optimizers under three independent tests. It can be seen that the average recognition rate of the 1D-CNN-B shallow network containing only one 1D Conv Unit B is higher than 96.00%. When Adamax and Nadam were used as the optimizers, the average recognition rate of the test set under the three experiments was the highest, reaching 97.50%. When SGD was used as the optimizer, the average recognition rate of the test set under three experiments was the lowest, which is 97.08%. Figure 6B shows the loss of the test set of four different optimizers under three independent tests. It can be seen that the loss under all experiments of 1D-CNN-B shallow networks containing only one 1D Conv Unit B is less than 0.22. When Nadam was used as the optimizer, the average loss of the test set under three experiments was the smallest, at only 0.0363. Also, when SGD was used as the optimizer, the average loss of the sample in the test set under three experiments was the largest, which was 0.0967. To sum up, when the multispectral information of coal and gangue was combined with the 1D-CNN-B network to construct the identification model of coal and gangue, under the condition of only containing a 1D Conv Unit B, using Nadam as the optimizer of the network can achieve the best identification effect. At this time, the average recognition rate of the test set sample was the largest and the average loss was the smallest.
3.4 Parameter selection of the 1D-CNN identification model
Through the experimental results of Section 3.2 and Section 3.3, it can be seen that when using the 1D-CNN-A network to build the shallow identification model of coal and gangue, using Adamax as the optimizer can obtain better results, and when using the 1D-CNN-B network, using Nadam as the optimizer can obtain better results. The mean and variance of recognition rate, loss, and training time of the two networks under three independent experiments are recorded in Table 2. It can be seen that the two different network structures have high recognition rates for coal and gangue, which are higher than 95.00%, indicating that the two network structures are feasible and effective for identifying coal and gangue. More specifically, the 1D-CNN network using B structure (i.e., 1D-CNN-B) has a higher average recognition rate and lower average loss. In terms of training time, the training time of the 1D-CNN with two structures is similar, and the average training time of each epoch of the 1D-CNN-B network is about 1 ms longer than that of 1D-CNN-A. Considering that the 1D-CNN model constructed in this study is mainly used for identification of coal and gangue, the model structure with higher average recognition rate is preferred, that is, Nadam optimization of the 1D-CNN-B shallow network is more suitable.
Considering that the selection of different activation functions determines that the network model has different performances, it is also crucial to select the appropriate activation function. In the optimization of the 1D-CNN-B shallow network with Nadam, the ReLU function and its improved functions (ELU function, LeakyReLU function, PReLU function, and ThresholdedReLU function) were selected as the activation functions. Three experiments were randomly carried out to compare the accuracy and loss, and the error band (using the standard deviation) graph is shown in Figure 7.
FIGURE 7. Results of the 1D-CNN-B network under different activation functions. (A) Accuracy and (B) loss.
Figure 7A shows the sample recognition rate of five different activation functions in the test set under three independent tests. It can be found that the average recognition rate of the 1D-CNN-B shallow network is higher than 93.00%. When LeakyReLU or PReLU was used as the activation function, the average recognition rate of the test set under three experiments was the highest, reaching 97.92%. Also, when ThresholdedReLU was used as the activation function, the average recognition rate in the three experiments was the lowest (95.42%). Figure 7B shows the loss of the test set samples of five different activation functions under three independent experiments. It can be seen that the loss of the 1D-CNN-B shallow network under all experiments is less than 0.26. When PReLU was used as the activation function, the average loss of the sample in the test set under three experiments was the smallest (only 0.0559). Also, when ThresholdedReLU was used as the activation function, the average loss was the largest (0.1440). In summary, when the identification model of coal and gangue is constructed by using multispectral spectral information combined with a 1D-CNN-B shallow network (only containing one 1D Conv Unit B), under the condition of using Nadam as the optimizer, using PReLU as the activation function can achieve the best identification effect. At this time, the average recognition rate of the test set sample is the largest and the average loss is the smallest.
After selecting PReLU as the activation function, the number of 1D Conv Unit B in Figure 4B is gradually increased on the basis of the shallow 1D-CNN-B network to achieve the purpose of deepening the network depth. The specific operation method was as follows: the number of convolutional kernels in the 1D convolutional layer inside each 1D Conv Unit B remained the same, the size of the convolutional kernel was set to 3, the step size was set to 2, and the filling method was set to ‘same’ (that is, the convolution results at the boundary were retained so that the output size was consistent with the input size). The number of 1D Conv Unit B increased according to 1–2–3–4, and the number of convolutional kernels increased according to 16–32–64–128. In the process of training and testing the 1 D-CNN-B network, Nadam was selected as the optimizer. Three experiments were randomly carried out, and relevant indicators were calculated and shown in Figure 8.
FIGURE 8. Results under different numbers of one-dimensional convolution units. (A) Accuracy and loss. (B) Training time and number of trainable parameters.
By observing the abovementioned graphs, it can be seen that with the deepening of the network depth, the accuracy does not show a trend of increasing but first increases and then decreases. The reason may be when the number of network layers is small, with the increase of the number of layers, more effective features can be extracted, which is conducive to classifier identification based on these features, so the recognition rate increases gradually; and when the number of layers increases to a certain extent, the network depth continues to increase, and more features can be extracted, but it may contain some useless features, which will cause some interference to the identification of the classifier, so the recognition rate decreases to a certain extent. In particular, when using three 1D Conv Unit B, the average recognition rate can be maximized and the loss minimized. At the same time, it can also be seen that when the number of 1D Conv Units is increased, the training time and the number of model parameters to be trained have increased to varying degrees. It is obvious that with the deepening of the network depth, it will inevitably lead to an increase in the number of parameters that need to be trained, so the training time will also increase. In summary, when the multispectral spectral information of coal and gangue is combined with the 1D-CNN-B network to construct the identification model, the network model contains three 1D Conv Unit B, Nadam is used as the optimizer, and PReLU is used as the activation function, which can achieve the best identification effect. At the same time, the network parameters of the model are shown in Table 3.
The abovementioned table shows the core composition and output size of the optimized 1D-CNN identification model of coal and gangue. The input size of the network is consistent with the dimension of the multispectral spectral information, which is 25 × 1 (i.e., the number of channels of the spectral data). After batch normalization, the size does not change because the operation is mainly to normalize the input data. Next, the batch normalized data are connected to three consecutive 1D Conv Units B, the output size of the first 1D Conv Unit B (containing 16 convolutional cores) is 12 × 16, the output size of the second 1D Conv Unit B (containing 32 convolutional cores) is 6 × 32, and the output size of the third 1D Conv Unit B (containing 64 convolutional cores) is 3 × 64. Finally, the output of the upper layer is ‘flattened’ by the Flatten layer as a 1D vector and then connected to the fully connected layer (using PReLU as the activation function and setting the output to 10) and then connected the output of the first fully connected layer to the second fully connected layer (using Softmax as the activation function and keeping its output consistent with the category, that is, setting the output to 2) for the output of the identification result of coal and gangue.
3.5 Performance of the optimal 1D-CNN recognition model of coal and gangue
The multispectral spectral information identification model of coal and gangue is established according to the basic structure of the 1D-CNN-B network selected in Section 3.4, and the accuracy and loss of the training set and test set of the model were recorded under 1,000 iterations, as shown in Figure 9.
Obviously, with the increasing number of iterations, the recognition accuracy of the training set and test set showed an upward trend and eventually stabilized (tended to 1), while their loss showed a downward trend and finally stabilized (tended to 0). When the 1D-CNN model is used for multispectral classification of coal and gangue, the accuracy and loss reach a stable value after about 300 epochs. In other words, the proposed 1D-CNN model (in particular, using the 1D-CNN-B network) is feasible and effective for classifying the multispectral spectral information of coal and gangue and realizing the identification of coal and gangue.
Figure 10 displays the identification effect of the trained 1D-CNN identification model on the test set samples. It can be clearly seen that only one of the 40 gangue samples was wrongly predicted as coal, and 40 coal samples can be accurately identified. That is to say, only one of the 80 test samples is wrongly classified, indicating that the 1D-CNN model constructed in this study can realize the accurate identification of coal and gangue.
3.6 Comparison with traditional recognition methods
For the analysis of spectral information, especially the classification problem, the combination strategy of dimension reduction and classifier is usually adopted. In this study, the traditional spectral identification methods are used for comparison. Specifically, different combinations of PCA and SVM are used to achieve spectral classification. First, the spectral data of coal and gangue are normalized, and the data normalization interval was set to [0,1]. Then, the normalized data were processed by PCA to achieve the extraction of principal components, and the cumulative contribution rate was set to 95% in the PCA. Finally, the abovementioned processed data were divided into the training set and test set according to the principle of random division. The training set was fed into three different SVM classifiers (radial basis function was selected as the kernel function), namely, grid search SVM (GS-SVM), genetic algorithm SVM (GA-SVM), and particle swarm optimization SVM (PSO-SVM) to construct the identification model of coal and gangue, and the identification performance was verified by the samples of the test set. At the same time, for the purpose of forming a more perfect comparison, the normalized data were not processed by PCA as a control to compare the performance of different identification models, and the identification results under three different experiments are shown in Figure 11.
By observing Figure 11, it can be found that when the spectral information of coal and gangue is fed directly into different SVM identification models after normalization, the identification accuracy can be stabilized at more than 95.00%, and when GS-SVM is used as the classifier, the maximum average recognition rate can be reached (97.50%). When the spectral information of coal and gangue is normalized and processed by PCA and then fed into different SVM identification models, the recognition accuracy can be stabilized at more than 85.00%, and when GA-SVM is used as the classifier, the maximum average recognition rate is 92.50%. When the normalized spectral data are processed by PCA, the average recognition rate is lower than that of the spectral data without PCA, which shows that the spectral information loses some effective information after dimension reduction, which leads to a decrease in recognition accuracy. However, no matter whether the spectral information is processed by PCA or the spectral information is not processed by PCA, the recognition accuracy of the SVM classifier is lower than that of the 1D-CNN model proposed in this study. Although some classification models can achieve high classification accuracy, they are less stable than the CNN. On the other hand, machine learning algorithms need complex data processing steps, are more cumbersome, and need to pay attention to feature extraction and classifier cooperation.
4 Conclusion
Aiming at the urgent need for accurate identification of coal and gangue, a method of identifying coal and gangue using multispectral spectral information combined with the CNN model is proposed. More specifically, a 1D-CNN model for multispectral spectral information identification of coal and gangue is constructed. The model comprehensively utilizes spectral information in multiple bands and uses 1D-CNN to automatically extract spectral features of coal and gangue, which eliminates the complex preprocessing and feature extraction steps of traditional spectral information identification methods. A multispectral data acquisition system was built to obtain the multispectral information of 200 pieces of coal and 200 pieces of gangue in the Huainan mining area, and the average value of each wavelength position was calculated to obtain the spectral information of the whole band. Using the spectral information at 25 wavelength positions, based on 1D-CNN-A network and 1D-CNN-B network, and with the help of SGD, Adam, Adamax and Nadam optimizers, the ReLU function and its improved function were used as the activation function to compare the identification ability of identification models for the task of identifying coal and gangue with different network structures and optimize the network depth and parameters. The study found that compared with the 1D-CNN-A network, the 1D-CNN-B network using Nadam as the optimizer for identification of coal and gangue can achieve better results. In particular, the best 1D-CNN recognition model contains three 1D Conv Units B and uses PReLU as the activation function. At this time, the model can maximize the average recognition rate (98.75%) and minimize the average loss (0.0382). In addition, for the purpose of verifying the reliability of the proposed recognition method for coal gangue, we compared it with the traditional recognition strategies and found that the 1D-CNN model had higher recognition accuracy than the traditional method, without considering how to select the appropriate preprocessing and feature extraction methods. The research results show that the accurate identification of coal and gangue can be realized by using the multispectral spectral information of coal and gangue combined with 1D-CNN, which has reference value for promoting the development of automatic separation equipment for coal and gangue.
In future research studies, on the one hand, we can add more samples of coal and gangue from different regions, such as the Shaanxi mining area and Inner Mongolia mining area, to further enrich the multispectral database of coal and gangue, and on the other hand, considering that the structure and hyperparameter design of the CNN model in this paper mainly rely on manual screening, how to introduce an intelligent optimization algorithm to automatically design the structure and hyperparameters of the CNN model will be the key research work to be carried out in the next stage.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Materials; further inquiries can be directed to the corresponding author.
Author contributions
FH conceived the study. FH and MZ developed the method. FH and RD implemented the algorithms. RD and YL analyzed the data. FH supervised the study. FH and MZ wrote the manuscript. All authors read and approved the final manuscript content of the work.
Funding
The research is financially supported by the University-level Key Projects of Anhui University of Science and Technology (No. xjzd2020-06), Key Projects of Natural Science Research in Anhui Universities (Nos. KJ2021A0470 and KJ2021A0471), Talent Introduction Fund of Anhui University of Science and Technology (No. 13200404), Young Talent Project of Anhui University of Science and Technology (No. 2020023), National Key R&D Program of China (No. 2020YFB1314100), Energy Internet Joint Fund of Anhui Province (No. 2008085UD06), Anhui Science and Technology Major Project (No. 201903a07020013), Jiangxi Provincial Natural Science Foundation (No. 20202BAB212007), and Jiangxi Provincial Science and Technology Project of Education Department (Nos. GJJ210644 and GJJ200651).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
AlSuwaidi, A., Grieve, B., and Yin, H. (2018). Combining spectral and texture features in hyperspectral image analysis for plant monitoring. Meas. Sci. Technol. 29, 104001. doi:10.1088/1361-6501/aad642
Bai, F., Fan, M., Yang, H., and Dong, L. (2021). Fast recognition using convolutional neural network for the coal particle density range based on images captured under multiple light sources. Int. J. Min. Sci. Technol. 31, 1053–1061. doi:10.1016/j.ijmst.2021.09.004
Bioucas-Dias, J. M., Plaza, A., Camps-Valls, G., Scheunders, P., Nasrabadi, N., and Chanussot, J. (2013). Hyperspectral remote sensing data analysis and future challenges. IEEE Geosci. Remote Sens. Mag. 1, 6–36. doi:10.1109/MGRS.2013.2244672
BP (2021). BP world energy statistics yearbook 2021 Edition. Available at:https://www.bp.com/content/dam/bp/country-sites/zh_cn/china/home/reports/statistical-review-of-world-energy/2021/BP_Stats_2021.pdf (accessed July 8, 2021).
Cai, Y., Tay, K., Zheng, Z., Yang, W., Wang, H., Zeng, G., et al. (2018). Modeling of ash formation and deposition processes in coal and biomass fired boilers: A comprehensive review. Appl. Energy 230, 1447–1544. doi:10.1016/j.apenergy.2018.08.084
Chen, X., Wang, S., Liu, H., Yang, J., Liu, S., and Wang, W. (2022). Coal gangue recognition using multichannel auditory spectrogram of hydraulic support sound in convolutional neural network. Meas. Sci. Technol. 33, 015107. doi:10.1088/1361-6501/ac3709
Clevert, D.-A., Unterthiner, T., and Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (ELUs). ArXiv E-Prints. arXiv:1511.07289 Available at:https://ui.adsabs.harvard.edu/abs/2015arXiv151107289C.
Dill, H. G. (2016). Kaolin: Soil, rock and ore. Earth. Sci. Rev. 161, 16–129. doi:10.1016/j.earscirev.2016.07.003
Dou, D., Wu, W., Yang, J., and Zhang, Y. (2019). Classification of coal and gangue under multiple surface conditions via machine vision and relief-SVM. Powder Technol. 356, 1024–1028. doi:10.1016/j.powtec.2019.09.007
Ghaderizadeh, S., Abbasi-Moghadam, D., Sharifi, A., Zhao, N., and Tariq, A. (2021). Hyperspectral image classification using a hybrid 3D-2D convolutional neural networks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 14, 7570–7588. doi:10.1109/JSTARS.2021.3099118
He, D., Le, B. T., Xiao, D., Mao, Y., Shan, F., and Ha, T. T. L. (2019). Coal mine area monitoring method by machine learning and multispectral remote sensing images. Infrared Phys. Technol. 103, 103070. doi:10.1016/j.infrared.2019.103070
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. IEEE Int. Conf. Comput. Vis., 1026–1034. doi:10.1109/ICCV.2015.123
Hu, F., Zhou, M., Yan, P., Li, D., Lai, W., Bian, K., et al. (2019). Identification of mine water inrush using laser-induced fluorescence spectroscopy combined with one-dimensional convolutional neural network. RSC Adv. 9, 7673–7679. doi:10.1039/C9RA00805E
Hu, F., Zhou, M., Yan, P., Liang, Z., and Li, M. (2022). A Bayesian optimal convolutional neural network approach for classification of coal and gangue with multispectral imaging. Opt. Lasers Eng. 156, 107081. doi:10.1016/j.optlaseng.2022.107081
Kingma, D., and Ba, J. (2014). Adam: A method for stochastic optimization, ArXiv Prepr. ArXiv. doi:10.48550/arXiv.1412.6980
Konda, K., Memisevic, R., and Krueger, D. (2014). Zero-bias autoencoders and the benefits of co-adapting features. doi:10.48550/arXiv.1402.3337
Li, L. J., Fan, S. X., Wang, X. W., Li, R., Wen, X., Wang, L. Y., et al. (2022). Classification method of coal and gangue based on hyperspectral imaging technology. Spectrosc. Spectr. Anal. 42, 1250–1256. doi:10.3964/j.issn.1000-0593(2022)04-1250-07
Li, Q., He, X., Wang, Y., Liu, H., Xu, D., and Guo, F. (2013). Review of spectral imaging technology in biomedical engineering: Achievements and challenges. J. Biomed. Opt. 18, 100901. doi:10.1117/1.JBO.18.10.100901
Maxwell, A. E., Pourmohammadi, P., and Poyner, J. D. (2020). Mapping the topographic features of mining-related valley fills using Mask R-CNN deep learning and digital elevation data. Remote Sens. (Basel). 12, 547. doi:10.3390/rs12030547
Munir, N., Kim, H.-J., Park, J., Song, S.-J., and Kang, S.-S. (2019). Convolutional neural network for ultrasonic weldment flaw classification in noisy conditions. Ultrasonics 94, 74–81. doi:10.1016/j.ultras.2018.12.001
Nair, V., and Hinton, G. (2010). Rectified linear units improve restricted Boltzmann machines. Proc. 27th Int. Conf. Mach. Learn. doi:10.5555/3104322.3104425
Peddinti, V., Wang, Y., Povey, D., and Khudanpur, S. (2018). Low latency acoustic modeling using temporal convolution and LSTMs. IEEE Signal Process. Lett. 25, 373–377. doi:10.1109/LSP.2017.2723507
Qin, J., Chao, K., Kim, M. S., Lu, R., and Burks, T. F. (2013). Hyperspectral and multispectral imaging for evaluating food safety and quality. J. Food Eng. 118, 157–171. doi:10.1016/j.jfoodeng.2013.04.001
Robbins, H., and Monro, S. (1951). A stochastic approximation method. Ann. Math. Stat. 22, 400–407. doi:10.1214/aoms/1177729586
Shao, H., Chen, Y., Yang, Z., Jiang, C., Li, W., Wu, H., et al. (2020). A 91-channel hyperspectral LiDAR for coal/rock classification. IEEE Geosci. Remote Sens. Lett. 17, 1052–1056. doi:10.1109/LGRS.2019.2937720
Singh, A. L., Singh, P. K., Kumar, A., and Singh, M. P. (2015). Sequestration of metals from coal using bacteria: Environmental implications on clean coal energy. Energy Sources Part A Recovery Util. Environ. Eff. 37, 1432–1439. doi:10.1080/15567036.2011.619631
Voulodimos, A., Doulamis, N., Doulamis, A., and Protopapadakis, E. (2018). Deep learning for computer vision: A brief review. Comput. Intell. Neurosci., 1–13. doi:10.1155/2018/7068349
Wan, L., Wang, J., Zeng, Q., Ma, D., Yu, X., and Meng, Z. (2022). Vibration response analysis of the tail beam of hydraulic support impacted by coal gangue particles with different shapes. ACS Omega 7, 3656–3670. doi:10.1021/acsomega.1c06279
Wang, W., and Zhang, C. (2017). Separating coal and gangue using three-dimensional laser scanning. Int. J. Min. Process. 169, 79–84. doi:10.1016/j.minpro.2017.10.010
Xiao, D., Le, T. T. G., Doan, T. T., and Le, B. T. (2022). Coal identification based on a deep network and reflectance spectroscopy. Spectrochimica Acta Part A Mol. Biomol. Spectrosc. 270, 120859. doi:10.1016/j.saa.2022.120859
Xie, B., Zhou, S., Wu, L., Mao, W., and Wang, W. (2022). RockSL: An integrated rock spectral library for better global shared services. Big Earth Data, 1–21. doi:10.1080/20964471.2021.2017111
Xu, X., Yu, T., Hu, X., Ng, W. W. Y., and Heng, P.-A. (2021). SALMNet: A structure-aware lane marking detection network. IEEE Trans. Intell. Transp. Syst. 22, 4986–4997. doi:10.1109/TITS.2020.2983077
Yang, D., Li, J., Zheng, K., Du, C., and Liu, S. (2018). Impact-crush separation characteristics of coal and gangue. Int. J. Coal Prep. Util. 38, 127–134. doi:10.1080/19392699.2016.1207634
Yazdi, M., and Esmaeilnia, S. (2003). Dual-energy gamma-ray technique for quantitative measurement of coal ash in the Shahroud mine, Iran. Int. J. Coal Geol. 55, 151–156. doi:10.1016/S0166-5162(03)00085-5
Yuan, L. (2020). Challenges and countermeasures for high quality development of China’s coal industry. China coal. 46, 6–12. doi:10.19880/j.cnki.ccm.2020.01.001
Yuan, L., O'Riordan, E. D., and Jacquier, J. C. (2019). Development of a first order derivative spectrophotometry method to rapidly quantify protein in the presence of chitosan and its application in protein encapsulation systems. Food Chem. 44, 1–6. doi:10.1016/j.foodchem.2019.02.121
Zhang, X., Zou, Y., and Shi, W. (2017). Dilated convolution neural network with LeakyReLU for environmental sound classification. Int. Conf. Digit. Signal Process. 1–5. doi:10.1109/ICDSP.2017.8096153
Zhang, Y., Zhu, H., Zhu, J., Ou, Z., Shen, T., Sun, J., et al. (2021). Experimental study on separation of lumpish coal and gangue using X-ray. Energy Sources Part A Recovery Util. Environ. Eff., 1–13. doi:10.1080/15567036.2021.1976325
Keywords: multispectral imaging, coal–gangue identification, spectral characteristics, one-dimensional convolutional neural network, activation function, optimizer
Citation: Hu F, Zhou M, Dai R and Liu Y (2022) Recognition method of coal and gangue based on multispectral spectral characteristics combined with one-dimensional convolutional neural network. Front. Earth Sci. 10:893485. doi: 10.3389/feart.2022.893485
Received: 11 March 2022; Accepted: 11 August 2022;
Published: 06 September 2022.
Edited by:
Guanglong Sheng, Yangtze University, ChinaReviewed by:
Hao Zhang, Henan Agricultural University, ChinaDan Tao, East China Jiaotong University, China
Mohammad Yazdi, Shahid Beheshti University, Iran
Copyright © 2022 Hu, Zhou, Dai and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Feng Hu, hufeng0106@163.com