Classification of crop disease-pest questions based on BERT-BiGRU-CapsNet with attention pooling

Zhang, Ting; Wang, Dengwu

doi:10.3389/fpls.2023.1300580

ORIGINAL RESEARCH article

Front. Plant Sci. , 07 December 2023

Sec. Technical Advances in Plant Science

Volume 14 - 2023 | https://doi.org/10.3389/fpls.2023.1300580

This article is part of the Research Topic Artificial Intelligence and Internet of Things for Smart Agriculture View all 20 articles

Classification of crop disease-pest questions based on BERT-BiGRU-CapsNet with attention pooling

Ting Zhang^*

Dengwu Wang

College of Computing, Xijing University, Xi’an, China

Crop disease-pest question classification is an essential part of pest knowledge intelligent question answering system. A crop disease-pest question classification method is proposed on the basis of bidirectional encoder representations from transformers (BERT), bidirectional gated unit (BiGRU), capsule network (CapsNet), and BERT-BiGRU-CapsNet with attention pooling (BBGCAP). In BBGCAP, the unstructured text data are preprocessed vectorically using BERT, BiGRU is used to extract the deep features of the text, attention pooling is used to assign the corresponding weights to the extracted deep information, and CapsNet is used to route the right alternative. BBGCAP is a synthetic model by integrating the advantages of BERT, BiGRU, CapsNet, and attention pooling. The experimental results on the cucumber-pest question database show that the proposed method is superior to the methods based on traditional template matching, support vector machines (SVM), and convolutional neural network–long short-term memory (LSTM), and the accuracy rates of precision, recall, and F1 are all above 902.15%. This method provides technical support for intelligent question answering system of crop disease-pests.

1 Introduction

Crop pest-diseases are one of the important factors that seriously threaten crop yield and quality. Early correct diagnosis and control of pest-diseases can effectively reduce the economic losses caused by pest-diseases (Wang et al., 2022; He et al., 2023). The disease-pests can be effectively prevented and controlled by only timely obtaining crop disease-pest and disease information and taking suitable control measures. Diagnosing and identifying the types of pest-diseases is not an easy task for farmers, especially considering the various pest-diseases and complex environment. The knowledge management of crop disease-pests can provide guidelines for the diagnosis and prevention of pest-diseases, which is a new way to obtain crop pest information in time in precision agriculture (Waheed et al., 2022). However, with the development of the Internet-of-Things technology and the explosive growth of network data, the data related to crop disease-pests also show a highly dispersed, complex, and heterogeneous state, which brings difficulties to farmers, plant-protection experts, and other personnel to quickly and accurately obtain the required information about disease-pests (Cherif, 2022).

It is a key issue to accurately extract useful knowledge such as pathogens, damage sites, and control agents from massive and complex crop pest-related data, where integrating crop disease-pest knowledge is an approach to pest control that aims to maintain harmful insects at tolerable levels, keeping pest populations below the economic damage levels. Bidirectional encoder representations from transformers (BERT) is the first unsupervised, depth bidirectional model for pre-training. It can learn surface features, phrase-level syntactic level features, and semantic level information from a shallow level to a high level, so that the word vector obtained by BERT not only implicitly contains context-level features but also effectively captures sentence-level features (Guo et al., 2021). In recent years, the amount of literature related to pest management has increased rapidly, and a large number of valuable crop pest information is still hidden in unstructured social media, such as the Chinese Agricultural Technology Promotion Q&A community, which adds nearly 10,000 crop pest data every day. Therefore, an effective classification of question sentences is a key technical link in achieving intelligent Q&A by crop producers and managers. At present, in the crop disease-pest knowledge answering system, a lot of question answering systems have been constructed by gradually integrating deep learning–related technologies into the process of practical agricultural production, but there are still some problems (Miguel et al., 2021). (1) There are many types of crop diseases and pests, and relevant knowledge is highly fragmented. (2) Due to the small number of natural language open datasets, short text, sparse features, and difficulty to learn the hidden semantic information in the field of agricultural pests and diseases, it is still difficult to parse and classify pest and disease questions and link attributes from questions to the system. Therefore, it is necessary to conduct the integrated research on pest knowledge and research on Q&A in agricultural pest-disease field to explore a more efficient and accurate Q&A model.

Integrating BERT, bidirectional gated unit (BiGRU), capsule network (CapsNet) and attention pooling, a crop disease-pest question classification method, namely, BBGCAP is proposed. This method has the characteristics of simple structure, fewer training parameters, and fast training speed, which can meet the response time requirements of the question answering system. The main contributions of this paper are summarized as follows:

(1) The unstructured text data are preprocessed vectorically using BERT text pre-training model based on the agricultural domain corpus, and the obtained word vector not only implicitly contains context-level features but also effectively captures sentence-level features.

(2) BiGRU is employed to extract the deep global features of the text, and CapsNet is used to extract the local features of text. CapsNet replaces scalar-output features of convolutional neural network (CNN) with vector-output capsules and pooling layer with dynamic routing algorithm.

(3) Attention pooling is adopted to assign the corresponding weights to the extracted deep information and retain the most significant information at the pooling stage. An intermediate sentence representation generated by BiGRU is used as a reference for local representations produced by the convolutional layer to obtain attention weights. The sentence representation is formed by combining local representations using obtained attention weights.

The rest of this paper is organized as follows. Section 2 simply introduces the related works. BERT-BiGRU-CapsNet with attention pooling, namely, BBGCAP, is described in detail in Section 3. Section 4 shows a preliminary validation analysis of the model in a simulated environment, and, finally, our conclusions and future work are put forward in Section 5.

2 Related works

Pest and diseases are two major factors affecting crop yield and quality. Correct detection, diagnosis, and prevention of various crop pest-diseases are the basis of pest-disease management. Farmers have traditionally relied on manual methods to judge and identify pests and diseases, which are time-consuming, expensive, and inaccurate. Traditional crop disease-pest information acquisition methods mainly use keyword-based search engines or shallow semantic analysis, but the returned results are a large number of related websites with vague and redundant answers (El-Ghany et al., 2020; Liu and Wang, 2020).

Deep learning has gained great advantages in crop pest management and has become the standard method for solving most of the technical challenges of crop pest detection, identification, and classification (Liu and Wang, 2021; Lu et al., 2021). Miao et al. (2022) summarized the applications of deep neural networks in pest detection in recent years into three categories, introduced the characteristics and research status of each network, and provided a direction for solving the current problem by describing the methods of multi-information fusion and dataset enhancement. Aiming at the problems of large computational resource and low precision in most CNNs, Zuo et al. (2020) proposed an attention-based lightweight residual network for plant disease recognition. It employs depthwise separable convolution instead of the conventional convolution on the basis of traditional residual neural network. The attention module is introduced to effectively prevent the overfitting problem of the network and enrich local feature learning. To achieve rapid recognition of the common pests in agriculture and forestry, Wang et al. (2020) proposed a pest image recognition method based on deep CNN and compared the performance of different models on Chlamydial Protease-Like Activity Factor (CPAF) dataset, which has 73,635 insect images, including 4,909 original images and 68,726 enhanced images. To enhance the learning ability of micro-lesion features, Chen et al. (2021) selected MobileNet-V2 pre-trained on ImageNet as the backbone network and added the attention mechanism to learn the importance of inter-channel relationship and spatial points for input features. Xin and Wang (2021) proposed a deep convolutional neural network and Google data (DCNN-G) model based on deep learning and fusion of Google data analysis and compared its accuracy with the conventional recognition model. Using CNNs to classify crop pest–disease image quality not only expands the application field of deep learning but also provides a new method for crop pest–disease image quality assessment.

Many deep learning–based entity recognition methods have been presented to identify crop diseases, pests, drug names, and other nouns related to disease-pests. It is a basic part of agricultural knowledge graph, question and answer, and will be implemented as a web application to provide the public with solutions for the prevention and control of crop pest-diseases (Nandhini and Ashokkumar, 2021; Thanammal Indu and Suja Priyadharsini, 2022). The named entities of crop pest-diseases have the common phenomena of complex word formation, word combination, and entity embedding. In particular, in the field of Chinese crop pest-diseases, there are many problems such as multiple entity naming methods, fuzzy entity boundary, inadequate feature extraction, and inconsistent entity boundary labeling. The crop disease-pest–related information is described by complex word-formation and universal phenomena of word combination and entity embedding. To address the above problems, Wang et al. (2022) combined discourse topic and attention mechanism, and proposed the attention-based SoftLexicon with term frequency–inverse document frequency (TF-IDF) for crop disease-pest entity recognition, designed a flow chart to explain the major principles and steps, and explained the model through visual methods. The recognition accuracy of Chinese agricultural pest-diseases was improved by dividing the word sets according to the position of the characters in the word, integrating the discourse theme features into the calculation of lexical information, and introducing the attention mechanism. Guo et al. (2021) used the fine-tuned BERT model to generate context-character–level embedded representations with specific knowledge, introduced adversarial training, and enhanced the generalization and robustness of recognizing rare entities. Miguel et al. (2021) proposed a first step toward a mature, semantically enhanced decision support system for integrated pest management, by collecting data from multiple heterogeneous sources to build a complete agricultural knowledge base and developing a system to help farmers make decisions about pest control. Guo et al. (2020) established an available corpus toward agricultural disease-pests, which contains 11 categories and 34,952 samples, and proposed a Chinese named entity recognition model via joint multi-scale local context features and the self-attention mechanism. The crop disease-pest knowledge Q&A system, as a key module of question answering systems, plays a decisive role in the efficiency of system retrieval. It can answer the questions that farmers encounter during agricultural production and planting, with the core of question sentence classification to match user questions. BiGRU can better understand and deal with dependencies in a language and improve the comprehension of text sequences by considering both historical and future contextual information and using the context information of text to extract the global features of text (Islam et al., 2022). It uses the same parameters to process both forward and backward sequence data, effectively reducing the number of parameters in the model, reducing the risk of overfitting, and improving training and inference efficiency. CapsNet has inherently better generalization capabilities and could theoretically use a considerably smaller number of parameters and get better results (Tao et al., 2022). Attention mechanism can be introduced to assign different weights to BiGRU hidden states through mapping weighting and learning parameter matrix, allocate sufficient attention to key information, highlight the influence of important information, reduce the loss of feature information, and strengthen the influence of important information, so as to improve the accuracy of the model (Meng et al., 2016). As for the complex and diverse semantic information of user questions in agricultural question answering systems, a crop disease-pest question classification method (BBGCAP) is proposed based on BERT, BiGRU, CapsNet, and attention pooling, meeting the needs of users to quickly and accurately obtain the classification results of crop disease questions.

3 Classification of crop disease-pest questions

Aiming at the characteristics of small vocabulary, strong sparse features, large noise, and poor normalization in crop disease-pest Q&A questions, a crop disease-pest question classification method, namely, BBGCAP is proposed. Its basic architecture is shown in Figure 1.

FIGURE 1

Figure 1 Architecture of BBGCAP, consisting of input and output layers, BERT vector embedding layer, BiGRU feature extraction layer, attention pooling layer, and capsule network layer.

In the method, question feature vocabulary is extended, word vector is weighted according to the importance of the vocabulary, text features are extracted using BiGRU and CapsNet, and its structure and parameters are further optimized by cross-validation strategy. The main components of BBGCAP are introduced in detail as follows.

3.1 Question participle

In question-answer system, each sentence is first segmented. Word segmentation is the addition of boundary markers between words in Chinese sentences. There are many methods for word segmentation, including shortest path word segmentation, N-Gram word segmentation, recurrent neural network (RNN) word segmentation, and transformer word segmentation. There are also many word segmentation tools, such as Jieba, HanLP, and FoolNLTK. Most word segmentation tools, such as Institute of Computing Technology Chinese Lexical Analysis System (ICTCLAS) of the Chinese Academy of Sciences, Language Technology Platform (LTP) of Harbin Institute of Technology, and Jieba, have accuracy rates of more than 95%. This paper uses the Jieba word splitter and adds it to the agricultural domain dictionary, so that domain vocabulary can be correctly segmented, with spaces between words as segmentation, as shown in Table 1.

TABLE 1

Table 1 Crop pest question sentence segmentation.

3.2 BERT vector embedding

BERT is a pre-trained language model for bidirectional encoding representation of transformers. It generates deep bidirectional language representation, has a deeper understanding of context than Word2vec and unidirectional language models, and can extract more efficient vector features from corpus. After inputting the word segmentation of crop disease-pest question sentences into BERT, it is transmitted to the word embedding layer, including marker word embedding, sentence word embedding, and positional word embedding, as shown in Figure 2.

FIGURE 2

Figure 2 BERT structure, where [CLS] and [SEP] are marked at the beginning and end of the sentence, respectively, and Tok_i is the ith token, randomly blocking some characters; E_i is the embedding vector of the ith token, and T_i is the feature vector obtained from the ith token after BERT processing.

Output the corresponding word vector for each word in the sentence through BERT, where the maximum length of the sentence is set to L, and the word vector dimension is V. Generate the word vector matrix X as follows:

\begin{array}{l} X = [X_{1}, X_{2}, X_{3}, \dots, X_{n}] {∈ ℝ}^{L \times V} & (1) \end{array}

3.3 BiGRU feature extraction

LSTM and GRU are two variants of RNNs that use gating mechanisms to track the sequence state, where GRU is simpler and superior to LSTM when the input data are scarce or the risk of overfitting is high. GRU consists of reset gates and update gates, which selectively pass through information through a “gate” structure, capturing sequence length dependencies and contextual information, thereby solving the problem of gradient vanishing or exploding in recursive networks. Their structures are shown in Figures 3A, B.

FIGURE 3

Figure 3 Structure of LSTM, GRU, and BiGRU, where i, f, and o in (A) represent input, forgetting, and output gates, respectively; C and $\tilde{C}$ represent memory cells and new memory cell contents, respectively; r and z represent reset and update gates in (B), respectively; and h and $\tilde{h}$ are activation and candidate activation gates, respectively. x_t in (C) is the input vector at time t, y_t is the output vector at time t, h₁ and h₂ are the output of the hidden layer state and update state at time, respectively.

For time t, its GRU state is calculated as follows (Islam et al., 2022; Ma et al., 2022):

\begin{array}{l} \begin{matrix} Z_{t} = σ (W_{z} \cdot [h_{t - 1}, x_{t}] + b_{z}) \\ r_{t} = σ (W_{r} \cdot [h_{t - 1}, x_{t}] + b_{r}) \\ {\tilde{h}}_{t} = \tanh (W_{h} \cdot [r_{t} * h_{t - 1}, x_{t}] + b_{h}) \\ h_{t} = (1 - z_{t}) * h_{t - 1} + Z_{t} * {\tilde{h}}_{t} \end{matrix} & (2) \end{array}

where $x_{t}$ is the input vector at time t; σ is the sigmoid activation function; $W_{z}, W_{r},$ and W_h are the weights; $b_{z}, b_{r},$ and b_h are bias; Z_t and r_t are the current unit state of the control gate and update gate at time t, respectively; $h_{t}$ and ${\tilde{h}}_{t}$ are the output of the hidden layer state and update state at time t, respectively.

GRU ignores future contextual information, where BiGRU can train a GRU model forward and backward using the same training sequence and then linearly combine the outputs of the two models to ensure that each node in the sequence can fully rely on all contextual information. Therefore, for question classification tasks, BiGRU is often used to better understand the user intentions. Its structure is shown in Figure 3C. For a given ith participle, BERT embeds the word E(word_i), and its output at time t is calculated as follows:

\begin{array}{l} h_{i t} = [{\vec{h}}_{i t}, {\overset{\leftarrow}{h}}_{i t}] & (3) \end{array}

where ${\vec{h}}_{i t} = G R U (E (w o r d_{i}), {\vec{h}}_{i t - 1})$ and ${\overset{\leftarrow}{h}}_{i t} = G R U (E (w o r d_{i}), {\overset{\leftarrow}{h}}_{i t - 1})$ are the outputs of forward GRU and backward GRU at time t, respectively.

3.4 Attention pooling

Attention mechanism is often embedded in machine learning and natural language processing models, such as YoloV3, U-Net, BERT, GPT, and transformer. It aims to allow the model to focus on the most crucial feature information relevant to the current task, thereby reducing attention to other irrelevant or noisy information. It can automatically learn and calculate the contribution of the input data to the output data. Its structure is shown in Figure 4.

FIGURE 4

Figure 4 Structure of attention mechanism, where x_i is the ith input and α_i is the ith weight coefficient of the state h_i..

To highlight the importance of different words in the entire question-answer classification, BiGRU introduces an attention layer. Its input is the output vector h_it activated by BiGRU in the previous layer, and the attention score a_it is calculated as follows:

\begin{array}{l} \begin{matrix} S_{i t} = \sum_{i - 1}^{T} a_{i t} h_{i t} \\ u_{i t} = \tanh (w_{w} h_{i t} + b_{w}) \\ a_{i t} = \exp (u_{i t}^{T} u_{w}) / \sum_{i = 1} \exp (u_{i t}^{T} u_{w}) \end{matrix} & (4) \end{array}

where h_it is the output vector of the previous layer of BiGRU, w_w is the weight coefficient, and b_w is the bias coefficient.

3.5 Capsule network

CapsNet consists of three layers: convolutional layer, main capsule layer using vectorized capsules, and convolutional capsule layer using dynamic routing mechanism. Let this layer contain N₁ convolution kernel $W \in R^{K \times V}$ , and the word vector element y_i is convoluted as follows:

\begin{array}{l} y_{i} = f (W \otimes x_{i : i + K - 1} + b) & (5) \end{array}

where f is a ReLU activation function and b represents its bias.

Calculate the feature matrix Y as follows:

\begin{array}{l} Y = [Y_{1}, Y_{2}, Y_{3}, \dots, Y_{N_{1}}] \in R^{(L - K + 1) \times N_{1}} & (6) \end{array}

The main capsule layer is different from CNNs. This layer integrates semantic features of the same position in sentences, saves them as vectorized capsules, and converts the feature matrix obtained in the previous step into a capsule matrix Z through N₂ m-dimensional transformation matrices $N_{2} \times 1 \times m$ :

\begin{array}{l} Z = [Z_{1}, Z_{2}, Z_{3}, \dots, Z_{N_{2}}] \in R^{(L - K + 1) \times N_{2} \times m} & (7) \end{array}

Perform a linear transformation on the K₁-row capsules of Z through N₃ transformation matrices of m × m, transformation matrix of N₃, and calculate the prediction vector $z_{i}$ as follows:

\begin{array}{l} z_{i} = W_{3} z_{i} + b_{i} & (8) \end{array}

Weighted sum operation on $z_{i}$ yields u_j

\begin{array}{l} u_{j} = g (\sum_{i} c_{i} z_{i}) & (9) \end{array}

where c_i is the coupling coefficient updated during the dynamic routing process, which is obtained by calculating b_i of the connection between capsule $z_{i}$ in this layer and capsule u_j in the upper layer through the softmax function. The update method for b_i is

\begin{array}{l} b_{i} \leftarrow {b^{'}}_{i} + u_{j} \cdot z_{i} & (10) \end{array}

where ${b^{'}}_{i}$ is the weight obtained from the previous iteration, initialized to 0.

Calculate the capsule matrix U as follows:

\begin{array}{l} U = [U_{1}, U_{2}, \dots, U_{N_{3}}] \in R^{(L - K - K_{1} + 2) \times N_{3} \times m} & (11) \end{array}

Finally, softmax is adopted as the feature classifier. Softmax normalizes the output feature vector and maps it to the (0, 1) interval to obtain the probability values of the corresponding output features for each type of question, thereby classifying the question.

The gradient descent–based method is adopted to learn the parameters of BBiQLSTMA. In each training time, for L input samples ⟨x_i, y_i⟩, the gradient of each parameter relative to the model loss is calculated and then updated each parameter with learning rate λ:

\begin{array}{l} \begin{matrix} L o s s = \sum_{i = 1}^{L} - \log p (y_{i} | x_{i}) \\ θ = θ - λ \frac{\partial L o s s}{θ} \end{matrix} & (12) \end{array}

where θ is the super parameter and λ is learning rate.

From the above analysis, a BBGCAP-based crop disease-pest knowledge question classification method is proposed. Its flowchart is shown in Figure 5.

FIGURE 5

Figure 5 The flowchart of the methodology.

The pseudocode of the algorithm is given as follows:

Input crop disease-pest question text T: mini-batch $T$ :

T = T_{1}, \dots, T_{m}

Output:The label of crop disease-pest question text T.

1. $X = [X_{1}, X_{2}, X_{3}, \dots, X_{n}] {∈ ℝ}^{L \times V}$ , the corresponding word vector for each word in the sentence through BERT( $T$ ), T is a text.

2. $h_{i t} = [{\vec{h}}_{i t}, {\overset{\leftarrow}{h}}_{i t}]$ , ${\vec{h}}_{i t} = G R U (E (w o r d_{i}), {\vec{h}}_{i t - 1})$ and ${\overset{\leftarrow}{h}}_{i t} = G R U (E (w o r d_{i}), {\overset{\leftarrow}{h}}_{i t - 1})$ are the outputs of forward GRU and backward GRU at time t, respectively.

3. $S_{i t} = \sum_{i = 1}^{T} a_{i t} h_{i t}$ , $u_{i t} = \tanh (w_{w} h_{i t} + b_{w})$ , $a_{i t} = \exp (u_{i t}^{T} u_{w}) / \sum_{i = 1} \exp (u_{i t}^{T} u_{w})$ , where h_it is the output vector of the previous layer of BiGRU, w_w is the weight coefficient, and b_w is the bias coefficient.

4. Calculate the prediction vector $z_{i}, z_{i} = W_{3} z_{i} + b_{i}$ .

5. Weighted sum operation on $z_{i}$ yields u_j, $u_{j} = g (\sum_{i} c_{i} z_{i})$ .

6. Update method for b_i, $b_{i} \leftarrow {b^{'}}_{i} + u_{j} \cdot z_{i}$ .

7. Calculate the capsule matrix U, $U = [U_{1}, U_{2}, \dots, U_{N_{3}}] \in R^{(L - K - K_{1} + 2) \times N_{3} \times m}$ .

8. Label (T) = Softmax(U), it maps the output of multiple neurons to the interval (0, 1), which can be understood as a probability, so as to carry out multi-classification.

9. Output the label of T.

4 Experimental results and analysis

BBGCAP is verified on the constructed crop disease-pest question dataset and compared with three crop disease-pest knowledge question classification methods: Chinese agricultural disease-pests named entity recognition with multi-scale local context features and self-attention mechanism (MSLCFSA) (Guo et al., 2020), question classification method based on merge-convolutional neural networks-deep pyramid convolutional neural networks–long short-term memory (MCDPLSTM) (Yu et al., 2021), and text classification model based on CNN and BiGRU fusion attention mechanism (CNNBiGRUA) (Ma et al., 2022). Among them, Word2vec is used for word embedding, whereas BBGCAP employs BERT for word embedding. The model parameters are initialized using Xavier normal distribution. The experimental conditions are set as follows: the hidden state dimension of GRU unit is set to 100; the output vector dimension of BiGRU is also set to 100; the iteration number of CapsNet is set to 10 (default is 5); the embedding size is 128; the hidden size is 768; the batch size is 32; the original learning rate is 0.001; the number of iterations is 3,000; the hidden activation is ReLU; the attenuation rate is 0.1, the hidden layer attenuation rate is 0.5; and other weights, biases, and other parameters change continuously with model optimization. BiGRU and CapsNet are conducted on Keras, TensorFlow1.7.0, and PyTorch library frameworks, whereas Direct Data Ingestion (DDI) extraction experiments are conducted on Ubuntu 18.04LTS as the operating system, 32GB of memory, Intel Core i5-4200U CPU @ 2.30 GHz, GPU GEFORCE GTX 1080ti, and Ubuntu 14.0. BERT, BiGRU, and CapsNet are optimized by Adam Optimizer. Evaluate its performance using precision, recall, and F1 and calculate as follows:

\begin{array}{l} P r e c i s i o n = \frac{T P}{T P + F P}, R e c a l l = \frac{T P}{T P + F N}, F 1 = \frac{2 \cdot P r e c i c i o n \cdot R e c a l l}{P r e c i c i o n + R e c a l l} & (13) \end{array}

where TP (true positive) is the number of correctly classified positive instances, FP (false positive) is the number of misclassified positive instances, and FN (false negative) is the number of misclassified negative instances.

4.1 Dataset

Through Scrapy crawler framework, various Chinese text corpora of common crop disease-pests are captured on various Baidu Encyclopedia, Interactive Encyclopedia, Chinese Wikipedia, as well as crop management websites such as “Expert Online System,” “Planting Q&A Network,” and “Nanjing Agricultural Commission,” including various questions and sentences from agricultural producers about common crop disease-pests. Some crop disease-pest information processing methods are adopted to correct the crawled corpus, remove duplicate and unclear data, and select 3,000 clear question-answer pairs as the dataset of frequently asked questions and corresponding answers. The questions are converted into narrative sentences to construct a dataset of common crop disease questions. Some questions and their classification are shown in Table 2.

TABLE 2

Table 2 Examples of crop pest questions.

4.2 Results

In the experiments, the word vector dimension is set to 128, the maximum length of the question is set to 100, and each GRU output feature dimension in the BiGRU layer is set to 128. Stacking mode is selected to connect the outputs of forward and backward GRUs. The experiments are conducted using a 10-fold cross-validation method, i.e., conducting 10 experiments. During each experiment, 10% of the questions in each category are randomly selected as the test dataset, and the remaining data are used as the training dataset. The test dataset and training dataset do not overlap. The average test results of the 10 test datasets are used as an evaluation indicator for the model classification performance. In the data preprocessing process, impurity questions are removed, and the Jieba word segmentation tool is used to segment user questions, removing stop words, punctuation marks, and special characters. Then, BERT is provided by Google open-source is used for training to quantify questions. CapsNet first extracts a set of features, then makes a cluster on this set of features, predicts with the clustering results, carries out backpropagation according to the predicted results, updates the matrix of extracted features that can be understood as changing a set of features or fine-tuning the features, and continues to cluster, and the cycle repeats. Finally, softmax is used to classify questions. The precisions versus iteration of BBGCAP and BBLCAP are shown in Figure 6, where BBLCAP is a variation of BBGCAP with BiLSTM replacing BiGRU, and the rest remaining unchanged. From Figure 6, it is seen that BBGCAP and BBLCAP converge at 2,500 iterations, and the convergence performance of BBGCAP is better than that of BBLCAP. In the following, the number of iterations is set as 3,000.

FIGURE 6

Figure 6 The precision versus iteration.

The overall classification performances of BBGCAP and three comparative models on the test set are shown in Table 3.

TABLE 3

Table 3 Word segmentation classification of crop pest questions.

From Table 3, it can be seen that the proposed model in this article is superior to the other three methods. The main reason is that the proposed model BBGCAP outperforms other models because it fully utilizes the advantages of three components: BERT is superior to Word2vec, BiGRU is superior to BiLST, and attention pooling is better than attention mechanism in MSLCFSA and CNNBiGRUA. MSLCFSA is superior to CNNBiGRUA and MCDPLSTM because it can obtain the multi-scale local context features and utilizes attention mechanism to fully reflect the keyword features in questions, making the question classification model have better accuracy in question feature extraction, thereby improving classification accuracy. Hybrid model CNNBiGRUA is little better than MCDPLSTM because it utilizes the advantages of CNN, BiGRU and attention mechanism, where BIGRU fully uses the positional information before and after sentence segmentation.

To test the impact of the number of training samples on the question classification results, different fold cross-validation experiments are carried on and the results are shown in Table 4.

TABLE 4

Table 4 Results of BBGCAP with different fold cross-validation experiments.

From Table 4, it is seen that the number of training samples has a significant impact on the effectiveness of the model, mainly due to the variety of user questions and the lack of a fixed format, requiring a large number of samples for training. As the dataset increases, the classification results significantly increase.

To test the advantages of BERT, BiGRU, and attention pooling mechanisms, we improve the structure of the proposed model BBGCAP and conduct 10-fold cross-validation experiments under the unchanged experimental conditions. The results are given in Table 5.

TABLE 5

Table 5 Results of ablation experiments.

From Tables 3–5, it can be seen that the effectiveness of crop disease-pest question classification not only depends on the selection of question classification algorithms but also has a significant impact on the size of the model training dataset. The experimental results in Tables 3–5 validate that BBGCAP is effective and feasible for crop disease-pest intelligent question answering system, which is a real-time and practical system and requires high accuracy. As a key step of problem classification, when training the selected model, the training data should contain as many problems as possible to improve the accuracy of the whole system.

5 Conclusion

The crop disease-pest–related question classification is an important and challenging problem in the crop disease-pest question answering system. A BERT-BiGRU-CapsNet with attention pooling model, namely, BBGCAP, is constructed for the crop disease-pest Q&A system. BBGCAP is a hybrid network model, integrating the advantages of BERT, BiGRU, CapsNet, and attention pooling. The experimental results demonstrate that BBGCAP outperforms the other methods. In fact, BBGCAP has hierarchical structures and requires a lot of optimization in sample training. The future work is to optimize BBGCAP and aims to address the complex and diverse semantic information of crop disease-pest user-questions.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

TZ: Methodology, Writing – original draft. DW: Resources, Validation, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work is supported by the National Natural Science Foundation of China (Nos. 62172338 and 62072378) and General Special Research Plan Project of Shaanxi Provincial Department of Education (Nos. 22JK0596).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Chen, J., Zhang, D., Zeb, A., Nanehkaran, Y. A. (2021). Identification of rice plant diseases using lightweight attention networks. Expert Syst. Appl. 169, 114514. doi: 10.1016/j.eswa.2020.114514

CrossRef Full Text | Google Scholar

Cherif, A. R. (2022). A systematic review of knowledge representation techniques in smart agriculture (Urban). Sustainability 2022, 14. doi: 10.3390/su142215249

CrossRef Full Text | Google Scholar

El-Ghany, N. M. A., El-Aziz, S. E. A., Marei, S. S. (2020). A review: application of remote sensing as a promising strategy for insect pests and diseases management. Environ. Sci. pollut. Res. 27 (27), 33503–33515. doi: 10.1007/s11356-020-09517-2

CrossRef Full Text | Google Scholar

Guo, X., Hao, X., Tang, Z., Diao, L., Bai, Z., Lu, S., et al. (2021). ACE-ADP: adversarial contextual embeddings based named entity recognition for agricultural disease-pests. Agriculture 11, 1–10. doi: 10.3390/agriculture11100912

CrossRef Full Text | Google Scholar

Guo, X., Zhou, H., Su, J., Hao, X., Li, L. (2020). Chinese agricultural disease-pests named entity recognition with multi-scale local context features and self-attention mechanism. Comput. Electron. Agric. 179 (5), 105830. doi: 10.1016/j.compag.2020.105830

CrossRef Full Text | Google Scholar

He, J., Chen, K., Pan, X., Zhai, J., Lin, X. (2023). Advanced biosensing technologies for monitoring of agriculture disease-pests: A review. J. Semiconductors 44 (2), 23104. doi: 10.1088/1674-4926/44/2/023104

CrossRef Full Text | Google Scholar

Islam, M. S., Islam, M. N., Hashim, N., Rashid, M., Bari, B. S., Farid, F. A. (2022). New hybrid deep learning approach using biGRU-biLSTM and multilayered dilated CNN to detect arrhythmia. IEEE Access 10, 58081–58096. doi: 10.1109/ACCESS.2022.3178710

CrossRef Full Text | Google Scholar

Liu, J., Wang, X. (2020). Tomato disease-pests detection based on improved YoloV3 convolutional neural network. Front. Plant Sci. 11 (1), 898. doi: 10.3389/fpls.2020.00898

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, J., Wang, X. (2021). Plant disease-pests detection based on deep learning: a review. Plant Methods 17, 22. doi: 10.1186/s13007-021-00722-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Lu, J., Tan, L., Jiang, H. (2021). Review on convolutional neural network (CNN) applied to plant leaf disease classification. Agriculture 11, 707. doi: 10.3390/agriculture11080707

CrossRef Full Text | Google Scholar

Ma, Y., Chen, H., Wang, Q., Zheng, X. (2022). “Text classification model based on CNN and BiGRU fusion attention mechanism,” in 2nd International Conference on Computer, Communication, Control, Automation and Robotics. doi: 10.1051/itmconf/20224702040

CrossRef Full Text | Google Scholar

Meng, J., Zhang, Y., Wang, N., Pratama, M. (2016). Attention pooling-based convolutional neural network for sentence modelling. Inf. Sci. 373, 388–403. doi: 10.1016/j.ins.2016.08.084

CrossRef Full Text | Google Scholar

Miao, Z., Huang, G., Li, N., Sun, T., Wei, Y. (2022). A Review of Plant Disease and Insect Pest Detection based on Deep Learning (Singapore: Chinese Intelligent Systems Conference, Springer), 103–118. doi: 10.1007/978-981-19-6226-4_12

CrossRef Full Text | Google Scholar

Miguel, R., Francisco, G., Rafael, V. (2021). Knowledge-based system for crop pests and diseases recognition. Electronics 10 (8), 905. doi: 10.3390/electronics10080905

CrossRef Full Text | Google Scholar

Nandhini, S., Ashokkumar, K. (2021). Improved crossover based monarch butterfly optimization for tomato leaf disease classification using convolutional neural network. Multimedia Tools Appl. 4), 1–28. doi: 10.1007/s11042-021-10599-4

CrossRef Full Text | Google Scholar

Tao, J., Zhang, X., Luo, X., Wang, Y., Song, C., Sun, Y. (2022). Adaptive capsule network. Comput. Vision Image Understanding 218, 103405. doi: 10.1016/j.cviu.2022.103405

CrossRef Full Text | Google Scholar

Thanammal Indu, V., Suja Priyadharsini, S. (2022). Crossover-based wind-driven optimized convolutional neural network model for tomato leaf disease classification. J. Plant Dis. Prot. 129 (3), 559–578. doi: 10.1007/s41348-021-00528-w

CrossRef Full Text | Google Scholar

Waheed, H., Zafar, N., Akram, W., Manzoor, A., Gani, A., Islam, S. U. (2022). Deep learning based disease, pest pattern and nutritional deficiency detection system for "Zingiberaceae" Crop. Agriculture 12 (6), 1–17. doi: 10.3390/agriculture12060742

CrossRef Full Text | Google Scholar

Wang, C., Gao, J., Rao, H., Chen, A., He, J., Jiao, J. (2022). Named entity recognition (NER) for Chinese agricultural diseases and pests based on discourse topic and attention mechanism. Evolutionary Intell. 9 (1), 4–18. doi: 10.1007/s12065-022-00727-w

CrossRef Full Text | Google Scholar

Wang, J., Li, Y., Feng, H., Ren, L., Du, X., Wu, J. (2020). Common pests image recognition based on deep convolutional neural network. Comput. Electron. Agric. 179 (1), 105834. doi: 10.1016/j.compag.2020.105834

CrossRef Full Text | Google Scholar

Xin, M., Wang, Y. (2021). Image recognition of crop diseases and insect pests based on deep learning. Wireless Commun. Mobile Computing 10, 1–15. doi: 10.1155/2021/5511676

CrossRef Full Text | Google Scholar

Yu, X., Gong, R., Chen, P. (2021). “Question classification method in disease question answering system based on MCDPLSTM,” in IEEE 21st International Conference on Software Quality, Reliability and Security Companion, Hainan, China. 381–387. doi: 10.1109/QRS-C55045.2021.00063

CrossRef Full Text | Google Scholar

Zuo, Y., Liu, P., Tan, Y., Guo, Z., Tang, R. (2020). “An attention-based lightweight residual network for plant disease recognition,” in International Conference on Artificial Intelligence and Computer Engineering (ICAICE). 224–228. doi: 10.1109/ICAICE51518.2020.00049

CrossRef Full Text | Google Scholar

Keywords: crop disease-pest question, bidirectional gated unit (BiGRU), capsule network (CapsNet), attention pooling, BERT-BiGRU-CapsNet with attention pooling (BBGCAP)

Citation: Zhang T and Wang D (2023) Classification of crop disease-pest questions based on BERT-BiGRU-CapsNet with attention pooling. Front. Plant Sci. 14:1300580. doi: 10.3389/fpls.2023.1300580

Received: 23 September 2023; Accepted: 22 November 2023;
Published: 07 December 2023.

Edited by:

Chuanlei Zhang, Tianjin University of Science and Technology, China

Reviewed by:

Weijun Cheng, Minzu University of China, China
Kashif Javed, National University of Sciences and Technology (NUST), Pakistan

Copyright © 2023 Zhang and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Ting Zhang, MTA0NjkzNTUyMEBxcS5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Classification of crop disease-pest questions based on BERT-BiGRU-CapsNet with attention pooling

1 Introduction

2 Related works

3 Classification of crop disease-pest questions

3.1 Question participle

3.2 BERT vector embedding

3.3 BiGRU feature extraction

3.4 Attention pooling

3.5 Capsule network

4 Experimental results and analysis

4.1 Dataset

4.2 Results

5 Conclusion

Data availability statement

Author contributions

Funding

Conflict of interest

Publisher’s note

References

95% of researchers rate our articles as excellent or good

95% of researchers rate our articles as excellent or good