- 1The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- 2National Engineering Laboratory for Internet Medical Systems and Applications, Zhengzhou, China
- 3Management Engineering School, Zhengzhou University, Zhengzhou, China
- 4Henan Province Telemedicine Center of China, National Telemedicine Center of China, Zhengzhou, China
Background: The outbreak of novel coronavirus disease 2019 (COVID-19) has led to tremendous individuals visit medical institutions for healthcare services. Public gatherings and close contact in clinics and emergency departments may increase the exposure and cross-infection of COVID-19.
Objectives: The purpose of this study was to develop and deploy an intelligent response system for COVID-19 voice consultation, to provide suggestions of response measures based on actual information of users, and screen COVID-19 suspected cases.
Methods: Based on the requirements analysis of business, user, and function, the physical architecture, system architecture, and core algorithms are designed and implemented. The system operation process is designed according to guidance documents of the National Health Commission and the actual experience of prevention, diagnosis and treatment of COVID-19. Both qualitative (system construction) and quantitative (system application) data from the real-world healthcare service of the system were retrospectively collected and analyzed.
Results: The system realizes the functions, such as remote deployment and operations, fast operation procedure adjustment, and multi-dimensional statistical report capability. The performance of the machine-learning model used to develop the system is better than others, with the lowest Character Error Rate (CER) 8.13%. As of September 24, 2020, the system has received 12,264 times incoming calls and provided a total of 11,788 COVID-19-related consultation services for the public. Approximately 85.2% of the users are from Henan Province and followed by Beijing (2.5%). Of all the incoming calls, China Mobile contributes the largest proportion (66%), while China Unicom and China Telecom are accounted for 23% and 11%. For the time that users access the system, there is a peak period in the morning (08:00–10:00) and afternoon (14:00–16:00), respectively.
Conclusions: The intelligent response system has achieved appreciable practical implementation effects. Our findings reveal that the provision of inquiry services through an intelligent voice consultation system may play a role in optimizing the allocation of healthcare resources, improving the efficiency of medical services, saving medical expenses, and protecting vulnerable groups.
Introduction
In late December 2019, a cluster of pneumonia cases caused by a new severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) were firstly reported in Wuhan, Hubei Province, China (1, 2). The viral pneumonia was now officially known as novel coronavirus disease 2019 (COVID-19), which has been confirmed with the characteristic of human-to-human transmission (1, 3, 4). As of February 14, 2021, COVID-19 has extended to almost all of the countries or territories around the world, causing over 108.2 million confirmed COVID-19 patients that includes more than 2.3 million deaths (5). The COVID-19 pandemic has led to a large number of healthy, suspected, or asymptomatic-infected individuals visit medical institutions for diagnosis or treatment, resulting in a shortage of healthcare resources, and lots of people crowd in clinics and emergency departments (6, 7). The public gatherings may increase the infection risk of healthy people and medical staff. The COVID-19 outbreak has also caused a sharp increase in the demand for healthcare consultation services, far exceeding the capacity that medical institutions can bear (8, 9). In terms of the issues, it is necessary to carry out education, publicity, teleconsultation, and intelligent voice inquiry to assist people to take suitable prevention and control actions, such as fever clinic visits, quarantine, or self-isolation observation at home, respectively, according to their actual situations (10–13).
Given the situation of the COVID-19 pandemic, avoiding tremendous population visits to medical institutions and realizing remote healthcare consultation and triage of patients have become the important means to allocate healthcare resources optimally, which can contribute to the control of the pandemic indirectly. To date, one of the imperative tools that have not yet been fully explored is to employ information and communications technologies (ICTs) to support social distancing and quarantine, optimal healthcare delivery, and reduction of exposure and cross-infection for healthcare professionals and COVID-19 patients (14, 15). Intelligent conversational agents and virtual assistants, such as chatbots, wearable devices, voice assistants, and mobile phone applications, have proven their potential to serve as an intermediary in the fight against COVID-19 (14, 16). For example, during the COVID-19 pandemic, emerging voice assistants (e.g., Google Assistant, Apple Siri, and Amazon Alexa) have been adopted as an alternative healthcare delivery modality to mitigate the risk of COVID-19 spread and relieve the stress on the healthcare system (14, 15, 17).
According to the analysis of voice messages of users, an artificial intelligence-based voice consultation system for COVID-19 can automatically identify health consultation questions of users from different distances and then provide specific answers and response opinions (18–20). As the latest application of ICTs, the design and application of intelligent response systems for COVID-19 voice consultation have specific requirements for the development and deployment of relevant information systems (20, 21). Firstly, intelligent speech recognition based on machine learning needs to be accurate enough, and it can continuously self-learn and optimize as new data are imported. Secondly, data transmission quality, request-response speed, and load-carrying capacity of the information system should meet the actual needs of medical services. Thirdly, the system is able to support the full access of mobile phones, computers, and other terminals. Lastly, the system should set up a special security module to protect privacy of users and ensure information security.
Accordingly, to reduce the number of non-essential in-person visits at hospitals, lessening face-to-face contact among the healthy public, COVID-19 patients, and healthcare professionals, preserving already strained medical resources, increasing service capacity of medical institutions to screen suspected cases and deliver healthcare information, and eventually help reduce the spread of COVID-19, this study has developed and deployed an intelligent response system for COVID-19 voice consultation. The system would support the functions of collecting and analyzing information of users through intelligent inquiry and interaction with users, propose suggestions of corresponding mitigation measures based on the actual situations of the users, and screen suspected COVID-19 cases. According to our knowledge, this intelligent response system is one of the first tools developed and applied on a large scale for COVID-19 in China. The findings of the present study may play a helpful role in avoiding the frequent visits of healthy people to fever clinics, improving the utilization efficiency of medical resources, and preventing and controlling the COVID-19 infection among healthy people and medical practitioners.
Materials and Methods
Requirements Analysis
This study intends to design an intelligent response system for COVID-19 voice consultation, which can complete user self-assessment. When users access the system, the system will ask questions about the basic information of users. Through natural language processing (NLP) technology, it dynamically adjusts the follow-up questions that need to be confirmed according to the different options selected by the user and then give corresponding response opinions depending on actual situations, which can quickly screen out COVID-19 suspected cases and offer specific suggestions for further actions. Due to the incompleteness of online consultation, the system is not connected to any healthcare systems, once a COVID-19 case is suspected, she/he will be recommended to go to the nearest COVID-19 designated hospital for further confirmation and treatment immediately. To achieve the above functions, according to the requirements analysis of software engineering, the intelligent response system needs to meet the following items.
1. Business requirements: Build an intelligent voice consultation system for COVID-19, users can complete self-assessment by accessing the system. For related workers, the chances of contact with other people should be minimized during the development, deployment, and application stages of the system.
2. User requirements: Users only need to make a phone call to access the system. They can complete a self-assessment about COVID-19 without going to the hospital. Through real-time voice communication, individualized assessment results and suggested response measures are ultimately obtained.
3. Functional requirements:
a) Remote deployment and maintenance: Given the high infectiousness of COVID-19, non-contact should be the first requirement of this system, and remote operation should be achievable at all stages from deployment to application of the system.
b) Accurate speech understanding: Accurate and real-time speech understanding is a key to the successful application of this system. Minor mistakes in speech recognition and understanding may lead to severe errors.
c) The intelligence of the system: Each epidemiological history and personal symptoms of user are different. Using a fixed process in all consultations is not appropriate, the system should be able to dynamically and intelligently adjust its contents based on the subsequent answers of user.
d) Data statistics and analysis: Statistical analysis of data from system logs and routine operation can quickly identify characteristics of users and provide suggestions for improving the consultation framework, and it is also helpful for medical staff to conduct scientific research.
Physical Architecture
To meet the abovementioned requirements, the physical architecture of the system is designed as shown in Figure 1. The physical architecture of the system consists of three modules, the administrator can adjust the operation procedure of system by admin module, and users can access the system by interactive module. The system acquires information, such as symptoms and epidemiological history of users, after establishing a call and collects the information provided by the user through a core module.
Admin Module
System administrators can access the system backstage remotely through web and then design and adjust the consultation process and configure the relevant parameters of the system. To realize subdivided medical services, the web server is open with various application programming interfaces (APIs) and can also contact with a hospital information system according to actual business needs.
Core Module
It contains three parts: intelligent voice engine server, load balancing, and storage server. The speech engine completes interactive services, such as speech recognition, synthesis, and understanding. The system dynamically balances services through load balancing, and the recording storage server is used to store recordings of the users.
Interactive Module
Users can directly dial the phone number to access the soft-switching system. The soft-switching system interacts with the core module through the Session Initiation Protocol and transmits the data of the conversation process to the recording storage server simultaneously.
System Architecture
The architecture of the system is shown in Figure 2, which includes four layers.
Basic Communication Layer
This layer adopts a soft-switching system to handle different types of incoming calls. It supports multiple pathways of access, such as telephone, email, short messaging service (SMS), and Web call.
Core Technology Layer
This layer is the core layer of the system. Key technologies, such as automatic speech recognition (ASR), prosody recognition, pragmatic analysis, and syntax analysis are employed, which can work together to achieve multiple rounds of dialogue with the system.
Background Business Layer
The background business layer does not provide services directly to users, but it is open to system administrators to support related business.
Basic Service Layer
This layer is in charge of functions, such as user terminal management, outgoing and incoming call tasks management, call center control, API management, and security module, which provides services directly to users and engineers of system operation.
Sampling Methods
Two samples were used in the present study. Firstly, sampling the voice signal, before training the model, it is necessary to sample the sound wave to convert the analog audio signal into a digital signal, which is convenient for the computer to process. This work was completed by using the following arguments: sample rate 8 kHz, bit depth 16 bit, and bit rate 128 kbps. After sampling the voice signal, Mel-scale frequency cepstral coefficients were employed to sample features of digital signals, which was able to achieve efficient modeling of the principles of the human voice, while reducing feature dimensions. Secondly, sampling the data set of speech recognition training. Before putting the speech recognition system into practical application, one of the biggest challenges is the accent problem. Actually, there are many accents in China, and the accents from different dialect regions are quite different (22). To deal with this problem, through cluster sampling, about 12,000 h of telephone voice conversation were collected from the company-provided voice communication service. The company is a leading voice service provider in China, with various customers from banks, hospitals, universities, and insurance companies, which ensures that the corpus includes various accents in different regions in China. The materials contain both human-to-human calls and human-to-machine calls. Thus, using the corpus to train and develop our models can effectively improve the accuracy of speech recognition and enhance the generalizability of the intelligent response system for COVID-19 voice consultation.
Core Algorithms
The core algorithm of this system is based on multiple rounds of speech recognition. The implementation process of the algorithm is shown in Figure 3. Through speech recognition, prosody analysis, syntactic analysis, semantic analysis, pragmatic analysis, and logical judgment, multiple rounds of dialogue are realized.
Speech Recognition
Complete the conversion of speech to text by constructing a speech recognition acoustic model based on long short-term memory (LSTM). Since voice is a typical timing signal, a recurrent neural network (RNN) has strong timing modeling capabilities and therefore is suitable for voice recognition. The LSTM model is a variant of RNN, which has three more gates: forget gate, input gate, and output gate. Compared to traditional RNN, LSTM can process a longer sequence of voice data through the combination of three kinds of controllers and achieve better voice recognition. LSTM has shown state-of-the-art performance on many tasks of speech recognition (23–25). To improve the performance further, we used a variant of LSTM, which is known as Stacked maxout LSTMs (26). The maxout LSTMs architecture is illustrated in Figure 4. In the present study, three maxout LSTM layers were stacked to build the Stacked maxout LSTMs.
Maxout units can summarize a group of spatially neighboring neurons in a lower layer that is capable of achieving the property of local translation invariance. Specifically, output ht from the lower maxout LSTM layer is the input xt of the upper maxout LSTM layer. These Stacked maxout LSTMs networks have the power to combine the multiple levels or representations with flexible use of long-range context. The equations of the maxout LSTM layers are as follows:
For the equations, σ is the logistic sigmoid function, and i, f, o, a, and c are the input gate, forget gate, output gate, cell input activation, and cell state vectors, respectively, all of which are the same size as the hidden vector h. Wci, Wcf, and Wco are diagonal weight matrices for peephole connections, G is the group size in the maxout unit.
Prosody Recognition
Dividing prosodic structure based on speech and text information, such as accent classification of prosodic words, pitch analysis of phrases at the boundary of prosodic phrases, and classification of tones at the boundary of intonation phrases.
Syntax Analysis
Based on text recognition, text phrase structure classification, short sentence type classification, and sentence category classification are performed.
Semantic Analysis
The information structure of phrases, the semantic inheritance relationship among dialogue rounds, and the topic of dialogue segments are analyzed.
Pragmatic Analysis
Classifying the speech act verbs firstly, and then categorizing the response type of this round of dialogue and confirming whether it is the trigger or response. In the wake of the classification, the adjacent pair category to which this round of dialogue belongs is investigated, and according to the source of the utterance, sentence category, or information structure, the corresponding answer type is determined.
Logical Analysis
Based on the results of the pragmatics analysis and combined with the current logical nodes, the next logical node can be determined, and through speech synthesis technology, a new round of dialogue can be pushed to users.
System Operation Procedure Design
According to the actual experience of prevention, diagnosis, and treatment of COVID-19, and considering the guidance documents of the National Health Commission (27), this study designed the operation procedure of the intelligent response system (Figure 5). Firstly, when the phone call is connected, the system will first introduce itself briefly and then collect relevant symptoms information of the user. Secondly, during the conversation, the system would recognize the keywords said by the user in real time and submit the corresponding questions in a targeted manner based on the actual situation of the user. Finally, the system will provide specific COVID-19 prevention and control measures for users depending on her/his epidemiological history.
In this system, the epidemiological history includes four types. (1) People who have traveled or contacted history of the outbreak epicenters; (2) people who have been in contact with emerging epidemic areas announced by the government; (3) people who had close contact with suspected or confirmed COVID-19 patient(s); and (4) many people around occurred symptoms, such as fever, fatigue, cough, and sore throat. After completing the consultation, the system will automatically select an appropriate opinion from the four predefined guidelines to recommend to the user. The details of the guidance opinions are summarized in Table 1. In the remainder of this paper, both qualitative (system construction) and quantitative (system application) data from the real-world healthcare service of the system were collected and analyzed retrospectively.
Results
The system has been developed and put into operation since March 31, 2020. People in China can access the intelligent response system by dialing the telephone number 0371-96299.
System Functions
Remote Deployment and Operation
With the support of necessary hardware and software environment conditions, this system can be deployed remotely, docking with local telephone lines, and then provides voice consultation services. During the operation of the system, the status of the system can be monitored remotely, and the consultation process mentioned above can also be adjusted remotely, which can avoid on-site operation and maintenance, reduce human contact, and decrease potential exposure of COVID-19.
Fast Operation Procedure Adjustment
The operation procedure of this system can be adjusted quickly and efficiently according to the actual needs, and nodes can be set up based on the original process. Functional modules, such as manual transfer and SMS distribution, can be added in if required, and the remote deployment of each procedure can be completed smoothly.
Multi-Dimensional Statistical Report Capability
The system can realize the statistical analysis of disconnection reasons, the dialogue data, consultation time, and geographical distribution of users. Based on the analysis, administrators can promptly identify and adjust the current existing or newly emerging problems of the system, which can help medical staff understand and count data in relation to COVID-19 well and then take more effective countermeasures.
Effective Protection of User Privacy and Information Security
The security module is located in the basic service layer of the system, which includes the following functions: (1) metadata management: viewing and modifying the information, such as the type, description, security level, and operation authority of each field in the data table; (2) account management: through unified management of user accounts, to make the granularity of permission control as small as possible, meanwhile, set a validity period for the permissions, and automatically recover the permissions when they expire; and (3) log management: recording and auditing the logs of account management operation, permission approval, and data access operation. Based on these functions, the security module can help the system achieves effective protection of user privacy and information security.
Performance Evaluation
To assess the performance of the proposed intelligent response system, 100 h of telephone voice conversation from the 12,000 h corpus were extracted as the test dataset, and the comparisons of speech recognition performances of different models were conducted. The Gaussian Mixture Models and Hidden Markov Models (GMM-HMM) model was selected as the baseline model, while the KALDI toolkit was used to train the GMM-HMM model. The Stacked LSTMs model was chosen as the state-of-the-art model, in which three conventional LSTMs were stacked, and each layer had 750 LSTM cells. In the Stacked maxout LSTMs, three maxout LSTMs were stacked, each layer had the same configurations as Stacked LSTMs. These three models were evaluated based on the same dataset.
Character Error Rate (CER) was employed to evaluate the different models. CER is a typical metric of the performance of the Chinese Speech Recognition System. Through comparing the output character sequence predicted by ASR with the correct reference character sequence, CER can be computed as:
Where S, D, and I are the number of substitutions, deletions, and insertions, respectively, and N is the number of words in the reference.
In terms of the test results listed in Table 2, the performance of the Stacked LSTMs model is much better than the GMM-HMM model. By replacing the input activation units in the Stacked LSTM networks with maxout units, a 2.15% relatively CER reduction can be achieved. It is should be noted that since our test data were extracted from production environments, the same source as the data used to training models, not from standard open datasets, thus the CER values presented were relatively higher than existing studies (28–30). However, according to practical experience, CER <15% is considered acceptable.
Application Effect
Since the intelligent response system was launched, the average number of user visits per day was 69 (Figure 6). As of September 24, 2020, the system has received a total of 12,264 times incoming calls, among which 11,788 COVID-19-related voice consultation services were provided for the public.
The geographical distribution of the incoming calls users was analyzed (Figure 7). Total 85.2% (10,054/11,788) of the users were from Henan Province and followed by Beijing, which was accounting for about 2.5% (303/11,788). In Henan Province, users from Zhengzhou city, the capital of the province, were responsible for about 50% (5,027/10,054) of the total, with the most amount (Figure 8).
The proportion of different mobile operators among users was also investigated. There are three major mobile operators in China: China Mobile, China Unicom, and China Telecom. Of all the incoming calls of the system, China Mobile with the largest proportion, and its users were responsible for 66% (7,775/11,788), while China Unicom and China Telecom were accounted for 23% (2,663/11,788) and 11% (1,350/11,788), respectively (Figure 9).
The time distribution that when users access the system for COVID-19-related voice consultation is shown in Figure 10. There was a peak period in the morning (08:00–10:00) and afternoon (14:00–16:00), respectively. Specifically, the peak time in the morning was at 09:00, and the peak in the afternoon was at 15:00.
Discussion
Based on ASR, text to speech, and NLP technologies, the present study developed and initiated an intelligent response system for COVID-19 voice consultation. The functions, performance, and application effect of the system were then investigated. To the best of our knowledge, this is the first time, from the provincial level real-world practice in China, that the comprehensive account of an intelligent response system for COVID-19 voice consultation was explored. The findings of the current study may provide a helpful reference for further regional, national, and even international actions against the COVID-19 pandemic.
Coronavirus Diesease-2019 is a highly infectious disease, since the officially reported emergence of COVID-19 in Wuhan, China, the epidemic scale has spread rapidly, with cases arising across China and many other countries. As of February 14, 2021, in mainland China, 89,772 confirmed COVID-19 cases were reported across 31 provinces and municipalities, with 4,636 fatalities. The COVID-19 pandemic has sharply increased the demand for medical services and inevitably exceeded the maximum supply of healthcare facilities. In China, the socio-economic development gap between different regions is huge, and the distribution of medical resources is extremely uneven. For example, although 42.65% population of China lives in rural areas, ~80% of medical resources of China are concentrated in urban areas, two-thirds of which are in megacities (31, 32). This geographical inequity of access to medical resources has created relatively poor healthcare services in remote regions. Therefore, during the COVID-19 epidemic, the traditional medical services of Chinese healthcare facilities can hardly meet the needs of the public, especially in remote mountainous or rural areas (33).
The high levels of human-to-human transmission, asymptomatic infection, and long incubation period are the main reasons for the large-scale epidemic of COVID-19 (34, 35). If asymptomatic patients frequently infect others, it could vastly complicate or delay the effectiveness of prevention and control measures in response to COVID-19. Thus, measures, such as tracing and quarantining close contacts as early as possible, isolating confirmed COVID-19 cases timely, and avoiding exposure of healthy people to those infected with SARS-CoV-2 but during the asymptomatic incubation period, have been becoming effective means in the war against COVID-19 (20, 35–37). However, in the time of the COVID-19 epidemic, a large number of people visit a hospital for authoritative diagnosis and treatment, while public gatherings may increase the chance of healthy people and medical staff meeting asymptomatic COVID-19 cases, which is not conducive to effective prevention and control of COVID-19. In terms of the fact that there are no definite antiviral therapies for COVID-19 now, it is crucial to harness global efforts to take mitigation measures and emergency actions across every stage of the epidemic to contain the disease (5, 38, 39). To keep the public calm and quench unnecessary fears, healthcare facilities across the globe are expected to advise the public on what to do to stay away from COVID-19 infection, for example, advise persons experiencing symptoms of fever, dry cough, fatigue, runny nose, and anhelation to seek medical attention promptly (13, 36, 38, 39).
The intelligent response system proposed in the present study can analyze the information about consultation content of users and the response of the system. Then according to the specific situation of the users, the system dynamically provides targeted treatment opinions and action suggestions (Table 1). This may play a helpful role in screening the suspected COVID-19 cases and guiding people to stay at home for self-isolation or go to a fever clinic for further diagnosis and treatment. Healthcare consultation and medical services guidance of the intelligent response system can help to screen and triage healthy people, suspected COVID-19 cases, patients infected with SARS-CoV-2, and advise them to take preventive and control measures, such as self-isolation at home for observation, quarantine at fever clinic, or diagnosis and treatment at COVID-19 designated hospitals, respectively. It is not limited by time and space, which can realize remote triage and crowd diversion, reduce public gatherings, optimize the utilization of medical resources, improve the efficiency and coverage of healthcare services, and protect the public and medical staff from the risk of cross-infection of COVID-19.
Recurrent neural networks are networks with loops in them, allowing information to persist. LSTM is a special kind of RNNs, which can learn long-term dependencies. LSTM was firstly introduced by Hochreiter and Schmidhuber in 1997 and then was further developed and popularized by many researchers (40, 41). LSTMs work tremendously well on a large variety of problems, for many tasks, the performance of LSTM is usually better than the standard RNNs version, and almost all exciting results based on RNNs are achieved with LSTMs (41–43). In the present study, the Stacked maxout LSTMs were employed to develop the intelligent response system, for which the performance is much better than other models (with the lowest CER of 8.13%). Based on machine learning or rule-oriented dialog, intelligent conversational agents and virtual assistants, such as voice assistants and chatbots, enable communications with users via natural language, which may involve multimodal interaction support (e.g., speech, text, and sound). Generally, voice assistants typically achieve their services through a voice interface, which needs voice commands to interact and complete COVID-19-related tasks (e.g., Amazon Alexa); while chatbots primarily engage in multi-turn dialogues through text, for example as Woebot (15, 18, 44). Compared to the human-based system, the intelligent response system based on machine learning can be deployed fast and with a low hardware cost during routine operation. Due to the characteristics of accessibility, availability, and scalability for naturalistic communications with users, intelligent voice assistants have been increasingly becoming popular in the battle with the COVID-19 pandemic around the world (14, 15, 45).
In the current study, the application of the intelligent response system can achieve effective triage of outpatients, reduce public gatherings, and help control the spread of the COVID-19 epidemic. Moreover, with the help of the system, users are able to visit the nearest hospital or fever clinic according to their specific symptoms and receive appropriate diagnosis and treatment in time. These findings are consistent with other similar studies (14, 46, 47). However, compared to the voice assistants discussed in previous studies (e.g., Google Assistant, Apple Siri, and Amazon Alexa), the intelligent response system for COVID-19 voice consultation initiated in the current study has several different characteristics (14, 15, 45). First, the system only needs to be docked with the local telephone line to perform intelligent voice query services, and the response process of the newly deployed system can be adjusted according to local conditions. This is time-saving, fast, and convenient for both service providers and end-users, which is valuable during the COVID-19 pandemic. Second, when enjoying the COVID-19-related voice consultation, users do not need to download and install any client software or applications, thus, the system is relatively user-friendly with no technological proficiency requirement for users (45). Third, the deployment and application of the system are not limited by IT infrastructure, Internet access or speed, costs of hardware or software components, and locations of patients and physicians. Besides, for the operation of the system, training of healthcare professionals, nurses, and users, online assistance for patients, and alterations to integrate within the current healthcare system are not required, which can save manpower and reduce unnecessary contacts or exposure.
Our findings showed that 85.2% of the users are from Henan Province and followed by Beijing (2.5%), while for the time users visited the system, there was a peak period in the morning and afternoon, respectively. By analyzing the time and geographical distribution information of the calls of users, the system can summarize and conclude the characteristics and habits of users, which can help to optimize the allocation of medical resources and improve both the quality of healthcare services and user satisfaction. For instance, 8:00–10:00 am and 2:00–4:00 pm are the peak periods for user visits (Figure 10), during this period, healthcare facilities can meet urgent healthcare consultation needs of users through provisionally increasing service capabilities and efficiency of the intelligent response system.
Strengths and Limitations
As one of the latest applications of artificial intelligence and ICTs in the battle with the COVID-19 epidemic, the intelligent response system for COVID-19 voice consultation proposed in the current study has both social and economic advantages. First, the system can triage and divert relevant healthcare needs of the public, thereby, alleviating the already shortage of medical resources. Second, the voice consultation services of the system can reduce the number of fearful people visiting the hospital, avoid frequent public gatherings in medical institutions, and decrease the exposure and infection risk of healthy individuals and medical staff. Third, based on the system, healthcare facilities can save and optimize the allocation of medical resources, and improve the efficiency, capacity, and quality of their services. Fourth, the system provides COVID-19-related consultation services through telephone and voice, which is appropriate and helpful for those who cannot read or use smartphones, and of course increases the coverage, acceptability, and adherence of the intelligent voice services. In addition, through advising healthy people to stay at home for isolation and observation, in addition to reducing unnecessary consumption of medical resources and the cost of healthcare services, the system can also help users avoid expenses due to hospital visits and related transportation and accommodation. The prevention and control of the COVID-19 pandemic are multifaceted, and the intelligent response system for COVID-19 voice consultation developed and applied in this study can undoubtedly play a helpful role in reducing public gatherings, preserving medical resources, increasing the service capacity of healthcare facilities, and eventually curbing the spread of COVID-19.
Some limitations of the intelligent response system in practical applications also need to be acknowledged. Firstly, the intelligent voice service functions of the system are relatively simple. It can only provide suggestions in relation to the prevention and control measures of COVID-19 based on the consultation and feedback information of users, such as personal precautionary practices, self-monitoring of body temperature at home, fever clinic visits, early detection and quarantine, and COVID-19-designated hospitals treatment, which may hardly meet the additional healthcare needs of the users. Secondly, the system is designed and constructed mainly according to or adapt to the COVID-19 epidemic situations of Henan Province. Thus, the functions and contents of the intelligent voice service may not be perfectly suitable for the prevention and control of COVID-19 in other regions in China. This can be seen from the fact that the healthcare inquiries of the system were mainly concentrated in Henan Province. The plausible phenomenon suggests that to increase the coverage and capacity of the intelligent response system, there is a need to further enrich and improve its services in the future, or do some adaption and adjustment works whenever necessary, according to the actual situations and policies where it is used. Lastly, based on the epidemiological characteristics and personal symptoms, this intelligent response system can complete the preliminary screening of users and then make relevant suggestions and guidance. However, the system only provides information for the reference of users, the COVID-19 prevention and control recommendations are neither belonging to the medical category nor can they replace the hospital doctor's diagnosis and treatment. Further research and practice works are urgently needed to address these limitations.
Conclusions
Given the general susceptibility, high prevalence, and wide distribution of COVID-19 across the world, substantial and various public health intervention and control measures involving social, economic, and healthcare sectors, especially the vital response arrangements based on the application and analysis of real-world data are continuously warranted. Based on NLP and modern ICTs, the present study designed and deployed an intelligent response system for COVID-19 prevention and control. Through identifying and analyzing the voice information of users, the intelligent response system realized functions, such as user-oriented intelligent inquiry, screening of suspected COVID-19 cases, and targeted recommendations of response measures, which achieved appreciable practical application effects. To further improve the efficiency and quality of prevention, diagnosis, and treatment of COVID-19, in the future, the improvement and application of the system should take the actual medical service activities of clinicians into consideration. For instance, through integrating different function models into the system to promote its versatility and then to increase the capacity of the system in the battle with COVID-19. Generally, in terms of the unprecedented the COVID-19 pandemic, the provision of inquiry services through an intelligent response system in this study plays a valuable role in optimizing the allocation of healthcare resources, improving the efficiency of medical services, saving medical expenses, containing the new pandemic, and protecting vulnerable groups.
Data Availability Statement
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Author Contributions
JS, JG, JZ, and YZ conceptualized, designed, initiated the study, reviewed, and revised the manuscript. JS, JG, and YZ drafted the initial manuscript. MY, YL, XH, FC, and QM were involved in the development of methodology and discussion of the article structure. All authors have read and approved the final manuscript as submitted.
Funding
This work was supported by the National Key R&D Program of China (Grant No. 2017YFC0909901), the Natural Science Foundation of Henan Province of China (202300410409), and the Joint Construction Project of the Henan Province Medical Science and Technology Research Plan (2018) (Grant No. 2018020120). The funders played no role in the design, development, or interpretation of the present work.
Author Disclaimer
The views expressed in the article are those of the authors and do not necessarily reflect the position of the funding bodies.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Abbreviations
COVID-19, novel coronavirus 2019; ICT, information and communications technology; ASR, automatic speech recognition; WHO, World Health Organization; NLP: natural language processing, ; API, application programming interface; SMS, short messaging service; LSTM, long short-term memory; RNN, recurrent neural networks; CER, character error rate.
References
1. Lai CC, Shih TP, Ko WC, Tang HJ, Hsueh PR. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges. Int J Antimicrob Agents. (2020) 55:105924. doi: 10.1016/j.ijantimicag.2020.105924
2. Hui DS, Azhar EI, Madani TA, Ntoumi F, Kock R, Dar O, et al. The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health - The latest 2019 novel coronavirus outbreak in Wuhan, China. Int J Infect Dis. (2020) 91:264–6. doi: 10.1016/j.ijid.2020.01.009
3. Wang D, Hu B, Hu C, Zhu F, Liu X, Zhang J, et al. Clinical Characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. JAMA. (2020) 323:1061–9. doi: 10.1001/jama.2020.1585
4. Du Toit A. Outbreak of a novel coronavirus. Nat Rev Microbiol. (2020) 18:123. doi: 10.1038/s41579-020-0332-0
5. World Health Organization. The World Health Organization Coronavirus Disease 2019 (COVID-19) Situation Report. World Health Organization (2020). Available online at: https://www.who.int/emergencies/diseases/novel-coronavirus-2019 (accessed April 10, 2020).
6. Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early Transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N Engl J Med. (2020) 382:1199–207. doi: 10.1056/NEJMoa2001316
7. Kucharski AJ, Russell TW, Diamond C, Liu Y, Edmunds J, Funk S, et al. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect Dis. (2020) 20:553–8. doi: 10.1016/S1473-3099(20)30144-4
8. Zhou P, Huang Z, Xiao Y, Huang X, Fan XG. Protecting Chinese healthcare workers while combating the 2019 novel coronavirus. Infect Control Hosp Epidemiol. (2020) 41:745–6. doi: 10.1017/ice.2020.60
9. Analysys Qianfan. Observation of the Internet Medical Industry Under the COVID-19 Epidemic (2020) [In Chinese]. Analysys Qianfan (2020) Available online at: http://qianfan.analysyschina.com/ (accessed October 23, 2021).
10. Novara G, Checcucci E, Crestani A, Abrate A, Esperto F, Pavan N, et al. Telehealth in urology: a systematic review of the literature. how much can telemedicine be useful during and after the COVID-19 pandemic? Eur Urol. (2020) 78:786–811. doi: 10.1016/j.eururo.2020.06.025
11. Eichberg DG, Basil GW, Di L, Shah AH, Luther EM, Lu VM, et al. Telemedicine in neurosurgery: lessons learned from a systematic review of the literature for the COVID-19 era and beyond. Neurosurgery. (2020) 88:E1–12. doi: 10.1093/neuros/nyaa306
12. Chang D, Xu H, Rebaza A, Sharma L, Dela Cruz CS. Protecting health-care workers from subclinical coronavirus infection. Lancet Respir Med. (2020) 8:e13. doi: 10.1016/S2213-2600(20)30066-7
13. Nkengasong J. China's response to a novel coronavirus stands in stark contrast to the 2002 SARS outbreak response. Nat Med. (2020) 26:310–1. doi: 10.1038/s41591-020-0771-1
14. Bokolo AJ. Exploring the adoption of telemedicine and virtual software for care of outpatients during and after COVID-19 pandemic. Ir J Med Sci. (2021) 190:1–10. doi: 10.1007/s11845-020-02299-z
15. Sezgin E, Huang Y, Ramtekkar U, Lin S. Readiness for voice assistants to support healthcare delivery during a health crisis and pandemic. NPJ Digit Med. (2020) 3:122. doi: 10.1038/s41746-020-00332-0
16. Chauhan V, Galwankar S, Arquilla B, Garg M, Somma SD, El-Menyar A, et al. Novel coronavirus (COVID-19): leveraging telemedicine to optimize care while minimizing exposures and viral transmission. J Emerg Trauma Shock. (2020) 13:20–4. doi: 10.4103/JETS.JETS_32_20
17. Keesara S, Jonas A, Schulman K. COVID-19 and health care's digital revolution. N Engl J Med. (2020) 382:e82. doi: 10.1056/NEJMp2005835
18. Ting DSW, Carin L, Dzau V, Wong TY. Digital technology and COVID-19. Nat Med. (2020) 26:459–61. doi: 10.1038/s41591-020-0824-5
19. Smith AC, Thomas E, Snoswell CL, Haydon H, Mehrotra A, Clemensen J, et al. Telehealth for global emergencies: implications for coronavirus disease 2019 (COVID-19). J Telemed Telecare. (2020) 26:309–13. doi: 10.1177/1357633X20916567
20. Fisk M, Livingstone A, Pit SW. Telehealth in the context of COVID-19: changing perspectives in Australia, the United Kingdom, and the United States. J Med Internet Res. (2020) 22:e19264. doi: 10.2196/19264
21. Boehm K, Ziewers S, Brandt MP, Sparwasser P, Haack M, Willems F, et al. Telemedicine online visits in urology during the COVID-19 pandemic-potential, risk factors, and patients' perspective. Eur Urol. (2020) 78:16–20. doi: 10.1016/j.eururo.2020.04.055
22. Ministry of Education of the People's Republic of China. Overview of Chinese Language and Characters in 2021 [In Chinese]. Ministry of Education of the People's Republic of China (2021). Available online at: http://www.moe.gov.cn/jyb_sjzl/wenzi/202108/t20210827_554992.html (accessed October 23, 2021).
23. Mairittha T, Mairittha N, Inoue S. Automatic labeled dialogue generation for nursing record systems. J Pers Med. (2020) 10:62. doi: 10.3390/jpm10030062
24. Chen J, Wang D. Long short-term memory for speaker generalization in supervised speech separation. J Acoust Soc Am. (2017) 141:4705. doi: 10.1121/1.4986931
25. Lisbeth RM, Cristian-Aarón R-E, José-Luis SC, Jair C, Giner AH. A general perspective of Big Data: applications, tools, challenges and trends. J Supercomput. (2016) 72:3073–113. doi: 10.1007/s11227-015-1501-1
26. Li X, Wu X. Improving long short-term memory networks using maxout units for large vocabulary speech recognition. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). South Brisbane, QLD: IEEE. (2015). p. 4600–4. doi: 10.1109/ICASSP.2015.7178842
27. The National Health Commission. Notice on Doing a Good Job in Internet Diagnosis and Treatment Consultation Services in the Prevention and Control of the COVID-19 Epidemic [In Chinese]. The National Health Commission (2020). available online at: http://www.nhc.gov.cn/cms-search/xxgk/getManuscriptXxgk.htm?id=ec5e345814e744398c2adef17b657fb8 (accessed October 22, 2021).
28. Xiong W, Droppo J, Huang X, Seide F, Seltzer M, Stolcke A, et al. Toward human parity in conversational speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing. (2017). 25:2410–23. doi: 10.1109/TASLP.2017.2756440
29. Shan C, Zhang J, Wang Y, Xie L. Attention-based end-to-end speech recognition on voice search. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (2018). p. 4764–8. doi: 10.1109/ICASSP.2018.8462492
30. Xiong W, Wu L, Alleva F, Droppo J, Huang X, Stolcke A. The microsoft 2017 conversational speech recognition system. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (2018). p. 5934–8. doi: 10.1109/ICASSP.2018.8461870
31. National Bureau of Statistics. The 2018 China Statistical Yearbook. China Statistics Press. (2018). Available online at: www.stats.gov.cn/tjsj/ndsj/ (accessed June 16, 2021).
32. Zhai Y, Gao J, Chen B, Shi J, Wang L, He X, et al. Design and application of a telemedicine system jointly driven by videoconferencing and data exchange: practical experience from Henan Province, China. Telemed J E-Health. (2020) 26:89–100. doi: 10.1089/tmj.2018.0240
33. Hong Z, Li N, Li D, Li J, Li B, Xiong W, et al. Telemedicine during the COVID-19 pandemic: experiences from western China. J Med Internet Res. (2020) 22:e19577. doi: 10.2196/19577
34. Guan WJ, Ni ZY, Hu Y, Liang WH, Ou CQ, He JX, et al. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med. (2020) 382:1708–20. doi: 10.1056/NEJMoa2002032
35. Chan JF, Yuan S, Kok KH, To KK, Chu H, Yang J, et al. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster. Lancet. (2020) 395:514–23. doi: 10.1016/S0140-6736(20)30154-9
36. Ung COL. Community pharmacist in public health emergencies: quick to action against the coronavirus 2019-nCoV outbreak. Res Social Adm Pharm. (2020) 16:583–6. doi: 10.1016/j.sapharm.2020.02.003
37. Chang D, Lin M, Wei L, Xie L, Zhu G, Dela Cruz CS, et al. Epidemiologic and clinical characteristics of novel coronavirus infections involving 13 patients outside Wuhan, China. JAMA. (2020) 323:1092–3. doi: 10.1001/jama.2020.1623
38. Kofi Ayittey F, Dzuvor C, Kormla Ayittey M, Bennita Chiwero N, Habib A. Updates on Wuhan 2019 novel coronavirus epidemic. J Med Virol. (2020) 92:403–7. doi: 10.1002/jmv.25695
39. Doraiswamy S, Abraham A, Mamtani R, Cheema S. Use of telehealth during the COVID-19 pandemic: scoping review. J Med Internet Res. (2020) 22:e24087. doi: 10.2196/24087
40. Yu Y, Si X, Hu C, Zhang J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. (2019) 31:1235–70. doi: 10.1162/neco_a_01199
41. Greff K, Srivastava RK, Koutnik J, Steunebrink BR, Schmidhuber J. LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst. (2017) 28:2222–32. doi: 10.1109/TNNLS.2016.2582924
42. Pathan RK, Biswas M, Khandaker MU. Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model. Chaos Solitons Fractals. (2020) 138:110018. doi: 10.1016/j.chaos.2020.110018
43. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. (2018) 19:1236–46. doi: 10.1093/bib/bbx044
44. Hoy MB. Alexa, Siri, Cortana, and more: an introduction to voice assistants. Med Ref Serv Q. (2018) 37:81–8. doi: 10.1080/02763869.2018.1404391
45. Bokolo Anthony J. Use of telemedicine and virtual care for remote treatment in response to COVID-19 pandemic. J Med Syst. (2020) 44:132. doi: 10.1007/s10916-020-01596-5
46. Shokri T, Lighthall JG. Telemedicine in the era of the COVID-19 pandemic: implications in facial plastic surgery. Facial Plast Surg Aesthet Med. (2020) 22:155–6. doi: 10.1089/fpsam.2020.0163
Keywords: intelligent response system, voice consultation, construction, application, COVID-19, China
Citation: Shi J, Gao J, Zhai Y, Ye M, Lu Y, He X, Cui F, Ma Q and Zhao J (2021) Construction and Application of an Intelligent Response System for COVID-19 Voice Consultation in China: A Retrospective Study. Front. Med. 8:781781. doi: 10.3389/fmed.2021.781781
Received: 23 September 2021; Accepted: 26 October 2021;
Published: 23 November 2021.
Edited by:
Zisis Kozlakidis, International Agency for Research on Cancer (IARC), FranceReviewed by:
Anthony Bokolo Jr., Norwegian University of Science and Technology, NorwayGuanghua Zhai, Nanjing Medical University, China
Copyright © 2021 Shi, Gao, Zhai, Ye, Lu, He, Cui, Ma and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Jie Zhao, emhhb2ppZSYjeDAwMDQwO3p6dS5lZHUuY24=
†These authors have contributed equally to this work