- 1Department of Computer Science, Faculty of Computing and Information Technology (FCIT), King Abdulaziz University, Jeddah, Saudi Arabia
- 2Department of Information Systems, Faculty of Computing and Information Technology (FCIT), King Abdulaziz University, Jeddah, Saudi Arabia
- 3High Performance Computing Center, King Abdulaziz University, Jeddah, Saudi Arabia
Smart cities are a relatively recent phenomenon that has rapidly grown in the last decade due to several political, economic, environmental, and technological factors. Data-driven artificial intelligence is becoming so fundamentally ingrained in these developments that smart cities have been called artificially intelligent cities and autonomous cities. The COVID-19 pandemic has increased the physical isolation of people and consequently escalated the pace of human migration to digital and virtual spaces. This paper investigates the use of AI in urban governance as to how AI could help governments learn about urban governance parameters on various subject matters for the governments to develop better governance instruments. To this end, we develop a case study on online learning in Saudi Arabia. We discover ten urban governance parameters using unsupervised machine learning and Twitter data in Arabic. We group these ten governance parameters into four governance macro-parameters namely Strategies and Success Factors, Economic Sustainability, Accountability, and Challenges. The case study shows that the use of data-driven AI can help the government autonomously learn about public feedback and reactions on government matters, the success or failure of government programs, the challenges people are facing in adapting to the government measures, new economic, social, and other opportunities arising out of the situation, and more. The study shows that the use of AI does not have to necessarily replace humans in urban governance, rather governments can use AI, under human supervision, to monitor, learn and improve decision-making processes using continuous feedback from the public and other stakeholders. Challenges are part of life and we believe that the challenges humanity is facing during the COVID-19 pandemic will create new economic, social, and other opportunities nationally and internationally.
Introduction
Smart cities and societies are characterized by our desire for innovation and advancements, and our aim to achieve environmental, social, and economic sustainability. The “smartness” in these environments is realized through engaging with these environments at fine-grained levels, in both space and time, analyzing these environments, and making informed decisions under sustainability constraints, all in a timely manner (Alotaibi et al., 2020; Yigitcanlar et al., 2020a). We engage with these environments through various physical and virtual sensors such as the Internet of Things (IoT), smartphones, and social media. These devices produce “big data” that is characterized by four Vs, Volume, Velocity, Variety, and Veracity (Chen et al., 2014). The data is analyzed using mathematical and computational methods that provide artificial intelligence (AI) for the city brains (Cugurullo, 2020; Yigitcanlar et al., 2020a, 2021a). Various physical and virtual devices provide the actuation ability for the smart environments using edge and fog computing (Janbi et al., 2020; Khan et al., 2020; Wang et al., 2020).
A key to this “smartness” is data-driven artificial intelligence that is becoming so fundamentally ingrained in these developments that smart cities have been called artificially intelligent cities (Yigitcanlar et al., 2020a,b), autonomous cities (Cugurullo, 2020), etc. Most applications, systems, and platforms we use today, non-stop, are powered by AI; Alexa, Cortana, Siri, Google Maps, and the list goes on. AI is helping us increasingly in everything we do, or, we may say, it is making decisions for us in many spheres of our life. It tells us what toothbrush to buy. It defines health and fitness for us, and how to achieve those health and fitness goals. It selectively brings information to us about our beliefs, culture, and values. It is our teacher, and spiritual leader, it tells us what is good knowledge, what to believe in through YouTube, news websites, and other information media. AI helps us even in finding the “true love” of our life and select the “significant other” (Janbi et al., 2020).
The COVID-19 pandemic has shown us that a tiny virus can gravely affect our lives, societies, economies, and planet (Alam et al., 2021). As of 24th July 2021, nearly 200 million people have been infected by the SARS-CoV-2 virus causing over 4.14 million deaths worldwide (Johns Hopkins University, 2020). Governments have used physical distancing as a measure to prevent the spread of infection among people. This has increased the physical isolation of people and consequently furthered the pace of human migration to digital and virtual spaces. One of the most important public issues and economic sectors affected due to this physical distancing is the education sector. In many parts of the world, countries have made transitions from in-class, in-person, or face-to-face learning to distance or online teaching and learning (we use the term online learning from hereon).
Social media has emerged over the last decade as the key “engagement” platform between people and service providers, be it businesses or governments (Kemp, 2019). Twitter is among the most popular social media due to it providing microblogs, called tweets, that are brief messages used for sharing product and service information, news, events, government decisions and news, personal status, and more (Lin, 2019). With 330 million active users every month and a 5.79 average tweets every second (Lin, 2019), Twitter is a lifeblood of urban life and could provide key information about public and other matters. With Saudi Arabia among the top counties in terms of Twitter users in the world, Twitter data provides a fertile and important source of information on research and practice in smart societies.
This paper investigates the use of AI in urban governance as to how AI could help governments learn about urban governance parameters on various subject matters including the public reactions, concerns, and preferences, for the governments to develop better governance policies, strategies, programs, and procedures. To this end—considering the significance of the education sector in general, and during the COVID-19 pandemic, in particular—we develop a case study on online learning in Saudi Arabia. Specifically, using Twitter data in the Arabic language, we discover 10 urban governance parameters for online learning in Saudi Arabia as seen during the COVID-19 pandemic using unsupervised machine learning. These governance parameters represent the most discussed topics by the Saudi society, including their needs and concerns, the government's measures to combat the pandemic-induced challenges, as well as their efforts to make the education process successful.
We group these ten governance parameters into four governance macro-parameters namely Strategies and Success Factors, Economic Sustainability, Accountability, and Challenges. The first macro-parameter touches upon strategies and success factors including developing social sustainability and enhancing educational performance through supporting families, students, teachers, and other stakeholders. The second macro-parameter reports on new economic opportunities enabled by individuals and SMEs in providing services to fill the service gaps created through the rapid transition of in-class learning to online learning. The third macro-parameter touches on governance accountability and provides evidence for the government to have successfully managed the pandemic-induced emergency for transitioning the whole national education delivery system from in-class to online learning. The fourth macro-parameter touches upon the challenges that the people faced during the online learning period and the government's strategies to manage those challenges.
The process of discovering these parameters involves the automatic detection of 20 most discussed topics using LDA-based topic modeling of Twitter data. Subsequently, we merge some of these 20 topics into topic clusters based on their relationships to others and call them urban governance parameters. The terms topic cluster, cluster, and governance parameter are used interchangeably in the paper. These ten governance parameters are then grouped into four governance macro-parameters based on their relationship to the macro themes.
We have developed a software tool from scratch for this work. The tool implements a complete machine learning pipeline including data collection, pre-processing, clustering, validation, and visualization components. The dataset contains 128,805 tweets related to online learning in the Arabic language collected for the period beginning October 1st to December 6th, 2020. The translation of the keywords and other Arabic content (sample tweets, etc.) given in this paper is deliberately made contextual to allow English readers to understand the contextual use of the keywords.
Many works have been reported on modeling COVID-19 related issues using natural language processing (NLP) methods. Some of these works are related to government measures and public matters such as a study on detecting symptoms of stress due to COVID-19 in the US (Li et al., 2020) and a study on detecting government measures and public concerns in Saudi Arabia (Alomari et al., 2021a). Both these works used Twitter data, the first work analyzes tweets in English and the second work in Arabic. The works using NLP with a focus on education during COVID-19 are not many. For example, Duong et al. have investigated the differences in responses to the COVID-19 pandemic between university students and the general public in the US (Duong et al., 2020a). Their focus and findings are very different from our work. They focused on comparing responses of students and the general public on seven topics about discussions on politics, national and international news, racism against Chinese related to COVID-19, physical distancing, and closures of colleges. In the Arabic language, we did not find any work that has focused on online learning. The only work remotely related to our paper is a sentiment analysis study of closures of seven types of public and private facilities to manage the pandemic in Saudi Arabia (Alhajji et al., 2020). One of these seven types of facilities was schools and universities. The focus of the paper was to find sentiment polarities whether the people were happy or unhappy about the closure of these seven types of facilities. Further description of these works and other related works is provided in Section Literature Review. In short, the study developed in this paper differs from the existing works in several ways including the context, focus, time period, geography, data, language, design, and findings of the study. Moreover, the particular perspective on the use of AI in urban governance that we have presented in this paper would be a trendsetter for many more works to come in the future and significantly impact research and practice in this field.
The case study in this paper shows that the use of AI does not have to necessarily replace humans in urban governance, rather governments can use AI, under human supervision, to monitor, learn and improve decision-making processes using continuous feedback from the public and other stakeholders.
The rest of the paper is organized as follows. Section Literature Review reviews the related works. Section Methodology and Design describes the methodology and design. Section 4 explains the results. Section Results and Findings provides a discussion of results including potential utilization of the research. We conclude in Section Discussion and present directions for future work.
Literature Review
This section provides a literature review on the topics related to this research study. Section Twitter Data Analytics (General Literature) reviews general literature (not specific to COVID-19) on analytics of Twitter data in the Arabic language (because Arabic is the focus of this paper). In Section Twitter Data Analytics and COVID-19, we review the works on Twitter data analytics that focus on COVID-19 (because COVID-19 is a focus of our paper). Section Twitter Data Analytics in Teaching and Learning reviews the works on Twitter data analytics that focus on teaching and learning (because this is the focus of our paper). Section Twitter Data Analytics Using Topic Modeling reviews the literature related to topic modeling of Twitter data (because this is the method we have used in this paper). Section AI and Online Learning presents the studies related to the use of AI in online learning. Section Research Gap highlights the research gap.
Twitter Data Analytics (General Literature)
Sentiment analysis of text in Arabic is one of the research areas that has recently gained increasing attention from the research community (Alotaibi et al., 2020; Alomari et al., 2021a). Sentiment analysis is known as the process of extracting phrases or words that refer to polarity of a specific topic. It reveals whether a phrase is positive, negative, or neutral. The objective behind sentiment analysis is to discover the polarity in reviews and comments of people. It is very important where valuable information can be obtained about a particular idea, product, or topic (Oueslati et al., 2020). It is useful for organizations, institutions, and companies where it can be used for evaluating the provided services or products (Duwairi and Qarqaz, 2016). It can also be used by governments to understand peoples' opinions about certain matters including government policies and procedures or detect events such as in transportation (Alomari et al., 2020a).
A number of works have reported Twitter data analytics using machine learning methods. Duwairi and Qarqaz (2016) developed a framework for detecting sentiments and the polarity of user's Arabic reviews on Twitter and Facebook. The researchers focus on preparing a suitable dataset for sentiment analysis by collecting and labeling the data. The dataset that was obtained consists of 2,591 tweets/comments. Crowdsourcing technology was employed for the labeling and annotating of the dataset. Furthermore, they investigate how various functions or techniques of term weighting could affect the model accuracy. Moreover, they investigate the accuracy of various classifiers including support vector machines (SVM), k-nearest neighbors (KNN), and Naïve Bayes (NB) textual analysis, and detecting the polarity of the comments. The authors conducted the study using the RapidMiner tool. The obtained results indicate that NB and SVM performed better compared to KNN. The results also showed that varying between the weighting schemes BM, TF, and TFIDF will affect the classification results. The results also showed that using a classifier with different weighting schemes will have different results. For example, the best results of KNN were obtained when it was used with BM, while SVM provided its best results when used with the TFIDF scheme. For NB, the best results were achieved when it was used with TF. The best accuracy over all the algorithms they achieved was 69.97%, which was for the NB algorithm combined with TF.
Altawaier and Tiun (2016) studied the performance of machine learning-based sentiment analysis of Twitter data in Arabic including Decision Trees (DT), NB, and SVM. They used Arabic stemming and simple features including TF-IDF (Term Frequency-Inverse Document Frequency). The experiments were conducted on a Modern Arabic Corpus dataset obtained from the UCI repository containing 2,000 annotated tweets (50% positive and negative each) with different tweets in politics and arts. The experiments were carried out using the Weka machine learning tool. The evaluation of the classification performance was based on three metrics including F-measure, recall, and precision. Their experiments suggested that DT (with 78% for F-measure) provided better results than the other techniques. Therefore, sentiment analysis on Arabic texts with two classes of opinions, DT will perform better than SVM and NB techniques. Baker et al. (2020) proposed an approach for detecting Influenza from Arabic tweets in Arab countries. The research was conducted using machine learning techniques. The dataset was prepared for this study with many steps including collecting Arabic tweets related to influenza, labeling, and finally filtering. For analysis purposes, various classifiers were used including SVM, NB, KNN, and DT. The experimental results reveal that the highest accuracy obtained was 89.06%, for the Naive Bayes classifier, and 86.43% for K-Nearest Neighbor.
Among the few works that have used deep learning for Arabic sentiment analysis includes the study by Gwad et al. (2020) on classifying sentiment reviews in Arabic collected from Twitter using LSTM, a type of deep learning Recurrent Neural Networks (RNNs). The experimental results showed that LSTM provides higher accuracy, requires less calculation, and a shorter period for working compared to traditional recognition techniques. They obtained 89.8% accuracy on average.
The literature on corpus-based sentiment analysis of Arabic tweets includes a method that combined stemming, TF-IDF, DMNB (Discriminative multinomial Naïve Bayes), and 4-grams tokenizer (Alsalman, 2020). The experiments were conducted on a Twitter corpus dataset, with 2000 Arabic tweets already labeled as negative or positive. WEKA machine learning software was used as the analysis tool. Their approach obtained 0.3% higher accuracy compared to other works.
Some works have used the SAP HANA in-memory platform for Twitter data analytics. In Alsulami and Mehmood (2018), the authors proposed a model for sentiment analysis of Arabic tweets. The research goal was to detect the opinions of users about the Ministry of Education services in Saudi Arabia. The study focused on finding what the users think about the new university system in Saudi Arabia. They used the in-memory platform SAP HANA for processing and analyzing Arabic Tweets. Alomari et al. (2020b) proposed an approach for classifying the feelings and emotions of car drivers using Arabic sentiment analysis. The proposed sentiment analysis mechanism was based on a lexicon approach. The study also provides analysis for Saudi dialect comments on Twitter about the conditions of road traffic. The dataset was collected for the Jeddah and Makkah cities, two large cities, during Ramadan. The data storage and analysis were performed using the SAP HANA platform. The obtained results were validated using data from news media.
A recent survey on sentiment analysis of Twitter data in Arabic can be found in Oueslati et al. (2020). It provides a review of sentiment analysis research focussing on approaches, resources, and open challenges. The authors divide the approaches for classifying the texts into corpus-based, lexicon-based, and hybrid approaches. The hybrid and corpus-based approaches have primarily used machine learning. NB and SVM are the most used algorithms for Arabic sentiment analysis. Deep learning approaches are scarcely explored. The authors state that the performance of sentiment analysis depends on the quality of data. The data available for sentiment analysis is not good due to Arabic dialectical content that is difficult to process. However, for data in other languages, findings are promising albeit challenges.
Some other works related to the analysis of Twitter data in the Arabic language are reviewed in Alomari et al. (2020a) and Alotaibi et al. (2020).
Twitter Data Analytics and COVID-19
This section reviews the works on Twitter data analytics that focus on COVID-19. First, we discuss works that use Twitter posts in any language other than Arabic. Subsequently, in Section Twitter Data Analytics and COVID-19 (The Arabic Language), we review the works on Twitter data analytics that focus on COVID-19 in the Arabic language.
Twitter Data Analytics and COVID-19 (Any Language)
Recently, COVID-19 has become one of the hot research fields. Twitter is one of the widely available and most useful resources. With the huge increase in social media usage such as Twitter during the pandemic, numerous opportunities are provided for the research community. Different fields are utilizing the data obtained from Twitter including health, education, and policy. We provide in this subsection a review of the most notable works related to COVID-19 that use Twitter data.
A number of works have looked into thematic analysis of Twitter data. For example, Samuel et al. (2020a) investigated the sentiment associated with COVID-19 pandemic from Twitter data. Coronavirus related Twitter posts were analyzed using the sentiment analysis packages of R statistical software and the NB and LR (Logistic Regression) algorithms. A comparison between the algorithms was conducted with varying lengths of tweets. The experimental results demonstrated that both algorithms have relatively weaker accuracy with longer tweets as compared to shorter tweets. NB and LR provided 91 and 74% accuracy on shorter tweets. They extended their earlier work in Samuel et al. (2020b) and analyzed the sentiment on Twitter posts to find the dominant trends related to the debate on reopening the US economy during COVID-19. They implemented an innovative and useful approach for sentiment polarity. The approach could handle data from different social media sources and perform analysis beyond COVID-19. It was designed based on public sentiment scenarios (PSS). The study employed Twitter API and well-known packages in R such as rTweet and Syuzhet for classifying the tweets. The results revealed an overall trend with positive sentiment where fear and sadness have lower levels of sentiment as compared to trust and anticipation sentiments.
Duong et al., 2020a,b investigated the implications of COVID-19 on society by sentiment analysis of Twitter data. They utilized topic modeling methods to find the topical patterns on Twitter. Latent Dirichlet Allocation (LDA) was used for topical analysis of COVID-19 tweets. RoBERTa model was used for topic-based sentiment analysis, the transformers library by Hugging Face for training and evaluation, with SemEval-2017 Task 4A dataset evaluation. They found seven topics related to the concerns of college students and general society about the pandemic. These topics related to discussions on politics, national and international news, racism against Chinese related to COVID-19, physical distancing, and closures of colleges. Their results revealed that the response of college students to COVID-19 was more negative than the general population. Abdulaziz et al. (2021) provided a model for analysis of COVID-19 tweets in which it extracts the most popular topics related to COVID-19 and then it provides an analysis for the sentiment of the extracted topics. LDA was utilized for finding the important topics and lexicon-based approaches were used for sentiment analysis.
Abd-Alrazaq et al. (2020) reported sentiment analysis of tweets to identify major topics related to coronavirus disease (COVID-19). The authors used Twitter's search API, PostgreSQL database, and Python libraries including Tweepy. The analysis was performed using LDA and recurrences of words (unigrams and bigrams). The interaction for each topic is obtained by extracting the average number of retweets, likes, and followers. As a result of the analysis, 12 topics were identified and grouped into four classes: the virus origin, sources, the effect of the virus on countries, the economy, and the people, the ways to reduce and control the spread of the infection. The results also revealed that the overall sentiment was positive in which it was positive for 10 topics and for the remaining 2 (deaths and racism) it was negative.
Other studies that reported thematic analysis of COVID-19 related tweets include (Das and Dutta, 2020; Jimenez-Sotomayor et al., 2020). Some other COVID-19 related works on social media data analysis including Twitter were reviewed in Alomari et al. (2021a).
Twitter Data Analytics and COVID-19 (The Arabic Language)
The existing research studies related to COVID-19 based on the data analytics using social media in the Arabic language are limited. Most of the current research studies using social media in the Arabic language have used Twitter as a data source since it is one of the most active social media and highly rich with COVID-19 related data.
Several studies have focused on studying various topics related to COVID-19. Essam and Abdo (2021) investigated the reactions of Arab communities on Twitter against the pandemic of COVID-19. The authors study the linguistic expressions used to reveal the feelings on the pandemic using Twitter data. The objectives were to perform thematic analysis to find the dominant COVID-19 related topics and to explore the effects of these topics from the psychological aspect and find how these implications and the causes of the pandemic are related. The analysis was lexicon-based, and it was conducted using corpus tools, R language's stylo, and LIWC. The results showed that 30.6% of the community discussions on Twitter were around news about the epidemic in general in addition to the number of people who got infected. 6.8% of the discussions were about the signs and symptoms of Coronavirus, and 6% were around Economy. The experimental results also show the causes of the pandemic from the perspective of Arab tweeters. Alsudias and Rayson (2020) reported an analysis of COVID-19 related Arabic tweets with a focus on rumor detection. The study aimed to provide analysis results aimed at Arab World governments, as well as Public Health Organizations. Multiple machine learning algorithms including k-means algorithm, Support Vector Classification, NB, and LR were utilized in the study to identify the topics of the discussions related to COVID-19 on Twitter, to perform rumor detection, and to predict the class the source of tweets about COVID-19. They found that the rumor was correctly identified with an accuracy of 84% by Logistic Regression, when used with count vector, and Support Vector Classification, when used with TF-IDF. The results also indicate that around 60% of the tweets that were detected with incorrect information were written in a language style of academics and health specialists. Alanazi et al. (2020) presented a study on the Arabic tweets to recognize the most common syndromes associated with COVID-19 cases and the order with their appearance. The Twitter data used in the study was for the period March to May 2020. The experimental results demonstrate that 66% of the users have experienced some symptoms and provided sequential order of their appearance. The results showed that fever, headache, and anosmia were the top three symptoms experienced by patients.
Some researchers have also looked at government measures and public concerns related to COVID-19 using Twitter data in Arabic. Alhajji et al. (2020) presented a sentiment analysis study of closures of seven types of public and private facilities to contain the virus in Saudi Arabia. These seven types of facilities included Grand Mosque, universities, schools, and shopping malls. They extracted tweets related to the sentiments over the closure of these seven types of public and private facilities from various hashtags. They used the Naïve Bayes machine learning model. Their results show the overall positivity regarding the preventive measures in the Saudi community except for shopping malls closure. Alomari et al. (2021a) presented a study reporting an analysis of Arabic Twitter data to detect concerns of the public and the pandemic measures of the Saudi Arabia government during the COVID-19 pandemic. They collected a dataset of Arabic tweets from Saudi Arabia for the period February to June 2020. They developed a software tool consisting of the LDA topic modeling algorithm, visualization and spatio-temporal techniques, and other natural language processing (NLP) techniques. Various technologies were utilized such as Apache Spark, Spark ML, Spark SQL, Parquet, and MongoDB. The paper provides data analysis from the information-structural perspective, temporal perspective, and spatio-temporal perspective. The experimental results revealed 15 government pandemic measures and public concerns in which they were grouped into six macro-concerns including daily livelihood, and economic, social, and environmental sustainability.
A review of other COVID-19 related studies that use Twitter data in the Arabic language is given in Alomari et al. (2021a).
Twitter Data Analytics in Teaching and Learning
We review here the works related to online learning where social media is used for machine learning-based data analytics. First, we review works that are not related to COVID-19 and subsequently, in Section Twitter Data Analytics in Teaching and Learning (COVID-19), we will review works on online learning during COVID-19.
Verma et al. (2016) provided a survey on analyzing students' learning by utilizing social media data. They stated that analyzing social media data that is related to students' learning experience could provide transparent results (such as emotions and opinions) which could be valuable for organizations and educational administration. According to their survey, Naive Bayes algorithms need less time for computation and a smaller amount of pre-defined data (Clark et al., 2008) and they provide the best performance with most of the feature extraction techniques. It also indicates that the Support Vector Machine algorithm has some limitations in terms of complexity, speed, and size (Go et al., 2009). Lande and Dalal (2016) provided an analysis for Twitter data under certain hashtags related to engineering students. The aim was to explore the issues that affect the learning process of students and help improve the educational system. Naïve Bayes Multi-label classifier was used in the study. The experimental results classified the tweets into six main categories including the heavy load of study, diversity issues, lack of social engagement, sleep problems, negative emotions, and others.
Other studies on utilizing social media applications such as Twitter and Facebook for improving the learning process include (Kechaou et al., 2011; Chen et al., 2014; Verma et al., 2016).
Twitter Data Analytics in Teaching and Learning (COVID-19)
Researchers have focused on various issues related to education during COVID-19. Some have focused on studying students and their attitudes while others have focused on teachers and other stakeholders. Adnan and Anwar (2020) developed a study to find the attitudes of higher education students about online education in Pakistan during the pandemic. They used an online survey for collecting data from 126 higher education students (with a two-third of them females). The results demonstrated that online learning cannot provide the required results in Pakistan due to various issues including technical and monetary issues of Internet access. The study also highlighted some challenges and obstacles related to distance learning such as the absence of face-to-face interaction and lack of classroom socialization. Bestiantono et al. (2020) presented an exploratory study to discover high school students' perspectives and viewpoints toward online learning in Indonesia during the pandemic. The data was collected with online surveys including 180 Indonesian secondary school students (90 female, 90 male). The results showed that online learning cannot provide the expected outcomes where a large number of students cannot get Internet access due to financial issues. The results also highlighted various issues related to online learning including the absence of eye contact of instructor and the usual schoolroom socialization.
Rasmitadila et al. (2020) presented a case study about the perceptions of online learning during COVID-19 from primary school teachers' perspectives. The data collected from surveys and interviews. They used inductive and thematic analytics. The results demonstrated four major themes, specifically, instructional strategies, teachers' motivation, challenges, and support. They found that the success of online learning was determined by the collaboration and support of teachers, schools, parents, the government, and the community. Harron and Liu (2020) studied Twitter posts of K-12 teachers in the English language during the pandemic of COVID-19. They analyzed the posts related to online learning. The results show that most of the teacher's tweets were on the actions of political leaders, transition to online learning, and sharing free advice and resources. Carpenter et al. (2020) provided an analysis study for education-related hashtags on Twitter. The study included 2.6 million tweets in English posted on 16 different hashtags during a 13-month period. They explored the trends among the 16 hashtags, the similarities, and the differences. They found significant differences in the traffic related to the hashtags.
Twitter Data Analytics Using Topic Modeling
We review in this section the studies related to Twitter data analytics using topic modeling with a focus on the Latent Dirichlet Allocation (LDA) topic modeling algorithm.
Zahra (2020) developed a topic modeling algorithm for focused analysis to obtain semantic relations on a targeted aspect from COVID-19 related Twitter data. Data collection was performed in two phases with the use of a list of stop words that are frequently occurring and the geo-labels. Text pre-processing, as well as lemmatization, were performed using hand-crafted rules due to the lack of text processing resources for Levantine Arabic. The model learns topics from Twitter data expressed in various dialects of Levantine Arabic. The experimental results demonstrated the model's capability of capturing topics within the required scope. The results were compared to a baseline model as well as another targeted topic model that is designed to serve the same purpose. Korshunova et al. (2019) propose a supervised learning model used for classification or regression. The proposed model can also provide unsupervised learning by exploiting the structure in the data. It can be used with groups of images, in addition to, arbitrary text embeddings. Alomari et al. (2021a) used the LDA topic modeling algorithm for analyzing Arabic Tweets related to COVID-19 with the aim to detect public concerns and government measures.
Elaraby and Abdul-Mageed (2018) developed various classifiers to identify the dialect using a dataset for online comments in Arabic. The classifiers are based on neural networks including CNNs, LSTM, RNN, and gated recurrent unit (GRU). The results indicated that attention-based BiLSTMs achieved the best accuracy on dialect identification, if it is used with a large dialect-specific model for word embeddings. Magatti et al. (2009) proposed an algorithm to provide automatic labeling for topics based on a hierarchy obtained through a tree. The best label is selected by utilizing a set of labeling rules. The rules for labeling are created to get the labels that are best agreed among the topic and the hierarchy. The results indicated the effectiveness of the proposed algorithm in mapping the extracted topic to topics labels associated with a topic hierarchy. Onan et al. (2016) presented an empirical study on the performance of the Latent Dirichlet Allocation (LDA) algorithm with sentiment classification. They examined the performance of various classification algorithms when LDA-based representation is used. Based on the analysis, they found that SVM, KNN, NB are used as the weak learners, while other algorithms such as AdaBoost, Voting, and Stacking, and Bagging are used as the ensemble learning methods. A review of other studies related to topic modeling using LDA can be found in Alomari et al. (2021a).
AI and Online Learning
Artificial intelligence technologies are important tools in the education industry to improve and reform the educational systems. It helps in understanding the stakeholders' needs and issues. For example, it helps educators in understanding the details of students' needs which helps making targeted adjustments to the content, cater to learners' individual learning styles and content needs, and teach students based on their aptitude (Hou et al., 2021). AI has accelerated distance education's modernization due to its rapid development (Gao et al., 2021). We review here in this the studies related to the use of AI in online learning. The works on social media analysis using AI for online learning are reviewed in Section Twitter Data Analytics in Teaching and Learning. Online learning is an active area of research. Utilizing artificial intelligence approaches in the learning domain has also gained researchers attention (Gao et al., 2021).
Some works have looked into recommendation systems in online learning using machine and deep learning methods. For example, Ai et al. (2019) introduce a system for online learning that provides personalized recommendations to students on specific exercise they should attempt. The system aims to improve the efficiency of the online learning experience. An online self-directed learning system (IPS) was used to analyze 5th grade students' interactions in the math curriculum. The proposed recommendation algorithm utilizes deep reinforcement learning techniques. The experimental findings indicate that the proposed system achieves better performance than existing policies in terms of optimizing the knowledge level of students. A recent review of recommendation systems in online learning and the utilization of AI technologies can be found in Khanal et al. (2019).
Other works have focused on understanding users' experiences about online learning. Ai and Laffey (2007) presented an experiment with pattern classification as a means of predicting the performance of students in the WebCT learning system. They aim to investigate whether data can be used to predict students' achievements in online learning, and examine how Web mining can be applied to online learning. The paper finds that Web mining is a useful approach for building knowledge about online learning and that it can improve learning performance in the long run. Mehmood et al. (2017) proposed a teaching and learning big data framework to improve lifelong learning in smart societies. They used eleven widely used datasets to evaluate various functions of the proposed framework. The ML techniques they used included DLANNs, Random Forests, and Naive Bayes classifier.
Research Gap
The literature review presented in this paper has clearly established the research gap and novelty of our work. The study developed in this paper differs from the existing works in several ways including the context, focus, time period, geography, data, language, design, and findings of the study. Moreover, the particular perspective on the use of AI in urban governance that we have presented in this paper would be a trendsetter for many more works to come in the future and significantly impact research and practice in this field.
Methodology and Design
The architecture of the proposed system is shown in Figure 1. It consists of five phases including data collection, pre-processing, governance parameters discovery, results validation, and visualization. The dataset has been collected using Twitter REST API.
Research Design and System Overview
We built the system using Twitter data analytics in the Arabic language. The architecture of the proposed system consists of five phases: data collection, pre-processing, parameters discovery, validation, and visualization. First, we collected the Arabic tweets related to online learning using Twitter REST API and the Tweepy library. The tweets extracted from the Twitter API were in the JSON (JavaScript Object Notation) format. Subsequently, we converted and saved the collected JSON file to XLXS format using a parser algorithm that we implemented in python. In the Pre-Processing phase, the collected data are cleaned and preprocessed to make it ready for the analysis stage. For stop-words removal, we used the Natural Language Toolkit (NLTK) library with an additional list of dialectical Arabic stop-words to remove from text. Then, we used the scikit-learn Python library to build the discovery module using Latent Dirichlet Allocation (LDA). Then, we provide visualizations of the discovered parameters using inter-topic distance map and term frequency diagrams. We have used the Python pyLDAvis library (Sievert and Shirley, 2014; Mabey, 2015) for computing and plotting the map and the term frequency diagrams. The inter-topic distances and the scaling for the set of inter-topic distances are computed using the default options Jensen-Shannon divergence and Principal Components, respectively. The term frequency diagrams show the corpus-wide and topic-specific frequencies represented by the widths of the blue and maroon bars, respectively. Finally, we validated the results using both internal and external validation. Internal validation was done by finding tweets that support the discovered urban governance parameters and findings. For example, in Section Results and Findings, New Economic Opportunities parameter was validated internally by many tweets, found by our tool, providing information about various educational services offered by individuals and small businesses for students and other stakeholders. For external validation, the parameters are validated by finding external online sources such as online newspapers and reports. For instance, Evaluation governance parameter was validated by two different studies conducted by six international organizations including Harvard University, UNESCO, and IITE (See Section Results and Findings).
The Dataset
We collected the Arabic tweets related to online learning using Twitter REST API and the Tweepy library which is a python library that provides easy access to Twitter API (Roesslein, 2022b). The data was extracted during the period from October 1st to December 6th, 2020. The total number of collected tweets is around 128,805 tweets.
Firstly, we download tweets using a set of predefined parameters such as the “Arabic” language, the “extended” mode for extracting the entire text of the Tweet without truncation, and a set of search hashtags related to online learning in Saudi Arabia. For example, we used the hashtags التعليم_حضوري# (in-class education or face-to-face education), التعليم_عن_بعد# (online education). The list of the hashtags used for our data collection is listed in Table 1. For simplicity, we refer to Twitter hashtags as H. H1 hashtag was used to discuss in-class learning. H2 and H3 hashtags were used for discussing online learning. H4 and H5 hashtags were used by people who were requesting in-class learning and were not supportive of online learning.
The tweets extracted from the Twitter API were in the JSON (JavaScript Object Notation) format which is the default Twitter API response format. Each tweet is retrieved with several attributes (Twitter, 2022) such as “id” which reflects a unique id for each tweet, “created_at” which represents the time of posting the tweet, “text” which provides the actual tweet message, and other attributes related to the geographical location including “place,” “geo,” and “coordinates.” Furthermore, “entities” attribute encapsulates a number of attributes e.g., “media,” “links,” “hashtags,” and “user_mentions.” The user_mentions attribute refers to other tweeters mentioned in the Tweet's text. An example of a tweet object is shown in Figure 2. The “full_text” field contains the complete untruncated text of the tweet [the explanation of the “full_text” field can be found in Roesslein (2022a)]. Note that the full_text attribute is returned instead of the text attribute because we used the extended mode. Note that all personal information in the figure has been replaced with “XYZ” or “ABC” letters for English text and “مخفي” for Arabic text which means “hidden,” and that is for preserving privacy. We also replaced the fields that have integers with an equal number of “0” digits.
Managing tweets in the JSON format involves certain programming and computational challenges. Therefore, we created a parser algorithm in python to iterate over all the tweets and extract the important attributes e.g., tweet id, date, time, and text from JSON format, get clean text by invoking pre-processing module (see Figure 1), and finally store the results in the XLSX format. Duplicate tweets were removed based on Tweet “Id” using Panda package. An example of the output of JSON parser is shown in Table 2. The JSON parser algorithm is shown in Algorithm 1.
Data Pre-processing
Data pre-processing is a very important step for data analytics. It involves employing various techniques on the obtained data to clean data, remove noise, enhance the quality, and accordingly, improve the accuracy of data analysis. There are some libraries for pre-processing of textual data such as Natural Language Toolkit (NLTK) library. Data pre-processing includes various tasks including tokenization, removal of irrelevant words and characters, normalization (letter replacement), and stemming.
After converting the extracted tweets from the JSON format to excel format, we started pre-processing by removal of irrelevant words and characters including hashtags, mentions, URLs, numbers, whitespaces, smiley faces, and emojis. Emoji faces used in tweets could hold some meaning. Some studies are replacing it with a suitable word describing the emotion behind using it. However, in this paper, we removed them. Furthermore, we replaced the new line and the colon symbol (:) with whitespace for readability. Removing the repeating characters was also performed for readability.
Moreover, we removed all punctuations such as mathematical notations (÷, ×, –, +, %), different types of brackets {}, [], (), colons and semi-colons (: ;), question marks (? ), slashes, ad symbols such as *, &, ∧, $, >>, < <, |, ~. Furthermore, we removed the non-Arabic letters and kept only the Arabic letters.
Normalization is another pre-processing task. PyArabic is one of Python libraries that can help with the normalization task. It provides support for the Arabic language, and it offers various functions for letters and texts detecting characters, removing diacritics. It also involves Tashphyne library for the normalization task (Harron and Liu, 2020). We removed all the forms of diacritics including Tashdid (ّ), Fatha (َ), Tanwin Fath (ً), Damma (ٌ) Tanwin Damm (ٌ), Tanwin Kasr (ٍ) Sukun (ْ) are removed. Furthermore, we normalized the letters in the words into consistent form. For example, “Taa marbutah” (ة) was replaced with “haa maftohah” (ه), Alif” with three shapes (أ, إ, آ) to “bare Alif” “Yaa” (ي) to “dotless Yaa” (ى).
For text mining, stop-words are not significant, and removing them will reduce the volume of the feature set. The Natural Language Toolkit (NLTK) library was used with some modifications that suit the needs. For example, the stop-word list provided by NLTK is for modern standard Arabic (Alomari et al., 2021b), and accordingly a list of words in the dialectical Arabic is needed. After manual observation of the tweets, we added a new list for stop words which are usually used in dialectical Arabic such as in the following list “علي، الي، شى، وش، ليش، ايشلاكن، لكن، علا، مو، احنا، اللى، اللهم، والله، الله، اللي”. The original list was extended with the list provided in Alrefaie and Bazine (2019), which contains 750 stop words.
Tokenization is important in the pre-processing phase. It aims to split the text into a sequence of words (tokens) separated by punctuation characters or whitespaces. Split () is an available method in python ad it was utilized for this task.
Governance Parameters Discovery
We report here the process of discovering urban governance parameters using topic modeling of Twitter data. Modeling of topics is an AI approach that is frequently used for topics discovery and data analysis. It consists of a set of algorithmic methods that seek to identify structural patterns within a corpus of documents, producing clusters of word terms that identify the central themes of the documents (Mortenson and Vidgen, 2016). Latent Dirichlet Allocation algorithm (LDA) is an unsupervised machine learning technique, a commonly used algorithm for topic modeling. It is a statistical approach that is used to find the most common topics in a collection of documents. It helps in finding clusters in a collection of documents. LDA can be used to map a given set of documents (such as tweets in our case) to a set of topics or clusters and each document in the set is associated with a topic by a certain probability.
We used the scikit-learn Python library to build the discovery module using Latent Dirichlet Allocation (LDA).
Modeling the distributions required determining the number of topics. We ran LDA multiple times to explore different cluster sizes e.g., 10 15, 20, and other numbers, and we found that 20 topics provided the best results for identifying important urban governance parameters. Subsequent to the extraction of clusters, we manually assigned names to the clusters based on the keywords, tweets, and the domain knowledge. We first looked at the key terms of each cluster and then we gave a name. When the key terms were not clear, we then looked at the context of the tweets. This was an iterative process. The clusters' names are a reasonable representation of the LDA keywords and the tweets in the cluster. Then, we removed the clusters that are not related, and we merged some of these 20 topics into 10 topic clusters that we call urban governance parameters. The merging of topics was based on their relationships to other topics. We further group these ten governance parameters into four governance macro-parameters based on their relationship to the macro themes. The macro-parameters are higher (broader) level parameters. This whole process created a set of 10 parameters and 4 macro-parameters.
For example, we found that the keywords and the tweets for both topics 5, 10, and 14 refer to the nature of exams during COVID-19 and the concerns of attending in-class exams due to the fear of COVID-19 infection. Some of the keywords are “Exams,” “Physical Attendance,” “Infection,” “Health,” and “Physical Distancing.” For this reason, we merged the three topics and named them as the Exam Procedures. Since conducting exams during COVID-19 whether online or in-class has many challenges, these three topics were listed under the fourth macro-parameter, Challenges.
Evaluation and Validation
For validating the discovered parameters obtained from the extracted topics using the LDA model, two methods were used, following the approach used by Alomari et al. (2021a). The first method provides external validation and uses digital media such as online newspapers (e.g., the Okaz and Al Madina newspapers) and other online reports and information to verify the identified parameters. The second method provides internal validation using the collected Twitter data including the tweets posted by the official accounts of schools, universities, or the Ministry of Education.
Results and Findings
Online learning currently is a key public issue due to the ongoing pandemic requiring physical isolation of people and in turn the migration of many aspects of our life to cyberspaces. Given the importance of this subject, we investigate in this section the use of AI in urban governance as to how AI could help governments learn about various urban governance parameters for online learning in Saudi Arabia. These governance parameters include government policies and strategies on the subject matter as well as the public reactions and concerns. Specifically, we report the process of discovering 10 urban governance parameters using LDA-based topic modeling of Twitter data. The process involved extracting 20 topics from the Twitter data using the LDA algorithm and then merging some of these topics into topic clusters that we call urban governance parameters. We will use the terms topic cluster, cluster, and governance parameter interchangeably. We further group these ten governance parameters into four governance macro-parameters. The methodology of detecting these topics using LDA has already been described in the previous section. The software is developed completely in the Python language.
Table 3 lists the urban governance parameters and related data. The four governance macro-parameters are listed in Column 1 of the table. These are Strategies and Success Factors, Economic Sustainability, Accountability, and Challenges. Column 2 lists the ten governance parameters that include Supporting Online-Learning, Supporting Families, and others. Note that each governance macro-parameter includes one or more governance parameters. For example, the fourth macro-parameter (Challenges) includes three governance parameters namely Exam Procedures, School Timings, and Digital Services. Column 3 lists the topic numbers. As we have mentioned earlier that we extracted 20 topics from the Twitter data using LDA and some of these topics that are related to each other are merged to form governance parameters. An example of this merging is the first governance parameter (Supporting Online-Learning) that is formed from merging Topic 1 and Topic 12 (see Column 3, Row 2, and Row 3). Column 4 gives the percentage number of keywords for the topics in the table rows. Topic 1 is the biggest topic and contains 11% of the total number of keywords. Topic 12 contains 4.1% of the total number of keywords. Column 4 in the fourth row gives the total number of keywords for the first governance parameter (Supporting Online-Learning). Column 5 in the table lists 15 keywords (in total for all the topics in a cluster) for each governance parameter. Originally, we have collected the top 30 keywords for each topic. These 15 keywords listed in the table are selected manually (using domain expertise) from the initial 30 keywords based on their significance to the topics.
The keywords are listed in Arabic along with their English translation. The translation of these keywords and other Arabic content (sample tweets, etc.) is deliberately contextual to allow English readers to understand the contextual use of the keywords. A literal translation of certain keywords would not convey the contextual meaning of the keywords and may confuse or give a completely different meaning to the reader.
We now digress from explaining Table 3 and explain the topics using graphical data. Figure 3 plots the inter-topic distance of the 20 topics using multidimensional scaling. The bottom-left of the figure gives a key to the size of the topics. Note that Topic 1 is represented by the largest circle implying the largest topic in terms of the number of keywords (as mentioned earlier it contains 11% of the total number of keywords, also see Table 3). Figure 4 plots the top 30 most relevant keywords (or key terms) for Topic 1. The keywords are ordered by decreasing frequency of these keywords within Topic 1 (see the maroon bars). The blue bars give the overall term frequency for each keyword. The software (LDAVis tool) analyses Arabic text and creates graphs for Arabic keywords. The keywords are shown in the Arabic language. The English translation for the Arabic keywords in the figure is provided in Table 4. Note that the overall frequency of the keyword “” (Education) is very near to the total frequency which shows that this keyword has significant weight. Furthermore, the keywords “الالكتروني” (Electronic) and “حضورى” (In-Class) have very similar overall item frequency, however, “الالكتروني” has a higher frequency within Topic 1 which is about electronic learning (E-learning or online learning). Now that we have given a general introduction to the table and topic diagrams, we move on to discuss in detail each of the governance parameters along with supporting data from the collected tweets and external sources. During these discussions, we will also elaborate further on the topic diagrams related to the governance parameter.
Table 4. The keywords and their English translation for Figure 4.
The first urban governance parameter (see Table 3) is Supporting Online-Learning and it belongs to the first macro-parameter Strategies and Success Factors. We have mentioned earlier that it is a merged parameter and includes Topic 1 and Topic 12. Figure 4 plots the top 30 most relevant keywords for Topic 1 (we are unable to include these diagrams for all the 20 topics, avoiding an excessive number of figures and conform to the publisher's article guidelines). The selective keywords in Column 5 of Row 2 (Topic 1) and Row 3 (Topic 12) characterize the governance parameter. The parameter relates to the government's decision, strategies, and efforts to continue online learning for all the school and university levels in Saudi Arabia. Before the beginning of the academic year (in August 2020) the Ministry of Education announced that the schools and universities will be online for the first 7 weeks of the first semester (Saudi Ministry of Education, 2020f). People actively discussed during this time this decision to migrate to online learning and whether it should be continued after the seventh week. Later on, due to the continuing COVID-19 pandemic, the government decided to continue online learning (note the keywords, e.g., “Continuation”) over the whole semester (Fall Semester, 2020–2021) (Saudi Ministry of Education, 2020d) that created mixed emotions from the people, though the people applauded the government's decision to continue (the keyword “better”). The keywords “Safety” and “Health” show the government's reasoning for the decision and care for citizens. The keywords “Technical,” “Training,” and “University” represent the type of educational institutions involved in these discussions. The tweets related to this cluster were posted by official accounts, news accounts, and people. For example, the official account of the Ministry of Education tweeted on October 8, 2020:
“وزارة التعليم تُعلن عن استمرار الدراسة عن بُعد لما تبقى من أسابيع الفصل الدراسي الأول، كما ستوضح آلية الاختبارات خلال الأيام المقبلة.”
“The Ministry of Education announces the continuation of online study for the rest of the first semester, …”
Many universities tweeted accordingly such as the following excerpt of a tweet by Umm Al-Qura University in Makkah city.
“ .....باستمرار العملية التعليمية عن بُعد لما تبقى من الفصل الدراسي الأول في التعليم العام والجامعي يعكس مدى ... اهتمام القيادة … على سلامة وصحة الطلاب والطالبات. #نعود_بحذر”
“… continue the distance learning process for the remainder of the first semester for general and university education … due to the safety and health of students. #return cautiously”
The second governance parameter is Supporting Families (see Row 3, Table 3) represented by key terms such as Guardian, Participate, Digital, Part, Madrasati Platform, Experience, and Upbringing. Family participation is very important for the success of online learning. Therefore, one of the strategies used by the Ministry of Education was to strengthen the family involvement in supporting their children for online learning. For this purpose, the ministry arranged some online activities and involved the families in these activities. These activities were organized on the online learning platform “Madrasati” (an Arabic word meaning “My School”). Many schools posted tweets related to this parameter. We found more than 500 tweets in our dataset similar to the following tweet that was posted by a school in Makkah.
“عزيزي ولي الأمر، شارك معنا لأنك جزء من منظومة التعليم الإلكتروني.”
“Dear Guardian, participate with us because you are part of the e-learning system.”
The Saudi Ministry of Education in August 2020 stressed the importance of parents and families supporting their children in transitioning to online learning to make the transition successful (Saudi Ministry of Education, 2020c). Recently, on 27th May 2021, the Minister of Education applauded the parents' involvement and support that enabled the educational process to successfully continue online (Saudi Ministry of Education, 2021a).
>“Parents contributed to creating a partnership with the Ministry of Education, and supported the continuation of the educational process online for their children.”
The third governance parameter is Competitions and Incentives (Row 4, Table 3) represented by key terms such as Madrasati Platform, Class, School, Competition, Teacher, Strengthening, and Digital. Developing certain competition exercises among the students was one of the strategies that have been followed by the Ministry of Education to strengthen distance learning. It aimed to encourage students, promote hard work, and reduce laziness among them. Some competitions were on the level of the schools, others were on the level of cities. For example, the “Digital Madrasati Platform Competition” (مسابقة مدرستي الرقمية) was provided by the Saudi Ministry of Education in which every school had its own competition (Alatiq, 2021). It was aimed at highlighting the importance of e-learning and enhancing the participation of the citizens in educational activities. For example, the following tweet found in our dataset was posted on Oct 2, 2020, by a school in Arar city.
“بهدف تعزيز اهمية التعليم الالكتروني والتعليم عن بعد وإثراء المحتوى الرقمي تبدأ إدارة المدرسة في استقبال المشاركات من جميع الفئات المشاركة من معلمين وإداريين وطلاب وأولياء أمور في مسابقة منصة مدرستي”
“With the aim of highlighting the importance of distance learning and improving the digital content, the school administration starts receiving participations on the Madrasati platform from all stakeholders including teachers, administrators, students, and parents.”
Another tweet in our dataset, posted on October 11, 2020, is from a school in Taif city. It highlights the use of financial incentives.
“حرصا من إدارة المدرسة على حث الطالبات على دخول منصة مدرستي والاهتمام بالأنشطة والاثراءات واستثمار كل الموارد التي توفرها لضمان سير العملية التعليمية فقد رأت العودة مجددا لمسابقة نجوم رحاب بشروط جديدة وجوائز قيمة شهرية.”
“In order to encourage students to enter the Madrasati platform, … to ensure the progress of the educational process online, the school administration will continue the Stars Rehab competition with monthly valuable prizes.”
The fourth governance parameter is Nurturing Positive Behavior (see Row 5, Table 3) represented by key terms such as Program, Children, Sunday, Week, Behavior, Positive, and Guidance. Developing and strengthening good attitudes in students is essential for the success of online learning. This is because with online learning there are a lot of issues including the children's motivation and honesty when the teacher is not present in the physical space to supervise them. Therefore, one of the strategies applied by the Ministry of Education was to strengthen the students' positive behavior. For this objective, the ministry activated the “Schools Promoting Positive Behavior” program. The program was executed by the guidance and counseling departments of schools, and it was for 1 week starting on Sundays (a working day in Saudi Arabia) (Alyaum Newspaper, 2020; Alhadwari, 2021) and this is why “Week” and “Sunday” are the key terms in this cluster. The program, with all its activities and events, aimed to create an institutional environment that stimulates positive behavior to achieve psychological and social compatibility for the student by adopting specialized and attractive methods, emphasizing the strengthening of the relationship between the students, schools, the families, and the local communities. Many schools in the country posted tweets related to this cluster, for example, the following tweet is taken from our dataset that was posted on October 14, 2020, by a secondary school in Jazan city.
“ضمن فعاليات الأسبوع المكثف لبرنامج المدارس المعززة للسلوك الإيجابي نعرض مشاركة الطالبات ومنسوبات المدرسة عبر الحائط الالكتروني؛ بمتابعة المرشدة الطلابية”
“Within the activities of the intensive week of the Schools Promoting Positive Behavior Program, we showcase its results achieved through the participation of students and school staff through Madrasati Platform under the supervision of the student advisor.”
The fifth governance parameter is Commending Stakeholders (see Row 6, Table 3) represented by keywords e.g., Mission, Results, Honesty, Thanks, Experience, Challenges, Pandemic, Continuous, Difficulties, Successful, and Excellence. During the COVID-19 pandemic the teachers' efforts became more evident because the students' families and other stakeholders realized the efforts that the teachers had put into teaching the kids despite the rapidly appearing challenges during the pandemic including migration to online learning. With the efforts of the teachers, online learning became successful. This cluster represents peoples' appreciation for teachers and other stakeholders. Due to this, International Teachers' Day was celebrated more than before. The following tweet related to this cluster found in our dataset is an example of many tweets that were posted on the 5th of October 2020 by students, families, and management to thank the teachers for their efforts.
“شكراً معلمي ٠٢٠٢ تجاوزتم الصعوبات وتغلبتم على التحديات وقمتم بأجمل رسالة على وجه الأرض بكل أمانة وإتقان فكان التعليم عن بُعد ناجحاً بجهودكم فشكراً لكم”
“Thank you, teachers of 2020. You overcame the difficulties and the challenges and delivered the most beautiful message on the earth with honesty and excellence. Distance learning became successful with your efforts. So, thank you.”
The sixth governance parameter is New Economic Opportunities (Rows 7–9, Table 3) that is included in the second macro-parameter Economic Sustainability. It is a merged parameter and includes Topics 6, 15, and 17. Figure 5 plots the top 30 most relevant keywords for Topic 6 (we limit presenting one topic diagram rather than for all the three merged clusters to conform to the publisher's article guidelines). Topic 6 contains 5.8% of the total number of keywords. The Arabic keywords in the figure are translated into English in Table 5. Note that some of the keywords e.g., Graphics, Logo, and Motion, are only significant within Topic 6 since the overall term frequency is equal to the term frequency within Topic 6. The selected keywords from the three topics listed in Column 5 of Rows 7–9 (Table 3) characterize the governance parameter and include Design, Graphics, Video, Logo, Motion, Project, Income, Private, Photos, Discount, Solution, Presentations, Assignments, Research, and Exams. Many businesses were severely affected due to the COVID-19 related lockdowns causing many people to lose their jobs or their salaries to be reduced (Alomari et al., 2021a). This governance parameter relates to the new economic opportunities in online learning that people have developed as a result of these adverse effects of the pandemic. The economic opportunities include various educational services for students and instructors. These services are provided by individuals and small businesses including providing tuition and other services to help students with research papers, solving assignments and exams, designing logos and photos, and developing PowerPoint slides. Some services were also made available to the other stakeholders such as families, teachers, and other management staff to facilitate their migration to the online mode of learning. The following tweet posted on October 4, 2020, is an example of the many tweets related to this governance parameter.
“نقوم بعمل مجموعه من الخدمات للطلاب بحوث الجامعية حل الواجبات بشكل احترافي. عروض البوربوينت بروجكتات حل اسايمنتات”
“We provide a range of (tuition) services to help students with their undergraduate research, preparing PowerPoint presentations, projects, and assignments”
Table 5. The keywords and their English translation for Figure 5.
Some services were also made available to the other stakeholders such as families, teachers, and other management staff who lack technical and digital skills. These services aim to facilitate the migration to the online mode of learning. The following tweet highlights some of the services provided to teachers.
“كل مايخص خدمات (التعليم عن بعد) الإلكترونية نقدم لكم: -اختبارات إلكترونية -العاب تفاعلية -أوراق عمل -مونتاج فيديوهات وصور تعليمية… هذا والكثير عبر حسابنا في سناب…”
“We offer everything related to (online learning) electronic services e.g., electronic tests, interactive games, worksheets, montage of educational videos and photos… a lot more through our account in Snap…”
Below are some tweets related to the private services provided. “…student services: designing electronic questionnaires, statistical analysis SPSS, … research papers, translation, … writing articles, … convert the audio content to a Word file, formatting Word file …”
The seventh governance parameter is Evaluation that is included in the third macro-parameter Accountability and is represented by the keywords from Topic 2 including Teacher, Student, Experience, Success, Thank, Process, Kingdom, Technology, Level, Skills, and Efforts. Figure 6 plots the top 30 most relevant keywords for Topic 2. It shows that Topic 2 includes 8.9% of the total number of keywords. The English translation of the Arabic keywords is depicted in Table 6. The governance parameter relates to the evaluation of the online learning programs in Saudi Arabia that, according to the National eLearning Center (NELC) in Saudi Arabia, was carried out by six international organizations in two different studies (Saudi National eLearning Center, 2020). The studies reported on the experience of public and higher education in the Kingdom during the pandemic. The aim of these studies was to analyze the experiences of online learning during the pandemic and suggest initiatives for improvements. These studies benefitted from the participation of 342,000 students, parents, teachers, and other staff from schools and higher education institutions. The first study was conducted by the Organization of the E-Learning Consortium (OLC), along with the International Society for Educational Technologies (ISTE), Quality Matters (QM), UNESCO, the National Center for Research on Distance Learning and Advanced Technologies in the United States of America (DETA), and the Institute of Information Technology in Education (IITE). The second study was conducted by the Organization for Economic Co-operation and Development (OECD) together with Harvard University. OLC applauded NELC Saudi Arabia particularly for the diversity of options available to the students and stakeholders (free access to lectures through Satellite TV, Internet, exceptional support to equip teachers with the required skills and access to various digital platforms, etc.) and the speed of response in ensuring a successful transition to online learning. The tweets detected under this governance parameter included the tweets reporting the outcome of these two evaluation studies and people and various institutions responding to the topic. For example, King Abdulaziz University, Jeddah, posted the following tweet.
“تفخر جامعة_الملك_عبد العزيز بمشاركتها مع الجامعات السعودية في نجاح تجربة المملكة في التعليم عن بعد حيث أشادت الدراسات التي قامت بها منظمات وجهات عالمية بدور المملكة في التعليم الالكتروني من حيث سرعة الاستجابة، وتعدد الخيارات، والتحسين المستمر.”
“King Abdulaziz University is proud of its contribution in the successful experience of Saudi Arabia in transitioning to electronic learning, assessed by two studies conducted by six international organizations, the studies demonstrated strong aspects of the Saudi experience including fast response, diversity of options, and continuous improvements.”
Table 6. The keywords and their English translation for Figure 6.
We now discuss the governance parameters related to the fourth macro-parameter, Challenges. It includes three governance parameters, Exam Procedures, School Timings, and Digital Services. The eighth parameter, Exam procedures, is a merged governance parameter created from three topics, Topics 5, 10, and 14. Figure 7 plots the top 30 most relevant keywords for Topic 5. The Arabic keywords in Figure 7 are translated into English in Table 7. Note that the key word “الاختبارات” (Exams) is at the top of the list of the key terms which shows the importance of it in this parameter which is about Exam procedures. The selected keywords from the three topics listed in Column 5 of Rows 9–11 characterize the governance parameter. The keywords include Exams, Physical Attendance, COVID-19, Ministry, Universities, Schools, Infection, Health, Demand, Risk, and Social Distancing. Exam procedures during COVID-19, whether these should be online or in-class, were a major concern for people and were vehemently discussed on Twitter by students, parents, instructors, and others. Some universities required students to physically attend the final exams so people were afraid that it created risks for higher infection rates during exams. While the public concern was valid, the universities view on physical attendance in final exams also seems to be valid because of the inability of the current eLearning systems to detect cheating during exams and enforce honesty on the students' part. This challenge creates opportunities for the development of digital systems that provide better online learning and assessment capabilities.
Table 7. The keywords and their English translation for Figure 7.
The ninth governance parameter is School Timings part of the fourth macro-parameter Challenges and is represented by the keywords from Topic 7 including Time, Platform, Studying, Teacher, Better, Excessive, Problem, Consider, Elementary, and Difficult. Online school timings in Saudi Arabia were scheduled based on the school levels. The intermediate and high school classes were scheduled in the mornings while the elementary school classes were scheduled in the afternoons. This became a problem for many families particularly mothers with children in multiple school levels who had to supervise their kids throughout the day. Tweets such as the following were found in our dataset.
“..... التعليم عن بعد جميل للأسف وقت المنصه للمرحلة الابتدائية جدا متعب خاصه اذا كنتي معلمه وام يضيع اليوم وحنا بالمنصة وبعده عندك مسؤوليات اخرى ارهاق”
“…. distance education is good. however, the school timings for the elementary stage are very exhausting, especially if you are a teacher and a mother, the entire day is wasted while we are on the platform, and after that, you have other responsibilities. That is exhausting.”
While the parents and mothers had valid reasons to be exhausted, we believe that school timings were decided considering a number of factors. Scheduling the senior school students in the mornings allowed working parents to carry out their jobs in the morning because older kids required less supervision. This enabled working parents, or older kids, to be able to supervise the younger kids during the afternoon schools. Dividing the school timings to mornings and afternoons also allowed the sharing of resources such as computers and tablets.
The tenth and the last governance parameter is Digital Services (see Table 3) that reflects the challenges associated with the digital services that people encountered during the online learning sessions. People reported a lack of response or delays in accessing the online platforms due to the poor internet services and the inability of the digital platforms to manage a large number of users (Alhebtali, 2020). These issues did exist, however, considering the emergent and rapidly evolving nature of the pandemic situation and rapid improvements in the internet services and digital platforms, we believe the Saudi government managed the situation very well. This was evident from the two studies carried out by six international organizations that we have mentioned earlier (Saudi National eLearning Center, 2020).
Finally, note that Topics 16, 20, 18, 11, and 13 were excluded from these discussions because these were related to Iraq, Jordan, and Kuwait, and these were not our focus in this study.
Discussion
This paper investigates the use of AI in urban governance as to how AI could help governments learn about urban governance parameters on various subject matters in order for the governments to develop better governance. Saudi Arabia has demonstrated a fine example of good governance during the pandemic in terms of both managing the pandemic and managing education. In terms of managing the pandemic, the government had used good practices from around the world including quarantine, social distancing, closure of public and private facilities to contain the virus, curfew, cleaning services, financial incentives for the public and private sector to keep the economy afloat, and effective return to normal strategy once the infection rate reached a certain low (Alomari et al., 2021a).
We studied governments approach toward education governance during the pandemic by automatically detecting twenty topics related to urban governance of online learning using the LDA algorithm. We merged these topics into ten governance parameters and then structured these ten parameters into four governance macro-parameters, Strategies and Success Factors, Economic Sustainability, Accountability, and Challenges. We found that regarding online learning, the government's stance was similar to its approach in managing the pandemic. The government was cautious not to open public facilities that could lead to an increase in the infection rate and therefore schools and universities provided online learning services.
The first macro-parameter Strategies and Success Factors touches upon strategies and success factors including developing social sustainability and enhancing educational performance through supporting families, students, teachers, and other stakeholders. In August 2020, before the beginning of the academic year, the Ministry of Education announced that the schools and universities will be online for the first 7 weeks of the first semester and later on decided to continue online learning over the whole Fall Semester of 2020–2021. The Ministry of Education used various strategies for supporting the transition to online learning. It provided free access to lectures through Satellite TV, Internet, exceptional support to equip teachers with the required skills, and access to various digital platforms. The ministry strengthened the family involvement in supporting their children for online learning and arranged some online activities and involved the families in these activities using the online learning platform “Madrasati.” The ministry also developed at the school and city levels certain competition exercises among the students to strengthen distance learning, aiming to encourage students, promote hard work, and reduce laziness among them. The government also launched the “Schools Promoting Positive Behavior” program that aimed to nurture positive behavior such as honesty among students, create an institutional environment that stimulates positive behavior to achieve psychological and social compatibility for the student, and emphasize the strengthening of the relationship between the students, schools, the families, and the local communities. The teachers and other stakeholders were commended for their hard work to show them appreciation, create a positive and motivating environment for all, and enhance the overall performance of online learning.
The financial crisis throughout the world due to the COVID-19 pandemic is well-known. For example, in our earlier work where we attempted to public concerns and government measures in Saudi Arabia during the COVID-19 pandemic, we found that many businesses were severely affected due to the COVID-19 related lockdowns causing many people to lose their jobs or their salaries to be reduced (Alomari et al., 2021a). The second macro-parameter reports on new economic opportunities enabled by individuals and SMEs in providing services to fill the service gaps created through the rapid transition of in-class learning to online learning. The economic opportunities include various educational services made available for students and other stakeholders such as teachers and families to facilitate their migration to the online mode of learning. These services are provided by individuals and small businesses including providing tuition and other services to help students with research papers, solving assignments and exams, and designing logos and photos.
The third macro-parameter touches on governance accountability and provides evidence through two studies for the government to have successfully managed the pandemic-induced emergency for transitioning the whole national education delivery system from in-class to online learning. The studies involved 342,000 students and other stakeholders in education and were carried out by international organizations including OLC, ISTE, QM, UNESCO, DETA, IITE, OECD, and Harvard University (Saudi National eLearning Center, 2020). The aim of these studies was to analyze the experiences of online learning during the pandemic and suggesting initiatives for improvements (see Section Results and Findings for details about this evaluation exercise). OLC applauded Saudi Arabia for the diversity of options available to the students and stakeholders including free access to lectures through Satellite TV, Internet, etc. and the speed of response in ensuring a successful transition to online learning.
The fourth macro-parameter touches upon the challenges that the people faced during the online learning period and the government's strategies to manage those challenges. The matter whether the school and university exams during the pandemic should be online or in-class was a major concern for people and was vehemently discussed on Twitter by students, parents, instructors, and others. While some citizens justifiably felt the exams should not be in-person due to health and safety, the universities' view on physical attendance in final exams, under strict physical distancing, also seems to be valid because of the inability of the current eLearning systems to detect cheating during exams and enforce honesty on the students' part. The school timings where classes were scheduled in the mornings and afternoons based on school grades was another issue for parents because it required parents to supervise their kids throughout the day. However, the school timings scheduled by the government had several advantages including sharing of computing and internet resources and allowing senior kids to attend classes in the mornings to allow working parents to carry out their jobs. People also encountered challenges associated with the quality of digital services during the online learning sessions including a lack of response or delays in accessing the online platforms. These issues did exist initially, however, were managed well by the government over some time into the semesters as became evident from the results of the two studies carried out by international organizations that we have mentioned earlier.
Comparing our work with other works in the literature, we are able to make significant contributions to the literature. The differences include in the context, focus, time period, geography, data, language, design, findings, and the particular perspective on the use of AI in urban governance presented in this study. The existing studies on modeling COVID-19 related issues using NLP methods have focused on detecting symptoms of stress due to COVID-19 in the US (Li et al., 2020), detecting government measures and public concerns in Saudi Arabia (Alomari et al., 2021a), investigating the differences in responses to the pandemic between university students and the general public in the US (Duong et al., 2020a), and sentiment analysis of closures of seven types of public and private facilities containing the pandemic in Saudi Arabia (Alhajji et al., 2020) (see Section Literature Review for a detailed literature review). Our work complements significantly to these works as it holistically studies a specific government function (online learning during COVID-19) in greater detail touching upon the success factors, accountability, challenges, social sustainability, and economic sustainability.
The data-driven approach proposed in this paper and the urban governance parameters for online learning during COVID-19 discovered in this work align with the smart cities and urban governance literature in multiple ways. The governance parameters and macro-parameters discovered for online learning in this work align with general urban governance policies, strategies, and methods. These include continuing education via support for online learning, supporting families to help them and their children to adapt to the changing (learning) environments, developing competitions and incentives to achieve (learning) objectives, nurturing positive behavior, commending stakeholders, government allowing or facilitating people to find new economic opportunities, evaluation of government policies, strategies and actions, developing new (exam) procedures to manage the changing environments, developing (school) timings and schedules that are convenient to the public, providing digital services to facilitate the public in performing the desired actions. It is clear that while these parameters are specific to online learning, these broadly fall into the general policies and methods used in urban governance. Secondly, it supports the premise that research and practice in smart cities and urban governance should be driven by data (Liu et al., 2017; Bibri, 2021; Yigitcanlar et al., 2021b) and confirms that digital media including social networks data are important sources of data that could be used for smart urban governance (Barns, 2020; Ahmad et al., 2022; Alahmari et al., 2022; Yigitcanlar et al., 2022).
Governance involves the processes of making and implementing decisions (ESCAP, 2009). The study in this paper and the analysis of the findings have provided vital information about the urban governance decision-making and implementation processes concerning online learning during COVID-19 in Saudi Arabia. Evaluating the urban governance parameters discovered from our case study against widely used characteristics of good governance (ESCAP, 2009), we make the following observations. The Saudi governments' response could be considered exceptional on several good governance metrics including participation, responsiveness, equity and inclusiveness, accountability, responsibility, transparency, and consensus-orientation. The government was efficient and effective in implementing measures that allowed a successful transition to online learning (Saudi National eLearning Center, 2020) without affecting the educational goals while keeping people safe through physical distancing and other pandemic measures (Alomari et al., 2021a) [for example, evidenced by the low number of infections and deaths in the country (Worldometer, 2022)]. The government provided all the citizens access to online learning through free TV lectures (Ministry of Education, 2022; Saudi Ministry of Education, 2022) and other financial incentives (Saudi Ministry of Education, 2020a) that support equity and social sustainability. Several initiatives and programs were devised by the government to nurture responsibility, honesty, and positive behavior among students (Madrastati, 2020; Saudi Ministry of Education, 2020e, Saudi Ministry of Education, 2021b). Noteworthy was also the government's efforts in creating appreciation and harmony among teachers, families, and other stakeholders (Saudi Ministry of Education, 2020b). We did not detect any issues raised by people related to the lack of equity of treatment by the government in providing access to online educational resources. No issues were detected related to lack of food, security, or other basic needs of people. Relatedly, the government provided free COVID-19 treatment and vaccination for all citizens facilitating people to focus on education without worrying about food, healthcare, and vaccinations (Unified National Platform GOV.SA., 2022).
Finally, the case study presented in this paper shows that the use of data-driven AI can help governments learn about public feedback and reactions on government matters, the effectiveness of government policies, strategies, programs, and procedures, the challenges people are facing in adapting to the government measures, new economic, social, or other opportunities arising out of the situation, and so on. These parameters could help the Saudi government to learn about the design and operations space of its various functions and state services and improve these through participatory governance incorporating public and other stakeholders' feedback and consultation. For example, some of the parameters that have touched upon economic sustainability can be used to improve sustainability of economy such as by learning about new economic opportunities from the natural responses and behaviors of people and supporting these opportunities to form into economic instruments, institutions, and outputs specific to pandemic responses or also for normal times. Similarly, governments can learn about better social sustainability instruments from the case study. The case studies can be focused on specific sustainability or other matters. The focused case studies can be enhanced through focused data collection and deeper analytics on the matter under study to develop effective instruments. We did not focus on environmental sustainability in this paper but this could be done by adding filters for data collection. The developed methods can be applied to learn from specific countries, or on a global scale, such as analyzing various governance parameters for countries that have shown success, failure, or poor performance in education, COVID-19, or other sectors to find the best practices and pitfalls. Good governance characteristics can be incorporated into the design of our proposed method, for example in the data collection process to investigate those specific governance characteristics for a specific government service under study. Good governance characteristics can also be included in the design of the proposed method, such that automated comparison and evaluation of a government's services are made against established standards of good governance. The government could use the knowledge gained through this data-driven AI process to develop better urban governance instruments and this whole process could be implemented as a perpetual loop for real-time urban governance with much finer levels of engagement with the public.
Conclusions and Future Work
Data-driven artificial intelligence is becoming so fundamentally ingrained in smart city developments that cities have been called artificially intelligent cities and autonomous cities. AI is helping us increasingly in everything we do and making decisions for us in many spheres of our life. The study in this paper demonstrated that the use of data-driven AI can help governments learn about public feedback on government matters, the effectiveness of government instruments, and the challenges and opportunities arising out of a situation. The method can be applied to learn best practices and pitfalls of government instruments nationally and internationally on any governance matter, to develop new or better instruments (such as for new economic opportunities) and to evaluate governance against international standards.
Moreover, we argue through the case study that the use of AI does not have to necessarily replace humans in urban governance and other decision-making tasks, rather the use of AI, under human supervision, could allow governments to monitor and improve—in real-time and at a much finer level—the effectiveness of government policies, strategies, programs, and procedures in light of continuous feedback from the public and other stakeholders. Many of the functions in urban governance requiring human expertise could be automated using data-driven AI to a large extent and the whole process of urban governance can become more and more autonomous. This automation and autonomy will make humans contribute to higher cognitive levels in the urban governance life cycle using the information provided by AI. How much AI-driven autonomy urban governance would have, would ultimately be determined by us humans. Until humans continue to believe in the superiority of the designer—i.e., we humans—we will manage AI to act autonomously but within a certain bound defined by us. Certainly, it will also depend on political empowerment of masses as opposed to human masses being controlled by a few of them. And if this happens (AI becomes autonomous and controls humans), then it is the fault of humans whose majority feels no or little concern about themselves and they have left their fate to a few among them without any restraints.
Our earlier work on NLP-based social media big data analytics has developed a number of tools and focused on its methodological and computational aspects for various applications. We have looked at distributed machine learning to manage big data (Alomari et al., 2021a), automatic labeling (Alomari et al., 2021b), developing an improved stemmer for Arabic NLP (Alomari et al., 2020a), in application areas including healthcare (Alotaibi et al., 2020), logistics (Suma et al., 2017, 2020), detecting general public concerns during COVID-19 (Alomari et al., 2021a), and government services' analysis (Alsulami and Mehmood, 2018). We plan to extend the case study presented in this paper by improving its breadth and depth in terms of the machine learning methods to enhance its functionality and usability, data and computing scalability, real-time functionality, and the novelty and utilization of its findings in urban governance. The tool currently works for the Arabic language but the broader methods developed in this work are applicable to other languages.
The whole of humanity has faced great trials during this pandemic and this adversity is continuing. The people and governments have shown great resilience during the pandemic and we will come out of it stronger and united. Challenges are part of life and a cause for improvements and innovations. There is no life in this world without challenges and no one is perfect. We believe that these challenges (discovered and reported in this paper) will allow the development of new industries nationally and internationally, providing new opportunities for economic developments and reducing unemployment by creating new jobs. For example, the challenge of online learning and assessments detected in this study creates opportunities for the development of digital systems that provide better online learning and assessment capabilities. The shortcomings in the required quality of internet services and digital platforms in terms of dealing with the QoS (delay, bandwidth, response time, functionalities of the online learning platforms, etc.) also present technological and business development opportunities. Looking into challenges and finding opportunities to accelerate innovation will form another direction of our research.
Data Availability Statement
The datasets presented in this article are not readily available because data was obtained from Twitter. Restrictions apply to the availability of these data. Requests to access the datasets should be directed to https://twitter.com/.
Author Contributions
SA and RM conceived, developed, analyzed, and validated the study. SA developed the software and prepared the initial draft, reviewed, and edited by RM. RM, IK, and EA provided supervision, funds, resources, and contributed to the article editing. All authors contributed to the article and approved the submitted version.
Funding
This project was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, under grant number RG-6-611-40. The authors, therefore, acknowledge with thanks the DSR for their technical and financial support. The work carried out in this paper is supported by the HPC Center at the King Abdulaziz University (KAU).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
The experiments reported in this paper were performed on the Aziz supercomputer at KAU.
References
Abd-Alrazaq, A., Alhuwail, D., Househ, M., Hamdi, M., and Shah, Z. (2020). Top concerns of tweeters during the COVID-19 pandemic: a surveillance study. J. Med. Internet Res. 22, e19016. doi: 10.2196/19016
Abdulaziz, M., Alotaibi, A., Alsolamy, M., and Alabbas, A. (2021). Topic based sentiment analysis for COVID-19 tweets. Int. J. Adv. Comp. Sci. Applic. 12, 626–636. doi: 10.14569/IJACSA.2021.0120172
Adnan, M., and Anwar, K. (2020). Online learning amid the COVID-19 pandemic: students' perspectives. J. Pedagog. Sociol. Psychol. 2, 45–51. doi: 10.33902/JPSP.2020261309
Ahmad, I., Alqurashi, F., Abozinadah, E., and Mehmood, R. (2022). Deep journalism and DeepJournal V1.0: a data-driven deep learning approach to discover parameters for transportation (as a case study). Preprint. doi: 10.20944/preprints202203.0245.v1
Ai, F., Chen, Y., Guo, Y., Zhao, Y., Wang, Z., Fu, G., et al. (2019). Concept-Aware Deep Knowledge Tracing and Exercise Recommendation in an Online Learning System. International Educational Data Mining Society. Available online at: https://eric.ed.gov/?id=ED599194
Ai, J., and Laffey, J. (2007). Web mining as a tool for understanding online learning. MERLOT J. Online Learn. Teach. 3, 160–169. Available online at: https://www.merlot.org/merlot/viewMaterial.htm?id=875448
Alahmari, N., Alswedani, S., Alzahrani, A., Katib, I., Albeshri, A., and Mehmood, R. (2022). Musawah: a data-driven ai approach and tool to co-create healthcare services with a case study on cancer disease in Saudi Arabia. Sustainability 14, 3313. doi: 10.3390/su14063313
Alam, F., Almaghthawi, A., Katib, I., Albeshri, A., and Mehmood, R. (2021). iResponse : an AI and IoT-enabled framework for autonomous COVID-19 pandemic management. Sustainability 13, 3797. doi: 10.3390/su13073797
Alanazi, E., Alashaikh, A., Alqurashi, S., and Alanazi, A. (2020). Identifying and ranking common COVID-19 symptoms from arabic twitter. J. Med. Internet Res. 22, e21329. doi: 10.2196/21329
Alatiq, M. (2021). Digital Madrasati Platform Competition, Saudi Ministry of Education. Available online at: https://edu.moe.gov.sa/Ola/MediaCenter/News/Pages/ (accessed July 10, 2021).
Alhadwari, N. (2021). Education Launches the Intensive Week of the Schools Promoting Positive Behavior Program, Saudi Ministry of Education. Available online at: https://edu.moe.gov.sa/Tabuk/MediaCenter/News/Pages/poseteve.aspx (accessed July 10, 2021).
Alhajji, M., Al Khalifah, A., Aljubran, M., and Alkhalifah, M. (2020). Sentiment analysis of tweets in Saudi Arabia regarding governmental preventive measures to contain COVID-19. Preprints. doi: 10.20944/preprints202004.0031.v1
Alhebtali, F. (2020). Online Education is Being Affected by Internet Quality. Available online at: https://cutt.ly/VjhnPDW (accessed June 10, 2021).
Alomari, E., Katib, I., Albeshri, A., and Mehmood, R. (2021a). COVID-19: Detecting government pandemic measures and public concerns from twitter arabic data using distributed machine learning. Int. J. Environ. Res. Public Health 18, 282. doi: 10.3390/ijerph18010282
Alomari, E., Katib, I., Albeshri, A., Yigitcanlar, T., and Mehmood, R. (2021b). Iktishaf+: a big data tool with automatic labeling for road traffic social sensing and event detection using distributed machine learning. Sensors 21, 2993. doi: 10.3390/s21092993
Alomari, E., Katib, I., and Mehmood, R. (2020a). Iktishaf: a big data road-traffic event detection tool using twitter and spark machine learning. Mobile Netw. Applic. 1–16. doi: 10.1007/s11036-020-01635-y
Alomari, E., Mehmood, R., and Katib, I. (2020b). “Sentiment analysis of arabic tweets for road traffic congestion and event detection,” in Smart Infrastructure and Applications: Foundations for Smarter Cities and Societies (Cham: Springer International Publishing Switzerland), 37–54. doi: 10.1007/978-3-030-13705-2_2
Alotaibi, S., Mehmood, R., Katib, I., Rana, O., and Albeshri, A. (2020). Sehaa: A big data analytics tool for healthcare symptoms and diseases detection using twitter, apache spark, and machine learning. Appl. Sci. 10, 1398. doi: 10.3390/app10041398
Alrefaie, M. T., and Bazine, T. (2019). Largest List of Arabic Stop Words on Github. Github. Available online at: https://github.com/mohataher/arabic-stop-words
Alsalman, H. (2020). “An improved approach for sentiment analysis of arabic tweets in twitter social media,” in ICCAIS 2020 - 3rd International Conference on Computer Applications and Information Security (Riyadh). doi: 10.1109/ICCAIS48893.2020.9096850
Alsudias, L., and Rayson, P. (2020). “COVID-19 and arabic twitter: how can arab world governments and public health organizations learn from social media?,” Association for Computational Linguistics, Proceeding. Available online at: https://aclanthology.org/2020.nlpcovid19-acl.16
Alsulami, M., and Mehmood, R. (2018). “Sentiment analysis model for arabic tweets to detect users' opinions about government services in Saudi Arabia: ministry of education as a case study,” in Al Yamamah Information and Communication Technology Forum (Riyadh: Al Yamamah Information and Communication Technology Forum), 1–8.
Altawaier, M. M., and Tiun, S. (2016). Comparison of machine learning approaches on Arabic twitter sentiment analysis. Int. J. Adv. Sci. Eng. Inform. Technol. 6, 1067–1073. doi: 10.18517/ijaseit.6.6.1456
Alyaum Newspaper (2020). The Intensive Week Promotes Positive Behavior in Schools. Alyaum Newspaper. Available online at: https://www.alyaum.com/articles/6282320/-السلوك-الإيجابي-في-المدارسالمملكة-اليوم/الأسبوع-المكثف-يعزز
Baker, Q. B., Shatnawi, F., Rawashdeh, S., Al-Smadi, M., and Jararweh, Y. (2020). Detecting epidemic diseases using sentiment analysis of arabic tweets. J. Univers. Comput. Sci. 26, 50–70. doi: 10.3897/jucs.2020.004
Barns, S. (2020). Re-engineering the city: platform ecosystems and the capture of urban big data. Front. Sust. Cities. 2, 32. doi: 10.3389/frsc.2020.00032
Bestiantono, D. S., Agustina, P. Z. R., and Cheng, T.-H. (2020). How students' perspectives about online learning amid the COVID-19 pandemic? Stud. Learn. Teach. 1, 133–139. doi: 10.46627/silet.v1i3.46
Bibri, S. E. (2021). Data-driven smart sustainable cities of the future: an evidence synthesis approach to a comprehensive state-of-the-art literature review. Sust. Fut. 3, 100047. doi: 10.1016/j.sftr.2021.100047
Carpenter, J., Tani, T., Morrison, S., and Keane, J. (2020). Exploring the landscape of educator professional activity on Twitter: an analysis of 16 education-related Twitter hashtags. Prof. Dev. Educ. 1–22. doi: 10.1080/19415257.2020.1752287
Chen, M., Mao, S., and Liu, Y. (2014). Big data: a survey. Mobile Netw. Applic. 19, 171–209. doi: 10.1007/s11036-013-0489-0
Chen, X., Vorvoreanu, M., and Madhavan, K. P. C. C. (2014). Mining social media data for understanding students' learning experiences. IEEE Trans. Learn. Technol. 7, 246–259. doi: 10.1109/TLT.2013.2296520
Clark, M., Sheppard, S., Atman, C., Fleming, L., Miller, R., Stevens, R., et al. (2008). “Academic pathways study: processes and realities,” in ASEE Annual Conference and Exposition, Conference Proceedings (Pittsburg, PA).
Cugurullo, F. (2020). Urban artificial intelligence: from automation to autonomy in the smart city. Front. Sust. Cities 2, 38. doi: 10.3389/frsc.2020.00038
Das, S., and Dutta, A. (2020). Characterizing public emotions and sentiments in COVID-19 environment: a case study of India. J. Hum. Behav. Soc. Environ. 31, 154–167. doi: 10.1080/10911359.2020.1781015
Duong, V., Luo, J., Pham, P., Yang, T., and Wang, Y. (2020a). “The ivory tower lost: how college students respond differently than the general public to the COVID-19 pandemic,” Proceedings of the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2020 (Los Alamitos, CA), 126–130. doi: 10.1109/ASONAM49781.2020.9381379
Duong, V., Pham, P., Yang, T., Wang, Y., and Luo, J. (2020b). The ivory tower lost: how college students respond differently than the general public to the COVID-19 pandemic. ArXiv.
Duwairi, R. M., and Qarqaz, I. (2016). A framework for arabic sentiment analysis using supervised classification. Int. J. Data Min. Model. Manage. 8, 369–381. doi: 10.1504/IJDMMM.2016.081247
Elaraby, M., and Abdul-Mageed, M. (2018). “Deep models for arabic dialect identification on benchmarked data,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects Santa Fe (New Mexico), 263. Available online at: https://aclweb.org/anthology/W18–3930
ESCAP (2009). What is Good Governance? ESCAP. Available online at: https://www.unescap.org/resources/what-good-governance
Essam, B. A., and Abdo, M. S. (2021). How do arab tweeters perceive the COVID-19 pandemic? J. Psycholing. Res. 50, 507–521. doi: 10.1007/s10936-020-09715-6
Gao, P., Li, J., and Liu, S. (2021). An introduction to key technology in artificial intelligence and big data driven e-learning and e-education. Mobile Netw. Applic. 26, 2123–2126. doi: 10.1007/s11036-021-01777-7
Go, A., Bhayani, R., and Huang, L. (2009). Twitter sentiment classification using distant supervision, pp. 1–6. Available online at: https://cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf (accessed April 30, 2022).
Gwad, W. H. G., Ismael, I. M. I., and Gültepe, Y. (2020). Twitter sentiment analysis classification in the arabic language using long short-term memory neural networks. Int. J. Eng. Adv. Technol. 9, 235–239. doi: 10.35940/ijeat.B4565.029320
Harron, J., and Liu, S. (2020). “Coronavirus and online learning: a case study of influential K-12 teacher voices on twitter,” in Proceedings of SITE Interactive Conference 2020, ed E. Langran (Association for the Advancement of Computing in Education), 719–724. Available online at: https://www.learntechlib.org/p/218229
Hou, C., Hua, L., Lin, Y., Zhang, J., Liu, G., and Xiao, Y. (2021). Application and exploration of artificial intelligence and edge computing in long-distance education on mobile network. Mobile Netw. Applic. 26, 2164–2175. doi: 10.1007/s11036-021-01773-x
Janbi, N., Katib, I., Albeshri, A., and Mehmood, R. (2020). Distributed artificial intelligence-as-a-service (DAIaaS) for smarter IoE and 6G environments. Sensors 20, 5796. doi: 10.3390/s20205796
Jimenez-Sotomayor, M. R., Gomez-Moreno, C., and Soto-Perez-de-Celis, E. (2020). Coronavirus, ageism, and twitter: an evaluation of tweets about older adults and COVID-19. J. Am. Geriatr. Soc. 68, 1661–1665. doi: 10.1111/jgs.16508
Johns Hopkins University (2020). Coronavirus COVID-19 Global Cases by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU), Corona Dashboard. Johns Hopkins University. Available online at: https://coronavirus.jhu.edu/map.html
Kechaou, Z., Ben Ammar, M., and Alimi, A. M. (2011). “Improving e-learning with sentiment analysis of users' opinions,” in 2011 IEEE Global Engineering Education Conference. EDUCON 2011 (Amman), 1032–1038. doi: 10.1109/EDUCON.2011.5773275
Kemp, S. (2019). Digital Trends 2019: Every Single Stat You Need to Know About the Internet, thenextweb.com. Available online at: https://thenextweb.com/news/digital-trends-2019-every-single-stat-you-need-to-know-about-the-internet (accessed March 1, 2022).
Khan, L. U., Yaqoob, I., Tran, N. H., Kazmi, S. A., Dang, T. N., and Hong, C. S. (2020). Edge computing enabled smart cities: a comprehensive survey. IEEE Internet Things J. 7, 10200–10232 doi: 10.1109/JIOT.2020.2987070
Khanal, S. S., Prasad, P. W. C., Alsadoon, A., and Maag, A. (2019). A systematic review: machine learning based recommendation systems for e-learning. Educ. Inform. Technol. 25, 2635–2664. doi: 10.1007/s10639-019-10063-9
Korshunova, I., Xiong, H., Fedoryszak, M., and Theis, L. (2019). Discriminative Topic Modeling with Logistic LDA. Red Hook, NY: Curran Associates Inc. p. 6770–6780. doi: 10.5555/3454287.3454895
Lande, P., and Dalal, V. (2016). Analyzing social media data to explore students' academic experiences. Int. J. Comp. Applic. 135, 13–16. doi: 10.5120/ijca2016908258
Li, D., Chaudhary, H., and Zhang, Z. (2020). Modeling spatiotemporal pattern of depressive symptoms caused by COVID-19 using social media data mining. Int. J. Environ. Res. Public Health 17, 1–23. doi: 10.3390/ijerph171449883
Lin, Y. (2019). 10 Twitter Statistics Every Marketer Should Know in 2020, Oberlo. Available online at: https://www.oberlo.com/blog/twitter-statistics (accessed March 1, 2022).
Liu, W., Cui, P., Nurminen, J. K., and Wang, J. (2017). Special issue on intelligent urban computing with big data. Mach. Vis. Applic. 28, 675–677. doi: 10.1007/s00138-017-0877-8
Mabey, B. (2015). pyLDAvis — pyLDAvis 2.1.2 Documentation. Available online at: https://pyldavis.readthedocs.io/en/latest/readme.html (accessed May 10, 2022).
Madrastati (2020). Madrasati Competition, Saudi Ministry of Education. Madrastati. Available online at: https://www.backtoschool.sa/education/competition
Magatti, D., Calegari, S., Ciucci, D., and Stella, F. (2009). “Automatic labeling of topics,” in ISDA 2009 - 9th International Conference on Intelligent Systems Design and Applications (Pisa), 1227–1232. doi: 10.1109/ISDA.2009.165
Mehmood, R., Alam, F., Albogami, N. N., Katib, I., Albeshri, A., and Altowaijri, S. M. (2017). UTiLearn: a personalised ubiquitous teaching and learning system for smart societies. IEEE Access 5, 2615–2635. doi: 10.1109/ACCESS.2017.2668840
Ministry of Education (2022). IEN Satellite TV Channels, Saudi Ministry of Education. Ministry of Education. Available online at: https://www.moe.gov.sa/en/mediacenter/Pages/ien.aspx
Mortenson, M. J., and Vidgen, R. (2016). A computational literature review of the technology acceptance model. Int. J. Inform. Manage. 36, 1248–1259. doi: 10.1016/j.ijinfomgt.2016.07.007
Onan, A., Korukoglu, S., and Bulut, H. (2016). LDA-based topic modelling in text sentiment classification: an empirical analysis. Int. J. Comp. Lingu. Applic. 7, 101–119. Available online at: https://www.ijcla.org/2016-1/IJCLA-2016-1-pp-101-119-preprint.pdf
Oueslati, O., Cambria, E., HajHmida, M. B., and Ounelli, H. (2020). A review of sentiment analysis research in Arabic language. Fut. Gener. Comp. Syst. 112, 408–430. doi: 10.1016/j.future.2020.05.034
Rasmitadila, R., Aliyyah, R. R., Rachmadtullah, R., Samsudin, A., Syaodih, E., Nurtanto, M., et al. (2020). The perceptions of primary school teachers of online learning during the covid-19 pandemic period: a case study in Indonesia. J. Ethnic Cult. Stud. 7, 90–109. doi: 10.29333/ejecs/388
Roesslein, J. (2022a). Extended Tweets- Tweepy 4.6.0 Documentation, Tweepy. Available online at: https://docs.tweepy.org/en/stable/extended_tweets.html (accessed September 20, 2020).
Roesslein, J. (2022b). Tweepy. Available online at: https://www.tweepy.org/ (accessed November 23, 2020).
Samuel, J., Ali, G. G., Rahman, M., Esawi, E., and Samuel, Y. (2020a). COVID-19 public sentiment insights and machine learning for tweets classification. Information 11, 1–22. doi: 10.3390/info11060314
Samuel, J., Rahman, M. M., Ali, G. M. N., Samuel, Y., Pelaez, A., Chong, P. H. J., et al. (2020b). Feeling positive about reopening? New normal scenarios from COVID-19 US reopen sentiment analytics. IEEE Access 8, 142173–142190. doi: 10.1109/ACCESS.2020.3013933
Saudi Ministry of Education (2020a). Madrasti Competition. Aseer. Available online at: https://www.asedu.gov.sa/sites/default/files/users/guides/210/pdf/الخطة التنفذية لمسابقة مدرستيالرقيمية_0
Saudi Ministry of Education (2020b). Promoting Positive Behavior in Distance Education. Saudi Ministry of Education. Available online at: https://edu.moe.gov.sa/Sharqia/MediaCenter/News/Pages/NMA-0033522.aspx
Saudi Ministry of Education (2020c). The Minister of Education .. Underscored the Importance of the Role the Families and Parents in Following up Their Children's Education, Twitter Blog. Saudi Ministry of Education. Available online at: https://twitter.com/tc_mohe/status/1295433116966936578
Saudi Ministry of Education (2020d). The Ministry of Education Announces the Continuation of Distance Learning for the Remaining Weeks of the First Semester, Twitter Blog. Saudi Ministry of Education. Available online at: https://twitter.com/moe_gov_sa/status/1314290973053247489?s=28
Saudi Ministry of Education (2020e). The Ministry of Education Strengthens Community Partnership on The World Mental Health Day to Reduce the Effects of COVID-19 on Male and Female Students, Twitter Blog. Saudi Ministry of Education. Available online at: https://twitter.com/moe_gov_sa/status/1314968715335696385?s=28
Saudi Ministry of Education (2020f). The Start of the New Academic Year 1442 AH Online for a Period of 7 Weeks, The Saudi Press Agency. Saudi Ministry of Education. Available online at: https://www.spa.gov.sa/2120893
Saudi Ministry of Education (2021a). The Minister of Education: The Development of Curricula, Study Plans and Three Semesters Represent the First Stage of Ministry of Education, Twitter Blog. Saudi Ministry of Education. Available online at: https://twitter.com/moe_gov_sa/status/1397655344713719809
Saudi Ministry of Education (2021b). The Minister of Education Inaugurates Programming Competition. Saudi Ministry of Education. Available online at: https://www.moe.gov.sa/ar/mediacenter/MOEnews/Pages/mp1442-456.aspx
Saudi Ministry of Education (2022). IEN Channels. Saudi Ministry of Education. Available online at: http://www.ientv.edu.sa/
Saudi National eLearning Center (2020). Two Studies on the Kingdom's Experience in E-learning, Twitter Blog. Saudi National eLearning Center. Available online at: https://twitter.com/NCEL_SA/status/1317417926874550272
Sievert, C., and Shirley, K. E. (2014). “LDAvis: a method for visualizing and interpreting topics,” in Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces (Baltimore, MD), 63–70. doi: 10.3115/v1/W14-3110
Suma, S., Mehmood, R., and Albeshri, A. (2020). “Automatic detection and validation of smart city events using hpc and apache spark platforms,” in Smart Infrastructure and Applications: Foundations for Smarter Cities and Societies (Cham: Springer), 55–78. doi: 10.1007/978-3-030-13705-2_3
Suma, S., Mehmood, R., Albugami, N., Katib, I., and Albeshri, A. (2017). Enabling next generation logistics and planning for smarter societies. Proc. Comp. Sci. 1122–1127. doi: 10.1016/j.procs.2017.05.440
Twitter (2022). Twitter Developer Platform. Twitter. Available online at: https://developer.twitter.com/en/docs/twitter-api/v1/data-dictionary/overview
Unified National Platform GOV.SA. (2022). Health Care in the Kingdom of Saudi Arabia. Unified National Platform GOV.SA. Available online at: https://www.my.gov.sa/wps/portal/snp/aboutksa/HealthCareInKSA
Verma, A., Agarwal, R., Bardia, S., and Shaikh, S. (2016). A survey on analysing students learning experiences by extracting data from social media (social forums). Int. J. Eng. Techn. 2, 75–80.
Wang, X., Han, Y., Leung, V. C., Niyato, D., Yan, X., and Chen, X. (2020). “Convergence of edge computing and deep learning: a comprehensive survey,” in IEEE Communications Surveys & Tutorials (Institute of Electrical and Electronics Engineers), 22. doi: 10.1109/COMST.2020.2970550
Worldometer (2022). Saudi Arabia COVID-19 Statistics. Worldometer. Available online at: https://www.worldometers.info/coronavirus/country/saudi-arabia/
Yigitcanlar, T., Butler, L., Windle, E., Desouza, K. C., Mehmood, R., and Corchado, J. M. (2020a). Can building “artificially intelligent cities” safeguard humanity from natural disasters, pandemics, and other catastrophes? An urban scholar's perspective. Sensors 20, 2988. doi: 10.3390/s20102988
Yigitcanlar, T., Corchado, J. M., Mehmood, R., Li, R. Y. M., Mossberger, K., and Desouza, K. (2021a). Responsible urban innovation with local government artificial intelligence (ai): a conceptual framework and research Agenda. J. Open Innov. Technol. Market Compl. 7, 71. doi: 10.3390/joitmc7010071
Yigitcanlar, T., Kankanamge, N., Regona, M., Ruiz Maldonado, A., Rowan, B., Ryu, A., et al. (2020b). Artificial intelligence technologies and related urban planning and development concepts: how are they perceived and utilized in Australia? J. Open Innov. Technol. Market Compl. 6, 187. doi: 10.3390/joitmc6040187
Yigitcanlar, T., Mehmood, R., and Corchado, J. M. (2021b). Green artificial intelligence: towards an efficient, sustainable and equitable technology for smart cities and futures. Sustainability 13, 8952. doi: 10.3390/su13168952
Yigitcanlar, T., Regona, M., Kankanamge, N., Mehmood, R., D'Costa, J., Lindsay, S., et al. (2022). Detecting natural hazard-related disaster impacts with social media analytics: the case of australian states and territories. Sustainability 14, 810. doi: 10.3390/su14020810
Zahra, S. (2020). Targeted Topic Modeling for Levantine Arabic. Uppsala University. Available online at: http://uu.diva-portal.org/smash/record.jsf?pid=diva2%3A1439483&dswid=-4641
Keywords: urban governance, online learning, machine learning, topic modeling, latent dirichlet allocation (LDA) algorithm, social media, natural language processing (NLP)
Citation: Alswedani S, Katib I, Abozinadah E and Mehmood R (2022) Discovering Urban Governance Parameters for Online Learning in Saudi Arabia During COVID-19 Using Topic Modeling of Twitter Data. Front. Sustain. Cities 4:751681. doi: 10.3389/frsc.2022.751681
Received: 01 August 2021; Accepted: 07 April 2022;
Published: 28 June 2022.
Edited by:
Sarah Barns, Queensland University of Technology, AustraliaReviewed by:
Chandana Siriwardana, University of Moratuwa, Sri LankaCourtney Page-Tan, Embry–Riddle Aeronautical University, United States
Copyright © 2022 Alswedani, Katib, Abozinadah and Mehmood. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Rashid Mehmood, Uk1laG1vb2RAa2F1LmVkdS5zYQ==