- 1Department of Civil and Environmental Engineering and Utah Water Research Laboratory, Utah State University, Logan, UT, United States
- 2Department of Environment and Society, Utah State University, Logan, UT, United States
Hydroinformatics and water data science topics are increasingly common in university graduate settings through dedicated courses and programs as well as incorporation into traditional water science courses. The technical tools and techniques emphasized by hydroinformatics and water data science involve distinctive instructional styles, which may be facilitated by online formats and materials. In the broader hydrologic sciences, there has been a simultaneous push for instructors to develop, share, and reuse content and instructional modules, particularly as the COVID-19 pandemic necessitated a wide scale pivot to online instruction. The experiences of hydroinformatics and water data science instructors in the effectiveness of content formats, instructional tools and techniques, and key topics can inform educational practice not only for those subjects, but for water science generally. This paper reports the results of surveys and interviews with hydroinformatics and water data science instructors. We address the effectiveness of instructional tools, impacts of the pandemic on education, important hydroinformatics topics, and challenges and gaps in hydroinformatics education. Guided by lessons learned from the surveys and interviews and a review of existing online learning platforms, we developed four educational modules designed to address shared topics of interest and to demonstrate the effectiveness of available tools to help overcome identified challenges. The modules are community resources that can be incorporated into courses and modified to address specific class and institutional needs or different geographic locations. Our experience with module implementation can inform development of online educational resources, which will advance and enhance instruction for hydroinformatics and broader hydrologic sciences for which students increasingly need informatics experience and technical skills.
Introduction
In an increasingly data intensive world, researchers and practitioners in water sciences need to apply data-driven analyses to address emerging problems, to explore theories and models, and to leverage growing datasets and computational resources. Within hydrology and related fields in environmental and geosciences, observational data are increasing in scope, frequency, and duration, and computational technologies are essential to solving complex problems (Chen and Han, 2016). Without training, students are unprepared to work or conduct research centered around large and complex data, questions, and tools (Merwade and Ruddell, 2012). To meet this need, hydroinformatics and water data science have been growing as specific topics of instruction, both in university programs and in community education settings (e.g., Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) Virtual University and University of Washington WaterHackWeek) (Popescu et al., 2012; Burian et al., 2013; Wagener et al., 2021). In parallel, incorporation of technical tools in traditional water science courses is growing, though uptake has been uneven and lags behind what many see as needed (Habib et al., 2019; Lane et al., 2021). Hydroinformatics and water data science both combine computational tools and water-related data to achieve actionable knowledge. Although the fields are overlapping, there are subtle differences, and both terms are used throughout this paper.
Within the geosciences, there is increased focus on reusability and reproducibility of research data, code, and results, as well as educational materials (Ceola et al., 2015). Several online spaces have emerged as hubs for storing and sharing lectures, code, examples, and scripts developed by instructors in hydrology, water resources, and other geosciences (Habib et al., 2012, 2019; Lane et al., 2021). The widespread shift to online education resulting from the COVID-19 pandemic illustrated the value of online instructional materials and rapidly accelerated development and transition to online formats (Beason-Abmayr et al., 2021; Rapanta et al., 2021). Community educational resources, online platforms, and increased accessibility of digital tools offer an opportunity to more fully incorporate informatics tools and techniques for data-driven hydrologic applications into water science education.
This paper reports on the current state of hydroinformatics and water data science education in the United States based on available literature and qualitative interviews and surveys with instructors of relevant courses. Another objective of this work was development of online educational modules and evaluation of the implementation platform to share insights with other instructors. Study participants offered information about key topics and technologies, formats and methods of delivery, challenges and gaps, and impacts of COVID-19 on instruction. In addition to the results of the survey, we performed a functional review of online educational platforms based on participants' criteria. Their perspectives and our evaluation were used to inform the development of online learning modules that address some of the identified challenges and gaps while demonstrating existing tools. The modules are community resources that can be incorporated into any related course, workshop, or educational program. They are a step toward sharing educational resources for reuse not only by instructors that specialize in hydroinformatics, but to incorporate informatics skills and topics more broadly in water science courses. The lessons learned from platform feature evaluation and module implementation are valuable for instructors sharing content and for further platform development.
In the Background section, we present a literature review of hydroinformatics and water data science education, including best practices for sharing educational content and outstanding gaps. The Methods section outlines the procedures and literature-informed questions of the surveys/interviews and the methodology for development of educational modules. In the Results and Discussion, we present survey results and the key points that drove the design and implementation of learning modules. The Results and Discussion also covers a review of existing online platforms and module implementation successes and challenges. Finally, the Conclusion offers an outlook for the future of hydroinformatics and water data science instruction.
Background
Hydroinformatics and Water Data Science
In an early conceptualization, hydroinformatics was described as encompassing computational tools to transform water related data and information into useful and actionable knowledge (VanZuylen et al., 1994). Although hydroinformatics may be technical in nature, water issues are inherently social, and consideration of human factors for the presentation and dissemination of results and information is a key component (Vojinovic and Abbott, 2017; Makropoulos, 2019; Celicourt et al., 2021). More recently, the definition of hydroinformatics is broadening to encapsulate water science, data science, and computer science (Burian et al., 2013; Chen and Han, 2016; Vojinovic and Abbott, 2017; Makropoulos, 2019). The objective of data science is application of analytical methods and computational power with domain understanding to transform data to decisional knowledge (Gibert et al., 2018; McGovern and Allen, 2021). When applied to the water domain, this definition is very close to that of hydroinformatics, and for most practical purposes, it is difficult to draw boundaries between hydroinformatics and water data science.
Based on the increasing volume, variety, and availability of data sources and the advancement of software and hardware tools, there is opportunity and need for the application of data science to water, environmental, and geoscience domains (Burian et al., 2013; Gibert et al., 2018). Hydrologic science is shifting from collecting data to support existing conceptual models toward analyses based on models derived from observational data (Chen and Han, 2016). In this paper, we report on how current instructors of hydroinformatics and water data science define their fields and the topics and technologies that are growing in importance in these fields.
Hydroinformatics and Water Data Science Education
Without training in data intensive approaches with modern technological tools, students will be unprepared to solve emerging water problems (Merwade and Ruddell, 2012; Lane et al., 2021). Technology integration and data and model-driven curriculum are key components for advancing hydrology education (Ruddell and Wagener, 2015). Many have recommended educational pedagogies for hydrology that are “student-centered” or “problem-based,” which describe applications that deepen learning by connecting to real-world contexts (Wagener and McIntyre, 2007; Ruddell and Wagener, 2015; Habib et al., 2019; Maggioni et al., 2020). Students need to learn using real-world datasets, actual tools, and open-ended problems, also referred to as “ill-defined,” “authentic,” or “experiential” (Ngambeki et al., 2012; Burian et al., 2013; Maggioni et al., 2020; Lane et al., 2021).
Hydroinformatics was initially taught in the mid-1990s to enable engineers to apply information technology to complex water problems (Abbott et al., 1994). Specific programs have since developed including courses for professionals (Popescu et al., 2012) and graduate students (Burian et al., 2013) and complete doctoral programs (Wagener et al., 2021). However, hydroinformatics courses remain limited, and to gain informatics skills, students often rely on technology incorporated into traditional hydrology courses, pursue self-learning (e.g., online courses, tutorials, etc.), or enroll in computer centric courses that do not address the focused set of topics with domain-specific applications covered by hydroinformatics.
Training in data science is typically separate from domain sciences; however, data science curricula cannot adequately address domain knowledge, so students are expected to rely on their own “substantive expertise” (Grus, 2015). Voices in industry and academia are calling for well-rounded and technology-literate water scientists (Chen and Han, 2016; McGovern and Allen, 2021), which may be achieved by packaging informatics and/or data science topics with real-world water science applications (Gibert et al., 2018; Wagener et al., 2021). In this paper, we use information gathered from instructors to understand how courses are being taught, what techniques are successful, and what would be useful going forward.
Sharing Educational Content
As technology and applications advance, books and even online content may become outdated quickly, and hydroinformatics and water data science instructors are challenged to keep up (Wagener et al., 2007; Makropoulos, 2019; Maggioni et al., 2020). Given shifts toward big data, open data sources, reproducible research, and data-driven analysis, many have called for advancement in content for teaching water science and methods for delivery of that content (Seibert et al., 2013; Habib et al., 2019). The COVID-19 pandemic caused many courses to be moved to virtual platforms, prompting evaluations of instructional formats and a call for additional online educational material (Maggioni et al., 2020).
Community platforms and resources can advance water science instruction by facilitating data-driven learning and offering common principles and approaches for teaching (Merwade and Ruddell, 2012; Popescu et al., 2012; Wagener et al., 2012; Makropoulos, 2019). Although water science modules have been shared and published online (e.g., Habib et al., 2012; Wagener et al., 2012; Merck et al., 2021; Gannon and McGuire, 2022), without integration within a common platform, modules are difficult to identify, access, and implement. In 2012, Merwade and Ruddell noted that an appropriate system was not yet in place, and there remains no single clearinghouse of educational resources in the field. More recently, Maggioni et al. (2020) and Lane et al. (2021) developed and published course content via HydroLearn (https://www.hydrolearn.org/). Lane et al. (2021) made the case that online educational materials should be supported by active learning, basic templates, adaptation, multiple content types, and pedagogical tools, which are emphasized in the HydroLearn platform. To these functional capabilities, we add that systems need to offer persistence as we were unable to access many of the online resources that were reported in the literature. They were either missing completely, lacking crucial metadata, or using outdated software or systems.
Our review of the literature identified key components, guidelines, and best practices for sharing educational content along with gaps and opportunities to improve. In this paper, we also consider key components to successful online modules as identified by hydroinformatics and water data science instructors, which we used as criteria to select an online educational platform. Based on these findings, we describe the development and implementation in an online system for four modules focused on hydroinformatics and water data science, which are available for instructors adapt into courses and may serve as examples to the community.
Methods
Survey and Interview Methodology
We developed survey and interview questions that focused on the instructors' courses and their perspectives on the future of the field (Table 1). Participant responses were analyzed to identify common themes surrounding key research questions: (1) What is the current state of instruction in hydroinformatics and water data science, including the effectiveness of tools being used for in-person and online instruction?; (2) How has the COVID-19 global pandemic affected instruction?; (3) Which topics comprise hydroinformatics education and what topics are growing in importance?; (4) What are the major challenges in hydroinformatics instruction?; and (5) How can shared instructional resources be beneficial for instructors and students? Although this analysis was primarily qualitative, where commonalities emerged, we were able to tally responses and present quantitative results.
Potential participants were initially identified via investigator connections, review of relevant literature, and information on institutional and personal websites discovered by Internet searches. Target participants were selected based on their experience teaching hydroinformatics, water data science, or related subject matter at an institution of higher education. We used email to invite contacts to participate, and participants elected to respond to questions either via online survey or recorded interview. During each interview or survey, participants were asked to identify any additional instructors who might be a good fit for the project.
While the questions for surveys and interviews were the same, both approaches were used so that participants could choose their preferred mechanism to respond. We acknowledge that the different modes for data collection may have influenced the length or character of the responses, but we made this decision to maximize the potential for participation. We observed that content specificity did not differ greatly between surveys and interviews. The survey was composed using Qualtrics software and administered with links personalized for each participant. Interviews were conducted over Zoom, recorded, and subsequently transcribed. Each interview lasted approximately 45–60 min. Notes were taken during all interviews in case of issues with audio. A total of 18 instructors participated in interviews (n = 7) or responded via survey (n = 11). Herein, we refer to interview and survey participants as “participants” and do not differentiate between the mode in which they participated. Procedures were approved by the Utah State University Institutional Review Board for Human Subjects Research with participation limited to instructors within the United States.
Review of Educational Platforms and Modules
From participants and our own review, we identified several existing online platforms for sharing educational content. Using the survey and interview responses, we extracted characteristics that participants considered important in an online platform for depositing materials and used these to assess available options. We identified specific instances of educational materials from the hydroinformatics community that are available online for each of the considered platforms.
Module Development
We evaluated educational platforms based on the criteria identified in interview and survey results to determine the repository and format to use for depositing the educational modules developed as part of this work. At a minimum, we required that modules be implemented in an open access format. Our selection of a particular platform does not signify that it should be preferred for all instructors, courses, or learning situations, and we anticipate that instructors will adapt content to their preferred interface.
We used the suggestions from participants to inform the topics for the educational modules developed as part of this work. Given the breadth of suggested topics, our team could not develop modules to comprehensively cover all areas. This points to the need for community resources to take advantage of the varied teaching and research expertise of instructors. Rather than serve as a complete and unified set of educational content, the modules we developed act as a demonstration and a launching point for sharing content.
Our conceptual model of a learning module independent of any specific technological implementation consists of the following elements: (1) learning objectives, (2) narrative, (3) example code, and (4) technical assignment. The learning objectives guide the content that is presented through the other elements and may be contained separate from or as part of the narrative. The narrative covers the core of the concepts and topics and is communicated through various formats–e.g., slides, documents, and/or video. Example code may take the form of scripts, formatted markdown or text, or an interactive code notebook. Technical assignments consist of authentic, open-ended tasks based on real-world data that require students to implement code and write a descriptive summary. Authentic tasks are high cognitive-demand activities designed to reflect how knowledge is used in real life and to simulate the type of problems that a professional might tackle. Authentic tasks have no single answer and thus avoid concerns with publicly available solutions and achieve higher level learning objectives. Each assignment includes a grading rubric to ensure that expectations and evaluation criteria are clearly defined and activities are aligned with learning objectives, outcomes and assessment, referred to as constructive alignment (Biggs, 2014).
Results and Discussion
Survey and Interview Results
Each instructor's definition of the terms “hydroinformatics” or “water data science” was unique, but all centered on common themes of using computers and informatics tools to solve water problems, including data collection, storage, sharing, interpretation, analysis, synthesis, and modeling. One participant simply defined hydroinformatics as “data and water.” The following quote summarizes the motivation for teaching these subjects:
“We have…talented, quantitatively savvy people…engineers and geologists and hydrologists and scientists that live and breathe data analysis and are limited by the tools they use. And we also have increasing data volume and aging infrastructure, emerging pollutants, drought, climate change. There [are] so many challenges our field faces. So, the goal is to give people modern tools to deal with modern water data challenges.”
The interviews and surveys generated a rich body of results, which we distilled in view of our core research questions. The current state of instruction in hydroinformatics and water data science is addressed in the subsection Courses, Platforms, and Modes of Delivery including impacts related to the COVID-19 pandemic. The subsection Challenges and Benefits of Online Delivery focuses on the effectiveness of tools for online instruction. What comprises hydroinformatics education is covered in the subsection Content, Technolocy, and Topics. There is a subsection Challenges and Future Directions of hydroinformatics. The Shared Resources subsection addresses interest, considerations, and potential benefits of shared institutional resources. In the following results, the number of participants (out of 18 total) that correspond to each response is reported parenthetically.
Courses, Platforms, and Modes of Delivery
The courses taught by participants include hydroinformatics and related courses with emphases on data science, research computing, and data and analysis tools (see Table 2). Most of the courses taught by participants are directed to university graduate students (14), though a few are undergraduate Introduction to Data Science classes (2), several courses are a mix of undergraduate and graduate students (4), and a few are designed for professionals (2). Most of the graduate classes permit some undergraduate enrollment, and several instructors noted that students at their institutions are exposed to some hydroinformatics topics in lower-level hydrology or geographic information system (GIS) classes.
Most of the courses are conducted in-person, although some had an online component even prior to COVID-19. In total, 12 out of 18 participants teach courses in person. Of these, most moved to an online format because of the COVID-19 pandemic. A few instructors (4) did not teach during this period due to buyout, sabbatical, or changing institutions. Multiple instructors (3) developed courses during the pandemic that would normally be held in-person. Of the courses offered fully online (6), one is a course for professionals, one was offered through an online community college, one was designed for a virtual university, and the remaining 3 are taught through universities.
Of those participants who moved from in-person to online because of COVID-19, most did not significantly change course structure but continued to use a format consisting of lectures with slides and coding demonstrations. Some instructors held synchronous classes over Zoom while others recorded lectures for asynchronous viewing. Generally maintaining course content with some changes to modalities was a commonly reported adaptation to the global pandemic (Beason-Abmayr et al., 2021; Smith and Praphamontripong, 2021). Additional modifications to address challenges of online learning are described in Section Challenges and Benefits of Online Delivery. Although hydrology and hydroinformatics have been identified as well-suited for online instruction (Merwade and Ruddell, 2012; Popescu et al., 2012; Wagener et al., 2012), even technologically savvy instructors with informatics-focused curriculum were generally returning to in-person formats even before the COVID-19 pandemic was over. The return to in-person instruction may be related to institutional expectations and instructors' preferences rather than ineffectiveness of tools and technologies (Rapanta et al., 2021). However, several instructors perceived benefits to online aspects and reported adjusting their teaching formats accordingly. A handful plan to shift modalities to alternate in-person and online classes or to a flipped format where lectures are recorded and viewed asynchronously while in-person class periods are work sessions. One participant was pleased with outcomes from online instruction and planned to continue with a purely online format. This is consistent with literature from other fields reporting that a flipped teaching format eased the transition between in-person and online education (Beason-Abmayr et al., 2021). Furthermore, the forced transition to online instruction can facilitate a deliberate integration of online and in-person instruction that is beneficial to active learning (Rapanta et al., 2021).
Instructors reported implementing a wide range and multiple layers of educational platforms to support instruction and handle course materials. Out of 18 participants, most (16) used a learning management system (e.g., Canvas, Blackboard, Brightspace, Sakai) for grading and assignment submission. For messaging with students, some used Canvas (or similar), though several instructors reported success in transitioning all course communication to Slack (2). For some, the learning management system was used to share files, while others stored and shared code and datasets with repositories in GitHub (6) and HydroShare (4), and a few reported using email or Google Drive. All these platforms were generally reported to be effective for both in person and online instruction, and several instructors planned to continue using Slack when returning to in-person instruction.
Most of the participants reported conducting live coding during lectures, whether synchronous or asynchronous, online or in-person. Some instructors switch between traditional teaching material (e.g., slides, videos) and live coding while others exclusively use coding interfaces for instruction. Many instructors (6) reported teaching with code notebooks (e.g., Jupyter) that can be launched from a web browser and include text and images as scaffolding to explain and support the code. Some instructors reported advantages to using GitHub and Jupyter notebooks:
“Jupyter notebooks enable us and our students to have a conversation with a problem and link to resources, like audio, video, images, visualizations and implement water resources projects step by step.”
“Jupyter notebooks work great for teaching either online or in person… They are especially nice for students working through in-class exercises. We…share screens while the instructor or students work through problems.”
“…copying [the assignment] to my private [GitHub repository] for grading and…deleting …the code that the students need to fill out but leaving the results…then committing those to the public repo [is]…a great tool…because [they] know what the answer should look like. … there's…self-training and…self-evaluation…by…working on their code until they get it to look like what it should.”
Challenges and Benefits of Online Delivery
The most reported challenges for online delivery were interpersonal and not unique to hydroinformatics or water data science. Instructors were concerned about meaningful engagement with students, lack of feedback and participation during lectures, and students struggling without the camaraderie and accountability of an in-person instructor and classmates. The paucity of in-person interaction and decreased student engagement have been reported as common concerns with the abrupt shift to online learning (Daniels et al., 2021; Godber and Atkins, 2021).
“…a lot of tactile things…are lost in a virtual format, and that can be very frustrating for students and instructors and really slow the course down.”
“You ask a question, and there's no feedback. You don't see anybody's faces. You don't hear any response. …you have to force those interactions and knowledge checks through some other mechanism.”
Instructors also reported difficulties with determining the best formats and technologies for rapidly pivoting to online instruction and the time-consuming nature of creating high quality online content. Reduced interaction and the time required for instructors to develop content are established drawbacks to online learning (Habib et al., 2019; Wagener et al., 2021), especially with the rapid shift that occurred in 2020 (Godber and Atkins, 2021; Rapanta et al., 2021).
A concern expressed by multiple instructors (6) specific to computer-based classes was the difficulty of troubleshooting and reviewing code and errors without being able to crowd around the screen, consistent with challenges reported by Gannon and McGuire (2022). Another issue for several instructors was getting hardware and sensors into the hands of students.
“…during the hands-on lab, I stop by each student and see if they're following and if they can finish that specific section of the code. …But in Zoom, it's relatively harder to see all the screens and then go back to each one…a classroom environment is often very engaging and more hands on for students. They can easily talk to the person next to them and get some help.”
“Live coding is challenging because students don't often have multiple screens, so typing code while watching the lecture requires some careful window manipulation.”
To address these challenges, instructors adjusted to hold more office hours and help sessions and increase communication opportunities, which was also important for Smith and Praphamontripong (2021) in transitioning a coding class online.
“I polled students [to ask] what's going on? What are the pain points? …they really enjoyed being able to watch stuff on their own time. So instead of doing a live lecture, I ended up doing recordings and then during the lecture times I [held] office hours. In fact, I started doing…office hours at…9pm, 10pm. It was crazy how busy they were.”
“We do a lot of office hours due to COVID so that we can connect, look at their screen…What's the problem with their code? I increased [office hours], but also, I schedule meetings with students if they have a [specific] problem…it's not really that engaging as in person, but still, we try to support the missing pieces…through some online meetings.”
Participants reported that communicating expectations for online classes and deliberately facilitating interaction helped ensure student engagement.
“We make it a point to tell students that being in an online class is no different than being face-to-face in terms of being engaged or not….This helps the students get to know each other and learn how to navigate online meetings, which is a great professional skill to develop. We are also more intentional in encouraging community in the online class; I have an “ice breaker” question related to data science each day, and many students submit their answers in the chat window.”
Despite the challenges of online delivery, instructors deemed several aspects of online instruction as beneficial. Zoom was an effective technology for interactive remote instruction, and several participants preferred live coding via Zoom rather than in the classroom because students could more easily follow along and screenshare their own work. For some participants, Zoom breakout rooms facilitated group work. Others reported benefits of live coding with screen sharing as well as online breakout rooms (Beason-Abmayr et al., 2021; Smith and Praphamontripong, 2021).
“If anything, the class may have gone more smoothly this way because everyone was sitting at a computer all the time so we could more easily screen share and debug and demonstrate across the instructor and student machines.”
“There are some elements of being online that work really well for this class. …The course is …flipped, so each professor prepares…videos for the students to watch in advance, and they also prepare a set of in-class exercises. During class, we split the students into breakout groups of 4-5 students each, and they work on the exercises. The professors and TA circulate through the rooms answering questions. At the end of the class period, we reconvene to discuss interesting problems or issues that arose while the students worked.”
Even with a return to in-person instruction, some are retaining approaches that were successful during the online period. These adjustments include non-traditional modalities for synchronous/asynchronous lecture and work sessions and increasing the use of tools and platforms such as Zoom, Slack, and Jupyter notebooks. This reflects the recommendations made by Rapanta et al. (2021) to retain effective aspects of online learning when blending with in-person modalities so that digital technologies support rather than hinder active learning.
Content, Technology, and Topics
All participants reported creating custom materials for their course and/or adapting content from other sources. A majority (13) created most of the instructional materials for their course. Only a handful (4) used any textbook: one hydroinformatics text, one modeling text, one statistics text, and one converted an existing coding book to water resources examples. A reported challenge is the rapidly evolving nature of the field in which the technology and applications change faster than published textbooks can account for. Several instructors (4) borrowed, exchanged, or modified material from each other.
“I have created all of my own course materials. I do not use a text. Most materials were drawn directly from my own research and project experience or that of my close colleagues.”
“We have built up the course material from scratch…we were not aware of a…textbook that would teach the students at the level that we wanted and with the types of R programming that we wanted while illustrating with the water-related data that we wanted.”
Regarding technologies emphasized, almost all instructors teach coding in Python (10) or R (6). In addition, instructors cover structured query language (SQL) (4), ArcGIS (3), Arduino (3), and web technologies (i.e., PHP, JavaScript, HTML, CSS) (3). For several cases, the course evolved from using Matlab to R to Python so that students have experience in a non-proprietary coding language that they can use in subsequent settings regardless of affiliation.
“I had a student who was just an outstanding computationalist. …got a great job…came back and she said…I really loved your class and I wish I still had…the ability to do those kinds of analyses, but our company won't pay for the MATLAB license…it was just heartbreaking because…think about what your company is missing out on you not being able to do that…I [determined] to…move this to Python or something that they're going to continue to have access to, regardless of where they work in the future.”
Although hydroinformatics is centered on tools, rather than emphasizing specific technologies, participants emphasized teaching students how to learn new informatics tools, a finding that echoes the emphasis of Burian et al. (2013). Several instructors noted that hydroinformatics technologies continue to advance, which makes it hard to settle on a set of tools to use in teaching a course and highlights the need to teach students how to recognize which tools to use in different scenarios.
“Students might never use those specific tools again, but have skills to learn new tools.”
“I do not expect that students leaving my class will be experts in any of these skills. However, they should have explored each of them and developed a level of proficiency that they know which of them will be the most useful in their research and future careers and which may be the most important for them to invest further time and effort into becoming more proficient.”
“I think we have reached a point where there are relatively good cyberinfrastructure components out there in the hydroinformatics domain and now one of the bigger problems is composability - e.g., how can students and researchers learn all of the available tools and then decide which tools to put together in composing a research, data analysis, data science, modeling, etc. workflow.”
Other instructors emphasize data and project management skills, which are agnostic to specific technologies or tools.
“My expectations for the informatics skills…are…more about…habits of mind and computational practices around…reproducibility and…sustainable code…making sure that their code is under version control, making sure that they're using things like Jupyter notebooks to provide…traceable and reproducible demonstrations of their workflows, more so than any kind of specific technique that they're using.”
An important skill repeated by participants was appropriate troubleshooting, including understanding documentation and finding help through forums and other resources.
“We…encourage students to use the internet to help them work through problems and troubleshoot coding errors (e.g., Google, StackOverflow).”
Each instructor and each course have specific emphases. While there is variety in what is taught, the overlap of common subjects illustrates key topics and themes that currently comprise hydroinformatics instruction (Figure 1). Most instructors (13) focus on scripting and coding basics (in Python, R, or Matlab) with emphases on data formatting, manipulation, and wrangling (12) and data visualization and plotting (11). Data science (10), basic statistics (7), and machine learning topics (7) were commonly mentioned. About half of participants covered geospatial topics such as mapping (7) and spatial analysis (10), which some instructors view as essential while others exclude these topics as they are covered by other courses. Several participants (6) include instruction on workflows, reproducibility, and best practices for coding. Other topics mentioned by multiple instructors included databases, data models, and SQL; dataloggers and sensors; modeling; the data life cycle and metadata; Git; and web services and web mapping tools.
Because of the open-ended nature of the questions, these numbers should be interpreted generally –e.g., more instructors may include content on metadata but did not explicitly mention it. Similarly, “modeling” is a broad term with various meanings and implementations. Despite these limitations, we can identify a few important takeaways. First, hydroinformatics is broadening its focus from modeling with custom tools and graphical user interfaces (GUIs) (as described in many of the papers we reviewed) to more strongly emphasize data management, visualization, and analysis using open-source scripting tools. These capabilities provide a broader path for addressing water-related challenges and questions.
“[The] basics of how to organize, use, and process data has not changed, but the technology to do that keeps changing. For example, we no longer use interface or GUI… The term workflow was not used earlier but is now used frequently. There is more use of internet-based tools and publicly available/open-source tools.”
“Things are becoming more standard; the tools keep getting better. We are now able to use mostly open-source mainstream languages and tools for our specialized environmental informatics work; 20 years ago we needed to build and use clunky, custom-purpose tools. This is much better now. It also means, however, that there is less need for ‘hydroinformatics’ specific tools and methods.”
Second, a primary objective for many of the instructors was to ensure that students are comfortable working in one scripting language and understanding the basic concepts of functions, conditional statements, iteration, logical operation, data management, querying, and visualization. Any modeling being taught is within the context of open-source scripting environments. We observed that data science, statistics, and machine learning topics are generally being taught in the water data science courses while databases, sensors, and spatial analyses are being taught in strictly hydroinformatics classes. However, the crossover between these topics is growing, and the boundaries between hydroinformatics and water data science are fuzzy.
Third, several instructors emphasize communicating scientific data and results, and others focus on enabling students to translate the skills gained in the course to resume entries or digital code portfolio.
“I'm big on science communication…that was the first time that they had ever really had someone be pedantic enough to talk about presentation of data, quality of graphs, quality of the writing.”
“I try to work with them to put it on their resume in a way they can explain it. …they're getting some really cool jobs…they wouldn't have gotten, as a result…So it basically opens up career trajectories that are not just typical civil and environmental consulting.”
“At the end of the class I'm hoping that they have…a GitHub repository that has…Jupyter notebooks that are their problem sets that they feel comfortable sharing on their LinkedIn profile or their CV that [is] a small e-portfolio of a demonstration of things [they] can do computationally.”
Challenges and Future Directions
There was little consensus in identified challenges and future directions (Figure 2), which reflects our finding that instructors are developing their own content based on their own definition of the field, drawing from their own research and experience. Many participants identified machine learning, deep learning, and/or artificial intelligence as increasingly relevant, reflecting the growing use of these techniques in water science (Shen, 2018; McGovern and Allen, 2021; Nearing et al., 2021). Beyond covering those topics broadly, some instructors offered specific ideas, including better understanding why some techniques do or do not work for some datasets, addressing correlation in data, and using data-driven modeling with physics-informed machine learning. Sensors and hardware-related subjects were identified as important by many participants, including managing high frequency data, low power and ubiquitous sensing, and smart sensors with controls and feedback for real-time decision making. Participants also mentioned electronics, drones, and satellite data. Data management aspects included data quality, reproducible analyses, big data, database schemas and SQL, and collaborative version control (e.g., GitHub).
“So there's always going to be an importance in a baseline proficiency in working with tabular and spatial data within water resources data science. …as data volumes increase, then you need…database skills, so creating schemas, interacting with databases, whether that's Postgres on a cloud or [SQLite] on your local computer. …something [that will] hold really big volumes of data, and then interact with it in a structured query language.”
One participant noted that web applications are overtaking desktop applications, further evidenced by several participants identifying cloud computing and technologies as an area of growing importance. For geospatial topics, emerging applications include open technology and platforms (e.g., Google Earth Engine) and open remote sensing products. Although visualization is covered in most of the courses, several participants noted that creative, interactive visualization tools and dashboards are increasingly important.
The range of responses regarding topics of growing importance demonstrate that these subjects are broad and varied, and that the tools, technologies, and topics continue to evolve, compelling instructors and courses to be agile. The challenge of defining and teaching a moving target was reiterated by several participants. Despite the long list of possible topics to cover in a course, one participant suggested that simplifying to cover fewer tools and models is preferable. Given the inflexibility of most engineering and science degree curricula and class structures, it is unlikely, outside of specifically focused degree programs, that additional hydroinformatics and water data science classes will proliferate in most university settings. However, it is feasible, and arguably preferable, that hydroinformatics and data science topics be better incorporated into other existing courses.
“Students have told me previous versions of this course was foundational for their PhD/MS and that it was ‘the most useful course I have ever taken’. They appreciated…the hidden curriculum (stats/R/programming) was brought to the forefront in my classes.”
“Students get very little, if any, exposure to hydroinformatics with their undergraduate degrees. I am in a Civil and Environmental Engineering department, and our undergraduate curriculum is so tight that students have very few options for tailoring their undergraduate degrees. Thus, many…show up in graduate school lacking the preparation for making advances in hydroinformatics.”
A major gap reported by participants is students' lack of baseline programming experience. Most of the courses expect some level of domain knowledge but do not require programming skill. However, getting students up to speed consumes precious time, and instructors would prefer programming/scripting at earlier levels (i.e., undergraduate). Participants reported difficulty in approaching advanced topics when students are learning to program for the first time, similar to Lane et al. (2021). Although computational skills are critical to water science and hydrology fields (Merwade and Ruddell, 2012), students are often expected to figure them out without explicit instruction (i.e., the “hidden curriculum”).
“Mainly I think hydroinformatics concepts could be introduced earlier or at all in undergraduate education. These things are so critical to the field that I think a solely analog hydrology course is a disservice to students.”
“If students don't come prepared with coding competency and conceptual fluency in computer science, they struggle to learn the applications to environmental fields.”
Shared Resources
Participants unanimously indicated moderate to high interest in sharing and exchanging teaching materials, and several reported already depositing educational content online. However, the materials are spread out in various formats over multiple platforms, and we were unable to locate some of the resources reported to be available. There is no single centralized platform, and implementations range from files uploaded to a personal website to a fully interactive online course. Reported interest and rate of uptake is uneven. One participant prepared and posted course content in a public repository with no knowledge of reuse while another shared content in an interactive website and received feedback from multiple external users. Even so, the level of reuse is modest relative to what some participants consider necessary for high impact.
“You have to make it easy and provide a venue where a significant number of students or other faculty will pick up on content.”
Despite universal interest in sharing materials, some participants expressed hesitancy to rely on others' content, to personalize and adapt it to fit their class, and to invest the time to gain the expertise to present others' materials.
“I don't know that…I would have grabbed someone else's material and…taught…a course. There's a lot of value I found as an instructor in having to prepare all the material from scratch myself as a way of making sure I actually know what I'm talking about. …it is very nice to have other resources [as a] stencil of what a class might look like, and what good topics would be…I would probably still have to spend the time to develop…a copy of that myself so that I actually knew what I was doing.”
A barrier to exchanging materials is the difficulty of knowing what modules or case studies exist, so an ideal system would facilitate discovery. Other desirable qualities of a platform, as identified by participants, include complete descriptions/metadata, a navigable interface, straightforward functionality for adding content, and separate teacher/student access.
“Some website where it is easy to search and find modules. It should be easy to navigate and easy to add new contributions. It would be cool if you could see how other faculty members have put together modules to create their own course.”
For shared resources, instructors are interested in portable programming examples, particularly: (1) Jupyter notebooks consisting of code and supporting theory and instructions in markdown, and (2) GitHub repositories that can be cloned and adapted. Other suggestions included slide decks, videos, handouts, example assignments, HydroShare resources, and ArcGIS online content. Participants wanted modular, self-contained exercises that can be modified and swapped into classes.
“Self-contained coding exercises that maybe on the first iteration can address a single problem, but then the instructor themselves can develop the sequence of problems that are the deeper dives after that. Something that can be easily plug and played into an existing curriculum or into an existing lecture, and then…would encourage ownership of the content.”
Similar to topics of increasing importance, topics of interest for shared resources varied (e.g., databases, interactive visualization, data-driven hydrologic models, cloud computing, etc.). Regardless of topic, domain specific datasets were consistently mentioned as a key need for shared resources.
“The biggest [need] is domain specific data that works for the kind of examples that we need to show…datasets that are large, complex, have hidden components in them that we're going to find, can be used to make a case for or against something…that can serve as good examples. And it's a slippery slope because either the dataset is too simple and it's silly. It's like 10 data points and we're drawing a line through it. Or it's…somebody's PhD dissertation and good luck getting that like into some sort of format where an undergrad can actually use it in the class.”
“Datasets that are ready to be used for illustration in class. These must have associated metadata that describes why the data was collected, what the researchers hoped to achieve with it, what each of the variables is, the sampling frequency, and what the data can be used to illustrate (i.e., clustering, visualization, regression, etc.).”
Several participants recognized that licenses with clear conditions for reuse and citation would help instructors understand limitations and expectations for repurposing content.
“…one of the best ways to learn is to look through other people's well-documented code, so open-sourcing the code and data used for scientific research, and using FAIR data standards to improve documentation and usability, is very important.”
“I think a GitHub with data with notebooks…that has a clear Creative Commons license for both the data and the notebook. And so I know I can use it, change it without getting a nasty gram…from someone's legal department seven years later.”
Regarding barriers for exchanging resources, the most common response was that credit could motivate instructors to publish instructional material. This may take the form of counting toward tenure and promotion decisions, citations to document the contribution, or monetary payment – e.g., a grant related to platform or repository development.
“Support from universities for “teaching” efforts beyond the…classroom, and consideration of these efforts and outcomes (e.g., pageviews/downloads) for hiring & tenure decisions.”
“Money - there's a lot I think we'd all do for a small amount of money. If you pay professors for their time, they will engage.”
Normalizing sharing teaching materials and developing a community around the exchange was another commonly repeated suggestion. Reciprocity was mentioned as crucial so that the exchange is mutually beneficial rather than a one-way offering.
“…if there are ways to, outside of the traditional incentive structure of writing research papers, to incentivize…technologically savvy researchers, postdocs, faculty to contribute lessons like this, then you'll see more participation… it has to be made important and valued by…the community somewhere.”
“[I would] go through the trouble of sharing…my resources if I knew that others were sharing theirs and that there could be an exchange from which I could benefit. All of my course materials have been online and openly available for a long time. Others have asked if they could use them, and I have always said yes. I've never had anyone offer to let me use modules they have developed, so the ‘exchange’ part of this would be important for me.”
Collaboration via feedback and edits on shared content was suggested, and multiple participants mentioned that workshops would be helpful to exchange ideas and build rapport.
“This course material is available to only 25 students per year. And seeing that it is used by many more…by different instructors and different institutes would be a nice…outcome of all these efforts. We really put a lot of effort for these materials to be created and used and refined throughout the years. …potentially giving feedback to these material and…seeing some updated versions of it by other instructors…a community level refinement of the course materials, and creating new versions and better, maybe more up to date versions of these slides will be…useful.”
“It would…motivate me if I knew that my contribution would be widely viewed and/or utilized. A workshop that drew educators/contributors together to share could be a helpful place to start.”
Building Educational Modules for the Future
Using information gathered on online educational platforms and examples of hydroinformatics educational content from study participants and our own search, we reviewed existing online platforms considering participant-identified attributes and selected HydroLearn for module implementation, covered in Section Online Educational Platforms and Materials. Section Online Module Development describes the modules developed by this work and how they address identified gaps. Module implementation is related in Section Online Module Implementation, including the mapping of module components to HydroLearn concepts and the benefits and challenges of implementing modules in online platforms such as HydroLearn.
Online Educational Platforms and Materials
There was no consensus among instructors on the preferred approach for sharing hydroinformatics educational material (Table 3). Some of these platforms are growing in popularity in the hydrologic science community but have not gained traction with the hydroinformatics instructors that we surveyed. The options include systems specifically designed for sharing and publishing educational content (HydroLearn, MyGeoHub, eddie, ECSTATIC), more generic repositories for data or code (HydroShare, GitHub), and customizable interfaces (personal websites, Canvas, or online courses). We reviewed these options with respect to characteristics extracted from the literature and our survey results (Table 4). Desirable characteristics include flexibility for hosting various types of materials, compatibility with open data practices, formal pedagogical structure, structured metadata, review and curation of content, and separate faculty and student access (Merwade and Ruddell, 2012; Popescu et al., 2012; Wagener et al., 2012; Makropoulos, 2019; Lane et al., 2021).
The major tradeoffs between the identified platforms are the level of control for creators versus structure to support education-specific content. Whereas, personal websites and custom online courses allow for a great deal of specialization, regular updating, and customizable interfaces, they do not include the searchability, structured metadata, curation, and educational support offered by several of the education focused platforms. A particularly attractive feature for hydroinformatics and water data science instruction is the ability to launch and run code notebooks. Two of the platforms that we examined have Jupyter servers and can launch notebooks: MyGeoHub and HydroShare. Potential challenges with these platforms include scalability for use with classes of students, inclusion of data files that accompany code, and installing desired software packages. Although existing systems currently do not support all desired functionality, we anticipate those limitations will be overcome with future development.
In deciding which platform to use for the educational modules of this work, we considered the factors in Table 4 with a focus on reuse and collaboration. We deposited materials in HydroLearn as it facilitates export and adaptation of courses and includes metadata, citation, curation, and pedagogical structure. HydroLearn is a repository for instructional material related to hydrology and water resources. Developed on the edX learning management system, HydroLearn is designed to support collaboration around instructional content, reuse and adaptation of materials, and flexibility for implementation in organized courses or by self-paced learners. Although it is relatively new, several cases observed enhanced learning of concepts and technical skills by students using HydroLearn and its precursors (Habib et al., 2019; Lane et al., 2021; Merck et al., 2021). Although it does not natively support launching and running notebooks, Lane et al. (2021) demonstrated linking notebooks via HydroShare.
Online Module Development
Based on the survey results, online educational materials are being used and modules have potential to address challenges in hydroinformatics and water data science education. However, there is substantial variety in topics and methods of instruction. While a unified curriculum and approach to the subject matter may be appealing, it does not match the reality of a rapidly changing field with dynamic courses and instructors. Instead, we sought to develop and publish example educational modules that focus on addressing gaps identified by participants and to illustrate an approach for additional online content creation and sharing.
The online modules were designed to address key challenges/gaps in hydroinformatics and water data science education reported by instructors. These gaps relate to: (1) content, (2) platform, and (3) organization. Regarding content, there is a lack of data-driven and problem-based learning that uses datasets from the water domain. Instructors requested notebooks for online coding examples, and there is a need for baseline levels of instruction in coding and scripting. To address the content gap, online educational content should include interactive code with water-related data and problems. Currently, instructors use various platforms for hosting educational content, and participants repeated the need for a system to facilitate upload, discovery, and community involvement. The platform gap may be addressed by publishing and publicizing resources in a system that meets many of the criteria in Table 4. We add that active and ongoing support are essential to ensure that the resources are not siloed or lost. Finally, the organization gap can be addressed by ensuring that the content is designed and structured to be modular and adaptable to different instructors, courses, and modes of delivery.
For our online modules, we worked to follow these recommendations to address the needs of hydroinformatics and water data science education. The modules address four topics: (1) Programmatically accessing water data via web services, (2) The sensor data life cycle and sensor data quality control, (3) Relational databases and SQL querying, and (4) Machine learning for classification (Table 5). These topics were selected based on survey and interview results indicating the need for reproducible code and the growing importance of high frequency sensor data, data quality control, databases, big data, web technologies, and machine learning. In conceptualizing these modules, we drew from our own expertise and datasets generated or used as part of our research efforts. The datasets are available for reuse, or instructors could apply the examples to data from other locations.
Table 5. Educational modules developed and deployed as part of this work with descriptions of essential components and datasets.
Online Module Implementation
HydroLearn facilitates a “Backward Design” approach wherein desired outcomes are first defined, then authentic tasks are crafted to meet outcomes, then instructional content is designed to present necessary information (Maggioni et al., 2020). Although in our case, development did not proceed in this order, the essential elements in our module design methodology correspond to backward design concepts and specific HydroLearn components: (1) learning objectives map to desired outcomes, (2) narrative maps to instructional content, (3) example code maps to both instructional content and authentic tasks (i.e., learning activities in HydroLearn), and (4) technical assignment maps to authentic tasks (learning activities). Implementation of each of the components in HydroLearn is reported in the following subsections.
Structure and Organization
Each HydroLearn course contains “modules” or “sections”, which is the level to which we matched our modules. Although our modules stand alone, we included them under a single course umbrella (Hydroinformatics–USU 6110) to fit the HydroLearn schema. Modules consist of “subsections” comprised of “units.” The subsections are only titles, whereas content is contained as components (e.g., text, discussions, problems, HTML code, videos) within units. In HydroLearn, users have control over using either many components within fewer units, which makes interaction with content more vertical (i.e., scrolling on a single page), or using many units, which makes interaction with content more horizontal (i.e., navigating from unit to unit). While this provides flexibility in presenting content, we found that navigation between subsections and the different levels of each module was not always clear.
Figure 3 illustrates the organization of a module implemented in HydroLearn. While this is an intuitive structure, it imposes hierarchical levels that may be overly strict for some users. For example, we found “subsection” to be an unnecessary level for some modules and would have preferred to directly use “units” under the module level–or to have had control over the hierarchical levels. Granularity and organization are persistent questions for many repositories, regardless of content type (Horsburgh et al., 2016), and developers of many data repositories determined to leave organization and structure up to the user (e.g., FigShare, HydroShare, Zenodo). Although there are benefits to imposed structure, there is no single prescriptive pattern, and users may prefer different organizational levels. We identified degree of control as the main distinction between platforms, and giving users more control over organization and structure may improve the appeal and uptake of HydroLearn (and similar platforms). Despite these limitations, we were able to fit our module content to the HydroLearn structure.
Figure 3. Module implementation in HydroLearn. The numbered steps indicate the order of workflow and the location of essential module elements: (1) the course landing page contains metadata and links to a course outline, (2) learning objectives in the module introduction, (3) the narrative consists of text, links, images, tables, and code snippets, (4) code examples are interactive notebooks in the CUAHSI JupyterHub linked from HydroLearn, and (5) the technical assignment and associated rubric are a separate module component.
Learning Objectives
Learning objectives are the desired outcomes of instruction and are ideally action-oriented, specific, and measurable. As a major part of its pedagogical emphasis (Lane et al., 2021), HydroLearn facilitates the creation of learning objectives, which can be entered manually or developed using a wizard according to an established structure (Maggioni et al., 2020). Although our learning objectives were defined prior to using HydroLearn, the wizard helped improve their specificity and robustness. HydroLearn functionality can directly connect module learning objectives to other module components (e.g., rubrics).
Narrative
For each module, the narrative was created in slides with text and images, then content was transferred to HydroLearn. Because study participants reported commonly using slides for lectures, the modules include linked slide deck files. Overall, we were successful in translating our content to HydroLearn components. Despite it being somewhat tedious to adapt text to HTML and to import and export images from slides to HydroLearn, we found it straightforward to edit content, to duplicate and modify components, to reorder units, and to publish changes. Building the course from the foundation of a HydroLearn template offered helpful organization and instructions.
Example Code
Each module contains 3–6 example scripts, each of which illustrates a task or piece of functionality (Table 5). There may be redundancy as examples build on each other, and instructors may choose to use fewer examples than provided. Code examples are shared in Jupyter notebooks as part of HydroShare resources that can be opened and run via the CUAHSI JupyterHub Server. We opted to use the CUAHSI JupyterHub because: (1) common Python packages are pre-installed, and additional packages can be installed by request, both of which are dependencies in our examples, and (2) data files can be called by code, which is essential for our modules. If data files are necessary to examples, they accompany the code notebooks in the HydroShare resources.
HydroShare resources containing notebooks and data can be linked and opened in a separate browser window or embedded as iFrames in HydroLearn units (Lane et al., 2021). We used links that directly launch the CUAHSI JupyterHub (Figure 3). From the link in HydroLearn, a user is prompted to sign into HydroShare and choose a coding environment and then is taken to their server directory where the notebooks are ready to be launched. This simplifies deployment of example code as learners do not have to install software or match a particular coding environment to view, execute, or manipulate code.
Technical Assignment
The technical assignments were conceptualized to meet recommendations in educational literature for open-ended, ill-defined, problem-based learning. For each assignment, students are expected to synthesize the narrative and code examples and apply the data and analysis tools to real-world applications. Each assignment requires coding and a written summary report to communicate and defend the results and conclusions. Within each module in HydroLearn, the assignment is a unit with components that specify the assigned tasks and expected deliverable. Assignments are accompanied by a customized rubric that sets expectations for students and facilitates objective grading for instructors. We adapted rubrics developed by a team of hydroinformatics instructors to each assignment (Burian et al., 2013). In another approach to assessment, HydroLearn offers rubric templates that connect the degree of student performance related to each learning objective (Lane et al., 2021).
Platform Challenges and Opportunities
Our experience with HydroLearn shows that it contains functionality that addresses each of the needs for online sharing and content organization that we identified in surveys and interviews with study participants. We also experienced challenges that present opportunities for continued advancement of educational platforms. We acknowledge that others who use HydroLearn may have varied experiences, and while it is beyond the scope of this effort, there is opportunity to gain further insight by soliciting feedback from users of HydroLearn and/or other platforms. In this section, we describe our experience using HydroLearn with respect to identified criteria, and each of the following paragraphs corresponds to a category in Table 4. While these outcomes may be specific to HydroLearn, we anticipate that other platforms face similar challenges and may require further development to support online educational resources.
Discoverability refers to locating content using keyword searches from Internet browsers and search functionality within a platform. After creating a course on HydroLearn, it appeared in the results of basic Internet searches. Within HydroLearn, we were able to search for the course and within the course. The platform could enhance discoverability by including keywords as part of the metadata for each course or module and filtering courses on keywords.
Metadata are displayed on the course landing page. The course template suggests metadata elements, which we used (e.g., target audience, tools needed, suggested citation), but elements are optional. HydroLearn could better standardize metadata by requiring certain elements and by automatically generating elements where possible. Creating metadata requires editing HTML code, and HydroLearn could improve usability through webforms or markdown.
Navigability of HydroLearn courses is dictated by the hierarchical structure described in the Structure and Organization Section. Even with a logical organization for content, moving between sections and knowing how to proceed through the module sequentially can be challenging for beginners. This may be improved by adding text to the icons in the navigation bar and by displaying a course outline and navigation in a persistent sidebar.
In Table 4, content refers to the types of files that are supported by the platform. We were able to use HydroLearn to share text, images, interactive websites, and to link files for download. Videos, equations, code snippets, and other HTML components are also supported. Supporting either a JupyterHub for launching notebooks or more directly integrating with the CUAHSI JupyterHub would strengthen the platform's ability to support code files.
Separate access for students and instructors is supported by HydroLearn. Course creators can elect to restrict access of certain content to course staff. Other instructors can access restricted content by exporting the course or by contacting course creators, though that may be unreliable. Although we used open-ended assignments, some require specific coding tasks. In these cases, we created scripts or notebooks as a solution key to the assignment, and we were able to use this functionality to restrict access without separating the solution from course materials.
Licenses can be specified by creators at the course level. HydroLearn supports Creative Commons licenses (e.g., Attribution, Noncommercial, No Derivatives, Share Alike), and related icons and messaging are displayed on course subsection pages. Licensing could be made clearer if displayed prominently on the course landing page.
Scalability refers to the ability for multiple users (e.g., classes of students) to use the materials or program. We have not yet tested HydroLearn in the context of multiple simultaneous users, but we are not aware of any limitations. It is built on an established online learning platform (edX), which offers robustness. There may be scaling issues with many users running notebooks on the CUAHSI JupyterHub, for which Lane et al. (2021) observed student frustration related to losing server connection and authentication.
Reusability of educational materials is an intent of HydroLearn, and modules are expected to be designed with consideration for uptake by other instructors. While the modules described here have not yet been reused, we found it straightforward to export and customize a HydroLearn course, and Lane et al. (2021) report that adaptation of a HydroLearn course by instructors at other institutions was straightforward. Reusability is facilitated by licenses and citations, and the course metadata template includes “Adapted From” to acknowledge source material. HydroLearn courses have been used for both online and in-person instruction and can be designed to be student-paced or with an imposed schedule making them compatible to the mix of modalities reported by study participants.
Citations are a recommended (but optional) metadata element for HydroLearn courses. Creators can structure the citation as desired, and it is displayed on the course landing page. There is opportunity for the platform to standardize by automatically generating a citation for each course or module, as is done for data and code resources in HydroShare (Horsburgh et al., 2016).
Curation of courses is not required in HydroLearn, and instructors may deposit and share content without review. However, most of the modules currently available on HydroLearn were developed through intensive summer hackathons including substantive instruction on pedagogical best practices and feedback from the HydroLearn team (Maggioni et al., 2020; Gallagher et al. in prep). As a result, much of the educational content shared on HydroLearn meets their criteria for high quality modules. However, there is no long-term system in place for module review and curation by the project team. As our modules were developed outside of the formal hackathons, we requested the feedback of a HydroLearn team member who was able to review and offer helpful suggestions. The approach of offering but not requiring curation balances increased overhead with fostering high quality content. Also, compensating fellows increases their motivation to deposit high quality material, as noted by study participants.
Educational support refers to assistance with teaching pedagogy and tasks, and is provided by HydroLearn through multiple features. HydroLearn emphasizes learning objectives throughout course development and includes functionality for various problem types to assess student learning (e.g., multiple choice questions, open responses, advanced mathematical expressions). Following templates and recommendations, capitalizing on features, and taking advantage of review by HydroLearn staff offers an approach that will result in a robust pedagogy. Although we did not tap into all these capabilities in developing modules, this is major benefit of HydroLearn.
Collaboration is facilitated in HydroLearn through the inclusion of multiple instructors who share editing abilities and co-authorship on a course. HydroLearn also has the ability give feedback through comments. It was uncomplicated to add instructors to our course and for all authors to edit materials; however, we did not experiment with feedback.
Outlook for the Future of Hydroinformatics and Water Data Science Instruction
In light of the transition to online courses precipitated by the COVID-19 pandemic as well as the growing prevalence of material online, instructors may need to consider how to best bring value to their course offerings. As expressed by one interview participant:
“…the incentive, the value proposition of the classroom is fundamentally altered after COVID. …No matter how good somebody is at explaining something, there's always somebody better on the internet. …what really is the role of the instructor…and modern classroom? … Obviously in person, it's made easier by the fact that [students are] there. But then the question is, is it you or is it the fact that they can be around each other? …online [content] is growing and dismissing it [is naïve].”
Several participants indicated that the merit of an organized course for students is interaction with an instructor curating content and facilitating learning. Despite the possibility of learning from purely online materials, a knowledgeable and engaged instructor still has much to offer. This echoes Rapanta et al. (2021) in identifying a teacher's role to organize and curate the learning process and recommending that instructors increase technology expertise to adapt to changing educational environments.
“…engagement, pre and post class discussions, office hours, a tailored curriculum to the class. …my class changes every semester based on…what I'm perceiving in lecture and what I'm hearing in office hours.”
“We're in an era where it's not necessarily the content that's most valuable to the students, it's me facilitating their use of the content. And so, I think that the content should be shared as broadly as possible.”
Access to educational material that is current, flexible, and reusable can help instructors adapt to the rapidly evolving field. The modules presented in this work are a first step and an invitation to the community to continue development and sharing of content online. In this way, instructors can address the gaps we identified related to content, platform, and organization of community materials. As instructors consult the list of topics of growing importance in the field and consider which of their materials and datasets may be most useful as community resources, we envision that they will deposit modules that include relevant water-related datasets and accessible code examples with ideas for problem-based learning.
This work illustrated that materials deposited in HydroLearn are modular and adaptable, and as HydroLearn advances and usage increases, it may address the platform gap related to limited community and siloed resources. This vision depends not only on sharing content, but also on uptake by other instructors implementing, reviewing, and engaging with shared material. As articulated by study participants, reciprocity, credit, and feedback will all motivate sharing and reuse of content, which will help advance instruction in hydroinformatics and water data science. Further implementation of online educational modules may help corroborate our experience in meeting identified criteria and may point to additional challenges or gaps.
Conclusion
We interviewed and surveyed instructors that teach hydroinformatics and water data science at collegiate and professional levels to assess the current state of practice regarding topics, teaching tools, shifts to online instruction related to COVID-19, and the potential for shared online resources. Results indicated a mix of online and in-person modalities. Although nearly all courses moved online because of COVID-19, there was a strong preference for in-person learning, and most were returning to in-person teaching. However, instructors are retaining some virtual aspects that facilitated instruction, particularly related to live coding. Student feedback and interaction were lacking in purely online modalities, leading to the conclusion that even successful online resources and tools require deliberate interpersonal components.
Instructors generally customized teaching materials to meet the demands of a rapidly developing field. Results show variety in topics currently taught and topics of growing importance, with consensus around emphasizing reproducible code development in open-source languages and competence regarding learning and selecting informatics tools. Live coding for online and in-person settings was facilitated by the growing use of online code notebooks. A key finding was a common need for technical skill development earlier in students' college experience.
We found high interest in shared online educational content, although a lack of recognition, reciprocity, community, and credit were deterrents to sharing. Although participants currently use multiple layers of miscellaneous educational platforms, there was an expressed need for common community resources. Participants reported gaps and challenges to hydroinformatics instruction related to content (water-related datasets, online notebooks, and data-driven problems), platform (community-based, facilitates discovery), and organization (modular, adaptable).
The educational modules we developed attempt to address these challenges, center around subjects of growing importance in the field, and were developed and deposited in HydroLearn, a platform for water-related educational modules. We found that HydroLearn was successful in meeting participants' criteria for a community content platform. HydroLearn has robust functionality for educational tools and pedagogy, and its scaffolding supports content sharing (i.e., metadata, citation, discoverability, collaboration, reusability). The major drawbacks were related to an imposed hierarchical structure, and improvements could be made regarding minimum metadata requirements. These modules are a step toward developing a rich set of online resources and an active community of instructors to meet the advancements in hydroinformatics and water data science.
In conclusion, shared online resources hold promise for overcoming challenges in hydroinformatics and water data science education. As instructors are already accustomed to tailoring content for their courses, adapting online modules with a water emphasis is accessible. Current and flexible resources would help instructors keep pace with the rapid development of technology and topics in the field and maintain the value of their course and teaching for students.
Data Availability Statement
The materials generated by and reported by this work are publicly available. The survey responses and interview transcripts are available via HydroShare (Jones et al., 2022c). The educational modules are published via HydroLearn (Jones et al., 2022a) along with code and associated datasets in HydroShare (Jones et al., 2022b).
Ethics Statement
The studies involving human participants were reviewed and approved by Utah State University Institutional Review Board. The patients/participants provided their written informed consent to participate in this study.
Author Contributions
AJ, JH, and BL conceptualized the presentation of survey and interview results with associated educational modules. AJ formulated the survey and interview design with support from JH and CF. AJ facilitated all surveys and interviews and analyzed the responses. AJ, JH, and CB created the educational modules and published them with support from BL. AJ wrote the manuscript with consultation and contributions from JH, CF, BL, and CB. All authors contributed to the article and approved the submitted version.
Funding
This research was primarily funded by the United States National Science Foundation Under Grant Number 1931297. Additional support for the educational/training modules was provided by the FAIR Cyber Training Fellowship program at Purdue University corresponding to National Science Foundation Grant Number 1829764. HydroLearn is supported by National Science Foundation Grant Number 1726965. Additional funding support was provided by the Utah Water Research Laboratory at Utah State University.
Author Disclaimer
Any opinions, findings, and conclusions or recommendations expressed are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
We gratefully acknowledge the input and expertise of the instructors who were participants in the surveys and interviews reported in this paper and the support we received from the HydroLearn team in setting up and sharing our educational modules.
References
Abbott, M. B., Minns, A. W., and Van Nievelt, W. (1994). Education and training in hydroinformatics. J. Hydraul. Res. 32, 203–214. doi: 10.1080/00221689409498812
Bader, N. E., Meixner, T., Gibson, C. A., O'Reilly, C., and Castendyk, D. N. (2015). Stream Discharge Module (Project EDDIE). QUBES Educ. Res. doi: 10.25334/V96B-NM56
Bandaragoda, C., and Wen, T. (2020). Data Science in Earth and Environmental Sciences. HydroLearn. Available online at: https://edx.hydrolearn.org/courses/course-v1:SyracuseUniversity+EAR601+2020_Fall/about
Beason-Abmayr, B., Caprette, D. R., and Gopalan, C. (2021). Flipped teaching eased the transition from face-to-face teaching to online instruction during the COVID-19 pandemic. Adv. Physiol. Educ. 45, 384–389. doi: 10.1152/advan.00248.2020
Burian, S. J., Horsburgh, J. S., Rosenberg, D. E., Ames, D. P., Hunter, L. G., and Strong, C. (2013). Using interactive video conferencing for multi-institution, team-teaching. ASEE Annu. Conf. Expo. Conf. Proc. doi: 10.18260/1-2–22706
Celicourt, P., Rousseau, A. N., Gumiere, S. J., and Camporese, M. (2021). Editorial: hydro-informatics for sustainable water management in agrosystems. Front. Water 3, 1–3. doi: 10.3389/frwa.2021.758634
Ceola, S., Arheimer, B., Baratti, E., Blöschl, G., Capell, R., Castellarin, A., et al. (2015). Virtual laboratories: new opportunities for collaborative water science. Hydrol. Earth Syst. Sci. 19, 2101–2117. doi: 10.5194/hess-19-2101-2015
Chen, Y., and Han, D. (2016). Big data and hydroinformatics. J. Hydroinformatics 18, 599–614. doi: 10.2166/hydro.2016.180
Daniels, L. M., Goegan, L. D., and Parker, P. C. (2021). The impact of COVID-19 triggered changes to instruction and assessment on university students' self-reported motivation, engagement and perceptions. Soc. Psychol. Educ. 24, 299–318. doi: 10.1007/s11218-021-09612-3
Flores, A. (2021). Open and Reproducible Computing. GitHub. Available online at: https://github.com/LejoFlores/Open-And-Reproducible-Research-Computing
Gannon, J. (2021). Hydroinformatics at VT. GitHub. Available online at: https://vt-hydroinformatics.github.io/
Gannon, J. P., and McGuire, K. J. (2022). An interactive web application helps students explore water balance concepts. Front. Educ. 7, 1–7. doi: 10.3389/feduc.2022.873196
Garousi-Nejad, I., and Lane, B. A. (2021). Hydrologic Statistics and Data Analysis (M1). HydroShare. Available online at: https://www.hydroshare.org/resource/bd0b38fc5d1e4d5c895dc484ceeb2c2a/
Gibert, K., Horsburgh, J. S., Athanasiadis, I. N., and Holmes, G. (2018). Environmental data science. Environ. Model. Softw. 106, 4–12. doi: 10.1016/j.envsoft.2018.04.005
Godber, K. A., and Atkins, D. R. (2021). COVID-19 impacts on teaching and learning: a collaborative autoethnography by two higher education lecturers. Front. Educ. 6, 1–14. doi: 10.3389/feduc.2021.647524
Gorelick, D., and Characklis, G. (2019). Introductory R for water resources-fall 2019-Univeristy of North Carolina at Chapel Hill. ECSTATIC. Available online at: https://digitalcommons.usu.edu/ecstatic_all/86/
Habib, E., Deshotel, M., Guolin, L. A. I., and Miller, R. (2019). Student perceptions of an active learning module to enhance data and modeling skills in undergraduate water resources engineering education. Int. J. Eng. Educ. 35, 1353–1365.
Habib, E., Ma, Y., Williams, D., Sharif, H. O., and Hossain, F. (2012). HydroViz: design and evaluation of a web-based tool for improving hydrology education. Hydrol. Earth Syst. Sci. 16, 3767–3781. doi: 10.5194/hess-16-3767-2012
Hamilton, A. (2021). Python for Environmental Research. MyGeoHub. Available online at: https://mygeohub.org/courses/Environmentalresearch/overview
Horsburgh, J. S. (2019). Fa19. CEE 6110-001. Utah State Univ. Canvas. Available online at: https://usu.instructure.com/courses/545625/pages/hydroinformatics
Horsburgh, J. S., Morsy, M. M., Castronova, A. M., Goodall, J. L., Gan, T., Yi, H., et al. (2016). HydroShare: sharing diverse environmental data types and models as social objects with application to the hydrology domain. J. Am. Water Resour. Assoc. 52, 873–889. doi: 10.1111/1752-1688.12363
Horsburgh, J. S., Tarboton, D. G., Maidment, D. R., and Zaslavsky, I. (2008). A relational model for environmental and water resources data. Water Resour. Res. 44, 1–12. doi: 10.1029/2007WR006392
Jones, A. S., Horsburgh, J. S., and Bastidas Pacheco, C. J. (2022a). Hydroinformatics and water data science. HydroLearn. Available online at: https://edx.hydrolearn.org/courses/course-v1:USU+CEE6110+2022/about
Jones, A. S., Horsburgh, J. S., and Bastidas Pacheco, C. J. (2022b). Hydroinformatics instruction modules example code. HydroShare. Available online at: http://www.hydroshare.org/resource/761d75df3eee4037b4ff656a02256d67
Jones, A. S., Horsburgh, J. S., and Flint, C. G. (2022c). Hydroinformatics and water data science instructor interviews and surveys, HydroShare. doi: 10.4211/hs.15b1a61f47724a6e8deb100789353df2
Kerkez, B. (2019). CEE575 Sensors, Data, and Intelligent Systems. Univ. Michigan. Available online at: http://www-personal.umich.edu/~bkerkez/courses/cee575/
Lane, B., Garousi-Nejad, I., Gallagher, M. A., Tarboton, D. G., and Habib, E. (2021). An open web-based module developed to advance data-driven hydrologic process learning. Hydrol. Process. 35, e14273. doi: 10.1002/hyp.14273
Maggioni, V., Girotto, M., Habib, E., and Gallagher, M. A. (2020). Building an online learning module for satellite remote sensing applications in hydrologic science. Remote Sens. 12, 1–16. doi: 10.3390/rs12183009
Makropoulos, C. (2019). Urban Hydroinformatics: Past, Present and Future. Water. 11, 1–17. doi: 10.3390/w11101959
McGovern, A., and Allen, J. (2021). Training the Next Generation of Physical Data Scientists. Eos. 102, 1–9. doi: 10.1029/2021EO210536
Merck, M. F., Gallagher, M. A., Habib, E., and Tarboton, D. (2021). Engineering students' perceptions of mathematical modeling in a learning module centered on a hydrologic design case study. Int. J. Res. Undergrad. Math. Educ. 7, 351–377. doi: 10.1007/s40753-020-00131-8
Merwade, V., and Ruddell, B. L. (2012). Moving university hydrology education forward with community-based geoinformatics, data and modeling resources. Hydrol. Earth Syst. Sci. 16, 2393–2404. doi: 10.5194/hess-16-2393-2012
Nearing, G. S., Kratzert, F., Sampson, A. K., Pelissier, C. S., Klotz, D., Frame, J. M., et al. (2021). What role does hydrological science play in the age of machine learning?. Water Resour. Res. 57, e2020WR028091. doi: 10.1029/2020WR028091
Ngambeki, I., Thompson, S. E., Troch, P. A., Sivapalan, M., and Evangelou, D. (2012). Engaging the students of today and preparing the catchment hydrologists of tomorrow: student-centered approaches in hydrology education. Hydrol. Earth Syst. Sci. Discuss. 9, 707–740. doi: 10.5194/hessd-9-707-2012
Peek, R., and Pauloo, R. (2021). R for Water Resources Data Science. Available online at: https://www.r4wrds.com/
Popescu, I., Jonoski, A., and Bhattacharya, B. (2012). Experiences from online and classroom education in hydroinformatics. Hydrol. Earth Syst. Sci. 16, 3935–3944. doi: 10.5194/hess-16-3935-2012
Rapanta, C., Botturi, L., Goodyear, P., Guàrdia, L., and Koole, M. (2021). Balancing technology, pedagogy and the new normal: post-pandemic challenges for higher education. Postdigital Sci. Educ. 3, 715–742. doi: 10.1007/s42438-021-00249-1
Ruddell, B. L., and Wagener, T. (2015). Grand challenges for hydrology education in the 21st century. J. Hydrol. Eng. 20, 1–8. doi: 10.1061/(ASCE)HE.1943-5584.0000956
Seibert, J., Uhlenbrook, S., and Wagener, T. (2013). Preface hydrology education in a changing world. Hydrol. Earth Syst. Sci. 17, 1393–1399. doi: 10.5194/hess-17-1393-2013
Shen, C. (2018). Deep learning: a next-generation big-data approach for hydrology. Eos. 99, 1–4. doi: 10.1029/2018EO095649
Smith, C., and Praphamontripong, U. (2021). Analysis of the transition to a virtual learning semester in a college software testing course, EASEAI 2021-Proceedings of the 3rd International Workshop on Education through Advanced Software Engineering and Artificial Intelligence, co-located with ESEC/FSE 2021. Association for Computing Machinery.
VanZuylen, H. J., Dee, D. P., Mynett, A. E., Rodenhuis, G. S., Moll, J., Ogink, H. J. M., et al. (1994). Hydroinformatics at delft hydraulics. J. Hydraul. Res. 32, 83–136. doi: 10.1080/00221689409498806
Vojinovic, Z., and Abbott, M. B. (2017). Twenty-five years of hydroinformatics. Water. 9, 1–11. doi: 10.3390/w9010059
Wagener, T., Kelleher, C., Weiler, M., McGlynn, B., Gooseff, M., Marshall, L., et al. (2012). It takes a community to raise a hydrologist: the modular curriculum for hydrologic advancement (MOCHA). Hydrol. Earth Syst. Sci. 16, 3405–3418. doi: 10.5194/hess-16-3405-2012
Wagener, T., and McIntyre, N. (2007). Tools for teaching hydrological and environmental modeling. Comput. Educ. J. 17, 16–26.
Wagener, T., Savic, D., Butler, D., Ahmadian, R., Arnot, T., Dawes, J., et al. (2021). Hydroinformatics education-the Water Informatics in Science and Engineering (WISE) centre for doctoral training. Hydrol. Earth Syst. Sci. 25, 2721–2738. doi: 10.5194/hess-25-2721-2021
Wagener, T., Weiler, M., McGlynn, B. L., Gooseff, M., Meixner, T., Marshall, L., et al. (2007). Taking the pulse of hydrology education. Hydrol. Process. 21, 1789–1792. doi: 10.1002/hyp.6766
Ward, A. S., Herzog, S., Bales, J., Barnes, R., Ross, M., Jefferson, A., et al. (2021). Educational Resources for Hydrology and Water Resources. HydroShare. Available online at: http://www.hydroshare.org/resource/148b1ce4e308427ebf58379d48a17b91
Keywords: hydroinformatics, water data science, collaborative instruction, graduate education, online education, community resources, educational module
Citation: Jones AS, Horsburgh JS, Bastidas Pacheco CJ, Flint CG and Lane BA (2022) Advancing Hydroinformatics and Water Data Science Instruction: Community Perspectives and Online Learning Resources. Front. Water 4:901393. doi: 10.3389/frwa.2022.901393
Received: 21 March 2022; Accepted: 26 May 2022;
Published: 29 June 2022.
Edited by:
Bridget Mulvey, Kent State University, United StatesReviewed by:
Chris Lowry, University at Buffalo, United StatesMohamed Abdelmoghny Hamouda, United Arab Emirates University, United Arab Emirates
Copyright © 2022 Jones, Horsburgh, Bastidas Pacheco, Flint and Lane. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Amber Spackman Jones, YW1iZXIuam9uZXMmI3gwMDA0MDt1c3UuZWR1