About this Research Topic
The advent of easily and remotely accessible HPC clusters in the past few decades has resulted in the creation of a large number of workflow systems, many with unclear differentiating characteristics. Even when sensible differences exist, it is unclear how relevant such differences are in addressing the core workflow problem. For example, many workflow systems designed in the early 2000s exposed their functionality through XML, whereas the modern trend eschews XML for JSON or YAML with little conceptual advances in the core characteristics of the systems. At the same time, many workflow systems, lacking a common infrastructure, end up independently writing components that solve similar problems, such as interfacing with batch schedulers or efficiently sequencing small concurrent operations.
A comprehensive strategy would approach this issue both from the top (a deeper understanding of what workflows are) and from the bottom (systematically constructing an infrastructure for workflows). The former is likely to be a long and winding process whose progress is as much social as it is technical. In this topic, we propose to address the latter.
Workflow systems have emerged as a compelling tool In the management and execution of large scale experiments on HPC systems. However, their complexity hinders the development of exhaustive solutions, leading to the proliferation of systems that only partially cover the problem spectrum. This makes it difficult for experiments to adopt a particular workflow solution. Addressing the complexity of workflow systems requires the careful and systematic application of software engineering principles in their design and implementation.
We invite contributions that address various aspects of software engineering as applied to workflow systems, such as but not limited to:
- Modular and reusable components and libraries (e.g., libraries to interface with HPC systems).
- High level concurrency abstractions and their implementations.
- Support systems, such as testing and performance tools.
Keywords: Workflows, infrastructure, HPC, HPC clusters, large scale systems
Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.