Scientific studies in many science and engineering disciplines such as atmospheric science and health are becoming a major source of generating big data including simulation and observational data. At the same time, more and more scientific studies are utilizing big data analytics (including machine/deep learning) techniques to achieve new findings that weren't inaccessible before. To have good execution efficiency, most big data analytics are conducted in a distributed environment, instead of a local computer. High-performance computing (HPC) and cloud computing are two dominant distributed environments to conduct big data analytics.
In recognizing this new paradigm and opportunities of conducting big data analytics on HPC and/or cloud computing environments, we call for submissions addressing the overarching goal of enabling data-driven scientific discovery at scale. This includes use cases of successful large-scale data analysis in various domains, technology innovation on big data wrangling, hybrid load balancing on distributed notes on HPC and/or cloud infrastructure, large-scale data analysis performance optimization on hardware and network configuration, cross-discipline data insights, and so on.
Specific areas of interest include:
? Scalable data processing algorithms for scientific data
? Scalable machine/deep learning algorithms for scientific data analytics
? Automated scientific data analytics pipelines and workflows
? Deployment and evaluation of scalable analytics tools on HPC and cloud
? Comparison and benchmarking of scalable applications on HPC and cloud
? Reproducible big data analytics on HPC and cloud
? Big scientific data analytics as services
? Large-scale batch data analytical applications in science and engineering
? High-speed stream data analytical applications in science and engineering
? GPU acceleration and optimization on HPC/AI application
? Distributed AI applications
? Data analytics on edge devices
? Benchmark and performance for data science at scale
? High-performance data services with a high-speed network
Scientific studies in many science and engineering disciplines such as atmospheric science and health are becoming a major source of generating big data including simulation and observational data. At the same time, more and more scientific studies are utilizing big data analytics (including machine/deep learning) techniques to achieve new findings that weren't inaccessible before. To have good execution efficiency, most big data analytics are conducted in a distributed environment, instead of a local computer. High-performance computing (HPC) and cloud computing are two dominant distributed environments to conduct big data analytics.
In recognizing this new paradigm and opportunities of conducting big data analytics on HPC and/or cloud computing environments, we call for submissions addressing the overarching goal of enabling data-driven scientific discovery at scale. This includes use cases of successful large-scale data analysis in various domains, technology innovation on big data wrangling, hybrid load balancing on distributed notes on HPC and/or cloud infrastructure, large-scale data analysis performance optimization on hardware and network configuration, cross-discipline data insights, and so on.
Specific areas of interest include:
? Scalable data processing algorithms for scientific data
? Scalable machine/deep learning algorithms for scientific data analytics
? Automated scientific data analytics pipelines and workflows
? Deployment and evaluation of scalable analytics tools on HPC and cloud
? Comparison and benchmarking of scalable applications on HPC and cloud
? Reproducible big data analytics on HPC and cloud
? Big scientific data analytics as services
? Large-scale batch data analytical applications in science and engineering
? High-speed stream data analytical applications in science and engineering
? GPU acceleration and optimization on HPC/AI application
? Distributed AI applications
? Data analytics on edge devices
? Benchmark and performance for data science at scale
? High-performance data services with a high-speed network