AUTHOR=Sabbagh Ramin , Cai Zicheng , Stothert Alec , Djurdjanovic Dragan TITLE=Physically Inspired Data Compression and Management for Industrial Data Analytics JOURNAL=Frontiers in Computer Science VOLUME=2 YEAR=2020 URL=https://www.frontiersin.org/journals/computer-science/articles/10.3389/fcomp.2020.00041 DOI=10.3389/fcomp.2020.00041 ISSN=2624-9898 ABSTRACT=
With the huge and ever-growing volume of industrial data, an enormous challenge of how this data should be handled, stored, and analyzed emerges. In this paper, we describe a novel method that facilitates automated signal parsing into a set of exhaustive and mutually exclusive segments, which is coupled with extraction of physically interpretable signatures that characterize those segments. The resulting numerical signatures can be used to approximate a wide range of signals within some arbitrary accuracy, thus effectively turning the aforementioned signal parsing and signature extraction procedure into a signal compression process. This compression converts raw signals into physically plausible and interpretable features that can then be directly mined in order to extract useful information via anomaly detection and characterization, quality prediction, or process control. In addition, distance-based unsupervised clustering is utilized to organize the compressed data into a tree-structured database enabling rapid searches through the data and consequently facilitating efficient data mining. Application of the aforementioned methods to multiple large datasets of sensor readings collected from several advanced manufacturing plants showed the feasibility of physics-inspired compression of industrial data, as well as tremendous gains in terms of search speeds when compressed data were organized into a distance-based, tree-structured database.