A relaxed support vector data description algorithm based fault detection in distribution systems

Chu, Fei; Lu, Zhenlin; Jin, Shuowei; Liu, Xin; Yu, Ziyang

doi:10.3389/fenrg.2022.973794

ORIGINAL RESEARCH article

Front. Energy Res., 22 July 2022

Sec. Smart Grids

Volume 10 - 2022 | https://doi.org/10.3389/fenrg.2022.973794

This article is part of the Research TopicRecent Advances of Edge Computing for Smart GridView all 9 articles

A relaxed support vector data description algorithm based fault detection in distribution systems

Fei Chu^1,2

Zhenlin Lu²

Shuowei Jin³*

Xin Liu³

Ziyang Yu³

¹Nanjing University of Aeronautics and Astronautics, Nanjing, China
²Beijing Microelectronics Technology Institute, Beijing, China
³Northeastern University, Shenyang, China

The power detection of the distribution network is essential for reliable and secure distribution. In this paper, a flexible dual-threshold SVDD fault warning algorithm with fault samples is proposed to deal with problems concerning complex network topology, accessible data, and missing fault data in the power grid. For the problems of complicated network topology and a wide variety of signal types, we propose to combine wavelet packet energy features with Spearman to extract electric signal features, and finally achieve accurate feature extraction of multiple signal types. In the case of the problems of untimely judgement and low accuracy of the original SVDD, a relaxed SVDD fault warning algorithm with fault samples is correspondingly proposed. We turn the original SVDD boundary into a double-layer boundary, and divides the hypersphere space into three regions to increase the sensitivity to the fault samples and lessen the risk of missed detection. Besides, an adaptive update strategy is developed, which reduces the computational effort of the model and is proven more applicable to the distribution. Finally, the method is applied to numerical examples and fault detection experiments, and the experimental results in turn verified its effectiveness and superiority.

1 Introduction

The distribution, the last link of the power system, is directly responsible for meeting the responsibilities proposed by users for stability, safety, quality, and economy of electric energy, and reliable fault warning is necessary for the safe of the distribution network, making rapid fault warning of the distribution network necessarily important. The fault warning algorithms and methods introduced so far can be categorized as follows.

• Analysis based on expertise and prior knowledge

• Analysis based on artificial neural network

• Analysis based on data-driven

Expert system-based fault information analysis is applicable to some cases where the information obtained is incomplete (Yi and Etemadi, 2017; Wang X. et al., 2018; Peng et al., 2018). However, in recent years, the introduction of large amounts of distributed generation and loads have contributed to the exponential growth of the complexity of power systems in size and coupling. The coupling between these subsystems and components is still in the research phase and has not yet been clarified, make it difficult for this method to keep up with the exponentially growing network complexity and its applicability may be worse. Moreover, this method takes a long time to build the database and has less redundancy for errors, perfectly presenting its advantages over expert systems for fault warning.

The fault diagnosis method is based on the artificial neural network by keep simulating the human nervous transmission and processing of information (Gopakumar et al., 2015; Yansong et al., 2018; Fei et al., 2019; Guomin et al., 2019; Saizhao et al., 2019; Han et al., 2022). The literature (Saizhao et al., 2019) proposes an artificial neural network-based rapid diagnosis method for production line electricity for the existing fault detection methods with the problems of difficult threshold selection, low accuracy, and long detection time; A new method of fault detection and fault location is proposed in the literature (Gopakumar et al., 2015), which can achieve a better classification of fault types and a fast location of fault locations. Although the artificial neural network for equipment power fault processing speed also has good fault redundancy, it is subject to certain limits brought by the need for much data support. However, the production equipment power consumption information is limited and fault data to be collected is missing, which severely limit its application.

Instead of exploring the faults or fault models, the data-driven analysis approach analyzes the characteristics of the received signals and correlates them with fault states. Many algorithms have been proposed, such as principle component analysis, model analysis, etc (Feng and Zuo, 2013; Gritli et al., 2013; Joksimovi et al., 2013; Clemente-Alarcon et al., 2014; Hong and Dhupia, 2014; Lu et al., 2017). There have been some literature exploring data-driven methods for power system fault detection. The most commonly adopted schemes are various machine learning based classifiers, e.g. the Decision tree-based classifiers (Chouder and Silvestre, 2010) and the Support Vector Machine (SVM)-based classifiers (Zheng et al., 2018). Among them, Support Vector Data Description (SVDD) is widely used in the field of fault warning. SVDD can be established with a few samples and only normal samples, making it a new hot spot in the field of fault diagnoses such as real-time monitoring and fault warning. The existing problems of conventional SVDD and the ways to improve them are as follows. 1) For faulty samples, many current online SVDD algorithms choose to discard (Lei, 2009) or treat them as normal samples (Davy et al., 2006) and continue to use them, which can lead to great data waste and computational errors. The literature (Tax and Duin, 2004) points out that fault samples are rich invaluable, and the SVDD model can effectively improve the diagnosis accuracy by adding a few fault samples in the training. 2) In the applications, the criteria that distinguish whether the power system is faulty is fuzzy, the specific performance is that when the SVDD is training, the incorrect samples near the boundary have a great possibility to appear inside to become the new support vectors, and similarly the samples inside the model also have the probability to appear outside to become the wrong samples. Therefore, the similarity of these misclassified samples near the boundary in the applications will bring a certain detection error (Guo et al., 2009). The literature (Lei, 2009) proposes a dual threshold to distinguish misclassified samples from support vectors. The literature (Mu and Nandi, 2009) proposes a v _ SVDD algorithm to overcome the effect of wild points and noise appearing on the SVDD boundary.

However, a comprehensive analysis of all available signal types, that is, accurate feature extraction, is a condition and a crucial step for implementing fault warning. To address this challenge, the literature (Yilun et al., 2020) has studied the proposed point estimation method to calculate the currents in the power and extracted the electrical signal features by deep learning methods. In the case of improving feature recognition, the deep learning approach leads to large diagnostic model size and a long diagnostic time. In the literature (Lu et al., 2021), a single switching quantity information or electrical quantity information is used to judge the faults of power grid, it is demonstrated that multiple signal features can better improve the accuracy of grid fault detection. In the literature (Liu et al., 2019), this paper proposes a multi-source log comprehensive feature extraction method based on restricted Boltzmann machine (RBM), the RBM fully exploits the valid information in the grid signal to obtain more accurate signal features. In the literature (Xin et al., 2021), for the problems of low diagnostic accuracy and difficult feature extraction of microgrid, the microgrid fault diagnosis method of wavelet feature extraction and deep learning is proposed, and the final diagnostic accuracy was improved, but the effect of feature reproduction for grid signal was not obvious. Although the literature (Xiang et al., 2015) proposes a combined feature fusion approach for online fault diagnosis, it is limited to fusing only two features, which does not apply to scenarios with many signal classes and complex feature coupling associations.

Only a single type of signal is used for feature extraction in research methods and applications mentioned above. However, in the power distribution, many variables of the electricity-using network are measured and recorded, each containing some fault characteristics. This means that fault characteristics are irregularly distributed across multiple signals and that this complex nonlinear relationship is difficult to analyze. This requires a comprehensive analysis of all useful signals collected using novel data correlation techniques.

Thus, this paper aims to provide a new fault warning scheme for power systems. This method can be divided into two parts, i.e., wavelet packet energy entropy feature extraction with Spearman correlation analysis and a highly optimized resilient dual-threshold SVDD classifier. Wavelet packet decomposition is used to generate key feature combinations to lay the foundation for classifier construction; the elastic double-threshold SVDD algorithm introduces the idea of relaxed boundaries to increase the sensitivity to faulty samples.

The primary contributions of this paper can be summarized as follows.

• A novel feature extraction method namely wavelet packet-Spearman rank correlation feature extraction is proposed to analyze and fully exploit the correlation between different signal types., which is especially suitable for power distribution with complex network structures and coupling between signals;

• The flexible dual-threshold SVDD algorithm is proposed to form a variable double-layer discriminant boundary by introducing a relaxed boundary strategy, which effectively increases the discrimination degree of the model for fault samples and greatly reduces the diagnostic false alarm;

• An online adaptive update strategy over time is proposed, which greatly reduces the computational effort of the model and makes it widely applicable to fault warning with various system signals delivering complex and strongly coupled signals.

The rest of this paper is organized as follows. In Section 2, the wavelet packets, Spearman correlation, and other related theories are proposed. In Section 3, the WPDSR- SVDD framework and technical details of the proposed paper are given. Section 4 partly presents the comparative experiments and experimental results of the proposed method and other methods. And Section 5 concludes the paper.

2 Feature extraction related work

Feature extraction is the basis for achieving fault warning. Extracting the key and accurate features can directly affect the performance and correctness of the fault warning model. However, the topology of the power network is complex, and the voltage and current signals collected by PMUs have problems such as difficult processing, inconsistent magnitudes, and large correlations, which require a comprehensive new data feature extraction method. Therefore, the wavelet packet energy entropy method is hereby proposed to solve the gauge nonuniformity problem and combines it with Spearman correlation analysis to fully exploit the effective information between different signal types.

2.1 Wavelet packet decomposition

In actual application, there are many signals reflecting the information of equipment characteristics, such as voltage, current, power, power factor, and other electrical signals, and these signals have the problems of non-uniformity of magnitude and correlation with each other. To extract uniform, comprehensive and accurate features, more comprehensive and suitable methods are needed. In recent years, wavelet packet analysis has a more flexible time-frequency plane of the signal, and it also has a good analysis effect on the high frequency part. After the wavelet packet decomposition of the energy, entropy can also be clever to achieve the purpose of scale unification. Therefore, this paper decomposes many different types of signals using the wavelet packet analysis method (WPD) and then a uniform magnitude is obtained by extracting the energy entropy. WPD is specifically described as follows (Rafiee et al., 2009).

X_{i + 1}^{2 n} = X_{i}^{n} * P (- 2 m) (1)

X_{i + 1}^{2 n + 1} = X_{i}^{n} * Q (- 2 m) (2)

Where $*$ denotes the convolution operator, $X_{i}^{n}$ denotes the wavelet packet coefficient, $P (m)$ and represents a conjugate filter. $i$ and $n$ represent the number of layers of decomposition and the label with nodes. Assuming that the original signal is at the maximum decomposition depth $I$ , at the node $(i, n)$ , the main areas of concentration of frequency bands are as follows.

\begin{array}{l} F \in [n \times F_{s} / 2^{i + 1}, (n + 1) \times F_{s} / 2^{i + 1}] \\ 1 \leq i \leq I, 0 \leq n \leq 2^{I} - 1 \end{array} (3)

Where $F_{s}$ indicates the frequency of the original signal.

2.2 Spearman correlation

The above-proposed wavelet packet decomposition method, although can effectively extract the features of a single class of signals and achieve the purpose of unifying the magnitude by extracting the energy entropy. However, this is not enough. There are many types of signals reflecting the state of the equipment mentioned earlier which means that the fault features will be irregularly distributed in multiple signals, and these signals will jointly affect the state of the system. Besides, only feature extraction of a single signal type will lead to the loss of correlation information in the extracted feature information, which will affect the accuracy of the SVDD model. Therefore, methods to fully exploit the correlation between different signal types are also needed, and this paper proposes Spearman rank correlation combined with wavelet packet energy to analyze and extract the exact features.

Spearman correlation analysis is used to assess the correlation between two variables. It requires the observations of the two variables to be paired with rating information or rating information obtained by transforming observations of continuous variables, without considering the overall distribution pattern of the two variables and the size of the sample size. For every two vectors $Y_{n x 1}$ , the Spearman rank correlation can be calculated as

ρ = \frac{cov (x, y)}{σ_{x} σ_{y}} = \frac{E [(x - μ_{x}) (y - μ_{y})]}{σ_{x} σ_{y}} = \frac{\sum_{i = 1}^{n} x_{i} y_{i} - n \bar{x} \bar{y}}{\sqrt{(\sum_{i = 1}^{n} x^{2} i - {\bar{x}}^{2}) (\sum_{i = 1}^{n} y^{2} i - {\bar{y}}^{2})}} (4)

Where, $x$ or $y$ are the rank vectors of the raw vectors $X$ and $Y$ , $μ$ and $σ$ represent the mean and variance, respectively. Since the rank order is a continuous positive integer, the $\bar{x}$ , $\bar{y}$ , $\sum_{i = 1}^{n} x^{2}$ , $\sum_{i = 1}^{n} y^{2}$ , $\sum_{i = 1}^{n} x_{i} y_{i}$ can be expressed as follows (Myers et al., 1995).

\bar{x} = \bar{y} = \frac{1}{n} (1 + 2 + ... + n) = \frac{n + 1}{n} (5)

\sum_{i = 1}^{n} x^{2} = \sum_{i = 1}^{n} y^{2} = 1^{2} + 2^{2} + ... + n^{2} = \frac{n (n + 1) (2 n + 1)}{6} (6)

\sum_{i = 1}^{n} x_{i} y_{i} = \frac{1}{2} \sum_{i = 1}^{n} [x_{i}^{2} + y_{i}^{2} - {(x_{i} - y_{i})}^{2}] = \frac{n (n + 1) (2 n + 1)}{6} - \frac{1}{2} \sum_{i = 1}^{n} d_{i}^{2} (7)

Where, $d_{i} = x_{i} - y_{i}$ , thus Eq. 4 can also be written as follows.

ρ = 1 - \frac{6 \sum d_{i}^{2}}{n (n^{2} - 1)} (8)

2.3 Feature extraction

In summary, wavelet packet analysis can fully decompose the collected signal, and it is also provided with the property of energy conservation. Thus, it can not only achieve a uniform signal magnitude, but also determine how much feature information is contained by comparing the energy level after decomposition. The higher the energy of the nodes means that the more feature information contained, the more obvious the features. Additionally, the application of Spearman rank correlation can ensure the wavelet packet decomposition while analyzing the correlation between different signal types, so that energy features with more accurate and fuller feature information can be extracted in the end. The specific implementation steps can be expressed as follows following Perceval’s constant equation:

\int_{- \infty}^{\infty} {| f (x) |}^{2} d x = \sum {| d_{k}^{j + 1,2 n} |}^{2} (9)

From the equation, it can be seen that the square of the wavelet packet coefficients has a quantum of energy. Select appropriate wavelet basis functions for J-layer wavelet packet decomposition, and obtain the number of bands of M.

E_{t o t a l} = \sum_{J = 1}^{J = 2^{n - 1}} {‖ A_{j} ‖}^{2} + \sum_{J = 1}^{J = 2^{n - 1}} {‖ D_{j} ‖}^{2} (10)

Where, $A^{J}$ denotes the decomposed low-frequency component coefficients, and $D_{j}$ represents the decomposed high-frequency component coefficients. The relative wavelet packet energy of each wavelet node can be expressed as follows.

ρ_{j} = \frac{E_{j}}{E_{t o t a l}} = \frac{{‖ A_{j} ‖}^{2}}{\sum_{J = 1}^{J = 2^{n - 1}} {‖ A_{j} ‖}^{2} + \sum_{J = 1}^{J = 2^{n - 1}} {‖ D_{j} ‖}^{2}} (11)

To extract more accurate and richer features, after first wavelet packet decomposition and extracting the energy share by Eq. 11, then applying Eq. 8 to the extracted energy features for correlation analysis, and finally finding the exact energy combination features. The result is shown in Eq. 12.

ρ_{W P D} = 1 - \frac{6 \sum ρ_{j}^{2}}{n (n^{2} - 1)} (12)

The energy occupation ratio calculated by wavelet packet decomposition is directly used as the feature set, and the results are shown in Table 1.

TABLE 1

TABLE 1. Energy contribution.

As is shown in Table 1, the contribution of each signal as a feature set is 20%, failing to reflect the influence of their relationship on the energy features. Since the energy share calculated after wavelet packet decomposition can only characterize one signal type, the correlation between the signals is explored to fully extract the signal features. The analysis method of Spearman’s correlation is introduced to analyze the correlation between each signal, and the correlation between each signal is calculated according to Eq. 12, as is reflected by the change in the contribution of each signal as a feature set, which is shown in Table 2.

TABLE 2

TABLE 2. Sr energy contribution.

As is shown in Table 2, Spearman correlation analysis can characterize the correlation between the different signals, i.e., the change in the characteristic contribution of each signal. The change from Table 1 lies in the increased contribution of current and power.

3 Proposed fault diagnosis method

In this paper, a novel flexible dual-threshold SVDD fault warning algorithm with fault samples is proposed, which adds the fault samples to the SVDD training and proposes a relaxation boundary, and introduces an offset factor. The trained model uses online power network signals for fault warning. The overall flow of fault warnings is shown in Figure 1.

FIGURE 1

FIGURE 1. Overall flow chart of the fault warning.

3.1 SVDD-related studies

Since in production, the power will always be in steady state, with few or no fault states. Therefore, the data collected by PMUs are mostly normal samples, and fault samples are seriously missing. It also limits the application of other methods. For example, expert systems, artificial neural networks, and other models require a large number of fault samples for training.

Compared to other related works, SVDD shows uniqueness about the lack of failure samples. Its essence of SVDD is to map feature samples to a high-dimensional space through a mapping relationship to find an optimal description boundary that can contain all feature samples as much as possible. The essence of constructing an optimal description boundary is to solve a quadratic optimization problem, only if all samples $x_{i}$ satisfy the following conditions of Karush-Kuhn-Tucher (KKT). $[α_{1}, α_{2}, ..., α_{n}]$ is the optimal solution to the problem.

{\begin{cases} α_{i} = 0 \Rightarrow d_{i}^{2} \leq R^{2} \\ 0 < α_{i} < C \Rightarrow d_{i}^{2} = R^{2} \\ α_{i} = C \Rightarrow d_{i}^{2} \geq R^{2} \end{cases} (13)

Where, $d_{i}^{2}$ is the square of the distance from the sample $x_{i}$ to the center of the sphere. Among these samples, those that satisfy $α_{i} = 0$ are located in the interior of the hypersphere boundary (including the hypersphere boundary), and those satisfying $0 < α_{i} < C$ are on the outer of the hypersphere (including the hypersphere boundary). Similarly, many scholars have made efforts to improve the performance of the SVDD model. To improve the robustness and accuracy of SVDD, the literature proposed the SVDD algorithm with fault samples (NSVDD) by adding fault samples to the model training, and the SVDD with fault samples adds a small number of fault samples to the basic SVDD model. The rest of the solution is consistent with the standard SVDD algorithm in terms of solution. This method improves the performance of the SVDD model to some extent.

The key factor affecting the accuracy of the model is the shape of the optimal description boundary, and the key factor determining this shape is the sample located on the hypersphere boundary known as the support vector. Therefore, an accurate judgment of whether it can be a support vector sample is crucial. In the power distribution, there is the aging of the power equipment, which leads to deviations between the collected data and the original equipment data. In other words, when the model works with samples that are located outside the boundary but close to it, the traditional model would consider them as faulty samples, which is actually not entirely true. It would lead to a false alarm risk for the model. Similarly, there is a great possibility the samples located inside the hypersphere boundary close to the boundary are with fault warnings, which can lead to the problem of missed detection in the model.

3.2 Flexible double-threshold SVDD

As mentioned in the previous section, the conventional SVDD directly compares the size of sample distance and hypersphere radius, which will produce serious false alarms and missed detections. To better divide the samples and improve the discriminative accuracy of the hypersphere boundary for the samples on both sides, this paper introduces the idea of a relaxed boundary and proposes a ball boundary offset discriminative criterion. First, define the ball boundary offset factor.

η = (r - R) / r (14)

Where, $r$ represents the distance from the sample to the center of the sphere, and $R$ represents the radius of the hypersphere. Simultaneously, set two thresholds $λ_{+}$ and $λ_{-}$ , where $λ_{+} \geq 0$ , $λ_{-} \geq 0$ . Thus, two new decision surfaces are obtained based on the original decision surface, and the hypersphere space is divided into three regions, i.e., A, B, and C, as shown in Figure 2. Each regional sample satisfies $η \leq - λ_{-}$ , $- λ_{-} < η \leq λ_{+}$ and $η > λ_{+}$ .

FIGURE 2

FIGURE 2. Flexible double-threshold SVDD.

The thresholds $λ_{+}$ and $λ_{-}$ in the relaxation discriminant criterion corresponds to the role of the relaxation variable C in the SVDD model. When the traditional SVDD model is trained, the slack variable C reflects the degree of fault tolerance of the model to the fault points, weakening the influence of normal samples far away from the boundary (away from the sphere center direction) and faulty samples (near the sphere center direction). After the training is completed, the thresholds $λ_{+}$ and $λ_{-}$ adjusts the SVDD boundary so that the vast majority of samples contaminated by working conditions and noise enter the B region, which is used to balance the misclassification ratio of the SVDD boundary. The $λ_{+}$ controls the proportion of normal samples entering region C (rejecting normal samples, often called “false alarms”), the proportion is set to $μ_{+}$ . The $λ_{-}$ controls the proportion of erroneous samples entering region A (accepting faulty samples, often called “misses”) and is set to $μ_{-}$ .

In production practice, missed inspection is more serious than a false alarm, which will seriously endanger the safety of equipment operation and bring greater losses. Therefore, the upper limit of $μ_{-}$ should be very small, or even directly taken as 0, to increase the sensitivity to the faulty samples and reduce or avoid the risk caused by a missed detection. The upper limit of $μ_{+}$ should not be too large, depending on the size of the sample. The threshold values $λ_{+}$ and $λ_{-}$ are determined as in Figure 3.

FIGURE 3

FIGURE 3. Process for determining the threshold values $λ_{+}$ and. $λ_{-}$ .

3.3 Adaptive update policy

The basic idea of model updating is to continuously add newly-collected sample data while eliminating the same number of original samples to form a new set of samples for retraining the SVDD model. The introduction of this update idea overcomes the disadvantages of frequent updates and large computation of the SVDD model, improves the algorithm efficiency, and makes it more suitable for real-time online fault warning of the distribution power. Samples entering area B cannot be directly considered normal or faulty samples.

The ideal situation for fault warning is to enter test data with normal samples located in area A and faulty samples located in area C. Due to the complex working conditions of the power system network equipment, a large amount of noise is likely to be mixed in the signal, and several samples will enter the B region. The method adopted in this paper is to count and record the proportion of samples entering the B region for the samples to be detected, and set the alarm threshold $θ$ . If it is higher than the threshold value, it is considered that there is a potential danger, but there is no trend of failure so that it can better give early warning signals and reduce unnecessary losses caused by untimely maintenance. Suppose that the sample is trained at moment t-1, the support vector obtained is ${S V_{1, t - 1}, ..., S V_{l, t - 1}}$ . Then the detection is done, and the training samples and detection model are updated according to the following law. The update rule is as follows.

1) If $η_{x (t)} > λ_{+}$ , $S V_{1, t - 1} > 0$ , Then there is no need to update the training sample and the SVDD model, and the sample is directly considered as a faulty sample at that moment.

2) If $η_{x (t)} > λ_{+}$ , $S V_{1, t - 1} = 0$ , Same as (1).

3) If $η_{x (t)} \leq - λ_{-}$ , $S V_{1, t - 1} = 0$ , then update the training sample ${x_{1, t - 1}, ..., x_{l, t - 1}} = {x_{2, t - 1}, ..., x_{l, t - 1}, x (t)}$ , update support vectors ${S V_{1, t - 1}, ..., S V_{l, t - 1}} = {S V_{2, t - 1}, ..., S V_{l, t - 1}, S V_{1, t - 1}}$ . The samples at that moment are judged as normal samples and can participate in the next training.

4) If $η_{x (t)} \leq - λ_{-}$ , $S V_{1, t - 1} > 0$ , then update the training sample ${x_{1, t - 1}, ..., x_{l, t - 1}} = {x_{2, t - 1}, ..., x_{l, t - 1}, x (t)}$ , the support vectors remain unchanged. The sample at that moment is judged as normal but does not participate in the next training.

5) If $- λ_{-} \leq η_{x (t)} \leq λ_{+}$ , $S V_{1, t - 1} > 0$ , Same as (4).

6) If $- λ_{-} \leq η_{x (t)} \leq λ_{+}$ , $S V_{1, t - 1} = 0$ , Same as (4).

In summary, when the test sample is located in the C region, i.e., $η_{x (t)} > λ_{-}$ , the system is judged to be working abnormally and a warning signal is given in time. When both $x_{1, t - 1}$ and $x (t)$ are located in region A, $x_{1, t - 1}$ is removed from the sample set, $x (t)$ is added to the training sample set, and its corresponding support vector is assigned a value of 0. When the test sample is located in area B, both the model and threshold need to be updated, and the percentage of entering area B over a while is counted and recorded. When the leakage rate threshold is reached, a warning is issued in time, and if there is a continuous warning, the system is judged to be in a fault state.

4 Experiments

4.1 Simulation model and parameterization

Wide area measurement system (WAMS) using PMUs has been widely deployed worldwide in recent years. WAMS can measure and transmit multiple signals according to the GPS synchronized clock. PMUs can use GPS signals for simultaneous voltage and current measurements to analyze and provide information such as frequency, phase, and amplitude. These signals are collected and transmitted to the master station at each sampling moment. Therefore, the data can be synchronized for grouping and storage. The standard IEEE 14-bus power system is established using the PSCAD/EMTDC to verify effectiveness and superiority of the proposed method. The network structure and configuration of the simulation model is shown in Figure 4. The frequency of this standard model is 60Hz, so the sampling frequency is 6 kHz which is achievable because the PMUs can sample ten thousand points per second. According to the Nyquist’s sampling law, this sampling frequency is greater than the system state. In addition, the generator power and the load power in this experiment are also shown in Figure 6.

FIGURE 4

FIGURE 4. Simulation model built in PSCAD/EMTDC.

4.2 Wavelet packet -spielman feature verification

The signals collected through PMUs are electrical signals such as voltage, current, active power, reactive power, and power factor, which possess different scales and therefore need to be standardized. The wavelet packet energy decomposition is applied to unify into energy features. The energy share of five different signals after wavelet packet decomposition is shown in Figure 5.

FIGURE 5

FIGURE 5. Energy share of the five signals after decomposition.

According to the energy conservation law, a fundamental property of wavelet decomposition above, the larger the energy share of the wavelet packet decomposition band is, the more obvious the reflected characteristics will become. Therefore, according to Figure 5, it is concluded the five different signals all have the largest energy share at the first node after decomposition. By extracting the maximum energy value and calculating the energy share of each signal, the combined characteristics of the uniform magnitude are obtained. As shown in Figure 5, it can be seen that the wavelet packet decomposition yields eight nodes, and all five signal types are the first node with the highest energy, so the first node energy value is used as the combined feature. If each letter does not interfere with each other and has independent features, the combined energy features are assigned the same weights; however, under operating conditions, there is a complex coupling between power system power signals, so Spearman rank correlation is introduced to calculate the coupling between each feature and recalculate the weights to get a more accurate energy share. A comparison of the energy characteristics after adding the Spearman rank correlation is shown in Figure 6.

FIGURE 6

FIGURE 6. Comparison of energy characteristics.

As shown in Figure 6, the proportion of energy characteristics has changed significantly after adding Spearman’s rank correlation, from approximately the same role of each signal, i.e., the signals are independent of each other, to the dominant role of the current, which forms a mutual “constraint” relationship with other signals. This also confirms that in industrial production power systems, the complexity of the environment leads to strong coupling between signals, so the direct use of energy features as a feature set is not sufficient and will affect the diagnostic accuracy of the early warning model. To investigate the effectiveness of the proposed method, the original energy features with the addition of Spearman rank correlation are input into the SVDD model. The diagnostic accuracy, and the diagnostic elapsed time of the two methods, are shown in Table 3.

TABLE 3

TABLE 3. Feature comparison.

In Table 3, without considering the coupling between features, the diagnostic accuracy is significantly lower than that of the feature combinations with the addition of Spearman’s rank correlation. Besides, there is almost no loss in diagnostic elapsed time, which further verifies the effectiveness of the improved feature extraction method proposed in this paper, and improves the diagnostic accuracy while ensuring the diagnostic elapsed time.

4.3 Verification of WPDSR- SVDD

To fully validate the performance of the proposed method in this paper, three levels of experiments are conducted separately, starting with the combined energy features extracted above as the feature dataset. Fault warning is performed on PMUs acquired signals using raw SVDD-based, SVDD-based with fault samples, and WPDSR-SVDD-based, respectively. The number of normal samples and faulty samples are set as $n_{+} = 90$ , $n_{-} = 10$ , The range of penalty factors C for the two types of samples is set as C = [0.2–0.9], the hyper-parameter a is set to $σ = 0.3$ . Through cross-validation, the upper limits of $μ_{+}$ and $μ_{-}$ are set as 0.04 and 0.001, and the alarm cap is set as 0.6 and gives a warning signal when there are three consecutive alarms. The combined energy features extracted above are used as the feature sample set to train the SVDD model, and then input to the test set for fault warning. The results are shown in Figure 7 (the vertical coordinates in the figure are the state category labels, one represents normal, -1 represents the presence of an abnormality, and 0 represents warning) Figure 7 (1) to (3) correspond to the original SVDD algorithm, SVDD with fault samples, and the WPDSR-SVDD algorithm proposed in this paper, respectively.

FIGURE 7

FIGURE 7. Comparison of three algorithms.

Comparing Figure 7 (1) to (3), it is obvious that the original SVDD has serious false alarms and missed detections. Although the SVDD algorithm with fault samples can improve the diagnosis accuracy to a certain extent, it still cannot solve the problem of false alarms and missed detections, resulting in the diagnosis accuracy is still not high. The proposed method in this paper, however, can effectively warn the faults that appear in the power system and gives early warning signals in the pre-fault stage. It shows that the WPDSR-SVDD algorithm can improve the early warning accuracy while effectively avoiding false alarms and missed detections in the diagnosis process. Then, the diagnostic accuracy of the three algorithms has been modeling time-consuming as shown in Table 4. From the table, it is obvious that the proposed method in this paper has high effectiveness.

TABLE 4

TABLE 4. Diagnosis efficiency.

5 Conclusion

In this paper, a new generalized resilient dual-threshold fault warning method with fault samples named WPDSR-SVDD is proposed for complex system networks, especially distribution power networks. The use of fault samples, the division of real-time samples, and the optimization and updating of the training model are successfully applied to practical production. The experimental results show that WPDSR-SVDD can provide accurate and fast fault warnings even in the case of insufficient data, and the main conclusions are as follows.

1) Energy feature extraction is introduced to unify the magnitudes of different kinds of signals to form combined features, and Spearman rank correlation is introduced to solve the problem of mutual coupling between signals, which finally improves the diagnostic accuracy.

2) The relaxed boundary criterion is proposed for the traditional SVDD, and the offset factor is introduced to change the original SVDD from an exact judgment to a fuzzy judgment, which in turn improves the robustness of the diagnosis system.

3) A model capable of online adaptive updating is proposed that can reduce the risk of false alarms and missed detections while satisfying real-time online faults.

Moreover, there are still something need to improve of the proposed approach. For example, since its modeling is an unsupervised learning process, online updating of models is extremely challenging. And the proposed method can only be used for early fault warning of the distribution system. That also means the approach cannot analyze the fault type, because the model is trained with only normal data. That may be the direction of further research.

Data availability statement

The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

Author contributions

SJ has published a total of three journal articles A 9 ps Time-to-Digital Converter Based on Multiple Sampling in 0.18 μm CMOS, Theoretical analysis for the influence of the core radius on long period fiber grating sensors and A Hysteresis Comparator for Level-Crossing ADC. All authors contributed to the article and approved the submitted version.

Funding

This work was partially supported by the Fundamental Research Funds for the Central Universities (N2104018).

Acknowledgments

I would like to extend my sincere gratitude to my supervisor, SJ, for his instructive advice and useful suggestions on my thesis. I am deeply grateful of his help in the completion of this thesis.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Chouder, A., and Silvestre, S. (2010). Automatic supervision and fault detection of PV systems based on power losses analysis. Energy Convers. Manag. 51, 1929–1937. doi:10.1016/j.enconman.2010.02.025

CrossRef Full Text | Google Scholar

Clemente-Alarcon, V., Antonino-David, J. A., Riera-Guasp, M., and Vlcek, M. (2014). Induction motor diagnosis by advanced notch fir filters and the Wigner-Ville distribution. IEEE Trans. Industrial Electron. 61 (8), 4217–4227.

Google Scholar

Davy, M., Desobry, F., Gretton, A., and Doncarli, C. (2006). An online support vector machine for abnormal events detection. Signal Process. 86 (8), 2009–2025. doi:10.1016/j.sigpro.2005.09.027

CrossRef Full Text | Google Scholar

Fei, X., Jianping, Y., Xiangli, D., Kang, Y., Congcong, W., and Haiyun, Y. 2019. “Power grid fault diagnosis method based on remote signaling data fault coding technology and DHNN correction [J]”. Power Syst. Prot. control (21). doi:10.19783/j.cnki.pspc.181497

CrossRef Full Text | Google Scholar

Feng, Z., and Zuo, M. J. (2013). Fault diagnosis of planetary gearboxes via torsional vibration signal analysis. Mech. Syst. Signal Process. 36 (2), 401–421. doi:10.1016/j.ymssp.2012.11.004

CrossRef Full Text | Google Scholar

Gopakumar, P., Reddy, M. J. B., and Mohanta, D. K. Adaptive fault identification and classification methodology for smart power grids using synchronous phasor angle measurements. IET Generation, Transm. distribution, 2015 9(2), 133–145. doi:10.1049/iet-gtd.2014.0024

CrossRef Full Text | Google Scholar

Gritli, Y., Zarri, L., Rossi, C., Filippetti, F., Capolino, G., Casadei, D., et al. (2013). Advanced diagnosis of electrical faults in wound-rotor induction machines. IEEE Trans. Ind. Electron. 60 (9), 4012–4024. doi:10.1109/tie.2012.2236992

CrossRef Full Text | Google Scholar

Guo, S. M., Chen, L. C., and Tsai, J. S. H. (2009). A boundary method for outlier detection based on support vector domain description. Pattern Recognit. 42, 77–83. doi:10.1016/j.patcog.2008.07.003

CrossRef Full Text | Google Scholar

Guomin, L., Yingjie, T., Changyuan, Y., Yinglin, L., and Jinghan, H. (2019). Deep learning-based fault location of DC distribution networks. J. Eng. (Stevenage). 16 (3), 3301–3305. doi:10.1049/joe.2018.8902

CrossRef Full Text | Google Scholar

Han, W., Haifeng, Z., Dongqiang, Z., Zhonghua, L., Xingjie, Z., and Tianmin, G. 2022. Fault diagnosis of rectifier circuit based on WPD‑PSO algorithm [J]. Journal of Jimei University (Natural Science Edition) 27 (3), 253–259. doi:10.19715/j.jmuzr.2022.03.08

CrossRef Full Text | Google Scholar

Hong, L., and Dhupia, J. S. (2014). A time domain approach to diagnose gearbox fault based on measured vibration signals. J. Sound Vib. 333 (7), 2164–2180. doi:10.1016/j.jsv.2013.11.033

CrossRef Full Text | Google Scholar

Joksimovi, G. M., Rieger, J., Wolbank, T. M., Peri, N., and Vaak, M. (2013). Statorcurrent spectrum signature of healthy cage rotor induction machines. IEEE Trans. Ind. Electron. 60 (9), 4025–4033. doi:10.1109/tie.2012.2236995

CrossRef Full Text | Google Scholar

Lei, H., Online fault detection algorithm based on double-threshold OCSVM and its application. J. Mech. Eng., 2009, 45(3): 169–173. doi:10.3901/JME.2009.03.169

CrossRef Full Text | Google Scholar

Liu, D., Yu, H., Wang, W., Zhang, H., Zhao, X., Zhao, Y., et al. (2019). “Multi-source log comprehensive feature extraction method based on restricted Boltzmann machine in power information system,” in 2019 IEEE 11th International Conference on Communication Software and Networks (ICCSN), Chongqing, China, , 503–508. doi:10.1109/ICCSN.2019.8905373

CrossRef Full Text | Google Scholar

Lu, J., Zhao, R., Li, B., Li, H., and Tan, H. (2021). “Intelligent fault diagnosis method of power grid based on multi-source feature fusion,” in 2021 IEEE 5th Conference on Energy Internet and Energy System Integration (EI2), Taiyuan, China, 1794–1797. doi:10.1109/EI252483.2021.9713464

CrossRef Full Text | Google Scholar

Lu, S., He, Q., Yuan, T., and Kong, F. (2017). Online fault diagnosis of motor bearing via stochastic-resonance-based adaptive filter in an embedded system. IEEE Trans. Syst. Man. Cybern. Syst. 47 (7), 1111–1122. doi:10.1109/tsmc.2016.2531692

CrossRef Full Text | Google Scholar

Mu, T. T., and Nandi, A. K. (2009). Multiclass classification based on extended support vector data description. IEEE Trans. Syst. Man. Cybern. B 39 (5), 1206–1216. doi:10.1109/tsmcb.2009.2013962

PubMed Abstract | CrossRef Full Text | Google Scholar

Myers, J. L., Well, A. D., and Lorch, R. F. (1995). Research design and statistical analysis. 2nd ed. New York: Lawrence Erlbaum.

Google Scholar

Peng, H., Wang, J., Ming, J., Shi, P., Prez Jimenez, M. J., Yu, W., et al. (2018). Fault diagnosis of power systems using intuitionistic fuzzy spiking neural p systems. IEEE Trans. Smart Grid 9 (5), 4777–4784. doi:10.1109/TSG.2017.2670602

CrossRef Full Text | Google Scholar

Rafiee, A. H. J., Tse, P., and Sadeghi, M. (2009). A novel technique for selecting mother wavelet function using an intelligent fault diagnosis system. Expert Syst. Appl. 36 (3), 4862–4875. doi:10.1016/j.eswa.2008.05.052

CrossRef Full Text | Google Scholar

Saizhao, Y., Wang, X., Junyu, Z., Hong, R., Shukai, X., Runhong, H., et al. 2019. “Fault detection method of overhead flexible DC power network based on artificial neural network” [J]. Chin. J. Electr. Eng. (15). doi:10.13334/j.0258-8013.pcsee.181340

CrossRef Full Text | Google Scholar

Tax, D. M. J., and Duin, R. P. W. (2004). Support vector data description. Mach. Learn. 54, 45–66. doi:10.1023/b:mach.0000008084.60811.49

CrossRef Full Text | Google Scholar

Wang, X., McArthur, S. D. J., Strachan, S. M., Kirkwood, J. D., and Paisley, B. (2018). A data analytic approach to automatic fault diagnosis and prognosis for distribution automation. IEEE Trans. Smart Grid 9 (6), 6265–6273. doi:10.1109/tsg.2017.2707107

CrossRef Full Text | Google Scholar

Xiang, D., and Cen, J. 2015. A rolling bearing fault diagnosis method based on EMD entropy feature fusion [J]. J. Aerodyn. 30 (05), 1149–1155. doi:10.13224/j.cnki.jasp.2015.05.016

CrossRef Full Text | Google Scholar

Xin, Y., Congyun, X., and Ping, X. 2021. “Microgrid fault diagnosis and classification method based on wavelet feature extraction and deep learning” [J]. Smart power 49 (12), 17–24. doi:10.3969/j.issn.1673-7598.2021.12.004

CrossRef Full Text | Google Scholar

Yansong, W., Xueying, Z., and Jingbo, Y. 2018. “Fault location and fault tolerance algorithm for distribution network” [J]. Power autom. Equip. 38 (4), 9–15. doi:10.16081/j.issn.1006-6047.2018.04.002

CrossRef Full Text | Google Scholar

Yi, Z., and Etemadi, A. H. (2017). Fault detection for photovoltaic systems based on multi-resolution signal decomposition and fuzzy inference systems. IEEE Trans. Smart Grid 8 (3), 1274–1283. doi:10.1109/TSG.2016.2587244

CrossRef Full Text | Google Scholar

Yilun, Z., Xinjian, C., Qiang, G., Daojian, H., Zhouhong, W., and Wei, L. 2020. “A power flow feature extraction method based on deep reinforcement learning” [J]. Power grid clean energy 36 (3), 7–12. doi:10.3969/j.issn.1674-3814.2020.03.002

CrossRef Full Text | Google Scholar

Zheng, X., Geng, X., Xie, L., Duan, D., Yang, L., and Cui, S. (2018). “A SVM-based setting of protection relays in distribution systems,” in Proc. IEEE Texas Power Energy Conf. (TPEC), College Station, TX, USA, 1–6. doi:10.1109/TPEC.2018.8312071

CrossRef Full Text | Google Scholar

Keywords: fault waring, SVDD, spearman rank correlation, relaxation boundary, energy feature extraction

Citation: Chu F, Lu Z, Jin S, Liu X and Yu Z (2022) A relaxed support vector data description algorithm based fault detection in distribution systems. Front. Energy Res. 10:973794. doi: 10.3389/fenrg.2022.973794

Received: 20 June 2022; Accepted: 04 July 2022;
Published: 22 July 2022.

Edited by:

Peng Zeng, Shenyang Institute of Automation (CAS), China

Reviewed by:

Hua Chunsheng, Liaoning University, China
Ting Li, Beijing Institute of Graphic Communication, China

Copyright © 2022 Chu, Lu, Jin, Liu and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shuowei Jin, amluc2h1d2VpQGlzZS5uZXUuZWR1LmNu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.