Skip to main content

METHODS article

Front. Energy Res., 11 October 2024
Sec. Smart Grids
This article is part of the Research Topic Data-Driven Approaches for Efficient Smart Grid Systems View all 14 articles

Bad data identification method considering the on-load tap changer model

Shiyao Hu
Shiyao Hu1*Chunyan RongChunyan Rong1Mengnan ZhangMengnan Zhang2Linjie ChaiLinjie Chai1Yuxuan MaYuxuan Ma2Tianlei ZhangTianlei Zhang2
  • 1Economic and Technology Research Institute of State Grid Hebei Electric Power Co., Ltd., Shijiazhuang, Hebei, China
  • 2School of Electrical and Control Engineering, North China University of Technology, Beijing, China

With the connection and integration of renewable energy, the on-load tap-changer (OLTC) has become an important device for regulating voltage in distribution networks. However, due to non-smooth and non-linear characteristics of OLTC, traditional bad data identification and state estimation methods for transmission network are limited when applied to the distribution network. Therefore, the nonlinearity and droop control constraints of the OLTC model are considered in this paper. At the same time, the quadratic penalty function is introduced to realize the fast normalization of the tap position. It proposes a two-stage bad data identification method based on mixed-integer linear programming. In the first stage, suspicious measurements are identified using projection statistics and maximum normalized residual methods for preprocessing the measurement data. In the second stage, a linearization approach utilizing hyhrid data-physical driven is applied to handle nonlinear constraints, leading to the development of a bad data identification model based on mixed-integer linear programming. Finally, the proposed methodology is validated using a modified IEEE-33 node test feeder example, demonstrating the accuracy and efficiency of bad data identification.

1 Introduction

In the context of distribution network state estimation, the integrity of the estimation process can be compromised by the presence of erroneous data. Such data may arise from a variety of sources, including heterogeneous measurement instruments, sensor failures, and communication disruptions. These inaccuracies have the potential to severely impact the reliability and accuracy of the state estimation outcomes (Chen et al., 2021). The same time, With the continuous integration of renewable energy sources, OLTC is increasingly being utilized in power grid applications every year. It plays an important role in ensuring the safe and reliable operation of distribution networks and grid dispatching (Ju and Huang, 2023).The relevant parameters of the on-load tap-changer device exhibit nonlinearity and discreteness in mathematical models, and traditional continuous variable processing methods may reduce the accuracy of state estimation.

Traditional methods for bad data identification primarily encompass residual search methods (Handschin et al., 1975; Lin and Abur, 2018; Zhao and Mill, 2018), zero residual methods (Zhuang and Balasubramanian, 1987), and estimation-based methods (Huang and Lin, 2004). These approaches are susceptible to errors such as misjudgment and missed detection. Contemporary techniques for bad data detection and identification predominantly include optimization-based and intelligent algorithm-based approaches. Among these, optimization-based methods have demonstrated a significant capacity for accurate identification of erroneous measurement data. A literature (Irving, 2008) proposes a robust state estimation model based on mixed integer nonlinear programming (MINLP). This method has high accuracy in identifying bad data, but lacks precision in estimation. Additionally, the model is nonconvex, nonlinear, and introduces a large number of 0/1 integer variables, making it difficult to solve. When the scale of the system increases, the solution efficiency de-creases. The Schweppe-type generalized M-estimator with Huber psi-function (SHGM) is currently a method with good robustness (Mili et al., 1996). Introduces coefficients that reflect the lever-age properties, which can suppress the weight of bad data in leverage points and weaken the effect of bad data residuals in the objective function. However, it cannot completely filter out the impact of bad data. Existing methods need further improvement in terms of computational efficiency, accuracy, and ability to handle bad data.

For state estimation problems involving discrete variables, modelling and solving based on MINLP methods may have low convergence and computational efficiency. In order to avoid solving MINLP problems, in traditional state estimation, OLTC tap positions are usually treated as continuous variables (Shiroi and Hosseinie, 2008), which may lead to biases in the estimated results due to mismatch with the actual operating characteristics of OLTC. The literature (Korres et al., 2004; Teixeira et al., 1992; Handschin and Kliokys, 1995) presents, in addition to the traditional treatment of continuous variables, using rounding or sensitivity analysis to round the tap positions to integers. These integer values are then taken as known parameters and used in the state estimation problem to solve state variables such as voltage magnitude and phase angle. The literature (Maalouf et al., 2013) develops a mixed integer quadratic programming model with discrete variables, which can effectively address the tap position rounding issue with high accuracy, although the approach is more complex. Furthermore, the literature (Nanchian et al., 2017) applies a hybrid particle swarm optimization method to solve MINLP problems that involve discrete tap positions, but the algorithm has a long computation time and low efficiency.

To address the challenges of nonlinearity and low efficiency in bad data identification for OLTC discrete variables, this paper introduces a two-stage bad data identification method utilizing a positive curvature quadratic penalty function to facilitate rapid adjustment of OLTC tap positions. This method enhances solution efficiency while maintaining the accuracy of identification. The main contributions are as follows:

1) In order to improve the accuracy of bad data identification, an optimization-based method for bad data identification is proposed in this paper. This method can accurately and effectively identify the presence of bad data in the measurement data, demonstrating a good identification accuracy.

2) To cope with the issue of low efficiency in solving traditional MINLP models due to a large number of discrete and nonlinear constraints, this paper presents a two-stage bad data identification model based on mixed integer linear programming (MILP). In the first stage, all measurements are distinguished using projection statistics and maximum normalized residual methods, generating a reduced set of suspicious measurements. In the second stage, a linearization model based on hyhrid data-physical driven is proposed to linearize nonlinear constraints, leading to a MILP-based bad data identification method that significantly improves the accuracy and efficiency of bad data identification.

The organizational structure of this paper is as follows: Section 2 introduces the method of identifying leverage points and suspicious measurement bad data based on projection statistics and maximum normalized residual method, and obtains the reduction of suspicious measurement set; Section 3 introduces the traditional MINLP bad data identification model, and proposes a hyhrid data-physical driven linearization model considering OLTC constraints, and constructs a bad data identification model based on MILP; Section 4 illustrates the experimental at IEEE-33 node test feeder; Section 5 provides the conclusion.

2 Stage 1: reduction of suspicious measurement set

In this paper, the reduced suspicious measurement set mainly consists of two parts. The first part involves the identification of leverage points through the use of projection statistics, while the second part involves the identification of suspicious measurements in non-leverage points using the maximum normalized residual method. The detailed explanation of the two algorithm processes is provided below.

2.1 Identification of leveraged measurements using projection statistics

This subsection describes the mathematical model for identifying leverage points based on projection statistics. Firstly, the connotation of leverage points and their impact on state estimation are introduced. Currently, there are two types of definitions for leverage points. As shown in Figure 1, the first type is based on regression model analysis, while the second type is based on the analysis of diagonal elements of the hat matrix. The difference between the two methods is as follows: The first type constructs a factor space composed of the measurement Jacobian matrix and the measurement vector, obtaining the distribution of row vectors in each group of measurement Jacobian matrix and the measurement vector in the factor space, and identifies outliers as leverage points; the second type is based on residual sensitivity analysis, namely, determining whether the measurement residual increases significantly when there is a large measurement error in the system, and identifying measurements where the measurement error cannot be positively fed back to the measurement residual as leverage points.

Figure 1
www.frontiersin.org

Figure 1. Dy11 on-load tap changer equivalent model.

The first method based on regression model analysis is introduced as follows. The first-order Taylor expansion is performed at the initial value point x0 and the approximate expression of the measurement error is obtained as follows:

Δzi=HiΔx+ei(1)

Where, Δzi is the error between the true measurement value and the measurement vector; Hi represents the ith row element of the Jacobian matrix H; Δx represents the error between the current estimated value of the state variable and the initial value.

The above Equation 1 is called the regression analysis model in statistics. Δzi is the output of the regression analysis model, Δx are the regression variables, and H is the coefficient matrix of the regression analysis. Δzi,Hi represents a point in the factor space and also indicates the relationship between the measurement vector and the true value, as well as the state variables. The regression factor is defined as, in the m×n dimensional Jacobi matrix, there is a total of m Hi,and the elements of each Hi correspond to an n-dimensional space coordinate, then they are all located in this n-dimensional space. These coordinates are called the corresponding measured factors, and the n-dimensional space is the factor space of the regression analysis.

Due to the system network parameters, measurement errors, and other reasons, abnormal values may appear in the above factor space, that is, the data is quite different from other data, and it is shown as outliers in the two-dimensional space. When outliers appear in the Y-space ' of the factor space, that is, in the ordinate axis ' Δzi ' direction, they are bad data in the conventional sense of state estimation. When the outliers only appear in the ' X-space ' of the factor space, that is, in the ' hi ' direction of the abscissa axis, the corresponding measurement is leverage measurement; when the outliers occur in ' X-space ' and ' Y-space ' at the same time, the corresponding measurement is the leverage measurement bad data. In the state estimation of power system, the leverage measurement is determined by the network topology, line parameters, measurement position and the meter error of the measurement instrument. Once the system network parameters are determined, whether the leverage measurement will become the inherent attribute of a certain measurement will not change with the change of the state variables and measurement values of the system.

To identify outliers in the “X-space,” that is, to identify anomalies in the row vectors of the Jacobian matrix compared to other row vectors, this can be achieved by calculating distance measures between the individual row vectors. The Mahalanobis distance and other similar distance measures based on this can be used to calculate the distances between the row vectors. The criterion of such methods is to compare the distance between the row vectors with a set threshold. Measurements greater than this threshold are considered leverage measurements. The threshold setting criterion is to designate measurements that are far from most other measurements as leverage measurements. However, in cases where there are multiple leverage measurements in a system, problems may arise due to mutual masking between the leverage measurements, causing this type of method to potentially struggle to accurately identify systems with multiple leverage measurements.

The second method based on residual sensitivity analysis is introduced as follows. The estimation of the measurement error based on the least square method is defined as Δz^, and the matrix expression is as follows.

Δz^=HHTR1H1HTR1ΔzW=HHTR1H1HTR1(2)

Where, W is defined as a hat matrix.

When Equations 1, 2, the measurement residual is defined as the difference between the measured value and the estimated measurement vector, which can be equivalent to the error between the estimated measurement error value Δz^ and the measurement error value.

r=ΔzΔz^=IWΔz=SΔz(3)

In the equation, I is the identity matrix with diagonal elements equal to 1 and off-diagonal elements equal to 0; S defines the residual sensitivity matrix, with its matrix expression shown as shown in Equation 4:

S=IHHTR1H1HTR1(4)

The hat matrix W has the property of idempotence, where the matrix elements satisfy as shown in Equation 5:

Wii=Wii2+ijWij2(5)

From the above equation, Based on Equation 3, we know that when the diagonal elements of are close to 0. If the measurement error is large and the diagonal elements of the multiplication of the sensitivity matrix remains very small, indicating that the measurement error cannot be reflected in the residual of the measurement, then define this type of measurement as a leverage measurement. Furthermore, the leverage measurement can also be determined by comparing with its expected value to assess its relative magnitude.

The expected value is calculated as shown in Equation 6:

W¯=EKii=1mi=1mKii=2n1m(6)

If WiiW¯,it is determined that the measurement corresponding to the diagonal element is determined to be a leverage measurement. On the basis of empirical experience, it is generally considered that when Wii2W¯, the measurement is a leverage measurement.

To avoid the difficult identification of leverage measurements that may arise from the two aforementioned methods, this paper adopts a method based on projection statistical values for the identification of leverage measurement. This method circumvents the calculation problem of the covariance matrix and directly utilizes the projections of the row vectors of the Jacobian matrix onto the relevant subspace to identify leverage measurements. First, calculate the projection statistical values for all measurements, with the calculation formula as follows:

PSi=maxkHiHkTSk(7)

Where PSi represents the statistical projection value for measurement i, Hi represents the i-th element of the Jacobian matrix H; HkT represents the transpose of the k-th element of the Jacobian matrix H; Sk represents the covariance of Hi and HkT.

The calculation formula for Sk is given by the following equation:

Sk=1.1926lomedilomedjiHiHkT+HjHkT(8)

Where, lomed is defined as m+1/2, x represents the value of the integer part of n. For example,: m=6, m+1/2=3.

After calculating the projection statistical values corresponding to each measurement based on Equations 7, 8, compare them numerically with the projection statistical cutoff values. Under a Gaussian distribution, the projection statistics typically follow a Chi-square distribution, satisfying Equation 9:

bi=χv,0.9752(9)

Where, bi is the calculated cutoff value.

When comparing the calculated projection statistical values with the cut off values, you can determine whether the measurement is a leverage measurement based on the comparison results.

Di=PSi/bi>1(10)

If the calculated projection statistical value D for measurement i satisfies Equation 10, then measurement i is determined to be a leverage measurement. The calculation of projection statistical values only involves simple algebraic operations and does not require matrix inversion. Therefore, even in cases where the Jacobian matrix is very sparse, accurate identification of leverage measurements can be achieved in a system with multiple leverage measurements.

2.2 Identification of suspicious bad data using maximum normalized residual method

As the calculation of residuals is approximate to the error values, it may not be able to detect bad data. By using standardized residuals, a more accurate method for identifying bad data can be obtained. The normalized residual value for measurement i can be obtained by dividing the residual value by the corresponding diagonal element in the residual covariance matrix.

riN=riΩii=riRiiSii(11)

In Equation 11: riN represents the normalized residual value; R represents the measurement error covariance matrix; S represents the sensitivity matrix; Ω represents the residual covariance matrix; r represents the residual value.

Ωii=RiihiTi1im(12)

In Equation 12: Ω is the residual covariance matrix, T=G1HT, G is the gain matrix, The calculation formula of G is G=HTR1H , Rii=σi2 represent the measurement variance matrix of the error i.

When solving the weighted least squares state estimation, the residual value can be calculated. The calculation formula is as shown in Equation 13:

r=zhx(13)

For non-leverage measurement, the normalized residual vector rN obeys the standard normal distribution, As Equation 14:

riNN0,1(14)

Therefore, the maximum element in rN is compared with the statistical threshold to determine the existence of suspicious measurement bad data. The threshold can be selected according to the required detection sensitivity level.

3 Stage 2: bad data identification based on mixed integer linear programming

In order to improve the solving efficiency of the traditional MINLP model, the hybrid data-physical-driven linearization model is first applied to linearize the non-linear constraints in the state estimation of on-load tap changer in transformers. Subsequently, an MILP-based bad data identification model is constructed. The following provides a detailed description of the linearization process and the construction of the optimization model.

3.1 Bad data identification based on MINLP

The traditional MINLP-based bad data identification model is represented as follows:

mini=1mbi(15)
s.t.cx=0(16)
hixzim3σiMbi(17)
hixzim3σi+Mbi(18)

In Equations 1518: cx represents equality constraints, including balanced node phase angle constraints, injection equality constraints, and load tap transformer droop control constraints., and hix represents nonlinear measurement equations, including node voltage amplitude measurement equations, branch current amplitude square measurement equations, branch active power measurement equations, and branch reactive power measurement equations. zim represents measurement values, M is a large positive number, taken as 10,000 in this paper. bi is a binary variable. When bi equals 0, it indicates that measurement i is not bad data, and the corresponding measurement equation inequality constraint is effective. When bi equals 1, it can be determined that the measurement corresponding to i is bad data, and the corresponding measurement equation inequality constraint is invalid.

The bad data identification method based on mixed integer can overcome the problem that the residual method is difficult to identify the bad data of the multi-leverage measurement system, and has high identification accuracy. However, due to the serious non-convex nonlinearity of the model, the requirement for the solver is high, and the computational efficiency of the model solution is low.

3.2 The hybrid data-physical-driven measurement equation linearization model

3.2.1 OLTC model considering non-smooth control characteristics

OLTC includes two parts, “transformer” and “tap changer.” Unlike traditional transformers, an OLTC sets the tap position as an unknown variable during modelling. By controlling the amplitude of the secondary side voltage to meet a certain dead band range, the OLTC adjusts the tap position to maintain the voltage level at the load center within a certain error range, thereby enhancing the power quality for the user. This paper proposes a droop control model for the OLTC, where the amplitude of the secondary side voltage and the tap turns ratio of the OLTC align with the droop control curve.

Define the column name of the nodal association matrix to represent the node ia, ib, ic, ja, jb, jc,the line name corresponds to the branch connected to ia,the branch connected to ja,the branch connected to ib,the branch connected to jb,the branch connected to ic,the branch connected to jc,where ij represent node; abc represented by three phases of each node. The nodal correlation matrix is as shown in Equation 19:

nodeC=branch111111111(19)

Where represents the association between a branch and a node, with its direction flowing out of the node; and represents the association between a branch and a node, with its direction flowing into the node.

Taking phase A as an example, according to the law of energy conservation on the primary and secondary sides of the transformer, the voltage and current on the winding satisfy the following relationship:

I˙iaI˙ja=1taU˙iaU˙jaI˙ja/ya=ta(20)

Here, I˙ia and I˙ja represent the branch currents flowing through the branches connected by ia and ja, respectively. ya=1/Ra+jXa represents the equivalent internal impedance of phase A of the transformer, ta represents the turns ratio of phase A.

The relationship between branch current and node voltage is expressed as shown in Equation 21:

I˙ia=1taI˙ja=1taU˙jaU˙iataya=yata2U˙iayataU˙jaI˙ja=U˙jaU˙iataya=yataU˙ia+yaU˙ja(21)

Equation 20 is expressed as a matrix as shown in Equation 22:

I˙iaI˙ja=yata2yatayatayaU˙iaU˙ja(22)

Similarly, the relationship between the branch current and node voltage for phases B and C is derived, and the relationship between the three-phase branch current and voltage is ultimately obtained as follows:

I˙=YprU˙Ypr=yata2yatayatayaybtb2ybtbybtbybyctc2yctcyctcyc(23)

In Equation 23: I˙=I˙iaI˙jaI˙ibI˙jbI˙icI˙jcT represents the branch current matrix of the OLTC, U˙=U˙iaU˙jaU˙ibU˙jbU˙icU˙jcT represents the node voltage matrix of the OLTC; Ypr is defined as the original admittance matrix; tb, tc respectively represent the B and C phase transformation turns ratios.

By combining the original admittance matrix with the node-branch incidence matrix, the node admittance matrix is obtained from the following equation:

Ybus=CTYprC

The branch power connected to each node is calculated using the node injection power expression as shown in Equations 24 and 25:

PiφReg=Uiφβ=A,B,CUjβGijφβRegcosθijφβ+BijφβRegsinθijφβ(24)
QiφReg=Uiφj=A,B,CUjβBijφβRegcosθijφβ+GijφβRegsinθijφβ(25)

GijφβReg and BijφβReg represent the real and imaginary parts of the corresponding elements in the node admittance matrix Ybus of the on-load tap changer, GijφβReg represents the mutual conductance between node iφ and node jβ, BijφβReg represents the mutual susceptance between node iφ and node jβ; The row name and column name of Ybus are expressed as: node ia, ib, ic, ja, jb, jc.

The following describes the modelling of the regulator part in Figure 2.

Figure 2
www.frontiersin.org

Figure 2. Detailed schematic diagram of OLTC voltage regulator part.

The tap position of OLTC are regulated through a control circuit. When voltage control, direct control cannot be carried out on the high-voltage circuit. Therefore, voltage and current transformers are used to construct a simulated circuit - the control circuit. The control circuit is a proportional model of the actual line. For example, if the actual line transformer has a secondary side rated at 2.4 kV, the control circuit operates at a voltage level of 120 V. By monitoring the voltage of the control circuit, the voltage at the load center of the actual line can be determined. If the voltage is below the normal level, indicating that the voltage at the loading center is below the position of the normal level, the tap changer is adjusted to raise the voltage at the loading center. For three-phase supply voltages below 10 kV, the allowable deviation is 7% of the rated voltage. For single-phase supply voltages of 220 V, the allowable deviation is +7% and −10% of the rated voltage. Rkj,eq and Xkj,eq are the proportional impedances of the actual line in the control circuit and are typically known quantities.

For the control circuit, the current rating of the primary winding of the current transformer is set as CTP, and the current rating of the secondary winding is set as CTS (usually taken as 5). The voltage transformer turns ratio is set as NPT. First, according to Ohm’s law, the equivalent impedance of the three-phase line is calculated as shown in Equation 26:

Rline+jXline=U˙kU˙jI˙kj(26)

It should be noted that the equivalent impedance of the three-phase line is not the actual line impedance; it is the turns ratio of the actual measured voltage on the secondary side to the load center voltage difference to the load side current under the rated turns ratio.

The equivalent impedance of the line compensator is calculated from the equivalent impedance of the three-phase line as Rkj,eqXkj,eq using the formula as shown in Equation 27:

Req,kj+jXeq,kj=Rline+jXline·CTPNPT·CTS(27)

The current in the compensator branch is obtained from the actual line current I˙kj and the current transformer I˙kj,eq.

I˙kj,eq=I˙kj·CTSCTP(28)

In Equation 28: I˙kj=I˙kjaI˙kjbI˙kjcT represents the three-phase current in the line.

The voltage difference U˙drop of the compensator branch is obtained as shown in Equation 29:

U˙drop=Rkj,eq+jXkj,eq·I˙kj,eq(29)

Finally, the control voltage of the compensator branch is as shown in Equation 30:

U˙r=U˙3NPTU˙drop(30)

Applying the droop control to the control voltage and tap changer turns ratio, taking phase A as an example, the principle of droop control for the OLTC is introduced. Define Δta, Δtb, Δtc as the adjustment amounts of the tap changer turns ratio for phases A, B, and C relative to their respective initial turns ratios, and add to the objective function as shown in Equation 31:

min Δta2+Δtb2+Δtc2(31)

The physical meaning of the above equation is that when the control voltage is within the dead zone, the OLTC tap position remains unchanged.

The ΔtaUra droop control function of the variation in the variable turns ratio variation and the control voltage is shown in Equation 31:

Δta=ΔtmaxΔt0UminUra<Ulkdr1UraUdblUlUraUdbl0Udbl<Ura<Udbhkdr2UraUdbhUdbhUraUhtminΔt0Uh<UraUmax(32)

The relationship between Δta and the control voltage Ura satisfies the following curve in Figure 3:

Figure 3
www.frontiersin.org

Figure 3. Voltage regulation control curve showing the variation of the turns ratio with control voltage droop.

Where, Δtmax and Δtmin respectively represent the upper and lower limits of the variation in turns ratio Δt; Δt0 represents the change in tap position corresponding to the initial turns ratio; Udbh and Udbl respectively represent the upper and lower bounds of the voltage dead zone; Uh and Ul respectively represent the upper and lower bounds of the voltage droop control curve; Umax and Umin respectively represent the upper and lower bounds of the bus voltage magnitude; kdr1 and kdr2 respectively represent the droop control coefficients, and kdr1<0, kdr2<0 . When the voltage Ura is within the dead zone range, the OLTC does not actuate, and the variation in turns ratio is 0.

The relationship between the droop control coefficient and the variation in the turns ratio and the voltage is as shown in Equation 33:

kdr1=ΔtmaxUlUdblkdr2=ΔtminUdbhUh(33)

From Figure 3, it can be seen that the characteristic curve represents a piecewise function. When the system operates near the turning point, the derivative is discontinuous, the algorithm search direction is uncertain, and it is difficult to smoothly switch operating curves, so the piecewise droop control function has strong non-smooth characteristics and is difficult to solve. Usually, mixed integer nonlinear programming is used to describe piecewise function control characteristics, but this method has low computational efficiency. In this paper, a fitting function is used to approximate the piecewise function, and the fitted droop control function is shown in Equation 34:

Δta=Δtmax+kdr1αln1+eαUraUlln1+eαUraUdbl+kdr2αln1+eαUraUdbhln1+eαUraUh(34)

Where, α is the fitting coefficient, set α=500 in this paper.

Approximately fitting the piecewise droop control function can transform it into a smooth function that is continuously differentiable, as shown in Figure 4. The smoothing function can avoid sudden changes in derivative order during algorithm iteration calculations, thus improving the convergence of the algorithm.

Figure 4
www.frontiersin.org

Figure 4. Derivative comparison diagram before and after fitting.

The variation of the ratio of the voltage regulator and the tap variable satisfies the relationship is shown in Equation 34:

Δta=xda·Δd(35)

In the equation, xda is the phase A tap-changer variable; Δd is the turns ratio change corresponding to a 1-tap adjustment, which is typically taken as Δd=0.00625 p.u. in distribution networks.

In this paper, the discrete tap-changer variable is first treated as a continuous variable for state estimation calculations to obtain a continuous optimal solution for state estimation. Subsequently, a positive curvature quadratic penalty function is introduced in the objective function, and the state estimation is continued using the continuous optimal solution as the initial value to obtain an integer solution for the tap-changer setting.

Taking phase A as an example, the positive curvature quadratic penalty function is shown in Figure 5. xd0xd1xd2 are any three continuous discrete component values. Define the neighborhood Rxda as shown in Equation 36:

Rxda=xda|xd112Δdxdaxd1+12Δd(36)

Figure 5
www.frontiersin.org

Figure 5. Quadratic penalty function model.

Where, xd1 is the neighborhood center, which is the closest discrete grading value determined according to the continuous optimal solution.

In the optimization process, when the value of xda is in the neighborhood of the above definition, the penalty function εxda is introduced:

εxda=12μdxdaxd12(37)

In the equation, μd is the penalty factor, a known quantity; the penalty function will force xda within the neighborhood to approach the neighborhood center. From Equation 37, we can obtain the first and second derivatives of the quadratic penalty function within the neighborhood.

εxdaxd=μdxdaxd12εxdaxd2=μd(38)

From Equation 38, it can be observed that for the Newton method, the introduction of a penalty function in the objective function will result in the incremental inclusion of the first and second derivative terms in the Jacobian matrix and the Hessian matrix during the optimization iteration. Through this procedure, not only can rapid convergence of OLTC tap positions be achieved, but also the positive definiteness of the iteration matrix can be strengthened, thereby improving algorithm convergence. From a perspective in the field of power systems, this approach enables efficient adjustment of OLTC tap positions and enhances algorithm convergence by modifying derivative terms in iterative matrices.

3.2.2 Hybrid data-physical-driven linearization

Nonlinear constraints in the model of OLTC mainly fall into two categories: the first category is branch power constraints; the second category is droop control nonlinear constraints. Due to the introduction of the unknown variable t of the voltage regulator in the branch power constraint and the logarithmic function in the droop control nonlinear constraints, the aforementioned physical linearization methods are no longer applicable. In this paper, it is proposed to utilize a first-order Taylor expansion for physical linearization, followed by using Partial Least Squares Regression (PLSR) to calculate compensation errors, resulting in the development of a hybrid data-physical-driven Linearization model.

Firstly, introduce the principle of linearization of power constraints. For ease of description, define the relationship between the branch power of the transformer and the variables as shown in Equation 39:

PiφReg=fiφPXRegQiφReg=fiφQXRegXReg=θiABCUiABCθjABCUjABCtABCT(39)

Where XReg represents the variables of the on-load tap-changing transformer, composed of the phase angle and voltage magnitude of the primary and secondary sides, and the three-phase turns ratio; fiφP and fiφQ respectively represent the functional relationships between the active and reactive power of phase φ branch and the variables of the on-load tap-changing transformer.

The first-order Taylor expansion is performed at the initial point XReg,0 of the variable, as shown in Equation 40:

fiφPXRegfiφPXReg,0+fiφPXReg,0·XRegXReg,0+ΔPiφRegfiφQXRegfiφQXReg,0+fiφQXReg,0·XRegXReg,0+ΔQiφReg(40)

In the equations, fiφP and fiφQ represent the partial derivatives of the active and reactive power functions of the phase φ branch with respect to each variable in XReg; ΔPiφReg and ΔQiφReg respectively represent the compensating errors of the active and reactive power of the phase φ branch, and the compensating errors are also calculated using the PLSR method.

Next, the linearization principle of the droop control function of the OLTC is introduced. Similar to the linearization principle of branch power, the functional relationship between the turns ratio change and the control voltage is defined in Equation 41:

Δt=fΔtUr(41)

Where, fΔt represents the functional relationship between the variable ratio variation and the secondary side control voltage.

The first-order Taylor series expansion is performed at the initial value point Ur,0 of the variable, as shown in Equation 42:

fΔtUrfΔtUr,0+fΔtUr,0·UrUr,0+Δtr(42)

In the formula, fΔt represents the derivative of the variable turns ratio variation function to the variable Ur ; Δtr represents the compensation error of the variable turns ratio variation.

Then, the PLSR algorithm is used for data-driven error compensation, the specific principle and mathematical expressions are as follows:

The compensation error Δy of active power is defined in Equation 43:

Δy=ξy+ηΔy=ΔPPVφmΔQPVφmTy=PloadPVφQloadPVφT(43)

Among them, ξ is the coefficient matrix and η is the constant matrix, which are obtained by PLSR fitting is the independent variable matrix of load composition; PloadPVφ and QloadPVφ represent the vectors composed of active and reactive loads of the grid-connected node φ phase of the system.

The independent variable set and the dependent variable set can be obtained from the results of the power flow calculation. After obtaining the set of independent variables R and the dependent variable set Z, it is standardized.

R*=RR¯SR1Z*=ZZ¯SZ1(44)

In Equation 44: R*, Z* represent the standardized independent variables and dependent variable sets; R¯ and SR respectively represent the mean and standard deviation of the independent variable set R.

For the standardized independent variable and dependent variable sets, the least squares regression analysis method is used to calculate the fitting coefficients, with the knowledge in the field of power systems:

C=PLSR*,Z*(45)
η=Z¯R¯SRCSZ(46)

In Equations 45 and 46: C represents the regression coefficient matrix obtained by PLSR. ξ and η represent the coefficient matrix and the constant matrix in the linear regression equation, respectively. Thus, the compensation error Δy can be obtained. Finally, the correction equation of the measurement is shown in Equation 47:

fPVφmpXPVfPVφmpXPV0+fPVφmpXPV0×XPVXPV0+ΔPPVφmfPVφmQXPVfPVφmQXPV0+fPVφmQXPV0×XPVXPV0+ΔQPVφm(47)

3.3 Bad data identification model based on MILP

For the suspicious measurement set, the MILP bad data identification model is constructed based on the hyhrid data-physical driven linearization method as shown in Equations 4851:

mini=1mbi(48)
s.t.cLx=0(49)
hiLxzim3σiMbi(50)
hiLxzim3σi+Mbi(51)

Where, cLx represents the linearized equality constraint, and hiLx represents the linearized measurement equation of the line.

When zi is a normal measurement, that is, it does not belong to the suspicious measurement set, Formulas 50, 51 in the model should be rewritten as shown in Equations 52 and 53:

hiLxzim3σi(52)
hiLxzim3σi(53)

The MILP bad data identification method is not affected by the leverage point, and can quickly and accurately identify the bad data in the leverage measurement.

4 Case study

Due to the need for further improvement in the accuracy and computational efficiency of the current bad data identification method in distribution networks, this paper proposes a bad data identification method based on MILP model to simultaneously address the issues of model accuracy and computational efficiency. In this chapter, a typical low-voltage 42-node distribution network case study is used to validate the effectiveness of the proposed bad data identification model. The relevant case studies are conducted on the MATLAB software platform, utilizing an Intel(R) Core(TM) i5-10210U CPU 1.60 GHz processor. The simulation calculations are carried out using per unit values, with a base voltage of 13.8 kV and a base power of 100 kVA for the case study system. The state estimation calculations are initialized in a flat start manner, with measurement values subject to Gaussian distribution errors with mean of 0 and variance of σ2 added to the true values. The standard deviations for voltage measurements, branch power measurements, and node injection power measurements are set as σi=0.004, σBran =0.008, and σNode =0.01.

For ease of analysis, this paper selects two mathematical indicators, the Root Mean Square Error (RMSE) and the Maximum Absolute Error (MAE). In this paper, RMSE represents the square root of the ratio of the sum of the squares of errors between the estimated values and the true values to the data dimension; MAE is generally used to measure the range of absolute errors, i.e., the maximum absolute error between the estimated values and the true values. The mathematical expressions for these two indicators are as follows:

RMSE=1ninxix^i2(54)
MAE=maxxix^ii=1,2,3(55)

In the equation, x^i represents the true value of the state variable xi.

To verify the model, the modified IEEE-33 node test feeder is used to test the linearization and bad data identification model proposed in this paper.

The schematic diagram of the modified IEEE-33 node test feeder is shown in Figure 6. The feeder consists of 33 nodes and 32 branches. The power flow results of the system are used as the normal measurements for the entire system, including 402 measurements, including 6 voltage magnitude measurements, 198 branch power measurements, and 198 node injection power measurements.

Figure 6
www.frontiersin.org

Figure 6. A modified IEEE-33 node test feeder.

4.1 Linearization accuracy comparison

Firstly, test the hybrid data-physical-driven linearization model proposed in this paper. Based on the IEEE 33-node parameters, training and test datasets can be generated from the AC power flow model. The case study sets training and testing data as simulated data randomly generated within the range of 95%–105% of the actual load after removing outliers, with a total of 100 sets of training samples. The measurement linearization accuracy obtained from three hybrid data-physical-driven methods, namely, Taylor expansion linearization, PLSR regression, Least squares regression (LSR) (Shao et al., 2023), and Bayesian linear regression (BLR) (Liu et al., 2019), are compared. To facilitate analysis, the difference between the linearized power flow results and the true values is represented by two mathematical indicators, RMSE and MAE, as shown in Equations 54, 55 respectively. The comparison results in Table 1.

Table 1
www.frontiersin.org

Table 1. Comparison of the accuracy between linearization methods.

As shown in Table 1, the hybrid data-physical-driven Linearization method proposed in this paper has a higher advantage in terms of linearization accuracy. Compared to physical-driven linearization, This method can improve the accuracy by 108 of magnitude. Additionally, compared to data-driven error compensation methods such as LSR and BLR, the PLSR method used in this paper has the highest accuracy and is closer to the results of nonlinear constraints.

4.2 OLTC tap position analysis

The OLTC tap position is verified by introducing a positive curvature quadratic penalty function into the tap position alignment model proposed in this paper. The tap position is treated as a continuous variable for state estimation calculations, obtaining a continuous optimal state estimate. The positive curvature quadratic penalty function is introduced into the objective function, using the continuous optimal solution as the initial value to continue the state estimation and obtain an integer solution for the tap position. Two different tap position processing models for on-load tap changing are set up to verify the effectiveness of the proposed model.

Model 1:. Traditional direct treatment of the tap position as a continuous variable.

Model 2:. Tap position rounding method based on positive curvature quadratic penalty function for OLTC.

The accuracy of voltage magnitude estimation for the two models is shown in Table 2.

The results indicate that for the OLTC model, the method of rounding tap positions with a positive curvature quadratic penalty function can lead to state estimation results with higher accuracy.

Table 2
www.frontiersin.org

Table 2. Voltage magnitude estimation results under different OLTC tap position handling methods.

4.3 Analysis of bad data identification results

4.3.1 Comparative analysis of traditional statistical methods for bad data identification

According to the two-stage bad data identification model proposed in this paper, 10 bad data are set as shown in Table 3. At the same time, the false alarm rate is set to Pe=0.0025. By querying the standard normal distribution table, the normal range of normalized residuals can be obtained, and the detection threshold τ is set to 3 and the threshold D for the projection statistics is set to 1.

Table 3
www.frontiersin.org

Table 3. Detail information of the bad data and the corresponding projection statistics and normalized residual values.

According to Table 4, among the 10 bad data, the projection statistical method effectively identified only 8 pcs bad data, while all other measurements with projection statistics values greater than 1 resulted in false alarms. Similarly, the maximum normalized residual method also failed to accurately identify the bad data. Therefore, both traditional statistical methods for bad data identification have certain limitations.

Table 4
www.frontiersin.org

Table 4. Comparison of results of different identification methods.

4.3.2 Comparative analysis of single-stage and two-stage bad data identification methods

Due to the potential for misjudgments and missed detections by traditional statistical methods when multiple measurement bad data points are present in the system, these methods may not accurately identify measurement bad data. In contrast, the MILP bad data identification model used in the second stage can accurately identify all measurement bad data in the set identified by traditional methods when misjudgments or missed detections occur in the first stage. Therefore, in the first stage of this paper, all 265 lever measurements in the system were placed into a suspicious measurement set. The maximum normalized residual method was then used to identify the remaining 137 measurements, thereby validating the accuracy of the proposed bad data identification model and its efficiency compared to the single-stage model. Four sets of tests were conducted to compare the bad data identification method proposed in this paper with the traditional MINLP-based bad data identification method:

Test 1: Compression of the suspicious measurement set using the MILP-based two-stage bad data identification method proposed in this paper;

Test 2: Compression of the suspicious measurement set using the traditional MINLP-based two-stage bad data identification method;

Test 3: No compression of the suspicious measurement set, directly using the MILP-based single-stage bad data identification method;

Test 4: No compression of the suspicious measurement set, using the traditional MINLP-based single-stage bad data identification method.

Models were built for both methods in the GAMS optimization software, different solvers were called for calculation, and the results were compared and analyzed.

According to Table 5, for the modified IEEE-33 node test feeder case chosen in this paper, the two-stage model reduces the number of 0/1 integer variables and improves identification accuracy. However, for traditional MINLP-based bad data identification models, feasible solutions could not be obtained with most solvers, with correct results only achievable using the BONMIN solver. In contrast, the MILP-based bad data identification model achieved accurate identification results with most solvers. In terms of identification efficiency, even with the two-stage model, the MINLP identification model’s solving time reached the computational limit of 1,000 s, and the solver exited abnormally, indicating limited applicability. For the MILP-based bad data identification method, using the two-stage model, the SCIP solver improved solving efficiency by approximately 23 times, and the CPLEX solver achieved the highest efficiency, requiring only 0.546 s.

Table 5
www.frontiersin.org

Table 5. Comparison of test results of different bad data identification methods.

5 Conclusion

To cope with the issue that existing traditional identification methods do not consider the nonlinear and discrete characteristics of on-load tap changers, making it difficult to achieve accurate and efficient identification of bad data., This paper proposes a MILP-based two stage bad data identification method. Detailed control characteristics of on-load tap changers are modelled, and a positive curvature quadratic penalty function is introduced to achieve fast tap normalization. In the first stage, leveraging projection statistics and maximum normalization residue methods effectively identifies leverage points and suspicious bad data, reduces the set of suspicious measurements. In the second stage, by linearizing nonlinear constraints and solving bad data identification model based on the MILP, the efficiency of the solution is greatly enhanced. The proposed model can achieve efficient and accurate identification of bad data, while ensuring optimal solutions by introducing penalty functions into the objective function for effective tap normalization.

Data availability statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Author contributions

SH: Conceptualization, Funding acquisition, Methodology, Writing–original draft. CR: Formal Analysis, Investigation, Writing–original draft. MZ: Project administration, Resources, Writing–review and editing. LC: Software, Supervision, Writing–review and editing. YM: Validation, Visualization, Writing–review and editing. TZ: Validation, Visualization, Writing–review and editing.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This research was funded by State Grid Hebei Electric Power Co., Ltd. science and technology projects, grant number kj2023-076.

Conflict of interest

Authors SH, CR, and LC were employed by Economic and Technology Research Institute of State Grid Hebei Electric Power Co., Ltd.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The authors declare that this study received funding from State Grid Hebei Electric Power Co.,Ltd. The funder had the following involvement in the study: the study design, collection, analysis, interpretation of data, and the writing of this article.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Apland, J., and Sun, B. (2019). A Multi-Period, multiple objective, mixed integer programming, GAMS model for transit system planning

Google Scholar

Baradar, M., and Hesamzadeh, M. R. (2014). A stochastic SOCP optimal power flow with wind power uncertainty. 1–5. doi:10.1109/PESGM.2014.6939790

CrossRef Full Text | Google Scholar

Chen, Y. (2021). Power system state estimation. Beijing, China: Science Press.

Google Scholar

Ghildyal, V., and Sahinidis, N. V. (2001). “Solving global optimization problems with baron,” in From Local to Global Optimization. Nonconvex Optimization and Its Applications. Editors A. Migdalas, P. M. Pardalos, and P. Värbrand (Boston, MA: Springer. doi:10.1007/978-1-4757-5284-7_10

CrossRef Full Text | Google Scholar

Gupta, O. K., and Ravindran, A. (1985). Branch and bound experiments in convex nonlinear integer programming. Manage. Sci. 31 (12), 1533–1546. doi:10.1287/mnsc.31.12.1533

CrossRef Full Text | Google Scholar

Handschin, E., and Kliokys, E. (1995). Transformer tap position estimation and bad data detection using dynamic signal modelling. IEEE T. Power Syst. 10 (2), 810–817. doi:10.1109/59.387921

CrossRef Full Text | Google Scholar

Handschin, E., Schweppe, F. C., Kohlas, J., and Fiechter, A. (1975). Bad data analysis for power system state estimation. IEEE Trans. Power Apparatus Syst. 94 (2), 329–337. doi:10.1109/T-PAS.1975.31858

CrossRef Full Text | Google Scholar

Huang, S. J., and Lin, J. M. (2004). Enhancement of anomalous data mining in power system predicting-aided state estimation. IEEE T. Power Syst. 19 (1), 610–619. doi:10.1109/TPWRS.2003.818726

CrossRef Full Text | Google Scholar

Irving, M. R. (2008). Robust state estimation using mixed integer programming. IEEE T. Power Syst. 23 (3), 1519–1520. doi:10.1109/TPWRS.2008.926721

CrossRef Full Text | Google Scholar

Ju, Y., and Huang, Y. (2023). State estimation for an AC/DC hybrid power system adapted to non-smooth characteristics. Power Syst. Prot. Control 51, 141–150. doi:10.19783/j.cnki.pspc.220368

CrossRef Full Text | Google Scholar

Kia, R., Shahnazari-Shahrezaei, P., and Zabihi, S. (2016). Solving a multi-objective mathematical model for a multi-skilled project scheduling problem by CPLEX solver. IEEE, 1220–1224. doi:10.1109/IEEM.2016.7798072

CrossRef Full Text | Google Scholar

Korres, G. N., Katsikas, P. J., and Contaxis, G. C. (2004). Transformer tap setting observability in state estimation. IEEE T. Power Syst. 19 (2), 699–706. doi:10.1109/TPWRS.2003.821629

CrossRef Full Text | Google Scholar

Lin, Y., and Abur, A. (2018). A highly efficient bad data identification approach for very large scale power systems. IEEE T. Power Syst. 33 (6), 5979–5989. doi:10.1109/TPWRS.2018.2826980

CrossRef Full Text | Google Scholar

Liu, Y., Zhang, N., Wang, Y., Yang, J., and Kang, C. (2019). Data-Driven power flow linearization: a regression approach. IEEE T. Smart Grid 10 (3), 2569–2580. doi:10.1109/TSG.2018.2805169

CrossRef Full Text | Google Scholar

Maalouf, H., Jabr, R. A., and Awad, M. (2013). Mixed-integer quadratic programming based rounding technique for power system state estimation with discrete and continuous variables. Electr. Pow. Compo. Sys. 41 (5), 555–567. doi:10.1080/15325008.2012.755233

CrossRef Full Text | Google Scholar

Mili, L., Cheniae, M. G., Vichare, N. S., and Rousseeuw, P. J. (1996). Robust state estimation based on projection statistics [of power systems]. IEEE T. Power Syst. 11, 1118–1127. doi:10.1109/59.496203

CrossRef Full Text | Google Scholar

Nanchian, S., Majumdar, A., and Pal, B. (2017). Three-Phase state estimation using hybrid particle swarm optimization. IEEE T. Smart Grid. 8 (3), 1035–1045. doi:10.1109/TSG.2015.2428172

CrossRef Full Text | Google Scholar

Shao, Z., Zhai, Q., and Guan, X. (2023). Physical-Model-Aided Data-Driven linear power flow model: an approach to address missing training data. IEEE T. Power Syst. 38 (3), 2970–2973. doi:10.1109/TPWRS.2023.3256120

CrossRef Full Text | Google Scholar

Shiroi, M., and Hosseinie, S. H. (2008). “Observability and estimation of transformer tap setting with minimal PMU placement,” in IEEE power and energy society general meeting - conversion and delivery of electrical energy in the 21st century. Paper presented at the 2008. doi:10.1109/PES.2008.4596382

CrossRef Full Text | Google Scholar

Teixeira, P. A., Brammer, S. R., Rutz, W. L., Merritt, W. C., and Salmonsen, J. L. (1992). State estimation of voltage and phase-shift transformer tap settings. IEEE T. Power Syst. 7 (3), 1386–1393. doi:10.1109/59.207358

CrossRef Full Text | Google Scholar

Vigerske, S., and Gleixner, A. (2017). SCIP: global optimization of mixed-integer nonlinear programs in a branch-and-cut framework. Optim. Methods Softw. 33 (3), 563–593. doi:10.1080/10556788.2017.1335312

CrossRef Full Text | Google Scholar

Zhao, J., and Mili, L. (2018). Vulnerability of the largest normalized residual statistical test to leverage points. IEEE T. Power Syst. 33 (4), 4643–4646. doi:10.1109/TPWRS.2018.2831453

CrossRef Full Text | Google Scholar

Zhuang, F., and Balasubramanian, R. (1987). Bad data processing in power system state estimation by direct data deletion and hypothesis tests. IEEE Power Eng. Rev. PER-7 (5), 40. doi:10.1109/MPER.1987.5527244

CrossRef Full Text | Google Scholar

Keywords: bad data identification, on-load-tap-changer, hyhrid data-physical driven, mixed integer linear programming, linearization

Citation: Hu S, Rong C, Zhang M, Chai L, Ma Y and Zhang T (2024) Bad data identification method considering the on-load tap changer model. Front. Energy Res. 12:1478834. doi: 10.3389/fenrg.2024.1478834

Received: 11 August 2024; Accepted: 26 September 2024;
Published: 11 October 2024.

Edited by:

Jinran Wu, Australian Catholic University, Australia

Reviewed by:

Yongqiang Kang, Lanzhou Jiaotong University, China
Zhesen Cui, Changzhi University, China

Copyright © 2024 Hu, Rong, Zhang, Chai, Ma and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shiyao Hu, ODQxMzQxOTcwQHFxLmNvbQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.