An efficient user demand response framework based on load sensing in smart grid

Jiang, Wenqian; Lin, Xiaoming; Yang, Zhou; Tang, Jianlin; Zhang, Kun; Zhou, Mi; Xiao, Yong

doi:10.3389/fenrg.2023.1141374

ORIGINAL RESEARCH article

Front. Energy Res., 17 May 2023

Sec. Smart Grids

Volume 11 - 2023 | https://doi.org/10.3389/fenrg.2023.1141374

An efficient user demand response framework based on load sensing in smart grid

Wenqian Jiang¹

Xiaoming Lin^2,3*

Zhou Yang¹

Jianlin Tang^2,3

Kun Zhang¹

Mi Zhou^2,3

Yong Xiao^2,3

¹Metrology Center of Guangxi Power Grid Co., Ltd, Nanning, China
²Electric Power Research Institute, China Southern Power Grid Company Limited, Guangzhou, China
³Key Laboratory of Intelligent Measurement and Advanced Measurement Enterprises of Guangdong Power Grid, Guangzhou, China

The current residential electricity demand is increasing. The demand side response of smart grid power users aims to enable users to reasonably plan their own power consumption through price incentives, so as to solve the problems of unreasonable power energy structure and low utilization rate. It is prominent to mine the rules of user response behaviors and design a reasonable incentive mechanism to maximize the enthusiasm of all participants. The traditional demand response is to ensure the stability of the power system from the macro-control load of the grid, which cannot meet the personalized requirements of power users. The existing incentive mechanism also does not comprehensively consider the profits of grid companies, low-voltage users, aggregators and other parties. In this paper, we propose a user demand response framework based on load awareness. Firstly, we devise a user demand response behaviour model based on short-term memory network. Secondly, we propose a demand response incentive scheme based on electric power scores. We also construct a deviation optimization integration adjustment model based on game theory to achieve the balance of profits among grid, aggregators and low-voltage users. The extensive experimental results show the effectiveness of our proposed framework.

1 Introduction

Power energy is an important guarantee for achieving sustainable economic development and improving the quality of persons’ lives. At present, the increasing demand for residential electricity, coupled with the irrational structure and low utilization rate of existing power energy, have deepened the contradiction between the power load system and the distributed low-voltage grid users. In demand side response, during the peak or valley period of residential power consumption, users can reasonably plan their own power consumption range by means of price incentives and active response to the imbalance of regional power demand, thereby achieving peak shaving and valley filling to ensure the smooth operation of the grid. The implementation of demand response mechanism can realize power utilization optimization from the demand side of power resource allocation, effectively solve the problem of tight supply and demand of local power, and provide new regulation means for economic, safe and stable operation of the power system. The main current economic incentive is to determine the amount of subsidies according to the load data of demand response.

With the increasing types of power transactions and the increasing transaction frequency, the traditional power market transactions face more difficulties and challenges. With the development of new power integrated energy, especially the popularization of home photovoltaic new energy technology, the potential of low-voltage user demand response is enhanced. However, residential low-voltage users have their own particularity when participating in demand response. Specifically, the consumer’s electricity consumption behaviour will be affected by many factors such as season, weather temperature, power consumption period, and real-time price. It is necessary to build a user demand response model to reveal the degree of response of user power consumption to relevant change factors, and simulate the demand response behavior of users. On the other hand, it is necessary to design a reasonable incentive mechanism, so that the overall benefits and the individual benefits of all participants can be balanced, so as to maximize the enthusiasm of all participants.

The traditional demand response is to ensure the stability of the power system from the macro-control load of the grid. The grid system sends inductive signals to users to reduce load use, such as compensation and power price changes, so that users can change their original electricity use habits. However, the method of macro-control cannot meet the individual requirements of power users. With the increase of users participating in demand response, user responses are different and have various complex features such as high-dimensional, nonlinear, and non-convex. This makes the interactive modelling based on model driven and the pricing strategy based on optimization no longer applicable. Existing incentive mechanisms for power users are mainly launched from the electricity price, such as peak price, real-time price and electricity price rebates in response to peak hours, with reasonable subsidies and incentives. The common incentive method is point incentive, which can be used to exchange subsidies through the distribution of scores for users’ daily behaviours. This traditional method does not comprehensively consider the interests of power grid companies, low-voltage users, aggregators and other parties.

In this paper, we propose a user demand response framework based on load sensing to analyze the demand response behaviour of low-voltage users, and achieve effective incentives for grid companies, low-voltage users, aggregators and other agents. Specifically, we build a user participation demand response behaviour model to find out the correlation characteristics between the user’s electricity consumption and user’s behavior participating in demand response. We construct a theoretical model with a presentation layer, a user consumption prediction layer, a demand response prediction layer and a multi task learning layer.

Next, we design the integration rules of differential scoring mechanism under three objectives: maximizing the scores obtained by users, maximizing the benefits of grid companies, and maximizing the benefits of aggregators. We devise a market trading rules for low-voltage demand response including five stages of the trading process: demand response invitation, the bidding stage, the orderly power consumption stage, the photovoltaic new energy power sale stage, and the release and distribution of subsidy information stage. The integration rules of differential integral mechanism under three objectives are designed: maximizing the points obtained by users, maximizing the benefits of grid companies, and maximizing the benefits of aggregators.

Finally, we construct a deviation optimization score adjusting model, which comprehensively considers the real-time price of grid companies, the unit price of load dispatching of aggregators, and the demand response of low-voltage users. The score adjusting model simulates the implementation effect when the grid, aggregators, and low-voltage users participate in demand response. According to the deviation of implementation effect, the incentive mechanism of power integration is optimized and adjusted. We adopt the non-cooperative static game to describe the three parties’ game. The maximum of the objective function can be achieved according to the benefit function of each party, that is, to maximize the benefits of grid companies, aggregators and low-voltage users. Then, according to the obtained Nash equilibrium solution, the deviation of the goal in the score incentive mechanism is optimized and adjusted.

Our main contributions are as follows.

• We propose a user demand response framework based on load awareness. This framework realizes the demand response behaviour analysis of low-voltage users, and constructs an incentive mechanism based on grid companies, low-voltage users, aggregators and other parties.

• We propose a low-voltage user participation demand response behaviour model, which learns user behaviour via LSTM (long short-term memory) network.

• We devise the demand response incentive scheme based on electricity scoring. A deviation optimization integration adjustment model based on game theory is presented to simulate the implementation effect of grid, aggregator and low-voltage users.

• We conduct extensive experiments. The experimental results show the effectiveness of the proposed method.

The rest of this paper is structured as follows. We summarize the related work in Section 2. We present the user participation demand response behaviour theory model in Section 3. We present the demand response incentive scheme in Section 4. We discuss about the incentive adjustment model in Section 5. The experimental design and results are presented in Section 6. Finally, Section 7 presents the conclusion.

2 Related work

It is well known that demand response plays an important role in balancing supply and demand in the power sector. Wijaya et al. (2014) and Shi et al. (2020) demonstrated how user engagement changes based on actual incentives received. Aalami et al. (2008) considered time-of-use and emergency demand response program. Zheng et al. (2020) proposed an incentive-based integrated demand response model. Wang et al. (2020) proposed a forecasting model to help aggregators predict the aggregate demand response capacity available in the future market. Baboli et al. (2012) developed an improved demand response model which considers the customer’s behaviour. Muratori and Rizzoni (2016) provided an accurate estimate of the actual quantity of controllable resources. From the perspective of a grid operator, Yu et al. (2019) established a resource trading framework. Khajavi et al. (2011) and Palensky and Dietrich (2011) analyzed the incentive-based programs in smart grid and the various types of demand side management. Yang et al. (2018) provided different consumers with a list of price plans to motivate them to participate in demand response. Wang et al. (2020) constructed the automatic demand response architecture, which provides the possibility of demand response’s real-time application.

The relationships between entities in a dataset usually have multiple properties, and many methods using subspace weighted clustering and long short-term memory network have been proposed to analyze these attributes. Jia and Cheung (2018) proposed an attribute-weighted clustering model based on the concept of object-cluster similarity. Jing et al. (2007) proposed a new k-means algorithm that can cluster high-dimensional objects in sub-spaces. Boongoen et al. (2011) studied subspace clustering, and proposed a filter approach applicable to different types of clustering. Chen et al. (2019) proposed a two-level subspace weighting clustering algorithm for customer transaction data. Yin et al. (2018) proposed a clustering method, which learns an adaptive graph affinity matrix and then obviates the pre-computed graph regularize effectively. Tang et al. (2019) learned a joint affinity graph for multi-view subspace clustering. Dong et al. (2014) proposed a method to cluster the vertices by efficiently merging the information. Some studies used the LSTM (long short-term memory) network and RNN network for prediction (Xiaoyun et al., 2016; Liu et al., 2017; Narayan and Hipel, 2017; Agrawal et al., 2018).

In addition, there are some methods to apply game theory to smart grid. Nguyen et al. (2013) proposed a strategy to solve the conflict of interest to achieve the overall optimal performance of the power supply system. Sanjab and Saad (2016) and Kamyab et al. (2016) proposed a distributed learning algorithm to find the equilibrium and proved its convergence to the game solution. Belhaiza and Baroudi (2015) proposed a new non-cooperative game theory model. The Nash Balance condition is used for demand management in the smart grid. Farraj et al. (2016) uses an iterative game theory formula to describe the interaction of all parties in the power system and its impact on system stability. Ni and Paul (2019) proposed a new solution for the dynamic game between parties. Demand response algorithms based on real-time price are proposed in (Tushar et al., 2014; Mondal et al., 2015; Yu and Hong, 2016). Nguyen et al. (2015) proposed a distributed demand-side management algorithm, which provides optimality for energy suppliers and users. La et al. (2016) established a dynamic pricing model based on differential game theory. Fadlullah et al. (2014) proposed a game-theoretic energy schedule method by modeling the interaction between power companies and consumers.

Recently, Fu and Zhou (2022) proposed a preprocessing approach for the simulation of the power systems. A tranfer function model is proposed to evaluate the characteristics of droop control inverter. Some researches focus on new energy generation such as photovoltaic power generation. Specifically, Fu and Zhou (2022) addressed the problem of the collaboration between photovoltaic load and energy systems, in order to coordinate the energy, such as electrical energy and thermal energy. A energy meteorological model is constructed. Machine learning techniques, such as Markov chains, nearest neighbor theory and probabilistic inequality theory, are used for optimal planning for capacitors in (Fu et al., 2020). The statical machine learning is utilized for power flow planning in (Fu, 2022). The statical machine learning techniques contains linear regression, probability distribution, and center point method, which simulate the uncertain weather, power generation, and others.

3 User participation demand response behavior theory model in load regulation

By analyzing users’ historical electricity consumption behaviour, we can get their overall electricity consumption habits and characteristics, thus we can formulate electricity planning schemes that are more consistent with users’ behaviour habits and formulate relevant incentive policies. In daily life, there are many external factors that will affect the power consumption of users, such as season, weather temperature, power consumption period, and real-time price. Motivated by the multi-task learning model in Xu et al. (2021), we use the division of different time periods on the timeline to describe the impact of users’ historical electricity consumption behaviour on the future, and enhance the representation. For external features, we combine the context information of electricity consumption, and use LSTM to learn. Finally, combining the two data, more accurate results can be obtained through loss function calculation. Therefore, the whole user participation demand response behavior theoretical model is divided into four layers: expression layer, user consumption prediction layer, load prediction layer, and multi-task learning layer.

1) Expression Layer.

This layer is composed of external characteristics and user electricity consumption information. External features are the top n features extracted above. The selected features are coded into the vector of $R^{1 \times A}$ to reduce dimensions, and then connected with these external feature vectors to form a low dimensional matrix E_τ. The user’s electricity consumption directly affects whether the user participates in the demand response behaviour. If the display features are directly used to predict, they cannot reflect the real world situation, so we use a hidden layer to enhance the context information that represents the electricity consumption at a certain time τ. The electricity consumption of station i is expressed as Cons_i,τ. In order to estimate the response quantity of user participation in demand response, the external characteristics and the context information related to user consumption are connected, that is $U_{i, τ} = [{C o n s}_{i, τ}, E_{τ}]$ .

2) User Consumption Forecast Layer.

This layer uses the user consumption amount to represent the consumption situation, and extracts the user consumption amount on the timeline. The time is divided into timeslots with a growth of t (for example, t = 5 (minutes)). We use A_u,t to represent the average amount consumed by the user u in the timeslot t. We hope to use this value to predict the user consumption amount of $A_{u, t_{τ}}$ in the future τ timeslot. Considering the periodicity of users’ electricity consumption, we divide the historical data into three categories: within 1 hour, within 6 days, and within 4 weeks. The time interval can be expressed as: $[t_{τ} - \frac{60}{t}, t_{τ} - 1], [t_{τ} - \frac{60 * 24}{t} \times 6, t_{τ} - \frac{60 * 24}{t}], [t_{τ} - \frac{60 * 24 * 6}{t} \times 4, t_{τ} - \frac{60 * 24 * 6}{t}]$ , the characteristics are expressed as follows:

I_{u, t_{τ}} = [A_{u, t_{τ} - 1}, A_{u, t_{τ} - 2}, \dots, A_{u, t_{τ} - \frac{60}{t}}] (1)

J_{u, t_{τ}} = [A_{u, t_{τ} - \frac{60 * 24}{t}}, A_{u, t_{τ} - \frac{60 * 24}{t} \times 2}, \dots, A_{u, t_{τ} - \frac{60 * 24}{t} \times 6}] (2)

K_{u, t_{τ}} = [A_{u, t_{τ} - \frac{60 * 24 * 6}{t}}, A_{u, t_{τ} - \frac{60 * 24 * 6}{t} \times 2}, \dots, A_{u, t_{τ} - \frac{60 * 24 * 6}{t} \times 4}] (3)

Convolution operations are performed on the three types of features as follows.

T_{u, I}^{t_{τ}} = f (W_{Conv}^{(I)} * I_{u, t_{τ}} + b_{I}) (4)

T_{u, J}^{t_{τ}} = f (W_{Conv}^{(J)} * J_{u, t_{τ}} + b_{J}) (5)

T_{u, K}^{t_{τ}} = f (W_{Conv}^{(K)} * K_{u, t_{τ}} + b_{K}) (6)

where * represents the convolution operation, and $W_{Conv}^{(I)}, W_{Conv}^{(J)}, a n d W_{Conv}^{(K)}$ represent weight coefficients. b_I, b_J, b_K are deviations, and f () is the activation function. Finally, the vector $F_{u, t_{τ}} = [T_{u, I}^{t_{τ}}, T_{u, J}^{t_{τ}}, T_{u, K}^{t_{τ}}]$ is connected and can be seen as an enhanced representation of user consumption Cons_i,τ, which better reflects the real situation in the real scene.

3) User Demand Response Prediction Layer.

The purpose of designing the user demand response prediction layer is to predict the user participation demand response based on the sequence model of space-time information. LSTM network is used to retain useful information. In the LSTM network, each feature vector has a memory unit c_x, an input gate i_x and a forgetting gate f_x that controls the network to remember useful new content and forget useless old content, and an output gate o_x where x represents the xth feature vector. For each eigenvector, there is an output h_x of LSTM unit: h_x = o_x◦tanh (c_x). The output gate uses a Sigmod function to determine the output: o_x = σ(W_oU_x,τ + V_oh_x−1+ D_oc_x). D_o is a diagonal matrix. The memory unit c_x is obtained from the weighted sum of new content $\tilde{c_{x}}$ and old content c_x−1: $c_{x} = f_{x} ◦ c_{x - 1} + i_{x} ◦ \tilde{c_{x}}$ . The new content $\tilde{c_{x}}$ is obtained from the weighted sum of the series information U_x,τ and the output h_x−1 of the last LSTM unit:

\tilde{c_{x}} = \tanh (W_{c} U_{x, τ} + V_{c} h_{x - 1}) (7)

Both W_c and V_c are weight matrices. The input gate and forgetting gate are calculated from the information U_x,τ, the output h_x−1 of the last LSTM unit, and the last memory unit c_x−1 connected by external features and user consumption related context:

i_{x} = σ (W_{y} U_{x, τ} + V_{y} h_{x - 1} + D_{y} c_{x - 1}) (8)

f_{x} = σ (W_{f} U_{x, τ} + V_{f} h_{x - 1} + D_{f} c_{x - 1}) (9)

where W_y, W_f, V_y, V_f are weight matrices, and D_y, D_f are diagonal matrices. Using the sequence of U_j,τ to train the LSTM network, we can get the hidden unit sequence H = h₁, h₂, … , h_n with time dependence. After this unit sequence is input into the MLP, the corresponding expected user participation demand response $\hat{r_{i}}$ can be obtained.

In order to further capture the global information, we employ the self-attention mechanism and multi-task parallel computing, which can take into account the correlation between any two stations. The corresponding trainable parameter matrix $W_{Q}^{T}, W_{K}^{T}, W_{V}^{T}$ and respective offsets are used to generate the corresponding Q, K, V:

Q = W_{Q}^{T} [h_{1}, h_{2}, \dots, h_{n}] + b_{Q} (10)

K = W_{K}^{T} [h_{1}, h_{2}, \dots, h_{n}] + b_{K} (11)

V = W_{V}^{T} [h_{1}, h_{2}, \dots, h_{n}] + b_{V} (12)

Each element in the Q sequence is used to match each element in K. Specifically, point multiplication is carried out between them in order to obtain the weight through softmax function. V can be regarded as the information learned from the sequence. Therefore, this process can be expressed as follows.

A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{Q K^{T}}{\sqrt{d_{k}}}) (13)

h^{*} = A t t e n t i o n (Q, K, V) V (14)

where d_k represents the dimension of k, and h* represents the comprehensive representation of sequence H. h* will be input into MLP to obtain the final predicted user demand response $\hat{R}$ .

4) Multi-Task Learning Layer.

The purpose of this layer is to calculate more accurate parameters by combining the outputs of the user consumption prediction layer and the user demand response prediction layer. Using the predicted user consumption amount $\hat{a_{i}}$ , the expected user participation demand response $\hat{r_{i}}$ , and the final predicted user demand response $\hat{R}$ calculated previously, we use the average absolute percentage error MAPE to measure the error normalized. The three loss functions are: $L_{a} = \frac{1}{n} \sum_{i = 1}^{n} |\frac{a_{i} - \hat{a_{i}}}{a_{i}}|$ , $L_{r} = \frac{1}{n} \sum_{i = 1}^{n} |\frac{r_{i} - \hat{r_{i}}}{r_{i}}|$ , and $L_{R} = |\frac{R - \hat{R}}{R}|$ a_i is the actual consumption amount of users in station i, r_i is the actual demand response of users in station i, and R is the actual demand response of the entire end user group. The final loss function is computed from the above three loss functions:

L = \frac{1}{2 λ_{1}^{2}} L_{a} + \frac{1}{2 λ_{2}^{2}} L_{r} + \frac{1}{2 λ_{3}^{2}} L_{R} + \log (λ_{1} λ_{2} λ_{3}) . (15)

4 Demand response incentive scheme based on electric score

For low-voltage users participating in demand response transactions, if they perform faithfully, they can accumulate credit value and reward scores. If users default, they will be punished. In this section, we develop a differential scoring model. The goal is to maximize the credit obtained by users and the income of grid companies and aggregators.

4.1 Score rules

1) Rewarding scores for user registration and real name authentication: After users download the low-voltage interactive response APP, register and authenticate with their real names, the system will reward users with scores, such as 20 scores.

2) The user submits the demand response information and gets bonus scores. After receiving the demand response invitation issued by the grid company, the user will submit relevant information according to the situation, and the system will issue reward scores to the user, such as 5 scores.

3) Rewarding scores for faithful performance of users. The aggregator compares the data read from the intelligent interactive terminal with the data submitted by the user. If the user performs faithfully, the system will give the user a performance award, such as 10 scores.

4) Deducting scores for user’s failure to perform. If the user breaches the contract, the user’s partial scores will be deducted, such as 15 scores.

5) Differentiated bonus scores are calculated according to the user’s credit value. The system sets a basic reward score, such as 12 scores, and then rewards according to the credit value. For users with high credit value (for example, above 900), each time they participate in a transaction and perform, they can be rewarded with the original bonus scores plus p times the original scores (for example, p = 0.2). For users with moderate credit value (500–900), each time they participate in a transaction and perform a contract, they can reward the original score plus q times the original score (for example, q = 0.05). Users with low credit value (below 500) will be rewarded normally.

6) User task credits: for a user’s time-limited task, for example, if he/she participates in 6 demand response transactions within half a month and performs faithfully, he/she can obtain 3 scores.

7) Rewards for blockchain block out: the block reward of the corresponding alliance chain of each station area is planned and managed by the grid company in a unified way and distributed to users according to their enthusiasm for participation.

8) Rewards for users to participate in orderly power consumption: after receiving the invitation, the user submits the transaction information, chooses to discharge the electric vehicle at the peak load and charge it at the low load, and the system will give the user a point reward, such as 18 scores.

9) Rewards for photovoltaic new energy sales: photovoltaic power generation users will voluntarily use the electricity and sell the surplus electricity to the grid. The grid company will give a point reward, such as 1 point for 1 kWh.

10) Bonus scores for users binding new smart home devices: users who bind a new smart home device through the APP will get bonus scores, such as 5 scores.

Users can use credits before the credits expire. Specifically, the purposes of scores include:

1) Exchange electricity subsidy price: users can convert it into electricity subsidy in proportion, for example, 500 scores can be converted into an electricity subsidy of 1 kW h, but it cannot be withdrawn for use.

2) Exchange souvenirs and prizes: for example, 2,000 cents can be exchanged for souvenirs, and 1,000 cents can be exchanged for vouchers of cooperative merchants, etc.

3) Restore credit value: for users with low credit value (for example, the credit value is lower than 200), credit value can be restored with scores.

4.2 Maximizing the user’s scores

We divide the sources of user scores into six parts. Let A be the sum of the scores obtained by each user’s daily behaviour, including reward scores regist_i,u obtained by user registration and real name authentication, reward scores submit_i,u obtained by users submitting demand response information, user task scores task_i,u, reward scores bind_i,u for users binding new smart home devices, subscript i for station area number, and u for user code. Therefore, the sum of scores obtained by all users’ daily behaviours can be expressed as A, which can be expressed as follows.

A = \sum_{i, u} r e g i s t_{i, u} + \sum_{i, u} s u b m i t_{i, u} + \sum_{i, u} t a s k_{i, u} + \sum_{i, u} b i n d_{i, u} (16)

Let B be the sum of the scores of each user’s participation in demand response performance, including the score case_t of the predicted demand response reward. The performance score is differential. Users get different credits at different implementation times, which are differentiated by time factor τ_t. We use H_x to denote time, D_x to denote date, M_x to denote month, and q, w, r to denote weight coefficients respectively, which can be adjusted according to season, temperature and other factors. Therefore, the calculation method of time coefficient τ_t is:

τ_{t} = {q H}_{x} + {w D}_{x} + {r M}_{x} . (17)

We use X to denote the predicted user demand response, and θ to denote the reward basis for each kW hour of electricity, so the reward scores obtained by users are calculated as follows:

{c a s e}_{t} = τ_{t} X θ . (18)

Therefore, the total score B of all users’ participation in demand response performance can be expressed as:

B = \sum_{t} c a s e_{t} . (19)

Then, according to the sum of the differentiated credits of the user credit value award, we use C to denote the credit value of the user. The user’s credit value is divided into three grades, expressed as grade factor δ_e, subscript e as user credit value, and credit value award base score as ρ. δ_e is calculated as follows.

D = \{\begin{matrix} δ_{1} \\ \begin{matrix} δ_{2} \\ 1 \end{matrix} \end{matrix} \begin{matrix} e \geq 900 \\ \begin{matrix} 500 \leq e < 900 \\ e < 500 \end{matrix} \end{matrix} (20)

The differentiation scores obtained by users are:

{c r e d i t}_{i, u} = δ_{e} ρ . (21)

Therefore, the sum of differential scores C awarded according to the user’s credit value can be expressed as:

C = \sum_{i, u} c r e d i t_{i, u} . (22)

The block reward scores for blockchain are uniformly distributed and managed by the grid company. The scores are sorted from high to low according to the response of each station area and the response of users in the station area. The ranking coefficient of the substation area is expressed as α_i, and i is the number of the station area. We use β_u to denote the ranking coefficient of users in the station area, and u to represent the number of users in the station area. If the basic division is set as η, the user will get miner_i bonus scores for block out as follows.

{b l o c k}_{i, u} = α_{i} β_{u} η (23)

We use D to denote the total reward scores of all users’ blockchain blocks, and D can be obtained as follows.

D = \sum_{i, u} b l o c k_{i, u} (24)

We use E to denote the sum of the rewards for users to participate in the orderly use of electricity. order_i,u represents the scores obtained by each user to participate in the orderly use of electricity.

E = \sum_{i, u} o r d e r_{i, u} (25)

We use F to denote the total scores of energy sales incentives, and photovoltaic_i,u to denote the bonus scores obtained by each PV new energy user by selling excess electricity:

F = \sum_{i, u} p h o t o v o l t a i c_{i, u} (26)

Therefore, the model for maximizing user scores is:

\begin{align} M a x i m i z e S^{users} & = A + B + C + D + E + F \\ = \sum_{i, u} r e g i s t_{i, u} + \sum_{i, u} s u b m i t_{i, u} \\ + \sum_{i, u} t a s k_{i, u} + \sum_{i, u} b i n d_{i, u} + \sum_{t} c a s e_{t} \\ + \sum_{i, u} c r e d i t_{i, u} + \sum_{i, u} b l o c k_{i, u} \\ + \sum_{i, u} o r d e r_{i, u} + \sum_{i, u} p h o t o v o l t a i c_{i, u} \end{align} (27)

4.3 Maximizing profit of grid companies

We use S^CSG to denote the revenue of the grid company. It refers to the total revenue after the end of the demand response load. Let a, b and c be the weight of demand response, orderly power consumption and photovoltaic new energy power generation in the whole low-voltage interaction response. There are three sources of demand response transaction load. We use X_i to denote a total load of demand response of the ith station area. We use Y_i,u to denote the load of the uth user in the ith substation area who sells the load stored by the electric vehicle to other users to participate in demand response in the orderly power consumption stage. We use Z_i,u to denote the excess load sold to the grid company by the user u of the ith station area in the photovoltaic power generation stage. Let p_t be the real-time electricity selling price of the grid company in period p. Let p^w be the dispatching load unit price of the aggregator, and let p^c be the generation cost of the grid company. Therefore, the benefit maximization model of grid companies can be expressed as:

M a x i m i z e S^{CSG} = (\sum_{i, u, t} a X_{i} + b Y_{i, u} + c Z_{i, u}) (p_{t} - p^{w} - p^{c}) . (28)

4.4 Maximizing aggregator’s profit

The aggregator participates in the process of dispatching demand response load, and the grid company pays the dispatching fee, which is represented by S^Agent. The three expressions of the demand response load dispatched by the aggregator are denoted as X_i, Y_i,u, Z_i,u, Here, a, b and c are also used to represent the weight of demand response, orderly electricity use and photovoltaic new energy power generation in the whole low-voltage interaction response. Let p^w be the unit price of the dispatching load charged by the aggregator for participating in the dispatching load. Therefore, the profit maximization model of aggregators can be expressed as:

M a x i m i z e S^{Agent} = \sum_{i, u} p^{w} (a X_{i} + b Y_{i, u} + c Z_{i, u}) . (29)

The demand response incentive model based on electric score is feasible in reality. The power grid company will install a specific electricity meter in each user’s home. When users use the application program (such as mobile APP) that matches the electricity meter, they can become the client of the demand response system, that is, the light node in the blockchain system. The settlement of score is calculated by the whole node with large computing power in the blockchain system, and then synchronized to the user light node. When the application program that conforms to the business logic in the paper is completed, the power regulation and score settlement based on the blockchain system is very simple and fast for power grid companies, aggregators and low-voltage users.

5 Incentive adjustment model of scores based on game theory

We simulate the implementation effect of multi-agents’ participation in demand response based on game theory. We optimize and adjust the scoring incentive mechanism according to the target deviation of demand response. The game involving grid companies, aggregators and low-voltage users is a non-cooperative static game, and the three parties independently choose their strategies according to their own optimization goals. The optimization objectives of grid companies, aggregators, and low-voltage users are as follows: maximum benefits, maximum benefits from dispatching demand response, and maximum scores. According to the goal of the three party game, solve the Nash equilibrium point and adjust the score incentive mechanism.

5.1 Game model

The game model of the grid company, aggregator and low-voltage user is as follows:

Participants: grid company CSG, aggregator Agent, and all user Users participating in low voltage response.

Strategy: the game decision variables of grid company, aggregator and low voltage user include real-time unit price of load in demand response period, dispatch unit price of aggregator participating in dispatching demand response load, and demand response quantity submitted by low voltage users.

Profit:

S^{CSG} = (\sum_{i, u, t} a X_{i} + b Y_{i, u} + c Z_{i, u}) (p_{t} - p^{w} - p^{c}) (30)

S^{Agent} = \sum_{i, u} p^{w} (a X_{i} + b Y_{i, u} + c Z_{i, u}) (31)

S^{users} = A + \sum_{t} τ_{t} . X . θ + C + D + E + F (32)

S^CSG denotes the total income of the grid company after the end of the demand response load. a, b and c are the weights of demand response, orderly power consumption and photovoltaic new energy power generation in the whole low-voltage interaction response. X_i refers to the total load of demand response of the ith substation area. Y_i,u refers to the phase of orderly power utilization, in which the user u of the ith substation sells the load stored by the electric vehicle itself to other users to participate in demand response. Z_i,u refers to the excess load sold to the grid company by the user u of the ith substation in the photovoltaic power generation stage. p_t refers to the real-time electricity selling price of the grid company in the p period. p^w refers to the unit price of dispatching load of the aggregator. p^c refers to the generation cost of the grid company. Eq. 38 scores out that the optimization objective of the aggregator is to obtain the maximum benefit in the process of participating in dispatching demand response load. p^w refers to the unit price of dispatching load charged by the aggregator for participating in dispatching load. Eq. 39 scores out that the optimization goal of low-voltage users is to maximize the integration. A, C, D, E, and F refer to the fixed scores obtained by users under different circumstances. $\sum_{t} τ_{t} . X . θ$ indicates the total scores obtained by all users during the period t, where the time coefficient is τ_t, X refers to the total amount of all user demand response loads, and θ is the reward basis for responding to unit loads.

5.2 Tripartite game

5.2.1 Grid companies increase unit price of real-time load

1) The aggregator chooses to increase the unit price of the dispatching load, and the user chooses to increase the demand response load.

The total electric power scores obtained by the user, the income obtained by the grid company, and the income obtained by the aggregators are as follows.

S_{1}^{users} = \sum_{t} τ_{t} . ▵ X . θ + S^{users} (33)

\begin{align} S_{1}^{CSG} & = (\sum_{i, n, t} a . ▵ X_{i} + b . ▵ Y_{i, u} + c . ▵ Z_{i, u}) \\ \times (▵ p_{t} - ▵ p^{w} - p^{c}) + S^{CSG} \end{align} (34)

S_{1}^{Agent} = \sum_{i, u} ▵ p^{w} (a . ▵ X_{i} + b . ▵ X_{i, u} + c . ▵ Z_{i, u}) + S^{Agent} (35)

2) The aggregator chooses to increase the unit price of the dispatching load, and the user chooses to reduce the demand response load.

The total electric power scores obtained by the user, the income obtained by the grid company, and the income obtained by the aggregators are as follows.

S_{2}^{users} = \sum_{t} τ_{t} . - ▵ X . θ + S^{users} (36)

\begin{align} S_{2}^{CSG} & = [\sum_{i, u, t} a (- ▵ X_{i}) + b (- ▵ Y_{i, u}) + c (- ▵ Z_{i, u})] \\ \times (▵ p_{t} - ▵ p^{w} - p^{c}) + S^{CSG} \end{align} (37)

\begin{align} S_{2}^{Agent} & = \sum_{i, u} ▵ p^{w} [a (- ▵ X_{i}) + b (- ▵ X_{i, u}) + c (- ▵ Z_{i, u})] \\ + S^{Agent} \end{align} (38)

3) The aggregator chooses to reduce the unit price of the dispatching load, and the user chooses to increase the demand response load.