Multi-Agent Schedule Optimization Method for Regional Energy Internet Considering the Improved Tiered Reward and Punishment Carbon Trading Model

Li, Tianxiang; Xiao, Qian; Jia, Hongjie; Mu, Yunfei; Wang, Xinying; Lu, Wenbiao; Pu, Tianjiao

doi:10.3389/fenrg.2022.916996

ORIGINAL RESEARCH article

Front. Energy Res., 24 May 2022

Sec. Smart Grids

Volume 10 - 2022 | https://doi.org/10.3389/fenrg.2022.916996

This article is part of the Research TopicAdvances in Distributed Energy Resources Aggregation for the Low Carbon FutureView all 14 articles

Multi-Agent Schedule Optimization Method for Regional Energy Internet Considering the Improved Tiered Reward and Punishment Carbon Trading Model

Qian Xiao¹*

¹Key Laboratory of Smart Grid of Ministry of Education, Tianjin University, Tianjin, China
²China Electric Power Research Institute, Beijing, China

Regional energy internet (REI) contains massive market agents, whose interests and objectives vary from each other. In consequence, it is challenging to stimulate the energy conservation and emissions reduction participation of each agent by the conventional schedule optimization method. This paper proposes a multi-agent schedule optimization method for REI considering the improved tiered reward and punishment carbon trading model. Firstly, the energy flow constraints and device constraints of REI are established. Secondly, to tighten restrictions on carbon emissions, the relative carbon emission is used as the criterion to establish the improved tied reward and punishment carbon trading model. Next, to analyze the real multi-agent game situation in the market, different agents are classified, and the objective functions are defined based on their revenue. Finally, a two-layer algorithm is used to solve the above multi-agent model. Simulation results verify that the proposed method can effectively reduce carbon emissions and significantly enhance the revenue of the region.

1 Introduction

Low-carbon and environmentally friendly energy production is the foundation of sustainable development in the world. Therefore, reducing the carbon emissions of energy systems gradually become a key research work. At present, with the gradual integration of energy systems, the research in this area mainly focuses on the following two aspects. The first is establishing a detailed carbon trading system to limit carbon emissions (Zhang et al., 2020), and the second is establishing an energy internet optimization system to improve energy efficiency (Yu et al., 2016).

Many scholars have made outstanding contributions to the energy internet carbon trading model. Huang et al. (2021) summarized the research status and application prospect of low carbonization technology, and respectively refined the carbon emissions reduction removal technologies of energy supply side and consumption measurement. Li et al. (2021) established a multi-layer key index system of source-net-network-load, and set a carbon index membership function suitable for central cities on the premise that the subjective and objective weight deviations were the smallest sum of squares. Yuan et al. (2022) introduced carbon capture power plants, which improved peak regulation capabilities and system economic benefits for cogeneration units, while the carbon emissions were reduced. Cui et al. (2021) introduced a comprehensive and flexible operation mode of carbon capture power plants on the source site, which was considered on the load side demand response. This mode explored the dispatching advantages of the complementary low-carbon characteristics of the two means, achieving a high degree of wind power consumption. Li and Niu (2021) summarized the current technical characteristics of the power system and believed that the expansion of renewable energy, the early withdrawal of coal power, the application of carbon capture technology, and the guarantee of transformation investment need to be handled in the future energy transition. Cui et al. (2022) proposed a multi-time scale source-load dispatch method of power system with wind power considering low-carbon characteristics of carbon capture power plant, which was able to take advantage of the dispatch of source-load adjustable resources to achieve low power systems the goal of carbon economy schedule. However, the above literature did not put strong restrictions on ultra-high carbon emissions enterprises, which was not conducive to significant control of carbon emissions.

Optimizing energy internet operation and reducing energy consumption is one of the current research priorities. Zhang et al. (2016) fully considered the characteristics of renewable energy and the characteristics of user demand response. He proposed a renewable energy day dispatch method to improve the revenue of the energy system. Based on the current energy internet operation model, Zhang et al. (2015) proposed a new energy Internet optimization model that considered electricity, heat, and gas. Because of the distribution characteristics of the energy internet, Xiao et al. (2022) proposed a side-cloud collaborative architecture. Under this architecture, the system optimization scheduling was realized by using multi-server layering, and the rapid scheduling of the energy internet was realized. Bahrami et al. (2015) and Zhang et al. (2019) extended the demand response of the traditional power system to the heat and gas system and proposed an optimal dispatching method that considered multi-dimensional demand response. Mohammadian et al. (2021) proposed a data-driven classifier for extreme outage prediction based on Bayes decision theory, which can guarantee the optimization effect and significantly improve the optimization rate of the energy system. Kamruzzaman et al. (2021) used deep reinforcement learning to improve the elasticity of the power system and then laid the foundations for improved energy efficiency and a low-carbon economy. In the microgrid scenario, Zeng et al. (2019) proposed a grid optimization and energy management method based on a deep neural network. Mohsenian-Rad et al. (2010) proposed an autonomous demand-side management based on game-theoretic energy consumption scheduling, which provided direction for the low-carbon transformation of smart grids in the future. Peng et al. (2021), Wang et al. (2020), and Chen et al. (2019) used the edge computing method to realize the hierarchical optimization of energy internet. However, the above literature did not consider the game relationship between different agents of energy internet adequately. It is difficult to mobilize the enthusiasm of the energy internet to participate in energy conservation and emissions reduction only by general overall optimization.

To reduce REI’s carbon emissions and enhance the different agents’ revenue, this paper proposes a multi-agent schedule optimization method for REI considering the improved tiered reward and punishment carbon trading model.

Compared with other works, the main contributions of the paper are summarized as follows.

1) This paper establishes a tiered reward and punishment carbon trading model for REI, which includes reward zone, ladder punishment zone, and index punishment zone. The model can reward enterprises that actively participate in reducing carbon emissions, and gradually raises the price of carbon to limit those enterprises that exceed carbon emissions standards.

2) This paper establishes a multi-agent schedule optimization method considering carbon trading for REI, which considers the interest demands of different agents in REI. This method divides all kinds of agents into supply agents, service agents and user agents. Then it sets up the objective function according to its actual revenue, which can stimulate the vitality of agents to participate in emissions reduction and market competition.

The rest of this paper is organized as follows. Section 2 establishes the REI mathematical model. The proposed tied reward and punishment carbon trading model is detailed in Section 3, and the proposed multi-agent schedule optimization method considering carbon trading is detailed in Section 4. In Section 5, this paper uses two sets of contrasting scenes to support the advantages of the proposed model and method. Finally, some conclusions are given in Section 6.

2 Regional Energy Internet Model and Main Work of This Paper

2.1 Regional Energy Internet Model

The REI model is shown in Figure 1.

FIGURE 1

FIGURE 1. Regional energy internet.

There are three external energy suppliers, namely electric supply company, heat supply company and gas supply company. In the REI, there are supply agents, service agents and user agents. Supply agents have gas generators, thermal boilers and gas holders to supply this region. Service agents have wind turbine generators, photovoltaic generators, power to gas generators, combined heat and power generators and gas boilers. Service agents can use these devices to optimize regional operations. User agents have three kinds of load, there are electric load, heat load and gas load. In this model, service agents are responsible for optimizing area operation.

2.1.1 Energy Flow Constraints

1) Electric flow constraints can be expressed as follows.

p_{e, t}^{UB} = p_{e, t}^{ESC} + p_{e, t}^{SS} + p_{e, t}^{VS} (1)

p_{e, t}^{VS} = p_{t}^{WTG} + p_{t}^{PG} + p_{e, t}^{CHP} - p_{e, t}^{PTG} (2)

Where $p_{e, t}^{UB}$ is the electric power of load after demand response at the time $t$ , $p_{e, t}^{ESC}$ is electric power purchased from the electric supply company at the time $t$ , $p_{e, t}^{SS}$ is electric power purchased from supply agents at the time $t$ , $p_{e, t}^{VS}$ is electric power offered by service agents at the time $t$ , $p_{t}^{WTG}$ is electric power offered by wind turbine generators at the time $t$ , $p_{t}^{PG}$ is electric power offered by photovoltaic generators at the time $t$ , $p_{e, t}^{CHP}$ is electric power offered by combined heat and power generators at the time $t$ , $p_{e, t}^{PTG}$ is electric power consumed by power to gas generators at the time $t$ .

2) Heat flow constraints can be expressed as follows.

p_{h, t}^{UB} = p_{h, t}^{HSC} + p_{h, t}^{SS} + p_{h, t}^{VS} (3)

p_{h, t}^{VS} = p_{h, t}^{CHP} + p_{h, t}^{GB} (4)

Where $p_{h, t}^{UB}$ is the heat power of load after demand response at the time $t$ , $p_{h, t}^{HSC}$ is heat power purchased from heat supply company at the time $t$ , $p_{h, t}^{SS}$ is heat power purchased from supply agents at the time $t$ , $p_{h, t}^{VS}$ is heat power offered by service agents at the time $t$ , $p_{h, t}^{CHP}$ is heat power offered by combined heat and power generators at the time $t$ , $p_{h, t}^{GB}$ is heat power offered by the gas boiler at the time $t$ .

3) Gas flow constraints can be expressed as follows.

p_{s, t}^{UB} = p_{s, t}^{PSC} + p_{s, t}^{SS} + p_{s, t}^{VS} (5)

p_{s, t}^{VS} = p_{s, t}^{PTG} - p_{s, t}^{CHP} - p_{s, t}^{GB} (6)

Where $p_{s, t}^{UB}$ is the gas power of load after demand response at the time $t$ , $p_{s, t}^{PSC}$ is gas power purchased from gas supply company at the time $t$ , $p_{s, t}^{SS}$ is gas power purchased from supply agents at the time $t$ , $p_{s, t}^{VS}$ is gas power offered by service agents at the time $t$ , $p_{s, t}^{PTG}$ is gas power offered by the power to gas generators at the time $t$ , $p_{s, t}^{CHP}$ is gas power consumed by combined heat and power generators at the time $t$ , $p_{s, t}^{GB}$ is gas power consumed by the gas boiler at the time $t$ .

2.1.2 Device Constraints

1) Power to gas generators constraints can be expressed as follows.

p_{s, t}^{PTG} = η^{PTG} p_{e, t}^{PTG} (7)

0 \leq p_{e, t}^{PTG} \leq p_{\max}^{PTG} (8)

Where $η^{PTG}$ is the efficiency of the power to the gas generators, $p_{\max}^{PTG}$ is the maximum input power of the power to gas generators.

2) Combined heat and power generators constraints can be expressed as follows.

p_{e, t}^{CHP} = η_{e}^{CHP} p_{s, t}^{CHP} (9)

p_{h, t}^{CHP} = η_{h}^{CHP} p_{s, t}^{CHP} (10)

0 \leq p_{s, t}^{CHP} \leq p_{\max}^{CHP} (11)

Where $η_{e}^{CHP}$ is the electric efficiency of the combined heat and power generators, $η_{h}^{CHP}$ is the heat efficiency of the combined heat and power generators, $p_{\max}^{CHP}$ is the maximum input power of the combined heat and power generators.

3) Gas boiler constraints can be expressed as follows.

p_{h, t}^{GB} = η^{GB} p_{s, t}^{GB} (12)

0 \leq p_{s, t}^{GB} \leq p_{\max}^{GB} (13)

Where $η^{GB}$ is the efficiency of the gas boiler, $p_{\max}^{GB}$ is the maximum input power of the gas boiler.

4) Wind turbine generators constraints can be expressed as follows.

0 \leq p_{t}^{WTG} \leq p_{fore, t}^{WTG} (14)

Where $p_{fore, t}^{WTG}$ is current wind turbine generators forecast power at the time $t$ .

5) Photovoltaic generators constraints can be expressed as follows.

0 \leq p_{t}^{PG} \leq p_{fore, t}^{PG} (15)

Where $p_{fore, t}^{PG}$ is current photovoltaic generators forecast power at the time $t$ .

2.2 Main Work of This Paper

To solve the problems of high carbon emissions and insufficient enthusiasm of all agents in the REI, this paper proposes the improved tiered reward and punishment carbon trading model and the multi-agent schedule optimization method considering carbon trading. As shown in Figure 2, the carbon trading model rewards and punishes counterpart enterprises, which achieves the goal of reducing carbon emissions. The multi-agent game method can consider the subjectivity of different agents participating in energy conservation and emissions reduction, which achieves the goal of reducing carbon emissions and enhancing revenue. The connection between the two parts is as follows. The carbon trading model helps the multi-agent method control carbon emissions, and the multi-agent game method helps the carbon trading model mobilize the enthusiasm of different subjects to participate in energy conservation and emission reduction.

FIGURE 2

FIGURE 2. The main work of this paper and the effects.

3 The Tied Reward and Punishment Carbon Trading Model

The traditional carbon trading model often punishes and rewards carbon trading in a single-priced monopoly, which is not conducive to mobilizing the enthusiasm of all kinds of agents in the market to participate in energy conservation and emissions reduction. The traditional carbon trading model is shown in Figure 3.

FIGURE 3

FIGURE 3. Traditional carbon trading model.

This traditional model will enable some enterprises to arbitrage revenue from it. For example, if the revenue of one enterprise emitting a unit of carbon dioxide is greater than the benchmark price of carbon emissions per unit k, this model does nothing to limit the enterprise’s carbon emissions!

Therefore, to stimulate the enthusiasm of stakeholders in the energy market to participate in energy conservation and emissions reduction, the improved tiered reward and punishment carbon trading model is proposed in this paper, which is shown in Figure 4.

E = E_{r} - E_{g} (16)

In Figure 4, the horizontal axis E is the relative carbon emission for the entire simulation period, which can be calculated by the formula (Zeng et al., 2019), and the vertical axis P is the carbon price per unit given by the government. In formula (Zeng et al., 2019), E_r is the actual carbon emissions of the enterprise, E_g is the free carbon emissions of the enterprise. The reward and punishment carbon trading model include reward zone, ladder punishment zone and index punishment zone. The reward zone can reward enterprises with lower carbon emissions on a tiered basis. In the same way, the ladder punishment zone can punish enterprises with higher carbon emissions at different levels. And the index punishment zone can limit enterprises with extremely high carbon emissions.

FIGURE 4

FIGURE 4. The proposed tiered reward and punishment carbon trading model.

Compared with the traditional carbon trading model, the improved tiered reward and punishment carbon trading model has the following advantages:

1) It can reward or punish enterprises with different carbon emissions at different levels, which can mobilize the enthusiasm of enterprises to participate in energy conservation and emissions reduction.

2) It can eliminate the possibility of some high-carbon emissions enterprises profiting from it.

3.1 Reward Zone

When an enterprise’s relative carbon emission E is less than zero, it means this enterprise’s actual carbon emissions E_r under the free carbon emissions of the enterprise E_g. On this occasion, this enterprise should be rewarded. The reward amount should be determined by the enterprise’s relative carbon emission E. This paper divides the reward zone into three levels. The carbon price per unit P can be calculated as follows.

P= {\begin{matrix} - k α_{1} & - v \leq E < 0; \\ - k α_{2} & - 2 v \leq E < - v; \\ - k α_{3} & - 3 v \leq E < - 2 v . \end{matrix} (17)

Where k is the benchmark price of carbon emissions per unit, $α_{k}$ (k =1, 2, 3) is the carbon emissions incentive factor, v is the carbon emissions classification unit.

The cost of carbon emissions $C^{c o_{2}}$ can be calculated as follows.

C^{c o_{2}} = {\begin{matrix} - k [α_{1} (- E)] & - v \leq E < 0; \\ - k [α_{1} v + α_{2} (- E - v)] & - 2 v \leq E < - v; \\ - k [α_{1} v + α_{2} v + α_{3} (- E - 2 v)] & - 3 v \leq E < - 2 v . \end{matrix} (18)

In the reward zone, the carbon price per unit given by the government and the cost of carbon emissions is negative. That means this enterprise is rewarded for reducing carbon emissions.

3.2 Ladder Punishment Zone

When an enterprise’s relative carbon emission E is more than zero, it means this enterprise should be punished for its carbon emissions. If the enterprise’s carbon emissions are less than the set standard 3v in the meantime, its carbon price will fall into the ladder punishment zone. This paper divides the ladder punishment zone into three levels. The carbon price per unit P can be calculated as follows.

P = {\begin{matrix} k β_{1} & 0 \leq E < v; \\ k β_{2} & v \leq E < 2 v; \\ k β_{3} & 2 v \leq E < 3 v . \end{matrix} (19)

Where $β_{k}$ (k = 1, 2, 3) is the carbon emissions punishment factor.

The cost of carbon emissions $C^{c o_{2}}$ can be calculated as follows.

C^{c o_{2}} = {\begin{matrix} k β_{1} E & 0 \leq E < v; \\ k [β_{1} v + β_{2} (E - v)] & v \leq E < 2 v; \\ k [β_{1} v + β_{2} v + β_{3} (E - 2 v)] & 2 v \leq E < 3 v . \end{matrix} (20)

In the ladder punishment zone, the carbon price per unit presents a ladder distribution. That means this enterprise is punished for carbon emissions, and the price of the punishment varies with the amount of carbon emitted.

3.3 Index Punishment Zone

When an enterprise’s relative carbon emission E is more than 3v, it means that the enterprise’s carbon emissions seriously exceeded the standard. In this case, the punishment must be increased to ensure the environmental protection of the energy system. Therefore, this paper sets up an index punishment zone. In this zone, the carbon price per unit P can be calculated as follows.

P = k β_{3} e^{(E - 3 v)} (21)

The cost of carbon emissions $C^{c o_{2}}$ can be calculated as follows.

C^{c o_{2}} = k [β_{1} v + β_{2} v + β_{3} v + \int_{3 v}^{E - 3 v} β_{3} e^{(E - 3 v)} dE] (22)

In the index punishment zone, the carbon price per unit presents exponential growth.

4 The Multi-Agent Schedule Optimization Method Considering Carbon Trading

There are many different types of agents in REI, and the general optimization of REI operation is not conducive to mobilizing the enthusiasm of all agents to participate in energy conservation and emissions reduction. According to the characteristics of different agents in REI, various agents are divided into supply agents, service agents and user agents. Then the objective function is set according to their actual interests, and an optimization scheduling method considering multi-agent carbon trading is proposed. The interests of each agent are affected by the policies of other agents. The game relationship of the three types of agents is shown in Figure 5.

FIGURE 5

FIGURE 5. Multi-agent game relationship diagram.

4.1 Supply Agents

Supply agents refer to all market agents who profit by producing and selling energy. Their main feature is that they have energy production facilities. Their revenue is mainly influenced by the number of purchases by lower-level buyers and the cost of energy production. When such supply agents are optimized, their energy comprehensive revenue can be maximized by adjusting their price to service agents and their energy supply project. Supply agents’ objective function $I^{S}$ can be calculated as follows.

I^{S} = \max (I^{SSale} - C^{SS} - C^{SC} - C^{C O_{2}}) (23)

Where $I^{SSale}$ is the supply agents’ revenue of energy sales, $C^{SS}$ is the supply agents’ cost of energy production, $C^{SC}$ is the comfort cost of supply agents. They can be calculated as follows.

I^{SSale} = \sum_{t = 1}^{T} δ_{t}^{SS} p_{t}^{SS} Δ t (24)

Where $t$ is current simulation time, $T$ is total simulation time, $δ_{t}^{SS}$ is energy sale price from supply agents at the time t, $p_{t}^{SS}$ is the amount of energy sale power from supply agents at the time $t$ , $Δ t$ is the length of simulation time.

C^{SS} = \sum_{t = 1}^{T} [\sum_{m \in M} (γ_{m}^{SS} p_{m, t}^{SS} Δ t + ε_{m}^{SS} p_{m, t}^{SS} Δ t)] (25)

Where $m$ is the current device number, $M$ is the gathering of all supply agents’ devices. $γ_{m}^{SS}$ is the service cost per unit of device $m$ , which usually includes equipment maintenance, sewage treatment and so on. $ε_{m}^{SS}$ is the product cost per unit of device $m$ . $p_{m, t}^{SS}$ is the power of device $m$ at the time $t$ .

C^{SC} = \sum_{t = 1}^{T} k_{1} [{(\frac{δ_{t}^{SS}}{δ})}^{k_{2}} - 1] (26)

Where $k_{1}$ is the product coefficient of comfort for supply agents, $k_{2}$ is the index coefficient of comfort for supply agents, $δ$ is the average market energy price.

Supply agents’ revenue is affected by various market factors. Supply agents can change their price to service agents $δ_{t}^{SS}$ and supply project $p_{m, t}^{SS}$ to enhance their revenue.

4.2 Service Agents

Service agents refer to all market agents who profit from energy conversion and dispatching. Their main feature is that they have power to gas, gas boiler and other energy conversion equipment. Their revenue is influenced by market conditions, equipment performance and other agents’ policies. When service agents are optimized, their energy comprehensive revenue can be maximized by adjusting their energy purchasing power, energy selling price and conversion strategy. Service agents’ objective function $I^{V}$ can be calculated as follows.

I^{V} = \max (I^{VSale} - C^{VS} - C^{VC} - C^{VB} - C^{C O_{2}}) (27)

Where $I^{VSale}$ is the service agents’ revenue of energy sales, $C^{VS}$ is the service agents’ cost of equipment maintenance, $C^{VC}$ is the service agents’ comfort cost, $C^{VB}$ is the service agents’ cost for purchasing energy. They can be calculated as follows.

I^{VSale} = \sum_{t = 1}^{T} δ_{t}^{VS} p_{t}^{VS} Δ t (28)

Where $δ_{t}^{VS}$ is energy sale price from service agents at the time t, $p_{t}^{VS}$ is the amount of energy sold power from service agents at the time $t$ .

C^{VS} = \sum_{t = 1}^{T} (\sum_{u \in U} γ_{u}^{VS} p_{u, t}^{VS} Δ t) (29)

Where $u$ is the current device number, $U$ is the gathering of all service agents’ devices. $γ_{u}^{VS}$ is the service cost per unit of device $u$ , which usually includes equipment maintenance, energy efficiency conversion and so on. $p_{u, t}^{VS}$ is the power of device $u$ at the time $t$ .

C^{VC} = \sum_{t = 1}^{T} k_{3} [{(\frac{δ_{t}^{VS}}{δ})}^{k_{4}} - 1] (30)

Where $k_{3}$ is the product coefficient of comfort for service agents, $k_{4}$ is the index coefficient of comfort for service agents, $δ$ is the average market energy price.

C^{VB} = \sum_{t = 1}^{T} δ_{t}^{SS} p_{t}^{VB} Δ t (31)

Where $p_{t}^{VB}$ is the amount of energy buying power from supply agents at the time $t$ .

Service agents can change their price to user agents $δ_{t}^{VS}$ , service project $p_{u, t}^{VS}$ and the amount of energy buying power from supply agents $p_{t}^{VB}$ to enhance their revenue.

4.3 User Agents

User agents refer to all market agents who benefit in other ways. They usually act as a user of energy rather than participating in energy production and transmission activities. In the process of energy optimization, the minimum energy purchase cost of the user agents is considered. User agents’ objective function $I^{US}$ can be calculated as follows.

I^{US} = \min (C^{UB} + C^{UC} + C^{UCT}) (32)

Where $C^{UB}$ is the user agents’ cost of buying energy, $C^{UC}$ is the user agents’ comfort cost, $C^{UC T}$ is the carbon limits cost of user agents. They can be calculated as follows.

C^{UB} = \sum_{t = 1}^{T} δ_{t}^{VS} p_{t}^{UB} Δ t (33)

Where $p_{t}^{UB}$ is the amount of energy buying power from service agents at the time $t$ . In this paper the user agents can only purchase energy from the service agents, so $p_{t}^{UB}$ is the same as the actual load after the demand response.

C^{UC} = \sum_{t = 1}^{T} [\frac{y}{2 L_{t}} {(p_{t}^{UB})}^{2} - y p_{t}^{UB} + \frac{y}{2} L_{t}] (34)

Where $y$ is the comfort coefficient of user agents, $L_{t}$ is the initial load before the demand response.

C^{UCT} = \sum_{t = 1}^{T} k_{5} {(C^{C O_{2}})}^{k_{6}} (35)

Where $k_{5}$ is the product coefficient of carbon limits for user agents, $k_{6}$ is the index coefficient of carbon limits for user agents. Although carbon emissions are not directly emitted by users, the energy consumption of users has a great influence on the carbon emissions of the park. Therefore, this paper uses $C^{UCT}$ to limit user agents’ energy consumption.

User agents can change their amount of energy buying power from service agents $p_{t}^{UB}$ to reduce its’ cost.

4.4 Optimization Calculation Method

Multi-agent game is different from multi-objective optimization. It is not simply to pursue maximum comprehensive revenue. It is trying to find a stable operating point where no one can enhance his revenue by chance himself. This feature can be expected as follows.

i \in O, a_{j}^{i} \in A^{i}, s \in S (36)

Where $i$ is the current agent, $O$ is the gathering of agents in the REI, $a_{j}^{i}$ is agent $i$ 's action $j$ , $A^{i}$ is the gathering of agent $i$ 's actions, $s$ is a current station, $S$ is the gathering of stations in the REI.

At Station $s$ , if Satisfies

\forall i \in O, \forall a_{j}^{i} \in A^{i} (37)

I_{i}^{a_{*}^{i}} \geq I_{i}^{a_{j}^{i}} (38)

Then $(a_{*}^{1}, a_{*}^{2}, \dots, a_{*}^{i}, \dots)$ is the equilibrium solution at station $s$ . Where $I_{i}^{a_{*}^{i}}$ is agent $i$ 's revenue at actor $a_{*}^{i}$ , $I_{i}^{a_{j}^{i}}$ is agent $i$ 's revenue at actor $a_{j}^{i}$ .

Considering that the multi-agent game process of REI is complex, this paper divides the whole operation optimization process into the following two parts: the upper price game and the bottom device game. The algorithm can be expressed as follows.

5 Simulation Result

To verify the superiority of the proposed method, this paper establishes an REI for simulation which includes supply, service and user agents. Supply agents possess electric generators, gas holders and thermal boilers. Service agents possess wind turbine generators, photovoltaic generators, combined heat and power generators, power to gas generators and gas boilers. User agents include electric, gas and heat load. The energy flow model and device model are shown in Section 2. This paper sets the simulation time as 24 h, and the time interval as 1 h. To ensure the rationality of the simulation results, the price needs to be limited. Supply agents’ price limit is shown in Table 1, and service agents’ price limit is shown in Table 2. At the same time, the efficiency should be set, unit service cost, and maximum output power constraints of the various device according to the situation of the region. Device parameters are shown in Table 3. Parameters in the carbon trading model are shown in Table 4.

TABLE 1

TABLE 1. Supply agents’ price limit.

TABLE 2

TABLE 2. Service agents’ price limit.

TABLE 3

TABLE 3. Device parameters.

TABLE 4

TABLE 4. Carbon trading scenes.

The renewable energy forecast value and initial load value are shown in Figure 6. This paper sets that the output power of renewable energy in the REI cannot exceed its predicted value. At the same time, the demand response in this paper is in the form of an interruptible load, so the actual user load value cannot exceed the initial value.

FIGURE 6

FIGURE 6. Renewable energy and load data.

To fully explain that the proposed improved tiered reward and punishment carbon trading model and the proposed multi-agent game method in this paper are beneficial to the REI, this paper sets up a simulation scenario under the principle of control variables. In section 5.1, considering three kinds of energy games, only the carbon trading model for simulation is changed; in section 5.2, considering the proposed tiered reward and punishment carbon trading model, only the game energy types for simulation are changed. The other parameter settings of the two sets of scenes are the same.

5.1 Carbon Trading Scenes Analysis

To verify the advantages of the proposed tiered reward and punishment carbon trading model, this paper sets up three scenes which are shown in Table 4 for comparison.

In Table 4, there are 3 scenes to simulate. In scene 1, the carbon trading model is not considered, which means the revenue of the whole region will not be affected by carbon emissions; in scene 2, the traditional carbon trading model in Figure 3 is considered; in scene 3, the improved tiered reward and punishment carbon trading model in Figure 4 is considered.

The simulation results are shown in Table 5.

TABLE 5

TABLE 5. Carbon trading results.

The comparison of carbon emissions is shown in Figure 7.

FIGURE 7

FIGURE 7. Comparison of carbon emissions.

In Table 5 and Figure 7, the results indicate that: from scene 1 to scene 3, carbon emissions significantly decrease by about 2.2%, which illustrates the method proposed in this paper is useful to control carbon emissions. But the cost of carbon emissions rises, which makes the comprehensive revenue decrease. This is an accepted thing. Controlling carbon emissions will reduce revenue.

5.2 Game Scenes Analysis

To verify the advantages of the proposed multi-agent game method, this paper sets up 4 scenes which are shown in Table 6 for simulation. In Table 6, there are 4 scenes to simulate: in scene 1, supply agents, service agents and user agents only focus on immediate revenue, do not consider the impact of other market players; in scene 2, each agent only considers heat game; in scene 3, consider heat and gas game; in scene 4, consider electric, heat and gas game.

TABLE 6

TABLE 6. Game scenes.

The simulation results are shown in Table 7.

TABLE 7

TABLE 7. Game results.

The comprehensive revenue and carbon emissions are shown in Figure 8.

FIGURE 8

FIGURE 8. Comparison results.

In Table 7 and Figure 8, the results indicate that as more types of energy games are considered, the higher the energy sales revenue. That’s because when taking the energy game out of the equation, supply agents and service agents only consider immediate revenue. At this moment, these agents do not analyze market conditions for an inflated price, which makes the region’s energy sales revenue fall instead. When considering fewer types of energy games, the carbon emissions go up, this is because users do not consider the market game, blindly reducing the demand response, increasing the total energy consumption, thus making carbon emissions up. Compared with scene 1, in scene 4, carbon emissions significantly decrease by about 5.1%.

More carbon emissions make the cost of carbon emissions rise, plus with the influence of energy sales revenue, both of them make comprehensive revenue decline.

5.3 Game and Device Results Analysis

Multi-agent game results are shown in the Figure 9.

FIGURE 9

FIGURE 9. Multi-agent results.

In Figure 9, the changing user loads make supply agents’ and service agents’ prices change with time. Heat price is always very high (the upper limit is often reached), this is because the heat demand in this paper is large. For service agents, the revenue of increasing the price is higher than the losses of the comfort function. In contrast, natural gas demand in this REI is low, reducing the price to increase the comfort function brings more revenue.

Device optimization results are shown in Figure 10.

FIGURE 10

FIGURE 10. Device optimization results.

In Figure 10, the results indicate that: WTG generators’ power and PV generators’ power are almost the same as predicted, this is because renewable energy is given priority in this paper. CHP generators are used when renewable energy cannot meet system requirements and there is a simultaneous thermoelectric demand. As for service agents, it is cheaper to supply heat through GB generators than buy it from supply agents. As opposed to this, P2G generators are not being used because it is inefficient. Demand response exists for all three energy sources at any one time, the demand response is affected by load type, energy price and renewable energy power. When the initial load is high, user agents need to bear higher energy costs, so they will increase demand response. The gas load at each time is low, that is, why gas demand response is low.

6 Conclusion

To reduce REI’s carbon emissions and enhance the different agents’ revenue, this paper proposes a multi-agent schedule optimization method considering the improved tiered reward and punishment carbon trading model. The advantages of this method are as follows.

1) The proposed improved tiered reward and punishment carbon trading model can reward or punish enterprises at different levels to reduce carbon emissions. Considering the game, this paper only changes the carbon trading model for simulation, and the results show that compared with the traditional model, the proposed model can reduce carbon emissions by about 1.3% in the REI.

2) The proposed multi-agent schedule optimization method can stimulate the energy conservation and emissions reduction participation of each agent to reduce carbon emissions and enhance revenue. Considering the improved tiered reward and punishment carbon trading model, this paper only changes the game energy types for simulation, and the results show that compared with the non-game method, this method can reduce carbon emissions by about 5.1% and significantly enhance the revenue of the REI.

Nevertheless, the different carbon emissions of the different devices are not considered in this paper. This will be the focus of future work.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author Contributions

TL and QX constructed the model and method of this paper. HJ and YM collected data. XW, WL, and TP worked together to design scenes.

Funding

The paper is supported by the National Natural Science Foundation of China (U2066213) from the Chinese Government. The author QX is the Subject Investigator of this project and the author TP is the Project Investigator of this project.

Conflict of Interest

Authors XW and TP were employed by the company China Electric Power Research Institute.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Reference

Bahrami, S., and Sheikhi, A. (2015). From Demand Response in Smart Grid toward Integrated Demand Response in Smart Energy Hub. IEEE Trans. Smart Grid 7 (2), 1. doi:10.1109/TSG.2015.2464374

CrossRef Full Text | Google Scholar

Chen, S., Wen, H., Wu, J., Lei, W., Hou, W., Liu, W., et al. (2019). Internet of Things Based Smart Grids Supported by Intelligent Edge Computing. IEEE Access 7, 74089–74102. doi:10.1109/ACCESS.2019.2920488

CrossRef Full Text | Google Scholar

Cui, Y., Deng, G., and Zhao, Y. (2021). Economic Dispatch of Power System with Wind Power Considering the Complementarity of Low-Carbon Characteristics of Source Side and Load Side. Proceeding CSEE 41 (14), 4799–4815. doi:10.13334/j.0258-8013.pcsee.202533

CrossRef Full Text | Google Scholar

Cui, Y., Deng, G., and Zhao, Y. (2022). Multi-time Scale Source-Load Dispatch Method of Power System with Wind Power Considering Low-Carbon Characteristics of Carbon Capture Power Plant. Proceeding CSEE. doi:10.13334/j.0258-8013.pcsee.210697

CrossRef Full Text | Google Scholar

Huang, Y., Ding, T., and Li, Y. (2021). Decarbonization Technologies and Inspirations for the Development of Novel Power Systems in the Context of Carbon Neutrality. Proceeding CSEE 41, 28–51. doi:10.13334/j.0258-8013.pcsee.211016

CrossRef Full Text | Google Scholar

Kamruzzaman, M., Duan, J., Shi, D., and Benidris, M. (2021). A Deep Reinforcement Learning-Based Multi-Agent Framework to Enhance Power System Resilience Using Shunt Resources. IEEE Trans. Power Syst. 36 (6), 5525–5536. doi:10.1109/TPWRS.2021.3078446

CrossRef Full Text | Google Scholar

Li, X., and Niu, S. (2021). Study on Multi-Layer Evaluation System of Source-Grid-Load under Carbon-Neutral Goal. Proceeding CSEE 41, 178–184. doi:10.13334/j.0258-8013.pcsee.211576

CrossRef Full Text | Google Scholar

Li, Z., Chen, S., and Dong, W. (2021). Low Carbon Transition Pathway of Power Sector under Carbon Emission Constraints. Proceeding CSEE 41 (12), 3987–4000. doi:10.13334/j.0258-8013.pcsee.210671

CrossRef Full Text | Google Scholar

Mohammadian, M., Aminifar, F., Amjady, N., and Shahidehpour, M. (2021). Data-driven Classifier for Extreme Outage Prediction Based on Bayes Decision Theory. IEEE Trans. Power Syst. 36 (6), 4906–4914. doi:10.1109/TPWRS.2021.3086031

CrossRef Full Text | Google Scholar

Mohsenian-Rad, A.-H., Wong, V. W. S., Jatskevich, J., Schober, R., and Leon-Garcia, A. (2010). Autonomous Demand-Side Management Based on Game-Theoretic Energy Consumption Scheduling for the Future Smart Grid. IEEE Trans. Smart Grid 1 (6), 320–331. doi:10.1109/TSG.2010.2089069

CrossRef Full Text | Google Scholar

Peng, N., Liang, R., Wang, G., Sun, P., Chen, C., and Hou, T. (2021). Edge Computing-Based Fault Location in Distribution Networks by Using Asynchronous Transient Amplitudes at Limited Nodes. IEEE Trans. Smart Grid 12 (1), 574–588. doi:10.1109/TSG.2020.3009005

CrossRef Full Text | Google Scholar

Wang, S., Wang, X., and Wu, W. (2020). Cloud Computing and Local Chip-Based Dynamic Economic Dispatch for Microgrids. IEEE Trans. Smart Grid 11 (5), 3774–3784. doi:10.1109/TSG.2020.2983556

CrossRef Full Text | Google Scholar

Xiao, Q., Li, T., and Jia, H. (2022). Research on Edge Cloud Collaboration Architecture and Optimization Strategy for Regional Energy Internet. Proceeding CSEE. doi:10.13334/j.0258-8013.pcsee.212931

CrossRef Full Text | Google Scholar

Yu, M., and Hong, S. H. (2016). Supply-demand Balancing for Power Management in Smart Grid: A Stackelberg Game Approach. Appl. Energy 164 (15), 702–710. doi:10.1016/j.apenergy.2015.12.03910.1016/j.apenergy.2015.12.039

CrossRef Full Text | Google Scholar

Yuan, G., Liu, H., and Yu, J. (2022). Combined Heat and Power Optimal Dispatching in Virtual Power Plant with Carbon Capture Cogeneration Unit. Proceeding CSEE. doi:10.2107/TM.20211104.2019.005

CrossRef Full Text | Google Scholar

Zeng, P., Li, H., He, H., and Li, S. (2019). Dynamic Energy Management of a Microgrid Using Approximate Dynamic Programming and Deep Recurrent Neural Network Learning. IEEE Trans. Smart Grid 10 (4), 4435–4445. doi:10.1109/TSG.2018.2859821

CrossRef Full Text | Google Scholar

Zhang, W., and Xu, Y. (2019). Distributed Optimal Control for Multiple Microgrids in a Distribution Network. IEEE Trans. Smart Grid 10 (4), 3765–3779. doi:10.1109/TSG.2018.2834921

CrossRef Full Text | Google Scholar

Zhang, X., Liu, X., and Zhong, J. (2020). Integrated Energy System Planning Considering a Reward and Punishment Ladder-type Carbon Trading and Electric-Thermal Transfer Load Uncertainty. Proceeding CSEE 40 (9), 6132–6141. doi:10.13334/j.0258-8013.pcsee.191302

CrossRef Full Text | Google Scholar

Zhang, X., Shahidehpour, M., Alabdulwahab, A., and Abusorrah, A. (2016). Hourly Electricity Demand Response in the Stochastic Day-Ahead Scheduling of Coordinated Electricity and Natural Gas Networks. IEEE Trans. Power Syst. 31 (1), 592–601. doi:10.1109/TPWRS.2015.2390632

CrossRef Full Text | Google Scholar

Zhang, X., Shahidehpour, M., Alabdulwahab, A., and Abusorrah, A. (2015). Optimal Expansion Planning of Energy Hub with Multiple Energy Infrastructures. IEEE Trans. Smart Grid 6 (5), 2302–2311. doi:10.1109/TSG.2015.2390640

CrossRef Full Text | Google Scholar

Keywords: carbon trading, multi-agent game, regional energy internet, schedule optimization, reward and punishment mechanism

Citation: Li T, Xiao Q, Jia H, Mu Y, Wang X, Lu W and Pu T (2022) Multi-Agent Schedule Optimization Method for Regional Energy Internet Considering the Improved Tiered Reward and Punishment Carbon Trading Model. Front. Energy Res. 10:916996. doi: 10.3389/fenrg.2022.916996

Received: 10 April 2022; Accepted: 25 April 2022;
Published: 24 May 2022.

Edited by:

Qinran Hu, Southeast University, China

Reviewed by:

Sheng Chen, Hohai University, China
Jian Chen, Shandong University, China

Copyright © 2022 Li, Xiao, Jia, Mu, Wang, Lu and Pu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Qian Xiao, eGlhb3FpYW5AdGp1LmVkdS5jbg==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.