Processing math: 100%

ORIGINAL RESEARCH article

Front. Phys., 18 July 2024

Sec. Statistical and Computational Physics

Volume 12 - 2024 | https://doi.org/10.3389/fphy.2024.1402593

This article is part of the Research TopicNon-equilibrium Steady-state Statistical Processes with Confined Active MatterView all articles

Introducing a new approach for modeling stock market prices using the combination of jump-drift processes

Ali Asghar MovahedAli Asghar MovahedHoushyar Noshad
Houshyar Noshad*
  • Department of Physics and Energy Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran

The stock price data are sampled at discrete times (e.g., hourly, daily, weekly, etc). When data are sampled at discrete times, they appear as a sequence of discontinuous jump events, even if they have been sampled from a continuous process. On the other hand, distinguishing between discontinuities due to finite sampling of the continuous stochastic process and real jump discontinuities in the sample path is often a challenging task. Such considerations, led us to the question: Can discrete data (e.g., stock price) be modeled using only jump-drift processes, regardless of whether the sampled time series originally belongs to the class of continuous processes or discontinuous processes? To answer this question, we built a stochastic dynamical equation in the general form dy(t)=ˉμdt+Ni=1ξidJi(t), which includes a deterministic drift term (ˉμdt) and a combination of stochastic terms with jumpy behaviors (ξidJi(t)), and used it to model the log-price time series y(t). In this article, we first introduce this equation in its simplest form, including a drift term and a stochastic term, and show that such a jump-drift equation is capable of reconstructing stock prices in Black-Scholes diffusion markets. Afterwards, we extend the equation by considering two jump processes, and show that such a drift-jump-jump equation enables us to reconstruct stock prices in jump-diffusion markets more accurately than the old jump-diffusion model. To demonstrate the practical applications of the proposed method, we analyze real-world data, including the daily stock price of two different shares and gold price data with two different time horizons (hourly and weekly). Our analysis supports the practical applicability of the methodology. It should be noted that the presented approach is expandable and can be used even in non-financial research fields.

1 Introduction

The stock price is known as a highly volatile variable in a stock market. Price fluctuations, which occur randomly and frequently and sometimes include sudden jumps, increase investment risk and cause concern for investors and company owners who want to increase their capital. Therefore, researchers are propelled to study the fluctuating behavior of the market to find a way to model prices (or improve existing models) to advise investors looking for the best investments [15]. So far, significant progress has been made in this field, the most important of which is stock price modeling via continuous stochastic processes and discontinuous jump processes. The “arithmetic Brownian motion” model was the first mathematical model of stock prices, presented by Louis Bachelier in [6]. In his proposed model, Bachelier assumed that the discount rate is zero and the stochastic differential equation (SDE) governing the stock price is as follows:

dS(t)=σdW(t)(1)

where S(t) is the spot stock price at time t, σ is diffusion coefficient (known as volatility), and {W(t),t0} is a scalar Wiener process (a standard Brownian motion). Integration of Eq. 1 over (t, t+t) yields the stochastic solution of the equation:

S(t)=σW(t)

where S(t)=S(t+t)S(t) is the relative change in the price during a time lag t, and W(t)=W(t+t)W(t) is the increment of the wiener process which is computed as W(t)=ηΔt, where η is a random variable that follows a normal (Gaussian) distribution with zero mean and unit variance, i.e., ηN(0,1). Therefore, the following can be written:

S(t+t)=S(t)+σηt(2)

The main shortcoming of Bachelier’s model is that it assumes that the future value of the assets follows a normal distribution. Based on this assumption, Eq. 2 can lead to a negative stock price with a positive probability, which is not possible in reality. In [7] Osborn demonstrated that the future value of the stock should follow a log-normal distribution, but the log-return of the stock follows a normal distribution. Shortly, the Bachelier model was modified by Samuelson in [8], where he introduced the “geometric Brownian motion” model (also known as Black-Sholes model). In this model, it is assumed that the price of the risky stock evolves according to the following SDE:

dS(t)=μS(t)dt+σS(t)dW(t)(3)

where µ and σ are the drift and diffusion coefficients, and again {W(t),t0} is a scalar Wiener process. The field of mathematical finance has gained significant attention since Black and Scholes published their work in [9, 10]. They contributed to the world of finance via the introduction of Itô calculus to financial mathematics, and also the Black-Scholes formula. By choosing y(t)=ln[S(t)] and applying Itô’s lemma [11, 12], Eq. 3 becomes:

dy(t)=(μσ22)dt+σdW(t)(4)

Integration of Eq. 4 over (t, t+t) gives us:

y(t)=(μσ22)t+σW(t)

where y(t)=y(t+t)y(t)=ln[S(t+t)S(t)] represents the logarithmic increment of stock price data (known as log-return), t is the length of time interval between two consecutive trading periods, and W(t)=ηΔt, ηN(0,1). Therefore, the following can be written:

y(t)=(μσ22)t+σηΔt(5)

In turn, the stock price can be determined from Eq. 5 as:

S(t+t)=S(t)e[(μσ22)t+σηΔt](6)

Eq. 6 enables one to simulate the possible stock price trajectories with time step t, through the Black-Sholes model. For this purpose, one must first find the parameters μ and σ2 from historical log-returns data based on the following relations:

M1=<y(t)>=(μσ22)t
M2=<(y(t)<y(t)>)2>=σ2t(7)

where <> denotes averaging over the data, so that M1 and M2 in Eq. 7 are the mean and variance of the historical log-returns data, respectively. Having M1 and M2, first σ2 is obtained:

σ2=1tM2

once σ2 is identified, the parameter μ is obtained from the first moment M1.

The main disadvantage of the Black-Scholes model is its constant volatility assumption, while it is widely believed and empirically confirmed that stock prices do not have constant volatility, rather it varies during time [1315]. This shortcoming and unsatisfactory performance of the Black-Scholes model caused researchers look for better alternatives and improve the classic Black-Scholes model in two directions:

1- Adding a term with jumpy behavior to the Black-Scholes equation to allow for random jumps in the stock price process (jump-diffusion model e.g., Merton model [16])

2- Considering stochastic volatility for the stock price (e.g., Heston model [17] or GARCH model [18]).

Here we only focus on the first option and describe the jump-diffusion model. Merton in [16] presented one of the first models in which jump processes were used in financial modeling. To take into account price discontinuities, Merton added a Poisson jump process to the log-price while preserving the independence and stationarity of log-returns. A jump-diffusion equation is generally written as:

dy(t)=ˉμdt+σdW(t)+ξdJ(t)(8)

where ˉμ and σ are the drift and diffusion coefficients, W(t) is a wiener process, and J(t) is a Poisson jump process with rate λ and distributed size ξ which Merton assumed follows a Gaussian distribution with zero mean and variance σ2ξ, i.e., ξN(0,σ2ξ). It was also assumed that Poisson process, jump size ξ and Wiener process in Eq. 8 are three independent processes.

Integration of Eq. 8 over (t, t+t), leads to:

y(t)=ˉμt+σW(t)+ξJ(t)

here J(t)=J(t+t)J(t) follows a Poisson distribution with mean λt, and W(t)=ηΔt, ηN(0,1). Therefore, the following can be written:

y(t)=ˉμt+σηΔt+ξJ(t)(9)

In turn, the stock price can be determined from Eq. 9 as:

S(t+t)=S(t)e(ˉμt+σηΔt+ξJ(t))(10)

Eq. 10 enables one to simulate the possible stock price trajectories with time step t, via the jump-diffusion model. For this purpose, one must find the parameters ˉμ, σ2, σ2ξ and λt from the historical log-returns data based on the following relations:

M1=<y(t)>=ˉμt
M2=<(y(t)ˉμt)2>=σ2t+σ2ξλt
M4=<(y(t)ˉμt)4>=3σ4ξλt
M6=<(y(t)ˉμt)6>=15σ6ξλt(11)

where M1, M2, M4 and M6 are the statistical moments of the historical log-returns data. Having these moments, first the jump characteristics σ2ξ and λt are obtained from Eq. 11:

σ2ξ=M65M4
λt=M43σ4ξ

once σ2ξ and λt are identified, the parameter σ2 is identified from the second moment M2 and the parameter ˉμ is obtained from the first moment M1.

The main shortcoming of the jump-diffusion model is that the jumps reconstructed by the model have larger amplitudes than the jumps in the actual data. Let us demonstrate how this problem occurs. Suppose we want to model the daily prices of a stock via jump-diffusion model. As mentioned, first we need to determine the parameters ˉμ, σ2, σ2ξ and λt from the historical log-returns data. Since in the Poisson jump process, the probability of occurrence of more than one jump in any small time interval t is zero, so J in Eq. 9 takes only the values ​​of one (one jump) or 0 (no jump) with the probabilities λt and 1λt, respectively [19]. Given these probabilities, the data points can be reconstructed by one of the following sub-equations:

If J=0, meaning that no jump occurs at that t, then the data point is reconstructed by:

y(t)=ˉμt+σηΔt(12)

If J=1, meaning that a jump occurs at that t, then the data point is reconstructed by:

y(t)=ˉμt+σηΔt+ξ(13)

As can be seen from Eqs 12, 13, the diffusion term σηΔt appears in both equations and is involved in the reconstruction of all data points, even jumpy data points. Since the random variables (σηΔt) and (ξ) are two independent zero mean normally distributed variables with variances σ2t and σ2ξ, respectively, their sum in Eq. 13 is also a normally distributed variable, i.e., (σηΔt+ξ)N(0,σ2ξ+σ2t). The variance of this distribution (σ2t+σ2ξ) represents the amplitude of the reconstructed jumps, which is larger than the amplitude of the jumps in the historical data (σ2ξ) that was originally obtained. Obviously, if σ2tσ2ξ, so that σ2t can be neglected compared to σ2ξ, then the data reconstructed by the jump-diffusion model will be similar to the original data in the statistical sense, otherwise the model will fail. This shortcoming led us to modify the jump-diffusion equation in such a way that, if necessary, we can discard the contribution of the diffusion term in Eq. 9 so that it does not interfere with the reconstruction of the jumps. For this purpose, we replace the diffusion term in Eq. 9 by a term with jumpy behavior. This idea is supported by the fact that when data are sampled at discrete times, they appear as a sequence of discontinuous jump events, even if they have been sampled from a continuous diffusion process [20]. This is precisely why distinguishing between discontinuities due to discrete sampling of continuous process and real discontinuities in a jump-diffusion process is itself a challenging task [21]. Based on the above considerations, we modify Eq. 9 by considering two jump process with different distributed sizes ξ1 and ξ2 and different rates λ1t and λ2t as follows:

y(t)=ˉμt+ξ1J1(t)+ξ2J2(t)

where ξ1J1(t) has replaced diffusion term in Eq. 9, and ξ2J2(t) has the same role as ξJ. Each of J1(t) and J2(t) take the values of 1 and 0, but to avoid their simultaneous occurrence, we stipulate that if J1(t)=1, then J2(t)=0 and vice versa. Applying this condition causes each data point to be reconstructed by only one of the jump events. The procedure is as follows:

If J1(t)=1, and J2(t)=0, then the data point is reconstructed by:

y(t)=ˉμt+ξ1

If J1(t)=0, and J2(t)=1, then the data point is reconstructed by:

y(t)=ˉμt+ξ2

With this modification, the shortcoming of the jump-diffusion model can be solved. In this model, we assume that ξ1 and ξ2 are two zero mean Gaussian random variables with variances σ2ξ1 and σ2ξ2, i.e., ξ1N(0,σ2ξ1) and ξ2N(0,σ2ξ2). These two random variables produce fluctuations that are additively superimposed on the trajectory generated by the deterministic dynamics. In the following, we will describe the model in detail and demonstrate that all the unknown parameters of this modeling can be derived directly from the historical stock price.

2 Model description

In [22] we have introduced a general dynamical stochastic equation as follows, which includes a deterministic drift term (ˉμdt) and a combination of stochastic terms with jumpy behaviors (ξidJi(t)):

dy(t)=ˉμdt+Ni=1ξidJi(t)(14)

where ˉμdt indicates the deterministic part of the process and J1(t),J2(t),etc are Poisson jump processes. The jumps have rates λ1,λ2,etc and sizes ξ1,ξ2,etc, which we assume they have zero mean Gaussian distributions with variances σ2ξ1,σ2ξ2,etc, respectively. In this article, we intend to use this equation specifically to simulate asset prices. For this purpose, we first start with the simplest form of Eq. 14, which includes a drift term and only a jump process. We will demonstrate that such a jump-drift equation is able to describe the discrete-time evolution of price time series in the Black-Scholes markets. Since the real markets are usually jump-diffusion markets, in the second section, we extend modeling by considering two jump processes with different rates (λ1, λ2) and different distributed sizes (ξ1, ξ2) and use it to model prices in actual markets. In each stage, we will demonstrate that all unknown parameters involved in the model can be derived non-parametrically from the historical price data. It should be noted that due to the small number of data points in the price time series, or the lack of diversity in the distributed sizes of fluctuations, we will model prices only by considering two jump processes. However, depending on the number of available data points and the variety of amplitudes, one can extend the proposed model.

2.1 Jump-drift modeling

In the first step, we consider Eq. 14 in its simplest form including a drift term and a stochastic term with jumpy behavior, and show that it can be used to reconstruct prices data of the diffusion markets (e.g., Black-Scholes markets). The general form of a jump-drift equation is as follows:

dy(t)=ˉμdt+ξdJ(t)(15)

where y(t)=ln(S(t)) is the log-price and ˉμdt denotes the deterministic drift part of the dynamics and J(t) is a Poisson jump process characterized by the rate λ and the size ξ. We assume that ξ is a random variable with zero mean Gaussian distribution, i.e., ξN(0,σ2ξ). Also, we assume that Poisson-distributed jumps dJ(t) and jump size ξ are two independent processes.

Integration of Eq. 15 over (t, t+t) gives us:

y(t)=ˉμt+ξJ(t)(16)

where y(t)=y(t+t)y(t)=ln[S(t+t)S(t)] is the log-return, t is the length of time interval between two consecutive points and J(t)=J(t+t)J(t) follows a Poisson distribution with mean λt.

In turn, the stock price can be determined as:

S(t+t)=S(t)e(ˉμt+ξJ(t))

To reconstruct prices data with the above relation, we must find three parameters ˉμ, σ2ξ and λt. We now show that all these parameters can be estimated directly from the log-return time series y(t). For this purpose, we derive the statistical moments of y(t) from Eq. 16 (note that J(t) and ξ are two independent processes):

M1=<y(t)>=<ˉμt>+<ξ><J(t)>
M2=<(y(t)ˉμt)2>=<ξ2><(J(t))2>
M4=<(y(t)ˉμt)4>=<ξ4><(J(t))4>

where <> denotes averaging over the data, so that M1 is the mean of log-returns, and M2, M4 and M6 are the other statistical moments of log-returns about the mean. Since for small t, all of the statistical moments of jumps are proportional to λt, i.e., <(J(t))m>=λt [19, 20], as well as for a zero mean Gaussian random variable ξ with variance σ2ξ, all of the even order statistical moments are obtained by <ξ2l>=2l!2ll!<ξ2>l , the above relations become (note that <ξ>=0 and <ξ2>=σ2ξ):

M1=ˉμt
M2=σ2ξλt
M4=3σ4ξλt(17)

According to the first relation in Eq. 17, the mean of log-returns (M1) gives us the drift parameter ˉμ, and the second and fourth-order moments (M2, M4) identify the jump characteristics, namely,:

ˉμ=1tM1
σ2ξ=M43M2
λt=M2σ2ξ(18)

We claim that the proposed jump-drift dynamics enable us to model diffusion processes such as the Black-Scholes process. We will check the validity of this claim by reconstructing a Black-Scholes process via the new dynamics using the parameters determined from Eq. 18. But before that, let us provide the following two criteria for evaluating the reconstructed process:

1) We know from Wick’s theorem that for the time series of the Black-Scholes process, the statistical moments of the data satisfy the relation M43M221, which follows from the fact that the short-time propagator of the Black-Scholes dynamics is a Gaussian distribution. Therefore, if the proposed jump-drift dynamics be capable of reconstructing a time series which is statistically similar to the original Black-Scholes time series, then the statistical moments of the reconstructed data should satisfy the Wick’s relation, i.e., (M43M22)rec1.

2) In continuation of the previous point, we find the ratio M43M22 from relations (17):

M43M22=3σ4ξλt3(σ2ξλt)2=1λt

by comparing this relation with Wick’s relation, i.e., M43M221, we expect that λt=1. On the other, if λt=1, then the second moment in Eq. 17 becomes:

M2=σ2ξ

this is while, the second moment in original Black-Scholes process is M2=σ2t (Eq. 7). Therefore, it can be concluded that if the new model works correctly, the estimation of jumps amplitude ( σ2ξ) should be equal to the variance of the original data (σ2t), namely,

σ2ξ=σ2t

In the following, we reconstruct a Black-Scholes process with known drift and volatility parameters via the jump-drift equation, and then evaluate the reconstructed data.

Example 1. First, we generate a synthetic time series y(t) with 106 data points via Black-Scholes dynamics (Eq. 5) and using preset parameters μ=1.5 and σ=1 with t=0.004. In Figure 1, we have shown the trajectory of 1,500 data points out of 106 generated data points so that the fluctuations can be clearly seen (blue graph). By obtaining the statistical moments Mn for n=1,2,4 from the generated data, and substituting in relations (18), we determine the parameters required for the new modeling. The results are as follows:

Figure 1
www.frontiersin.org

Figure 1. Upper panel: A sample path of synthetic log-returns generated via Black-Scholes dynamics using the preset parameters μ=1.5, σ=1 and t=0.004. Lower panel: A sample path of log-returns reconstructed via jump-drift dynamics.

Statistical moments determined from generated data:

M1=0.004,M2=0.004,M4=4.8121105,M43M22=1.002

Required parameters for new modeling:

ˉμ=1tM1=1
σ2ξ=M43M2=0.00401(In agreement with σ2t=0.004)
λt=M2σ2ξ=0.9971

In the second step, we reconstruct a time series y(t) via the jump-drift equation (Eq. 16) with 106 data points. For comparison with the original data, a sample path including 1,500 reconstructed data points is shown in Figure 1 (red graph). Finally, to ensure that the two time series (generated and reconstructed) are statistically equivalent, we obtain the statistical moments of the reconstructed data, and check the establishment of (M43M22)rec1. The results are as follows:

Statistical moments of reconstructed data:

M1=0.004,M2=0.004,M4=4.8172*105,(M43M22)rec=0.99851

As can be seen, the reconstructed data are statistically similar to original data with high accuracy, and there is a very good agreement between these results and the theory.

2.2 Jump-jump-drift modeling

In the previous section we modeled a continuous diffusion process through the jump-drift equation. Since real markets are usually jump-diffusion markets, the generalizing of jump-drift modeling to a jump-jump-drift modeling improves the characterization of real markets dynamics beyond a continuous process. The general form of a jump-jump-drift equation is as follows:

dy(t)=ˉμdt+ξ1dJ1(t)+ξ2dJ2(t)(19)

where ˉμdt indicates the deterministic part of the process and J1(t) and J2(t) are Poisson jump processes. The jumps have rates λ1 and λ2, and sizes ξ1 and ξ2, which we assume have zero mean Gaussian distributions, i.e., ξ1N(0,σ2ξ1) and ξ2N(0,σ2ξ2). We call σ2ξ1 and σ2ξ2 the jump amplitudes.

Integration of Eq. 19 over (t, t+t) gives us:

y(t)=ˉμt+ξ1J1(t)+ξ2J2(t)(20)

Furthermore, the stock price can be determined from Eq. 20 as:

S(t+t)=S(t)e(ˉμt+ξ1J1(t)+ξ2J2(t))(21)

In modeling the stock price via Eq. 21, we also assume that two jumps do not occur simultaneously, which means that in the time interval (t,t+t], if, for example, J1(t) occurs and takes the value of 1, J2(t) does not occur and its value is 0, and vice versa. Let λ1t and λ2t be the probabilities of occurrence of J1(t) and J2(t) in a small time step t, if we assume only one of the jumps (J1(t) or J2(t)) occurs in each time step, then we can write:

λ1t+λ2t=1(22)

According to this condition, we can discard one of the jump events at each time step, and reconstruct the corresponding data point by another jump event.

To model the stock prices via Eq. 20, we must find the five unknown parameters ˉμ, λ1t, λ2t, σ2ξ1 and σ2ξ2. We now show that all these parameters can be estimated directly from the log-returns time series y(t). For this purpose, we derive the statistical moments of y(t) from Eq. 20 (note that ξ1 and ξ2 are two Gaussian random variables independent from the jumps, and J1(t) and J2(t) do not occur simultaneously):

M1=<y(t)>=<ˉμt>+<ξ1><J1(t)>+<ξ2><J2(t)>
M2=<(y(t)ˉμt)2>=<ξ12><(J1(t))2>+<ξ22><((J2(t))2)>
M4=<(y(t)ˉμt)4>=<ξ14><(J1(t))4>+<ξ24><((J2(t))4)>
M6=<(y(t)ˉμt)6>=<ξ16><(J1(t))6>+<ξ26><((J2(t))6)>

By using the relations <(J1(t))m>=λ1t and <(J2(t))m>=λ2t for the statistical moments of jump processes, and the relations <ξ2l1>=2l!2ll!<ξ21>l and <ξ2l2>=2l!2ll!<ξ22>l for the even order statistical moments of zero mean Gaussian random variables ξ1 and ξ2 with variances σ2ξ1 and σ2ξ2, we will have (note that <ξ1>=<ξ2>=0, <ξ21>=σ2ξ1, and <ξ22>=σ2ξ2):

M1=ˉμt
M2=σ2ξ1λ1t+σ2ξ2λ2t
M4=3σ4ξ1λ1t+3σ4ξ2λ2t
M6=15σ6ξ1λ1t+15σ6ξ2λ2t

To find the five unknowns ˉμ, λ1t, λ2t, σ2ξ1 and σ2ξ2, we need to add one more equation to the above relations. For this purpose, we use Eq. 22 as λ1t=1λ2t, and reduce the number of unknowns, so we will have:

M1=ˉμt
M2=σ2ξ1+(σ2ξ2σ2ξ1)λ2t
M4=3σ4ξ1+3(σ4ξ2σ4ξ1)λ2t
M6=15σ6ξ1+15(σ6ξ2σ6ξ1)λ2t(23)

Having the statistical moments M1,M2,M4 and M6 from the log-return time series and solving the above system of equations numerically, the four unknown parameters ˉμ, λ2t, σ2ξ1 and σ2ξ2 are determined. Once λ2t is identified, λ1t is obtained from Eq. 22.

We claim that the proposed dynamics enables us to model time series with jump discontinuities more accurately than the classic jump-diffusion dynamics. We will check the validity of this claim by reconstructing a jump-diffusion process via the jump-jump-drift equation. But before that, let us prove this claim by showing that the new relations in Eq. 23 are generalizations of the old jump-diffusion relations in Eq. 11. For this purpose, we consider the case in which σ2ξ1σ2ξ2, so that σ2ξ1 can be ignored compared to σ2ξ2, and at the same time σ2ξ1 be so small that σ4ξ1=σ6ξ1=0. Under these special conditions, relations (23) can be written as follows:

M1=ˉμt
M2=σ2ξ1+σ2ξ2λ2t
M4=3σ4ξ2λ2t
M6=15σ6ξ2λ2t

As can be seen, these relations are similar to relations of jump-diffusion model (Eq. 11), so that σ2ξ1 has replaced σ2t, and identifies the diffusion part, and σ2ξ2λ2t has the same role as σ2ξλt. This means that under these special conditions (σ2ξ1σ2ξ2 and σ4ξ1=σ6ξ1=0), the new model works like the jump-diffusion model and the parameters obtained from the data are the same in both models. But if the data fluctuations are such that these conditions are not satisfied, it is clear that the proposed model will lead to more accurate estimates than the jump-diffusion model. By analyzing stock prices, we found that although the release of exciting news in the market causes sudden jumps in log-returns, the amplitude of these jumps is not so much larger than the amplitude of the fluctuations in normal days. Therefore, it seems that the new model has a better performance for modeling and forecasting prices.

In the following, to demonstrate the reliability of the new model, we test it on synthetic data. Furthermore, to ensure the effectiveness of the proposed approach in different conditions, we test the model with different synthetic data.

Example 2:. First, we test the model with data generated through the Black-Scholes process in Example 1. By obtaining the statistical moments Mn for n=1,2,4,6 from the generated data, and replacing them in relations (23), we determine the parameters required for the new modeling via the numerical solution of the obtained system of equations. Since the data generated in example 1 are diffusive data, and we have already modeled it through the jump-drift equation, we expect the occurrence rate of one of the jumps to be zero when we model the same data through the jump-jump-drift equation. The following results, confirm our opinion:

ˉμ=1
σ2ξ1=0.004
λ1t=0.99991
σ2ξ2=0.0007
λ2t=0.00010

The value of λ2t0 show that when the time series belongs to the class of continuous diffusion processes (e.g., Black-Scholes process), the jump-jump-drift dynamics, models it by using only one jump process and completely omitting the second jump process. In the next step, we test the model on two synthetic log-return time series generated via jump-diffusion Equation 9 with preset parameters. Each time series contains 3×106 data points which generated by considering ˉμ=5 and σ=2 with a sampling interval t=0.0001, so that σ2t=0.0004. The jumps in both time series have the same jump rate λt=0.3 (jump rate per data point), but the amplitude of the jumps are σ2ξ=0.1 and σ2ξ=0.001, respectively. We deliberately choose these jump amplitudes with different orders of magnitude to observe the effect of their amplitude in retrieving the coefficients. Note that in the first case σ2ξσ2t=250 and in the second case σ2ξσ2t=2.5, that is, in the first case, the variance of diffusion part (σ2t) is negligible compared to the amplitude of jumps (σ2ξ), and as mentioned earlier, we expect both models show almost the same results, but in the second case, we expect the estimates of the new model to be more accurate than the jump-diffusion model.

By obtaining the statistical moments Mn for n=1,2,4,6 from the generated data, and substituting in relations (11) and (23), we determine the parameters of the two models. The following results are estimated from the numerical solution of the corresponding system of equations:

Case1:. Preset parameters:

ˉμ=1,σ2t=0.0004,σ2ξ=0.1,λt=0.3

Estimated parameters via jump-diffusion model:

ˉμ=1,σ2t=0.00031,σ2ξ=0.1005,λt=0.299

Estimated parameters via jump-jump-drift model:

ˉμ=1,σ2ξ1=0.00045,σ2ξ2=0.1005,λ2t=0.299,λ1t=0.701

Case2:. Preset parameters:

ˉμ=1,σ2t=0.0004,σ2ξ=0.001,λt=0.3

Estimated parameters via jump-diffusion model:

ˉμ=1,σ2t=0.00013,σ2ξ=0.0012,λt=0.5

Estimated parameters via jump-jump-drift model:

ˉμ=1,σ2ξ1=0.00040,σ2ξ2=0.0014,λ2t=0.302,λ1t=0.698

The above results show that in the first case, both models lead to almost the same results, but in the second case, the proposed model leads to more accurate results (note that in the new model, σ2ξ1 is an estimate for the variance of the diffusive data, i.e., σ2ξ1=σ2t, and σ2ξ2 is an estimate for the variance of the jumpy data, i.e., σ2ξ2=σ2ξ).

3 Data and methodology

Our dataset comprises the daily closing prices of the Apple and IBM stocks, as well as gold prices with two different time horizons (weekly and hourly). For Apple and IBM stocks, the historical data that will be used are daily closing prices from 1 June 2020 to 1 June 2023, which are obtained from Yahoo Finance source. For gold, the historical data that will be used are weekly gold prices from 5 January 2004 to 3 January 2022, as well as hourly gold prices from 11 March 2022 to 11 November 2022, which are obtained from dukascopy historical data source.

For each of the collected data, we will obtain log-returns time series y(t) by:

y(t)=ln[S(t+t)S(t)],t=1,2,

where S(t) and S(t+1) are consecutive prices in the price time series. Afterwards, we will determine the statistical moments of log-returns data as follows:

M1=¯y=1NNt=1y(t)
Mn=1NNt=1(y(t)¯y)n,for n=2,4,6

where N is the number of log-returns data points. By determining these statistical moments and replacing them in relations (23), we will identify the required parameters of the model, i.e., ˉμ, σ2ξ1, σ2ξ2, λ1t, λ2t. Using these parameters, we will reconstruct the log-returns data by the following equation:

y(t)=ˉμt+ξ1J1(t)+ξ2J2(t)

In addition, we will use the following equation to forecast prices for several time steps after the chosen historical period:

S(t+1)=S(t)e(ˉμt+ξ1J1(t)+ξ2J2(t))

To determine the forecasts accuracy, we will use “Mean Absolute Percentage Error” (MAPE) calculation as follows:

MAPE=1NNt=1|F(t)S(t)|S(t)

where F(t) is the forecasted price at time t, S(t) is the actual stock price at time t, and N is the number of predicted data points. We will use MAPE values to evaluate our forecasting method. A scale for judging model accuracy based on the MAPE criterion was presented by Lawrence et al. [23], and is shown in Table 1.

Table 1
www.frontiersin.org

Table 1. A scale of judgment of forecast accuracy.

3.1 Research output and discussion

In the following, considering the elements described in the methodology, we first model Apple and IBM stocks and predict their prices for a period of 30 days. Historical data used are daily closing stock prices from 1 June 2020 to 1 June 2023. For daily prices, the trading period is t=1252 years (based on a year with an average of 252 stock trading days). The parameters obtained from the model are presented in Table 2. Based on this table, in both stocks, σ2ξ2 is not more than one order of magnitude larger than σ2ξ1, so that for Apple stock we have σ2ξ2σ2ξ1=4.34, and for IBM stock this ratio is σ2ξ2σ2ξ1=4.2. As mentioned earlier, in this situation, the new model has better performance. Furthermore, for the Apple stock, the jump rates are λ1t=0.8248 and λ2t=0.1752, which means that, in the data reconstruction stage, 82.48% of the data points are reconstructed by using a Gaussian random variable with smaller variance σ2ξ1, and 17/52% of the data points are reconstructed by using a Gaussian random variable with larger variance σ2ξ2. This is while, these rates for the IBM stock are λ1t=0.6790 and λ2t=0.3210, respectively.

Table 2
www.frontiersin.org

Table 2. values of the drift, jump amplitudes, and jump rates obtained from historical daily prices of Apple and IBM stocks using the jump-jump-drift modeling.

To reconstruct log-return data through proposed model, we use the equation y(t)=ˉμt+ξ1J1(t)+ξ2J2(t) and reconstruct a time series for y(t) that is statistically similar to the original ones. Figures 2, 3 show the actual and reconstructed log-returns of Apple and IBM stocks, respectively.

Figure 2
www.frontiersin.org

Figure 2. Upper panel: Time plot of actual daily log-returns of Apple stock from 1 June 2020 to 1 June 2023 (756 data points). Lower panel: Time plot of reconstructed daily log-returns of Apple stock using the proposed jump-jump-drift model.

Figure 3
www.frontiersin.org

Figure 3. Upper panel: Time plot of actual daily log-returns of IBM stock from 1 June 2020 to 1 June 2023 (756 data points). Lower panel: Time plot of reconstructed daily log-returns of IBM stock using the proposed jump-jump-drift model.

To predict stock prices, we use parameters estimated from historical data. The forecast period is 30 days and is related to the days after the selected historical period. Simulation of predictions is done by 1,000 realization of the trajectory. Each trajectory is realized using the equation S(t+1)=S(t)e(ˉμt+ξ1J1(t)+ξ2J2(t)) with 30 iterations. Figures 4, 5 show the daily forecasts of Apple and IBM stock prices, respectively. In order to compare the predictions with the actual prices, the graph of realized prices in the same 30 days is also shown in each figure (cyan graph). As can be seen, in both stocks, the actual prices are located within the predicted trajectories. In addition, the data analysis shows that all 1,000 predicted trajectories of Apple stock price have MAPE values less than 20% (with the smallest MAPE = 1.4%, the largest MAPE = 19.8% and the average MAPE = 5.84%). Meanwhile, the corresponding values obtained through the jump-diffusion model are as follows:

Figure 4
www.frontiersin.org

Figure 4. Graphical representation of the predicted paths of the daily price of Apple stock using jump-jump-drift modeling. The time period of all predictions is 30 days and their starting point is 1 June 2023. The cyan graph is the actual price path realized over the same 30 days, and the colored graphs are the 1,000 possible paths predicted by the model.

Figure 5
www.frontiersin.org

Figure 5. Graphical representation of the predicted paths of the daily price of IBM stock using jump-jump-drift modeling. The time period of all predictions is 30 days and their starting point is 1 June 2023. The cyan graph is the actual price path realized over the same 30 days, and the colored graphs are the 1,000 possible paths predicted by the model.

The smallest MAPE = 1.51%, the largest MAPE = 24.3%, and the average MAPE = 6.32%. These results show that the jump-jump-drift model has a better performance than the jump-diffusion model, and if the time period of the forecasts becomes larger (e.g., in annual forecasts), the difference between the forecasts of the two models becomes more visible.

The results of IBM stock price predictions are even more surprising than Apple stock. Analysis of IBM stock simulation outputs shows that all 1,000 predicted trajectories have MAPE values ​​less than 15% (with the smallest MAPE = 1.22%, the largest MAPE = 14.02% and the average MAPE = 4.62%), indicating good accuracy of the model predictions. The corresponding values obtained through jump-diffusion model are as follows:

The smallest MAPE = 1.62%, the largest MAPE = 20.7%, and the average MAPE = 5.12%.

Finally, to see the effectiveness of the proposed approach for different time horizons, we simulate gold prices with two different time horizons. Historical data used are weekly gold price from 5 January 2004 to 3 January 2022, as well as hourly gold prices from 11 March 2022 to 11 November 2022. For weekly prices, the trading period is t=152 years (based on a year with an average of 52 trading weeks), while for hourly prices, the trading period is t=15916 years (based on 2022 with 5,916 trading hours). The parameters obtained from the model are presented in Table 3. Based on this table, in both cases, σ2ξ2 is not more than one order of magnitude larger than σ2ξ1, so that for weekly data we have σ2ξ2σ2ξ1=4.6, and for hourly data this ratio is σ2ξ2σ2ξ1=11.7.

Table 3
www.frontiersin.org

Table 3. values of the drift, jump amplitudes, and jump rates obtained from historical gold prices (weekly and hourly) using jump-jump-drift modeling.

To reconstruct log-return data through proposed model, we use the equation y(t)=ˉμt+ξ1J1(t)+ξ2J2(t) and reconstruct a time series for y(t) that is statistically similar to the original ones. Figures 6, 7 show the actual and reconstructed log-returns of weekly and hourly prices of gold, respectively.

Figure 6
www.frontiersin.org

Figure 6. Upper panel: Time plot of actual weekly log-returns of gold prices from 5 January 2004 to 3 January 2022 (940 data points). Lower panel: Time plot of reconstructed weekly log-returns of gold prices using the proposed jump-jump-drift model.

Figure 7
www.frontiersin.org

Figure 7. Upper panel: Time plot of actual hourly log-returns of gold prices from 11 March 2022 to 11 November 2022 (3,999 data points). Lower panel: Time plot of reconstructed hourly log-returns of gold prices using the proposed jump-jump-drift model.

To predict gold prices, we use parameters estimated from historical data. The forecast period for weekly price is 30 weeks and for hourly price is 300 h and related to the times after historical periods. Simulation of predictions is done by 1,000 realization of the trajectory. Each trajectory is realized using the equation S(t+1)=S(t)e(ˉμt+ξ1J1(t)+ξ2J2(t)) with 30 iterations for weekly gold price and 300 iterations for hourly gold price. Figures 8, 9 show the weekly and hourly gold price forecasts, respectively. As can be seen, in both cases, the actual prices are located within the trajectories predicted by the model. Furthermore, the data analysis shows that all 1,000 predicted trajectories of weekly gold price have MAPE values ​​less than 30% (with the smallest MAPE = 2.1%, the largest MAPE = 29.43% and the average MAPE = 7.57%), which are acceptable forecasts. The corresponding values obtained by jump-diffusion model are as follows:

Figure 8
www.frontiersin.org

Figure 8. Graphical representation of the predicted paths of the weekly price of gold using jump-jump-drift modeling. The time period of all predictions is 30 weeks and their starting point is 3 January 2022. The cyan graph is the actual price path realized over the same 30 weeks, and the colored graphs are the 1,000 possible paths predicted by the model.

Figure 9
www.frontiersin.org

Figure 9. Graphical representation of the predicted paths of the hourly price of gold using jump-jump-drift modeling. The time period of all predictions is 300 h and their starting point is 11 November 2022. The cyan graph is the actual price path realized over the same 300 h, and the colored graphs are the 1,000 possible paths predicted by the model.

The smallest MAPE = 2.3%, the largest MPAE = 35.7%, and the average MAPE = 7.85%.

Analysis of hourly gold prices shows that all 1,000 predicted paths have MAPE values ​​less than 10% (with the smallest MAPE = 0.51%, the largest MPAE = 7.17% and the average MAPE = 1.82%), indicating very high accuracy of the model predictions for hourly time horizons. The corresponding values obtained by jump-diffusion model are as follows:

The smallest MAPE = 0.8%, the largest MPAE = 10.3%, and the average MAPE = 2.15%.

4 Conclusion

We discussed that when data are sampled at discrete times (e.g., stock prices), they appear as a sequence of discontinuous jump events, even if they have been sampled from a continuous process. This issue gave us the idea to propose a new modeling in which random variations in the sample path of a measured time series are attributed to jump events, even if the time series belongs to the class of diffusion processes. Based on this, we introduced a new dynamical stochastic equation including a deterministic drift term and a combination of several stochastic terms with jumpy behaviors. The general form of this equation is as follows:

dy(t)=ˉμdt+Ni=1ξidJi(t)

In this modeling we also assumed that the jump events do not occur simultaneously so that the jumps have no overlap. We started with the simplest form of equation including a deterministic drift term and a jump process as the stochastic component, and argued that it can be used to describe the discrete-time evolution of a diffusion process, e.g., Black-Scholes process. Afterwards, we extended the equation by considering two jump processes with different distributed sizes, and used it to model assets such as stock prices and gold prices with different time horizons. We also demonstrated that in all cases the proposed model works better than the old jump model. It should be noted that, due to the small number of available price data and the lack of diversity in the amplitudes of jumps, in this article we modeled prices data only by considering two jump processes. However, depending on the number of data points and variation in the amplitudes of fluctuations, more stochastic terms can be kept in the equation to increase the accuracy of the modeling. But on the other hand, the more the number of terms in the equation, the need to solve the system of equations with more unknowns, the cost of which must be paid in the form of longer runtime.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author contributions

AM: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing–original draft, Writing–review and editing. HN: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing–original draft, Writing–review and editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Reddy K, Clinton V. Simulating stock prices using geometric Brownian motion: evidence from Australian companies. Australas Account Business Finance J (2016) 10(3):23–47. doi:10.14453/aabfj.v10i3.3

CrossRef Full Text | Google Scholar

2. Synowiec D. Jump-diffusion models with constant parameters for financial log-return processes. Comput Math Appl (2008) 56(8):2120–7. doi:10.1016/j.camwa.2008.02.051

CrossRef Full Text | Google Scholar

3. Azizah M, Irawan M, Putri E. Comparison of stock price prediction using geometric Brownian motion and multilayer perceptron. In: AIP conference proceedings: Depok, Indonesia. AIP Publishing (2020).

CrossRef Full Text | Google Scholar

4. Mota PP, Esquível ML. Model selection for stock prices data. J Appl Stat (2016) 43(16):2977–87. doi:10.1080/02664763.2016.1155205

CrossRef Full Text | Google Scholar

5. Benninga S. Financial modeling, fourth edition By Simon Benninga Hardcover. The MIT press (2014).

Google Scholar

6. Bachelier L. Théorie de la spéculation. In: Annales scientifiques de l’École normale supérieure (1900).

CrossRef Full Text | Google Scholar

7. Osborne MF. Brownian motion in the stock market. Operations Res (1959) 7(2):145–73. doi:10.1287/opre.7.2.145

CrossRef Full Text | Google Scholar

8. Samuelson PA. Economic theory and mathematics--an appraisal. Am Econ Rev (1952) 42(2):56–66.

Google Scholar

9. Black F, Scholes M. The pricing of options and corporate liabilities. J Polit economy (1973) 81(3):637–54. doi:10.1086/260062

CrossRef Full Text | Google Scholar

10. Black F, Karasinski P. Bond and option pricing when short rates are lognormal. Financial Analysts J (1991) 47(4):52–9. doi:10.2469/faj.v47.n4.52

CrossRef Full Text | Google Scholar

11. Risken H. Springer series in synergetics (1996).

Google Scholar

12. Bouchaud J-P, Cont R. A Langevin approach to stock market fluctuations and crashes. Eur Phys J B-Condensed Matter Complex Syst (1998) 6:543–50. doi:10.1007/s100510050582

CrossRef Full Text | Google Scholar

13. Hull J, White A. The pricing of options on assets with stochastic volatilities. J Finance (1987) 42(2):281–300. doi:10.1111/j.1540-6261.1987.tb02568.x

CrossRef Full Text | Google Scholar

14. Mercurio D, Spokoiny V. Estimation of time dependent volatility via local change point analysis (2005).

Google Scholar

15. Goldentayer L, Klebaner F, Liptser RS. Tracking volatility. Probl Inf Transm (2005) 41:212–29. doi:10.1007/s11122-005-0026-2

CrossRef Full Text | Google Scholar

16. Merton RC. Option pricing when underlying stock returns are discontinuous. J financial Econ (1976) 3(1-2):125–44. doi:10.1016/0304-405x(76)90022-2

CrossRef Full Text | Google Scholar

17. Heston SL. A closed-form solution for options with stochastic volatility with applications to bond and currency options. Rev financial Stud (1993) 6(2):327–43. doi:10.1093/rfs/6.2.327

CrossRef Full Text | Google Scholar

18. Nelson DB. ARCH models as diffusion approximations. J Econom (1990) 45(1-2):7–38. doi:10.1016/0304-4076(90)90092-8

CrossRef Full Text | Google Scholar

19. Anvari M, Tabar MRR, Peinke J, Lehnertz K. Disentangling the stochastic behavior of complex time series. Scientific Rep (2016) 6(1):35435. doi:10.1038/srep35435

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Tabar R. Analysis and data-based reconstruction of complex nonlinear dynamical systems, 730. Springer (2019).

Google Scholar

21. Lehnertz K, Zabawa L, Tabar MRR. Characterizing abrupt transitions in stochastic dynamics. New J Phys (2018) 20(11):113043. doi:10.1088/1367-2630/aaf0d7

CrossRef Full Text | Google Scholar

22. Movahed AA, Noshad H. Introducing a new approach for modeling a given time series based on attributing any random variation to a jump event: jump-jump modeling. Scientific Rep (2024) 14(1):1234. doi:10.1038/s41598-024-51863-5

CrossRef Full Text | Google Scholar

23. Excel FOFU. Fundamentals of forecasting using excel (2019).

Google Scholar

Keywords: stock prices modeling, stochastic dynamical equation, Black-Scholes model, poisson jump process, jump-diffusion model, jump-drift process

Citation: Movahed AA and Noshad H (2024) Introducing a new approach for modeling stock market prices using the combination of jump-drift processes. Front. Phys. 12:1402593. doi: 10.3389/fphy.2024.1402593

Received: 17 March 2024; Accepted: 18 June 2024;
Published: 18 July 2024.

Edited by:

Shuvojit Paul, Indian Institute of Science Education and Research Kolkata, India

Reviewed by:

N. Narinder, Technical University Dresden, Germany
Prasanta Panigrahi, Indian Institute of Science Education and Research Kolkata, India

Copyright © 2024 Movahed and Noshad. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Houshyar Noshad, hnoshad@aut.ac.ir

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.