Skip to main content

ORIGINAL RESEARCH article

Front. Phys., 18 July 2024
Sec. Statistical and Computational Physics
This article is part of the Research Topic Non-equilibrium Steady-state Statistical Processes with Confined Active Matter View all articles

Introducing a new approach for modeling stock market prices using the combination of jump-drift processes

Ali Asghar MovahedAli Asghar MovahedHoushyar Noshad
Houshyar Noshad*
  • Department of Physics and Energy Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran

The stock price data are sampled at discrete times (e.g., hourly, daily, weekly, etc). When data are sampled at discrete times, they appear as a sequence of discontinuous jump events, even if they have been sampled from a continuous process. On the other hand, distinguishing between discontinuities due to finite sampling of the continuous stochastic process and real jump discontinuities in the sample path is often a challenging task. Such considerations, led us to the question: Can discrete data (e.g., stock price) be modeled using only jump-drift processes, regardless of whether the sampled time series originally belongs to the class of continuous processes or discontinuous processes? To answer this question, we built a stochastic dynamical equation in the general form dyt=μ¯dt+i=1NξidJit, which includes a deterministic drift term (μ¯dt) and a combination of stochastic terms with jumpy behaviors (ξidJit), and used it to model the log-price time series yt. In this article, we first introduce this equation in its simplest form, including a drift term and a stochastic term, and show that such a jump-drift equation is capable of reconstructing stock prices in Black-Scholes diffusion markets. Afterwards, we extend the equation by considering two jump processes, and show that such a drift-jump-jump equation enables us to reconstruct stock prices in jump-diffusion markets more accurately than the old jump-diffusion model. To demonstrate the practical applications of the proposed method, we analyze real-world data, including the daily stock price of two different shares and gold price data with two different time horizons (hourly and weekly). Our analysis supports the practical applicability of the methodology. It should be noted that the presented approach is expandable and can be used even in non-financial research fields.

1 Introduction

The stock price is known as a highly volatile variable in a stock market. Price fluctuations, which occur randomly and frequently and sometimes include sudden jumps, increase investment risk and cause concern for investors and company owners who want to increase their capital. Therefore, researchers are propelled to study the fluctuating behavior of the market to find a way to model prices (or improve existing models) to advise investors looking for the best investments [15]. So far, significant progress has been made in this field, the most important of which is stock price modeling via continuous stochastic processes and discontinuous jump processes. The “arithmetic Brownian motion” model was the first mathematical model of stock prices, presented by Louis Bachelier in [6]. In his proposed model, Bachelier assumed that the discount rate is zero and the stochastic differential equation (SDE) governing the stock price is as follows:

dSt=σdWt(1)

where St is the spot stock price at time t, σ is diffusion coefficient (known as volatility), and Wt,t0 is a scalar Wiener process (a standard Brownian motion). Integration of Eq. 1 over (t, t+t) yields the stochastic solution of the equation:

St=σWt

where St=St+tSt is the relative change in the price during a time lag t, and Wt=Wt+tWt is the increment of the wiener process which is computed as Wt=ηΔt, where η is a random variable that follows a normal (Gaussian) distribution with zero mean and unit variance, i.e., ηN0,1. Therefore, the following can be written:

St+t=St+σηt(2)

The main shortcoming of Bachelier’s model is that it assumes that the future value of the assets follows a normal distribution. Based on this assumption, Eq. 2 can lead to a negative stock price with a positive probability, which is not possible in reality. In [7] Osborn demonstrated that the future value of the stock should follow a log-normal distribution, but the log-return of the stock follows a normal distribution. Shortly, the Bachelier model was modified by Samuelson in [8], where he introduced the “geometric Brownian motion” model (also known as Black-Sholes model). In this model, it is assumed that the price of the risky stock evolves according to the following SDE:

dSt=μStdt+σStdWt(3)

where µ and σ are the drift and diffusion coefficients, and again Wt,t0 is a scalar Wiener process. The field of mathematical finance has gained significant attention since Black and Scholes published their work in [9, 10]. They contributed to the world of finance via the introduction of Itô calculus to financial mathematics, and also the Black-Scholes formula. By choosing yt=lnSt and applying Itô’s lemma [11, 12], Eq. 3 becomes:

dyt=μσ22dt+σdWt(4)

Integration of Eq. 4 over (t, t+t) gives us:

yt=μσ22t+σWt

where yt=yt+tyt=lnSt+tSt represents the logarithmic increment of stock price data (known as log-return), t is the length of time interval between two consecutive trading periods, and Wt=ηΔt, ηN0,1. Therefore, the following can be written:

yt=μσ22t+σηΔt(5)

In turn, the stock price can be determined from Eq. 5 as:

St+t=Steμσ22t+σηΔt(6)

Eq. 6 enables one to simulate the possible stock price trajectories with time step t, through the Black-Sholes model. For this purpose, one must first find the parameters μ and σ2 from historical log-returns data based on the following relations:

M1=<yt>=μσ22t
M2=<yt<yt>2>=σ2t(7)

where <> denotes averaging over the data, so that M1 and M2 in Eq. 7 are the mean and variance of the historical log-returns data, respectively. Having M1 and M2, first σ2 is obtained:

σ2=1tM2

once σ2 is identified, the parameter μ is obtained from the first moment M1.

The main disadvantage of the Black-Scholes model is its constant volatility assumption, while it is widely believed and empirically confirmed that stock prices do not have constant volatility, rather it varies during time [1315]. This shortcoming and unsatisfactory performance of the Black-Scholes model caused researchers look for better alternatives and improve the classic Black-Scholes model in two directions:

1- Adding a term with jumpy behavior to the Black-Scholes equation to allow for random jumps in the stock price process (jump-diffusion model e.g., Merton model [16])

2- Considering stochastic volatility for the stock price (e.g., Heston model [17] or GARCH model [18]).

Here we only focus on the first option and describe the jump-diffusion model. Merton in [16] presented one of the first models in which jump processes were used in financial modeling. To take into account price discontinuities, Merton added a Poisson jump process to the log-price while preserving the independence and stationarity of log-returns. A jump-diffusion equation is generally written as:

dyt=μ¯dt+σdWt+ξdJt(8)

where μ¯ and σ are the drift and diffusion coefficients, Wt is a wiener process, and Jt is a Poisson jump process with rate λ and distributed size ξ which Merton assumed follows a Gaussian distribution with zero mean and variance σξ2, i.e., ξN0,σξ2. It was also assumed that Poisson process, jump size ξ and Wiener process in Eq. 8 are three independent processes.

Integration of Eq. 8 over (t, t+t), leads to:

yt=μ¯t+σWt+ξJt

here Jt=Jt+tJt follows a Poisson distribution with mean λt, and Wt=ηΔt, ηN0,1. Therefore, the following can be written:

yt=μ¯t+σηΔt+ξJt(9)

In turn, the stock price can be determined from Eq. 9 as:

St+t=Steμ¯t+σηΔt+ξJt(10)

Eq. 10 enables one to simulate the possible stock price trajectories with time step t, via the jump-diffusion model. For this purpose, one must find the parameters μ¯, σ2, σξ2 and λt from the historical log-returns data based on the following relations:

M1=<yt>=μ¯t
M2=<ytμ¯t2>=σ2t+σξ2λt
M4=<ytμ¯t4>=3σξ4λt
M6=<ytμ¯t6>=15σξ6λt(11)

where M1, M2, M4 and M6 are the statistical moments of the historical log-returns data. Having these moments, first the jump characteristics σξ2 and λt are obtained from Eq. 11:

σξ2=M65M4
λt=M43σξ4

once σξ2 and λt are identified, the parameter σ2 is identified from the second moment M2 and the parameter μ¯ is obtained from the first moment M1.

The main shortcoming of the jump-diffusion model is that the jumps reconstructed by the model have larger amplitudes than the jumps in the actual data. Let us demonstrate how this problem occurs. Suppose we want to model the daily prices of a stock via jump-diffusion model. As mentioned, first we need to determine the parameters μ¯, σ2, σξ2 and λt from the historical log-returns data. Since in the Poisson jump process, the probability of occurrence of more than one jump in any small time interval t is zero, so J in Eq. 9 takes only the values ​​of one (one jump) or 0 (no jump) with the probabilities λt and 1λt, respectively [19]. Given these probabilities, the data points can be reconstructed by one of the following sub-equations:

If J=0, meaning that no jump occurs at that t, then the data point is reconstructed by:

yt=μ¯t+σηΔt(12)

If J=1, meaning that a jump occurs at that t, then the data point is reconstructed by:

yt=μ¯t+σηΔt+ξ(13)

As can be seen from Eqs 12, 13, the diffusion term σηΔt appears in both equations and is involved in the reconstruction of all data points, even jumpy data points. Since the random variables (σηΔt) and (ξ) are two independent zero mean normally distributed variables with variances σ2t and σξ2, respectively, their sum in Eq. 13 is also a normally distributed variable, i.e., σηΔt+ξN0,σξ2+σ2t. The variance of this distribution (σ2t+σξ2) represents the amplitude of the reconstructed jumps, which is larger than the amplitude of the jumps in the historical data (σξ2) that was originally obtained. Obviously, if σ2tσξ2, so that σ2t can be neglected compared to σξ2, then the data reconstructed by the jump-diffusion model will be similar to the original data in the statistical sense, otherwise the model will fail. This shortcoming led us to modify the jump-diffusion equation in such a way that, if necessary, we can discard the contribution of the diffusion term in Eq. 9 so that it does not interfere with the reconstruction of the jumps. For this purpose, we replace the diffusion term in Eq. 9 by a term with jumpy behavior. This idea is supported by the fact that when data are sampled at discrete times, they appear as a sequence of discontinuous jump events, even if they have been sampled from a continuous diffusion process [20]. This is precisely why distinguishing between discontinuities due to discrete sampling of continuous process and real discontinuities in a jump-diffusion process is itself a challenging task [21]. Based on the above considerations, we modify Eq. 9 by considering two jump process with different distributed sizes ξ1 and ξ2 and different rates λ1t and λ2t as follows:

yt=μ¯t+ξ1J1t+ξ2J2t

where ξ1J1t has replaced diffusion term in Eq. 9, and ξ2J2t has the same role as ξJ. Each of J1t and J2t take the values of 1 and 0, but to avoid their simultaneous occurrence, we stipulate that if J1t=1, then J2t=0 and vice versa. Applying this condition causes each data point to be reconstructed by only one of the jump events. The procedure is as follows:

If J1t=1, and J2t=0, then the data point is reconstructed by:

yt=μ¯t+ξ1

If J1t=0, and J2t=1, then the data point is reconstructed by:

yt=μ¯t+ξ2

With this modification, the shortcoming of the jump-diffusion model can be solved. In this model, we assume that ξ1 and ξ2 are two zero mean Gaussian random variables with variances σξ12 and σξ22, i.e., ξ1N0,σξ12 and ξ2N0,σξ22. These two random variables produce fluctuations that are additively superimposed on the trajectory generated by the deterministic dynamics. In the following, we will describe the model in detail and demonstrate that all the unknown parameters of this modeling can be derived directly from the historical stock price.

2 Model description

In [22] we have introduced a general dynamical stochastic equation as follows, which includes a deterministic drift term (μ¯dt) and a combination of stochastic terms with jumpy behaviors (ξidJit):

dyt=μ¯dt+i=1NξidJit(14)

where μ¯dt indicates the deterministic part of the process and J1t,J2t,etc are Poisson jump processes. The jumps have rates λ1,λ2,etc and sizes ξ1,ξ2,etc, which we assume they have zero mean Gaussian distributions with variances σξ12,σξ22,etc, respectively. In this article, we intend to use this equation specifically to simulate asset prices. For this purpose, we first start with the simplest form of Eq. 14, which includes a drift term and only a jump process. We will demonstrate that such a jump-drift equation is able to describe the discrete-time evolution of price time series in the Black-Scholes markets. Since the real markets are usually jump-diffusion markets, in the second section, we extend modeling by considering two jump processes with different rates (λ1, λ2) and different distributed sizes (ξ1, ξ2) and use it to model prices in actual markets. In each stage, we will demonstrate that all unknown parameters involved in the model can be derived non-parametrically from the historical price data. It should be noted that due to the small number of data points in the price time series, or the lack of diversity in the distributed sizes of fluctuations, we will model prices only by considering two jump processes. However, depending on the number of available data points and the variety of amplitudes, one can extend the proposed model.

2.1 Jump-drift modeling

In the first step, we consider Eq. 14 in its simplest form including a drift term and a stochastic term with jumpy behavior, and show that it can be used to reconstruct prices data of the diffusion markets (e.g., Black-Scholes markets). The general form of a jump-drift equation is as follows:

dyt=μ¯dt+ξdJt(15)

where yt=lnSt is the log-price and μ¯dt denotes the deterministic drift part of the dynamics and Jt is a Poisson jump process characterized by the rate λ and the size ξ. We assume that ξ is a random variable with zero mean Gaussian distribution, i.e., ξN0,σξ2. Also, we assume that Poisson-distributed jumps dJt and jump size ξ are two independent processes.

Integration of Eq. 15 over (t, t+t) gives us:

yt=μ¯t+ξJt(16)

where yt=yt+tyt=lnSt+tSt is the log-return, t is the length of time interval between two consecutive points and Jt=Jt+tJt follows a Poisson distribution with mean λt.

In turn, the stock price can be determined as:

St+t=Steμ¯t+ξJt

To reconstruct prices data with the above relation, we must find three parameters μ¯, σξ2 and λt. We now show that all these parameters can be estimated directly from the log-return time series yt. For this purpose, we derive the statistical moments of yt from Eq. 16 (note that Jt and ξ are two independent processes):

M1=<yt>=<μ¯t>+<ξ><Jt>
M2=<ytμ¯t2>=<ξ2><(Jt2>
M4=<ytμ¯t4>=<ξ4><(Jt4>

where <> denotes averaging over the data, so that M1 is the mean of log-returns, and M2, M4 and M6 are the other statistical moments of log-returns about the mean. Since for small t, all of the statistical moments of jumps are proportional to λt, i.e., <Jtm>=λt [19, 20], as well as for a zero mean Gaussian random variable ξ with variance σξ2, all of the even order statistical moments are obtained by <ξ2l>=2l!2ll!<ξ2>l , the above relations become (note that <ξ>=0 and <ξ2>=σξ2):

M1=μ¯t
M2=σξ2λt
M4=3σξ4λt(17)

According to the first relation in Eq. 17, the mean of log-returns (M1) gives us the drift parameter μ¯, and the second and fourth-order moments (M2, M4) identify the jump characteristics, namely,:

μ¯=1tM1
σξ2=M43M2
λt=M2σξ2(18)

We claim that the proposed jump-drift dynamics enable us to model diffusion processes such as the Black-Scholes process. We will check the validity of this claim by reconstructing a Black-Scholes process via the new dynamics using the parameters determined from Eq. 18. But before that, let us provide the following two criteria for evaluating the reconstructed process:

1) We know from Wick’s theorem that for the time series of the Black-Scholes process, the statistical moments of the data satisfy the relation M43M221, which follows from the fact that the short-time propagator of the Black-Scholes dynamics is a Gaussian distribution. Therefore, if the proposed jump-drift dynamics be capable of reconstructing a time series which is statistically similar to the original Black-Scholes time series, then the statistical moments of the reconstructed data should satisfy the Wick’s relation, i.e., M43M22rec1.

2) In continuation of the previous point, we find the ratio M43M22 from relations (17):

M43M22=3σξ4λt3σξ2λt2=1λt

by comparing this relation with Wick’s relation, i.e., M43M221, we expect that λt=1. On the other, if λt=1, then the second moment in Eq. 17 becomes:

M2=σξ2

this is while, the second moment in original Black-Scholes process is M2=σ2t (Eq. 7). Therefore, it can be concluded that if the new model works correctly, the estimation of jumps amplitude ( σξ2) should be equal to the variance of the original data (σ2t), namely,

σξ2=σ2t

In the following, we reconstruct a Black-Scholes process with known drift and volatility parameters via the jump-drift equation, and then evaluate the reconstructed data.

Example 1. First, we generate a synthetic time series yt with 106 data points via Black-Scholes dynamics (Eq. 5) and using preset parameters μ=1.5 and σ=1 with t=0.004. In Figure 1, we have shown the trajectory of 1,500 data points out of 106 generated data points so that the fluctuations can be clearly seen (blue graph). By obtaining the statistical moments Mn for n=1,2,4 from the generated data, and substituting in relations (18), we determine the parameters required for the new modeling. The results are as follows:

Figure 1
www.frontiersin.org

Figure 1. Upper panel: A sample path of synthetic log-returns generated via Black-Scholes dynamics using the preset parameters μ=1.5, σ=1 and t=0.004. Lower panel: A sample path of log-returns reconstructed via jump-drift dynamics.

Statistical moments determined from generated data:

M1=0.004,M2=0.004,M4=4.8121105,M43M22=1.002

Required parameters for new modeling:

μ¯=1tM1=1
σξ2=M43M2=0.00401In agreement with σ2t=0.004
λt=M2σξ2=0.9971

In the second step, we reconstruct a time series yt via the jump-drift equation (Eq. 16) with 106 data points. For comparison with the original data, a sample path including 1,500 reconstructed data points is shown in Figure 1 (red graph). Finally, to ensure that the two time series (generated and reconstructed) are statistically equivalent, we obtain the statistical moments of the reconstructed data, and check the establishment of M43M22rec1. The results are as follows:

Statistical moments of reconstructed data:

M1=0.004,M2=0.004,M4=4.8172*105,M43M22rec=0.99851

As can be seen, the reconstructed data are statistically similar to original data with high accuracy, and there is a very good agreement between these results and the theory.

2.2 Jump-jump-drift modeling

In the previous section we modeled a continuous diffusion process through the jump-drift equation. Since real markets are usually jump-diffusion markets, the generalizing of jump-drift modeling to a jump-jump-drift modeling improves the characterization of real markets dynamics beyond a continuous process. The general form of a jump-jump-drift equation is as follows:

dyt=μ¯dt+ξ1dJ1t+ξ2dJ2t(19)

where μ¯dt indicates the deterministic part of the process and J1t and J2t are Poisson jump processes. The jumps have rates λ1 and λ2, and sizes ξ1 and ξ2, which we assume have zero mean Gaussian distributions, i.e., ξ1N0,σξ12 and ξ2N0,σξ22. We call σξ12 and σξ22 the jump amplitudes.

Integration of Eq. 19 over (t, t+t) gives us:

yt=μ¯t+ξ1J1t+ξ2J2t(20)

Furthermore, the stock price can be determined from Eq. 20 as:

St+t=Steμ¯t+ξ1J1t+ξ2J2t(21)

In modeling the stock price via Eq. 21, we also assume that two jumps do not occur simultaneously, which means that in the time interval t,t+t, if, for example, J1t occurs and takes the value of 1, J2t does not occur and its value is 0, and vice versa. Let λ1t and λ2t be the probabilities of occurrence of J1t and J2t in a small time step t, if we assume only one of the jumps (J1t or J2t) occurs in each time step, then we can write:

λ1t+λ2t=1(22)

According to this condition, we can discard one of the jump events at each time step, and reconstruct the corresponding data point by another jump event.

To model the stock prices via Eq. 20, we must find the five unknown parameters μ¯, λ1t, λ2t, σξ12 and σξ22. We now show that all these parameters can be estimated directly from the log-returns time series yt. For this purpose, we derive the statistical moments of yt from Eq. 20 (note that ξ1 and ξ2 are two Gaussian random variables independent from the jumps, and J1t and J2t do not occur simultaneously):

M1=<yt>=<μ¯t>+<ξ1><J1t>+<ξ2><J2t>
M2=<ytμ¯t2>=<ξ12><J1t2>+<ξ22><J2t2>
M4=<ytμ¯t4>=<ξ14><J1t4>+<ξ24><J2t4>
M6=<ytμ¯t6>=<ξ16><J1t6>+<ξ26><J2t6>

By using the relations <J1tm>=λ1t and <J2tm>=λ2t for the statistical moments of jump processes, and the relations <ξ12l>=2l!2ll!<ξ12>l and <ξ22l>=2l!2ll!<ξ22>l for the even order statistical moments of zero mean Gaussian random variables ξ1 and ξ2 with variances σξ12 and σξ22, we will have (note that <ξ1>=<ξ2>=0, <ξ12>=σξ12, and <ξ22>=σξ22):

M1=μ¯t
M2=σξ12λ1t+σξ22λ2t
M4=3σξ14λ1t+3σξ24λ2t
M6=15σξ16λ1t+15σξ26λ2t

To find the five unknowns μ¯, λ1t, λ2t, σξ12 and σξ22, we need to add one more equation to the above relations. For this purpose, we use Eq. 22 as λ1t=1λ2t, and reduce the number of unknowns, so we will have:

M1=μ¯t
M2=σξ12+σξ22σξ12λ2t
M4=3σξ14+3σξ24σξ14λ2t
M6=15σξ16+15σξ26σξ16λ2t(23)

Having the statistical moments M1,M2,M4 and M6 from the log-return time series and solving the above system of equations numerically, the four unknown parameters μ¯, λ2t, σξ12 and σξ22 are determined. Once λ2t is identified, λ1t is obtained from Eq. 22.

We claim that the proposed dynamics enables us to model time series with jump discontinuities more accurately than the classic jump-diffusion dynamics. We will check the validity of this claim by reconstructing a jump-diffusion process via the jump-jump-drift equation. But before that, let us prove this claim by showing that the new relations in Eq. 23 are generalizations of the old jump-diffusion relations in Eq. 11. For this purpose, we consider the case in which σξ12σξ22, so that σξ12 can be ignored compared to σξ22, and at the same time σξ12 be so small that σξ14=σξ16=0. Under these special conditions, relations (23) can be written as follows:

M1=μ¯t
M2=σξ12+σξ22λ2t
M4=3σξ24λ2t
M6=15σξ26λ2t

As can be seen, these relations are similar to relations of jump-diffusion model (Eq. 11), so that σξ12 has replaced σ2t, and identifies the diffusion part, and σξ22λ2t has the same role as σξ2λt. This means that under these special conditions σξ12σξ22 and σξ14=σξ16=0), the new model works like the jump-diffusion model and the parameters obtained from the data are the same in both models. But if the data fluctuations are such that these conditions are not satisfied, it is clear that the proposed model will lead to more accurate estimates than the jump-diffusion model. By analyzing stock prices, we found that although the release of exciting news in the market causes sudden jumps in log-returns, the amplitude of these jumps is not so much larger than the amplitude of the fluctuations in normal days. Therefore, it seems that the new model has a better performance for modeling and forecasting prices.

In the following, to demonstrate the reliability of the new model, we test it on synthetic data. Furthermore, to ensure the effectiveness of the proposed approach in different conditions, we test the model with different synthetic data.

Example 2:. First, we test the model with data generated through the Black-Scholes process in Example 1. By obtaining the statistical moments Mn for n=1,2,4,6 from the generated data, and replacing them in relations (23), we determine the parameters required for the new modeling via the numerical solution of the obtained system of equations. Since the data generated in example 1 are diffusive data, and we have already modeled it through the jump-drift equation, we expect the occurrence rate of one of the jumps to be zero when we model the same data through the jump-jump-drift equation. The following results, confirm our opinion:

μ¯=1
σξ12=0.004
λ1t=0.99991
σξ22=0.0007
λ2t=0.00010

The value of λ2t0 show that when the time series belongs to the class of continuous diffusion processes (e.g., Black-Scholes process), the jump-jump-drift dynamics, models it by using only one jump process and completely omitting the second jump process. In the next step, we test the model on two synthetic log-return time series generated via jump-diffusion Equation 9 with preset parameters. Each time series contains 3×106 data points which generated by considering μ¯=5 and σ=2 with a sampling interval t=0.0001, so that σ2t=0.0004. The jumps in both time series have the same jump rate λt=0.3 (jump rate per data point), but the amplitude of the jumps are σξ2=0.1 and σξ2=0.001, respectively. We deliberately choose these jump amplitudes with different orders of magnitude to observe the effect of their amplitude in retrieving the coefficients. Note that in the first case σξ2σ2t=250 and in the second case σξ2σ2t=2.5, that is, in the first case, the variance of diffusion part (σ2t) is negligible compared to the amplitude of jumps (σξ2), and as mentioned earlier, we expect both models show almost the same results, but in the second case, we expect the estimates of the new model to be more accurate than the jump-diffusion model.

By obtaining the statistical moments Mn for n=1,2,4,6 from the generated data, and substituting in relations (11) and (23), we determine the parameters of the two models. The following results are estimated from the numerical solution of the corresponding system of equations:

Case1:. Preset parameters:

μ¯=1,σ2t=0.0004,σξ2=0.1,λt=0.3

Estimated parameters via jump-diffusion model:

μ¯=1,σ2t=0.00031,σξ2=0.1005,λt=0.299

Estimated parameters via jump-jump-drift model:

μ¯=1,σξ12=0.00045,σξ22=0.1005,λ2t=0.299,λ1t=0.701

Case2:. Preset parameters:

μ¯=1,σ2t=0.0004,σξ2=0.001,λt=0.3

Estimated parameters via jump-diffusion model:

μ¯=1,σ2t=0.00013,σξ2=0.0012,λt=0.5

Estimated parameters via jump-jump-drift model:

μ¯=1,σξ12=0.00040,σξ22=0.0014,λ2t=0.302,λ1t=0.698

The above results show that in the first case, both models lead to almost the same results, but in the second case, the proposed model leads to more accurate results (note that in the new model, σξ12 is an estimate for the variance of the diffusive data, i.e., σξ12=σ2t, and σξ22 is an estimate for the variance of the jumpy data, i.e., σξ22=σξ2).

3 Data and methodology

Our dataset comprises the daily closing prices of the Apple and IBM stocks, as well as gold prices with two different time horizons (weekly and hourly). For Apple and IBM stocks, the historical data that will be used are daily closing prices from 1 June 2020 to 1 June 2023, which are obtained from Yahoo Finance source. For gold, the historical data that will be used are weekly gold prices from 5 January 2004 to 3 January 2022, as well as hourly gold prices from 11 March 2022 to 11 November 2022, which are obtained from dukascopy historical data source.

For each of the collected data, we will obtain log-returns time series yt by:

yt=lnSt+tSt,t=1,2,

where St and St+1 are consecutive prices in the price time series. Afterwards, we will determine the statistical moments of log-returns data as follows:

M1=y¯=1Nt=1Nyt
Mn=1Nt=1N(yty¯)n,for n=2,4,6

where N is the number of log-returns data points. By determining these statistical moments and replacing them in relations (23), we will identify the required parameters of the model, i.e., μ¯, σξ12, σξ22, λ1t, λ2t. Using these parameters, we will reconstruct the log-returns data by the following equation:

yt=μ¯t+ξ1J1t+ξ2J2t

In addition, we will use the following equation to forecast prices for several time steps after the chosen historical period:

St+1=Steμ¯t+ξ1J1t+ξ2J2t

To determine the forecasts accuracy, we will use “Mean Absolute Percentage Error” (MAPE) calculation as follows:

MAPE=1Nt=1NFtStSt

where Ft is the forecasted price at time t, St is the actual stock price at time t, and N is the number of predicted data points. We will use MAPE values to evaluate our forecasting method. A scale for judging model accuracy based on the MAPE criterion was presented by Lawrence et al. [23], and is shown in Table 1.

Table 1
www.frontiersin.org

Table 1. A scale of judgment of forecast accuracy.

3.1 Research output and discussion

In the following, considering the elements described in the methodology, we first model Apple and IBM stocks and predict their prices for a period of 30 days. Historical data used are daily closing stock prices from 1 June 2020 to 1 June 2023. For daily prices, the trading period is t=1252 years (based on a year with an average of 252 stock trading days). The parameters obtained from the model are presented in Table 2. Based on this table, in both stocks, σξ22 is not more than one order of magnitude larger than σξ12, so that for Apple stock we have σξ22σξ12=4.34, and for IBM stock this ratio is σξ22σξ12=4.2. As mentioned earlier, in this situation, the new model has better performance. Furthermore, for the Apple stock, the jump rates are λ1t=0.8248 and λ2t=0.1752, which means that, in the data reconstruction stage, 82.48% of the data points are reconstructed by using a Gaussian random variable with smaller variance σξ12, and 17/52% of the data points are reconstructed by using a Gaussian random variable with larger variance σξ22. This is while, these rates for the IBM stock are λ1t=0.6790 and λ2t=0.3210, respectively.

Table 2
www.frontiersin.org

Table 2. values of the drift, jump amplitudes, and jump rates obtained from historical daily prices of Apple and IBM stocks using the jump-jump-drift modeling.

To reconstruct log-return data through proposed model, we use the equation yt=μ¯t+ξ1J1t+ξ2J2t and reconstruct a time series for yt that is statistically similar to the original ones. Figures 2, 3 show the actual and reconstructed log-returns of Apple and IBM stocks, respectively.

Figure 2
www.frontiersin.org

Figure 2. Upper panel: Time plot of actual daily log-returns of Apple stock from 1 June 2020 to 1 June 2023 (756 data points). Lower panel: Time plot of reconstructed daily log-returns of Apple stock using the proposed jump-jump-drift model.

Figure 3
www.frontiersin.org

Figure 3. Upper panel: Time plot of actual daily log-returns of IBM stock from 1 June 2020 to 1 June 2023 (756 data points). Lower panel: Time plot of reconstructed daily log-returns of IBM stock using the proposed jump-jump-drift model.

To predict stock prices, we use parameters estimated from historical data. The forecast period is 30 days and is related to the days after the selected historical period. Simulation of predictions is done by 1,000 realization of the trajectory. Each trajectory is realized using the equation St+1=Steμ¯t+ξ1J1t+ξ2J2t with 30 iterations. Figures 4, 5 show the daily forecasts of Apple and IBM stock prices, respectively. In order to compare the predictions with the actual prices, the graph of realized prices in the same 30 days is also shown in each figure (cyan graph). As can be seen, in both stocks, the actual prices are located within the predicted trajectories. In addition, the data analysis shows that all 1,000 predicted trajectories of Apple stock price have MAPE values less than 20% (with the smallest MAPE = 1.4%, the largest MAPE = 19.8% and the average MAPE = 5.84%). Meanwhile, the corresponding values obtained through the jump-diffusion model are as follows:

Figure 4
www.frontiersin.org

Figure 4. Graphical representation of the predicted paths of the daily price of Apple stock using jump-jump-drift modeling. The time period of all predictions is 30 days and their starting point is 1 June 2023. The cyan graph is the actual price path realized over the same 30 days, and the colored graphs are the 1,000 possible paths predicted by the model.

Figure 5
www.frontiersin.org

Figure 5. Graphical representation of the predicted paths of the daily price of IBM stock using jump-jump-drift modeling. The time period of all predictions is 30 days and their starting point is 1 June 2023. The cyan graph is the actual price path realized over the same 30 days, and the colored graphs are the 1,000 possible paths predicted by the model.

The smallest MAPE = 1.51%, the largest MAPE = 24.3%, and the average MAPE = 6.32%. These results show that the jump-jump-drift model has a better performance than the jump-diffusion model, and if the time period of the forecasts becomes larger (e.g., in annual forecasts), the difference between the forecasts of the two models becomes more visible.

The results of IBM stock price predictions are even more surprising than Apple stock. Analysis of IBM stock simulation outputs shows that all 1,000 predicted trajectories have MAPE values ​​less than 15% (with the smallest MAPE = 1.22%, the largest MAPE = 14.02% and the average MAPE = 4.62%), indicating good accuracy of the model predictions. The corresponding values obtained through jump-diffusion model are as follows:

The smallest MAPE = 1.62%, the largest MAPE = 20.7%, and the average MAPE = 5.12%.

Finally, to see the effectiveness of the proposed approach for different time horizons, we simulate gold prices with two different time horizons. Historical data used are weekly gold price from 5 January 2004 to 3 January 2022, as well as hourly gold prices from 11 March 2022 to 11 November 2022. For weekly prices, the trading period is t=152 years (based on a year with an average of 52 trading weeks), while for hourly prices, the trading period is t=15916 years (based on 2022 with 5,916 trading hours). The parameters obtained from the model are presented in Table 3. Based on this table, in both cases, σξ22 is not more than one order of magnitude larger than σξ12, so that for weekly data we have σξ22σξ12=4.6, and for hourly data this ratio is σξ22σξ12=11.7.

Table 3
www.frontiersin.org

Table 3. values of the drift, jump amplitudes, and jump rates obtained from historical gold prices (weekly and hourly) using jump-jump-drift modeling.

To reconstruct log-return data through proposed model, we use the equation yt=μ¯t+ξ1J1t+ξ2J2t and reconstruct a time series for yt that is statistically similar to the original ones. Figures 6, 7 show the actual and reconstructed log-returns of weekly and hourly prices of gold, respectively.

Figure 6
www.frontiersin.org

Figure 6. Upper panel: Time plot of actual weekly log-returns of gold prices from 5 January 2004 to 3 January 2022 (940 data points). Lower panel: Time plot of reconstructed weekly log-returns of gold prices using the proposed jump-jump-drift model.

Figure 7
www.frontiersin.org

Figure 7. Upper panel: Time plot of actual hourly log-returns of gold prices from 11 March 2022 to 11 November 2022 (3,999 data points). Lower panel: Time plot of reconstructed hourly log-returns of gold prices using the proposed jump-jump-drift model.

To predict gold prices, we use parameters estimated from historical data. The forecast period for weekly price is 30 weeks and for hourly price is 300 h and related to the times after historical periods. Simulation of predictions is done by 1,000 realization of the trajectory. Each trajectory is realized using the equation St+1=Steμ¯t+ξ1J1t+ξ2J2t with 30 iterations for weekly gold price and 300 iterations for hourly gold price. Figures 8, 9 show the weekly and hourly gold price forecasts, respectively. As can be seen, in both cases, the actual prices are located within the trajectories predicted by the model. Furthermore, the data analysis shows that all 1,000 predicted trajectories of weekly gold price have MAPE values ​​less than 30% (with the smallest MAPE = 2.1%, the largest MAPE = 29.43% and the average MAPE = 7.57%), which are acceptable forecasts. The corresponding values obtained by jump-diffusion model are as follows:

Figure 8
www.frontiersin.org

Figure 8. Graphical representation of the predicted paths of the weekly price of gold using jump-jump-drift modeling. The time period of all predictions is 30 weeks and their starting point is 3 January 2022. The cyan graph is the actual price path realized over the same 30 weeks, and the colored graphs are the 1,000 possible paths predicted by the model.

Figure 9
www.frontiersin.org

Figure 9. Graphical representation of the predicted paths of the hourly price of gold using jump-jump-drift modeling. The time period of all predictions is 300 h and their starting point is 11 November 2022. The cyan graph is the actual price path realized over the same 300 h, and the colored graphs are the 1,000 possible paths predicted by the model.

The smallest MAPE = 2.3%, the largest MPAE = 35.7%, and the average MAPE = 7.85%.

Analysis of hourly gold prices shows that all 1,000 predicted paths have MAPE values ​​less than 10% (with the smallest MAPE = 0.51%, the largest MPAE = 7.17% and the average MAPE = 1.82%), indicating very high accuracy of the model predictions for hourly time horizons. The corresponding values obtained by jump-diffusion model are as follows:

The smallest MAPE = 0.8%, the largest MPAE = 10.3%, and the average MAPE = 2.15%.

4 Conclusion

We discussed that when data are sampled at discrete times (e.g., stock prices), they appear as a sequence of discontinuous jump events, even if they have been sampled from a continuous process. This issue gave us the idea to propose a new modeling in which random variations in the sample path of a measured time series are attributed to jump events, even if the time series belongs to the class of diffusion processes. Based on this, we introduced a new dynamical stochastic equation including a deterministic drift term and a combination of several stochastic terms with jumpy behaviors. The general form of this equation is as follows:

dyt=μ¯dt+i=1NξidJit

In this modeling we also assumed that the jump events do not occur simultaneously so that the jumps have no overlap. We started with the simplest form of equation including a deterministic drift term and a jump process as the stochastic component, and argued that it can be used to describe the discrete-time evolution of a diffusion process, e.g., Black-Scholes process. Afterwards, we extended the equation by considering two jump processes with different distributed sizes, and used it to model assets such as stock prices and gold prices with different time horizons. We also demonstrated that in all cases the proposed model works better than the old jump model. It should be noted that, due to the small number of available price data and the lack of diversity in the amplitudes of jumps, in this article we modeled prices data only by considering two jump processes. However, depending on the number of data points and variation in the amplitudes of fluctuations, more stochastic terms can be kept in the equation to increase the accuracy of the modeling. But on the other hand, the more the number of terms in the equation, the need to solve the system of equations with more unknowns, the cost of which must be paid in the form of longer runtime.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author contributions

AM: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing–original draft, Writing–review and editing. HN: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing–original draft, Writing–review and editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Reddy K, Clinton V. Simulating stock prices using geometric Brownian motion: evidence from Australian companies. Australas Account Business Finance J (2016) 10(3):23–47. doi:10.14453/aabfj.v10i3.3

CrossRef Full Text | Google Scholar

2. Synowiec D. Jump-diffusion models with constant parameters for financial log-return processes. Comput Math Appl (2008) 56(8):2120–7. doi:10.1016/j.camwa.2008.02.051

CrossRef Full Text | Google Scholar

3. Azizah M, Irawan M, Putri E. Comparison of stock price prediction using geometric Brownian motion and multilayer perceptron. In: AIP conference proceedings: Depok, Indonesia. AIP Publishing (2020).

CrossRef Full Text | Google Scholar

4. Mota PP, Esquível ML. Model selection for stock prices data. J Appl Stat (2016) 43(16):2977–87. doi:10.1080/02664763.2016.1155205

CrossRef Full Text | Google Scholar

5. Benninga S. Financial modeling, fourth edition By Simon Benninga Hardcover. The MIT press (2014).

Google Scholar

6. Bachelier L. Théorie de la spéculation. In: Annales scientifiques de l’École normale supérieure (1900).

CrossRef Full Text | Google Scholar

7. Osborne MF. Brownian motion in the stock market. Operations Res (1959) 7(2):145–73. doi:10.1287/opre.7.2.145

CrossRef Full Text | Google Scholar

8. Samuelson PA. Economic theory and mathematics--an appraisal. Am Econ Rev (1952) 42(2):56–66.

Google Scholar

9. Black F, Scholes M. The pricing of options and corporate liabilities. J Polit economy (1973) 81(3):637–54. doi:10.1086/260062

CrossRef Full Text | Google Scholar

10. Black F, Karasinski P. Bond and option pricing when short rates are lognormal. Financial Analysts J (1991) 47(4):52–9. doi:10.2469/faj.v47.n4.52

CrossRef Full Text | Google Scholar

11. Risken H. Springer series in synergetics (1996).

Google Scholar

12. Bouchaud J-P, Cont R. A Langevin approach to stock market fluctuations and crashes. Eur Phys J B-Condensed Matter Complex Syst (1998) 6:543–50. doi:10.1007/s100510050582

CrossRef Full Text | Google Scholar

13. Hull J, White A. The pricing of options on assets with stochastic volatilities. J Finance (1987) 42(2):281–300. doi:10.1111/j.1540-6261.1987.tb02568.x

CrossRef Full Text | Google Scholar

14. Mercurio D, Spokoiny V. Estimation of time dependent volatility via local change point analysis (2005).

Google Scholar

15. Goldentayer L, Klebaner F, Liptser RS. Tracking volatility. Probl Inf Transm (2005) 41:212–29. doi:10.1007/s11122-005-0026-2

CrossRef Full Text | Google Scholar

16. Merton RC. Option pricing when underlying stock returns are discontinuous. J financial Econ (1976) 3(1-2):125–44. doi:10.1016/0304-405x(76)90022-2

CrossRef Full Text | Google Scholar

17. Heston SL. A closed-form solution for options with stochastic volatility with applications to bond and currency options. Rev financial Stud (1993) 6(2):327–43. doi:10.1093/rfs/6.2.327

CrossRef Full Text | Google Scholar

18. Nelson DB. ARCH models as diffusion approximations. J Econom (1990) 45(1-2):7–38. doi:10.1016/0304-4076(90)90092-8

CrossRef Full Text | Google Scholar

19. Anvari M, Tabar MRR, Peinke J, Lehnertz K. Disentangling the stochastic behavior of complex time series. Scientific Rep (2016) 6(1):35435. doi:10.1038/srep35435

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Tabar R. Analysis and data-based reconstruction of complex nonlinear dynamical systems, 730. Springer (2019).

Google Scholar

21. Lehnertz K, Zabawa L, Tabar MRR. Characterizing abrupt transitions in stochastic dynamics. New J Phys (2018) 20(11):113043. doi:10.1088/1367-2630/aaf0d7

CrossRef Full Text | Google Scholar

22. Movahed AA, Noshad H. Introducing a new approach for modeling a given time series based on attributing any random variation to a jump event: jump-jump modeling. Scientific Rep (2024) 14(1):1234. doi:10.1038/s41598-024-51863-5

CrossRef Full Text | Google Scholar

23. Excel FOFU. Fundamentals of forecasting using excel (2019).

Google Scholar

Keywords: stock prices modeling, stochastic dynamical equation, Black-Scholes model, poisson jump process, jump-diffusion model, jump-drift process

Citation: Movahed AA and Noshad H (2024) Introducing a new approach for modeling stock market prices using the combination of jump-drift processes. Front. Phys. 12:1402593. doi: 10.3389/fphy.2024.1402593

Received: 17 March 2024; Accepted: 18 June 2024;
Published: 18 July 2024.

Edited by:

Shuvojit Paul, Indian Institute of Science Education and Research Kolkata, India

Reviewed by:

N. Narinder, Technical University Dresden, Germany
Prasanta Panigrahi, Indian Institute of Science Education and Research Kolkata, India

Copyright © 2024 Movahed and Noshad. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Houshyar Noshad, hnoshad@aut.ac.ir

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.