Luận án tiến sĩ: Models of selected problems in mathematical finance and numerical methods for stochastic differential equations

The first problem concerns the relaxation of the Efficient Market Hypothesis and the development of agent-based models that can help explain the consistent, butpoorly understood, non-Gau

ga ai 1

Financial Markets 0 cv 2v TT và 1

The economic records of nations, industries and even individual companies are full of irregular cycles, bubbles and crashes The tulip mania of seventeenth century Netherlands and the South Sea Bubble of the eighteenth century are well-known examples of booming, speculative markets that eventually crashed [95] By the same token, the stock market crashes of 1929, 1987 and 2000 sent economic reverberations to every corner of society, and each crash was preceded by a booming financial market.

To this day, none of these crashes are thoroughly understood Consequently, we are still at risk, that is to say, our economy is still susceptible to severe market crashes.

The study of the economy, or ‘political economy’ as it is sometimes known, goes back only a few hundred years, to the time of Adam Smith Much of the work of the classical economists - Smith, Ricardo, Malthus and Marx — focused on growth and change, such as the problems caused by economic fluctuations and unemployment [80]. Most of these eminent figures worked in Britain, and the fact that Great Britain was a leader in the Industrial Revolution is hardly coincidental Adam Smith published his

1 published his work towards the end of the same century, while Ricardo published in the first half of the nineteenth century The famed German Karl Marx spent the latter part of his life in Great Britain, where he wrote Das Kapital Unfortunately, Marx’s economic theories have been widely ignored by the general reader, because they were championed so dogmatically by the failed communist regimes of the twentieth century. These classical economists are known today for their theories about economics Their theoretical models were surely based on empirical evidence, but how well were these theories and models tested?

Around 1870, more quantitative analysis began to enter the economic picture. Among others, Léon Walras, a trained physicist in Lausanne, Switzerland, introduced mathematical systems of analysis to the study of economics The contributions Walras made had a great impact on economic theory Nowadays, typical undergraduate courses in microeconomics begin with demand curves and supply curves and build to

“Walrasian General Equilibrium” [67,80] Despite the usefulness of Walrasian General Equilibrium in understanding the general theory of markets, some economists object to the fact that money plays no role in the Walrasian microeconomic explanation of markets Further, this lack creates a dichotomy in the transition to macroeconomics

[67] Other economists complain that, although the Walrasian mathematical approach applies more rigorous arguments to much economic reasoning, this approach can be carried too far [80] Economics is not a physical science; there are no rigid, universal laws of human behavior The Walrasian view, that of economics as a smoothly running machine that can be explained by mathematical arguments, tends to separate economics from its original, people-oriented foundations.

In the same spirit of using mathematics to explain economic phenomena, LouisBachelier in 1900 wrote his Ph D thesis Théorie de la Speculation, about the theory

He proposed that the price of a stock moved in a random fashion and, consequently, that this seemingly erratic motion of the stock market price was equivalent to a random walk If these assertions were true, then changes in stock price would be distributed in the well-known bell-shaped, or Gaussian, curve Bachelier’s assertion of unpredictability also lent credence to the fact that it is very difficult for even the best investors to beat the market in the long run.

Bachelier, and later others, argued that in a liquid market, such as stock or foreign exchange, any significant correlation in returns would be quickly noticed and acted upon to generate a profit Even small opportunities for arbitrage would not linger long, but instead would be rapidly turned into profit Ultimately, the liquidity and efficiency of the market effectively cancel meaningful correlations in the market This leads one to conclude that the lack of arbitrage opportunity is in fact an integral feature of the market In time, this fundamental concept was expanded and came to be known as the ‘Efficient Market Hypothesis.’

For over 50 years, Bachelier’s Gaussian distribution of the random changes in stock prices stood the test of time, drawing few detractors In 1963, however, Mandelbrot published contradictory evidence that gained wide attention [66] His findings re- futed the notion that price changes in an efficient, liquid market followed a Gaussian distribution Mandelbrot’s empirical evidence showed too many small price changes and too many large price changes (outliers) for the distribution of these changes to be Gaussian Instead, Mandelbrot proposed that the price changes followed a stable Paretian distribution [66] In 1965, Fama, in a large study of the distribution of the first differences of the logarithms of stock prices [29], reported finding leptokurtosis in the empirical distributions, a departure from normality which supported the findings of Mandelbrot Further, Fama’s results supported the stable Paretian distribution alternative to the Gaussian.

An immediate benefit of having the correct distribution of asset price returns is the correct pricing of options and warrants From Bachelier [3] to Samuelson [88] to Black and Scholes [6], being able to determine an option price has depended on the underlying distribution of asset price returns However, the nature of the distribution of asset price returns in speculative markets remains a matter of debate Ultimately, the nature of the distribution should be discernible from the market Therefore, a realistic model of a financial market should provide the answer.

Only for about 100 years have researchers been trying to model financial markets. Much attention has been paid to the modeling of asset returns and volatilities in stock, commodities and foreign exchange markets, which has helped us better understand the fundamental characteristics of these markets The study of financial markets can be accomplished in several ways One approach is to ignore some of the complexities found in these markets Many researchers start from a set of standard assumptions, often called the efficient market hypothesis (EMH), to make the modeling problem more tractable Generally, there are three widely accepted versions of the EMH: strong, semi-strong and weak, distinguished by the amount of information available to the typical market participant [31] The distinguishing feature is the quantity and nature of the available information, respectively, e strong - all information, even insider information. e semi-strong — publicly available information only. e weak — only information on past behavior of the price of the asset. precise manner in which buyers and sellers are matched and prices agreed upon are not addressed In addition, there are implicit assumptions regarding the rationality of the market participants and their ability to process the information The most significant of these is the concept of rational expectations whereby, even if every agent is not perfectly correct in her interpretation of new information, the average expectation is correct The fact that agents have differing expectations provides a rationale for trading to occur.

Although the EMH is valuable in that some market-related problems can be solved with closed-formed mathematical solutions, it has become evident that some of the assumptions need to be relaxed in order to model more realistic market behavior. Under the EMH, models of markets tend not to replicate many real-world, non- Gaussian phenomena such as volatility clustering, excess kurtosis and fat tails.

To summarize and clarify what any realistic financial market model should be able to do, Cont [23] lists a set of “stylized empirical facts” determined from the statistical analysis of price variations in financial markets A realistic model of a financial market should strive to replicate the following:

1 Absence of autocorrelations in price returns.

2 Heavy tails of the distribution of price returns (power-law or Pareto-like).

3 Volatility clustering of price returns.

4 Gain/loss price asymmetry: price decreases are larger but fewer, whereas increases are smaller yet more numerous.

5 Aggregational Gaussianity: as the timescale increases over which returns are calculated, their distribution converges to a Gaussian.

7 Conditional heavy tails of returns even after correcting for volatility clustering.

8 Slow decay of autocorrelation of absolute price returns.

9 The leverage effect: returns are negatively correlated with most measures of volatility.

10 Trading volume/volatility correlation with returns.

Further, a realistic financial asset market model ought to generate changes in asset returns that are comparable to changes in asset prices and in returns found in real asset markets Such changes should show neither unrealistic order or randomness in the price or log-price changes.

I take this opportunity to note that only about 230 years have passed since Adam Smith published his Wealth of Nations, and it was a mere 130 years ago that Walrus first applied mathematical analysis to the study of economics In addition, it has been only about 50 years since computers became available to the economics researcher. Finally, just in the last 20 years or so have powerful personal computers become ubiquitous These powerful desktop machines, along with their statistical and data manipulation software, have enabled researchers to evaluate more financial data and to subject it to more complex tests than has been the case during the first 200 years of the study of economics.

With all this in mind, how does one approach the modeling of a financial market?

Stochastic Differential Equations 0.004 10

Stochastic differential equations (SDEs) are defined by a vector field that includes a random (Brownian) forcing Such equations have become extremely important mathematical models of phenomena in almost every scientific discipline, and their efficient and reliable numerical solution is greatly desired.

High-order numerical methods for solving ordinary differential equations (ODEs) are very common For instance, there are many variations of multistep methods and Runge-Kutta methods and they can be generated with arbitrarily high orders In addition, there are well-understood, simple procedures for developing related adaptive timestepping schemes when timesteps are allowed to vary These are often chosen so as to control some estimate of the local error committed on each timestep The efficiency gains provided by such adaptive schemes are extremely impressive and almost entirely independent of the order of the underlying method.

However, when it comes to SDEs, the situation is much less straightforward. Stochastic differential equations are inherently more complicated than ODEs, primarily due to their stochastic component This has much to do with the greater complexity of the Itô-Taylor expansions and Stratonovich-Taylor expansions from which many SDE fixed-timestep integration schemes are developed Furthermore, the methods themselves, as the order increases, rapidly become very complicated, thus computationally costly to implement Finally, the efficiency gains induced by adaptivity are not as impressive as in the ODE case, and they quickly fall away as the order of the underlying method, and the associated computational/bookkeeping overheads, increase For these reasons, efficient adaptive timestepping algorithms based upon low-order (e.g., order 1) methods may well be the method of choice for many classes of problems and accuracy requirements.

Some of the techniques used to solve ODEs using variable step sizes can be carried over directly to the solution of SDEs, albeit with some modifications However, the efficiency gains that have been achieved to date are far more modest This research is based on an adaptive scheme introduced in [51], where the local error control consists of two distinct and separate error estimates applied to the standard order-1 Milstein method Roughly speaking, one approximates the deterministic error and the other the error in the solution of the stochastic integral A fundamental advantage of such an approach is that the algorithm can be designed to operate differently under both deterministic and diffusion-dominated regimes — as the magnitude of the diffusion coefficient is reduced, the algorithm behaves more and more like a standard ODE adaptive solver A related point concerns the user-defined tolerance 7 that determines their accuracy requirements As greater accuracy is required, and 7 reduced to 0, the fine-structure stochastic details of the solution path become more significant and the drift component less relevant to the error control Thus, there is an important but complex relationship between the magnitudes of the drift and diffusion coefficients, the desired accuracy level and the efficient operation of the algorithm. These considerations do not exist in the ODE case and so it should be no surprise that the principles underlying ODE adaptivity, when naively translated to the SDE framework, fail to replicate the efficiency gains.

Subsequent work [54] that forms part of this dissertation has also shown that such a dual error-control approach has desirable mean-square stability properties for a class of linear multiplicative-noise test problems This ability of adaptive techniques to induce numerical stability also occurs in ODEs and provides a further motivation for work in this area.

A novel feature of the algorithm in [B1] is that it allows a candidate step to be rejected after the next Brownian increment has been computed, but before any further coefficient evaluations are performed This raises another efficiency issue that does not exist for ODEs, which is the relative cost of performing a function evaluation on the drift or diffusion terms (and their derivatives) versus the cost of generating a sample from the Brownian motion.

In this dissertation, we first define a class of SDE algorithms, based upon the work in [51], and show that they preserve mean-square stability for a class of linear test problems with multiplicative noise I then consider how to improve upon the scheme in [51] There are three particular issues that were initially to form part of this dissertation Firstly, the dual error controls fail to adequately cope on SDEs with additive noise since the diffusion error estimate breaks down The algorithm still converges but it does so by only controlling the drift error Thus, it appears that extra, or different, error controls may be valuable — provided that they do not decrease the efficiency too much Secondly, the timestep selection mechanism used in [51] is complicated and ad hoc and, although it works, there is a great deal of potential for improvement Thirdly, the algorithm originally employed only the explicit Milstein method to advance the solution, but should work with any order-1 method Thus, it was of interest to see whether the algorithm could be used with implicit variations of the Milstein method to solve stiff SDEs.

The first two areas of investigation, additional error controls and timestep selection strategies, are very hard to disentangle It is impossible to determine good timestep selection criteria unless one knows what the error controls are, and difficult to judge the effectiveness of different error controls unless one has a way of choosing the timesteps For this reason it was decided to sidestep the timestep selection issue by allowing the algorithms tested to increment the Brownian motion in extremely small timesteps and choose the largest one compatible with the error controls This does not constitute a viable algorithm since the number of evaluations of the Brown- ian motion is very large but it allows us to decide whether, in principle, extra error controls have the potential to improve the efficiency of the algorithm.

To recap, I shall implement algorithms using both explicit and implicit Milstein- type schemes under differing error controls and examine their accuracy and robust- ness It is to be hoped that, together with the development of efficient timestep selection strategies, this work will result in ever more efficient and reliable software for the numerical solution of SDEs Finally, it must be noted that all of the above work only applies to SDEs forced by a single stochastic scalar process The extension of these adaptive strategies to more general SDEs will not be considered here.

Agent-Based Simulations of Financial Markets

Only relatively recently with the advent of low-cost, widely available and powerful computers has much serious work been done in building non-Gaussian models of financial markets In this chapter, I present a selected chronology of pertinent research leading to the development of the models introduced and studied here.

In 1900, Bachelier wrote his Ph D thesis, Théorie de la Speculation, about the theory of financial markets [3] An essential part of his thesis was the modeling of the distribution of price changes of a financial asset These he assumed to be Gaussian, using the theory of Brownian motion five years before Einstein’s classic

1905 paper [88] It should be noted that this assumption allows the price of stocks to become negative, which is unacceptable since stocks possess limited liability However, replacing this assumption by geometric Brownian motion (i.e., Brownian motion of the log price) corrects this problem.

Almost 60 years later, in 1959, M F M Osborne took an approach based on statistical mechanics [81] and considered the prices of common stock to be the result of an ensemble of decisions in statistical equilibrium Osborne pointed out that the logarithm of the relative price change of a stock had a steady-state distribution the same as that of a particle in Brownian motion, and some regard him as the first

In 1960 Mandelbrot published a theoretical paper [65] wherein he introduced the Pareto-Lévy law and applied it to the theory of income distribution As a second

14 purpose of his paper, Mandelbrot wanted “to draw the economist’s attention to the ereat potential importance of ‘stable non-Gaussian’ probability distributions.” In his

1963 paper [66], Mandelbrot asserted that Bachelier’s random-walk model of stock and commodity prices did not account for much of the abundant data that had been accumulated by empirical economists since 1900 In particular, the distributions were too “peaked,” that is, they contained too many small price changes as compared to a bell-shaped curve, and they also contained too many large price changes, or “fat tails.” The fat tails resulted in measurements of the kurtosis (the fourth mean- centered moment) greatly in excess of the Gaussian value 3, a phenomenon known as leptokurtosis.

Mandelbrot proposed a radically new approach to the price variation problem.

He used the logarithm of the price and replaced the Gaussian distribution with the

“stable Paretian” distribution Using changes in the price of cotton, Mandelbrot showed that his Paretian law model of the variation of speculative prices agreed closely with empirical data.

In his 1963 paper [30], Fama concluded that the stable Paretian hypothesis pro- moted by Mandelbrot was probably correct, but needed more testing By 1965 [29], Fama provided more evidence and, in fact, substantiated Mandelbrot’s findings in full Leptokurtosis was indeed an indisputable element of almost all empirical distributions of price changes Further, Fama supported the use of the stable Paretian distributions to more accurately represent the large observations He concluded that

*,, the daily changes in log price of stocks of large mature companies follow stable Paretian distributions with characteristic exponents close to 2, but nevertheless less than 2.”

In 1970, Fama published a review of the theory and empirical work of efficient capital markets [31] In an efficient market, prices “fairly reflect” available information In this paper, Fama delineated subsets of information used in the adjustment of security prices to define the strong, semi-strong and weak forms of the efficient market hypothesis Fama found no empirical evidence to reject the “fair game” efficient markets model He found no important evidence against the weak or the semi-strong form of the hypothesis, and only limited evidence against the hypothesis in the strong form (In his 1991 paper [32], though, Fama clarified that market efficiency per se is not testable In fact, “it must be tested jointly with some model of equilibrium, an asset-pricing model.” )

In 1972, Black and Scholes [6] assumed that geometric price returns of underlying stocks were in fact Gaussian, and they derived a theoretical valuation formula for the pricing of options on the stock The assumption of log normality of prices was just one of several which they made in order to work with the more tractable “ideal conditions” in the market for the stock and for the option The empirical tests which the authors conducted on their valuation formula revealed that in the real market, those who purchased options paid prices consistently higher than the prices predicted by their formula The difference in price was greater for options on low- risk stocks than for options on high-risk stocks Black and Scholes mention that the magnitude of transaction costs might explain some of the difference between option prices obtained by their formula and the prices buyers paid in the market Although their achievement in deriving the acclaimed option pricing formula is commendable, one of their assumptions appears to be wrong As McCauley points out [75], their error lay in the assumption that the empirical distribution of price returns is Gaussian. However, the empirical distribution of price returns has fat tails, and is consequently a poor fit to the Gaussian distribution.

In 1996, Pagan surveyed the financial econometric work of the prior decade Some of the relevant conclusions he arrived at are now summarized Firstly, the squares of financial returns are correlated and there is a slow decline in the autocorrelation coefficients, indicating that the correlation is persistent He also noted the “leverage effect,” that is, the volatility of asset returns depends on the algebraic sign of the returns (cross correlation between the squares of the returns and the lagged values of the returns) Although the magnitude is small, it is persistent and the values slowly die away In his review, Pagan reported evidence that the second moments of financial returns existed, but scant evidence that the fourth moments existed Further, Pagan reported that the densities of stock returns and exchange rates tended to have fatter tails than the normal distribution and that they contained marked peaks compared to the normal He concluded that financial time-series had too many small and large returns to have come from a normal density.

By the mid 1990s, when Pagan published his survey, many financial econometri- cians were aware of the stylized facts that he listed These stylized facts pointed to the non-normality of returns of financial assets This in turn ruled out certain models as possibilities when analyzing financial time-series They also questioned the input to specific financial models which depend on the existence of certain higher-order moments of the probability density function.

To better understand the dynamics of speculative markets, researchers have attempted to build microscopic models of the activities of individual traders in the market, and then examine the resulting macroscopic statistics In 1999, Maslov summarized what he considered to be the best models of market dynamics at that time

[71] Maslov’s criteria for good models included “simplicity,” 1.e., those with small numbers of assumptions and/or parameters He listed four main models: e the agent-based model by Bak, Paczuski and Shubik [4]; e the Cont-Bouchaud model [21); e the Caldarelli, Marsili and Zhang model [15]; and e the Minority Game model by Challet and Zhang [18].

I now briefly describe each of these models, along with some others.

In the first listed model, Bak, Paczuski and Shubik (BPS) produced a model that is especially straightforward, and the simplicity of their model is to be admired It consists of two types of agents, “noise” traders and “rational” or “fundamentalist” traders The trading habits of the noise traders may depend on the volatility of the market as well as the imitation of other traders The rational traders, on the other hand, optimize their own utility functions, which are based on expected dividends of the company and on the trader’s risk aversion There is only one type of stock in this simulated market and each agent may own only one share Exogenous price changes due to interest rates, money supply, wars and natural disasters are ignored. Any technicalities associated with the trading process are ignored and the price p may only vary between zero and Pmax Lastly, they let the size of the dividend (which the rational traders use in their utility functions) be a random variable.

BPS ran their simulation under various settings, thus giving several variations of their basic model The more realistic simulations were achieved with a mixture of noise traders and fundamental value traders With 2% of the traders rational and the rest noise traders, but with prices arbitrarily confined, price bubbles and fat tails occurred But when the mixture was set to 20% rational traders, only small price deviations occurred.

Not only is the BPS model easy to understand, but it is also easy to modify in order to gain a better understanding of market dynamics This model, though,has drawbacks Prices seem to be artificially confined by the rational traders when they use their utility functions These utility functions themselves seem not to be realistic — there are many possible rational trading strategies, but this model uses only one of them The noise traders, too, seem to be artificially constricted Some real noise traders work at short time intervals, while others work at longer time intervals.Furthermore, in real markets, sometimes a rational trader will in fact trade on noise.

Numerical Solution of Stochastic Differential Equations

A thorough background to SDEs and their numerical solution can be found in Kloeden and Platen [50] Although the solution of SDEs is more complicated than the solution of ODEs, many of the current techniques for adaptive SDE schemes are based upon the much older ODE literature Before we begin reviewing the literature for specific adaptive schemes for the solution of SDEs, it will be beneficial to highlight some of the techniques for an adaptive ODE solver.

A typical approach to numerically solve a deterministic differential equation using a variable timestepping approach with local error control is to create the following three components:

1 The underlying numerical integration scheme used to compute the solution. Such a scheme would be capable of computing a fixed-timestep solution even if adaptive timestepping were not being implemented.

2 A means to approximate the local error after each step in the computation, and to decide based on a preset, user-defined tolerance, 7, whether to accept or reject the step If the step is rejected, a smaller timestep is selected and the step is re-attempted See next item.

3 A mechanism by which the size of the next step is determined, based on whether the previous step was accepted or rejected as well as the size of the local error.

The basic numerical algorithm could be, for instance, an Adams-Bashforth method, a Taylor series method or a Runge-Kutta method However, the algorithm is usually selected so that the local error is more easily estimated, and when step sizes are changed, the new step size is more easily implemented.

There are several ways to estimate local error The Milne device uses two multistep methods of the same order but with different local error constants An estimate of the local error is obtained as a known constant times the difference between the two approximate solutions Unfortunately, this method requires about twice the computation when compared to the advancing of only a single multistep method, as we proceed with the integration See [44] for details.

Local error may also be approximated by the use of the technique known as Richardson extrapolation [38] Here, only one scheme is used to compute an approximate solution, but the computation is carried out twice over the same subinterval using different-sized steps To see how this may be used to approximate the local error, assume that in solving # = f(t, x) with a given initial value (to, zo), the scheme has just computed the approximation x2, over an interval of length 2h Then, a second computation approximates the true solution at 2h by computing two approximations successively over intervals of length h Now, if the order of the integration method is p, then the one-step, local error of the method may be written en = a(t +h) — xu, = ChPt! + O(hP*?).

The two-step error is composed of two parts: the error carried over from the first step, which is (1 + ngs + O(h?))e,, as well as the local error generated in the second step (but with the differentials evaluated at x, = x(t) + Ó(h)) Thus,

Now, the error in one big step of length H = 2h is given by

If we neglect the terms of order O(h?**), we can eliminate the unknown constant C and obtain

This also provides an improved approximation of order p+ 1 to x(t + 2h) :

Although this method of local error estimation is relatively simple, it does require extra function evaluations.

There is a very elegant alternative error control approach that requires no extra function evaluations in order to generate an error estimate Two Runge-Kutta algorithms of order p and p+1 are defined with one algorithm embedded inside the other (i.e., where the stages of the higher-order method are also used by the lower-order method) The difference between the two solutions over any timestep is used as an estimate of the error committed by the lower-order method A Runge-Kutta embed- ding algorithm may only require an extra p + 2 multiplications in order to compute a local error estimate as well as advance the integration See Hairer et al [38] for details.

The local error estimate is used both as a criterion for acceptance/rejection of the current step and as input to the step-size selection routine used to determine the size of the next step First, to determine whether or not we accept the current step, we compute two approximations to the true value, say x, and @, (see above) Let ?¡ be the less precise estimate Then x, — Z¡ is an estimate of the error at this step for the less precise result We accept this result if |v; — ¡| < TOL, where TOL is a predefined measure of error tolerance Typically,

TOL = TOLavs + max(zo, #ị) + TOLza, a combination of absolute error and relative error is used (if only one is needed, then the other is set to zero.)

We also use the tolerance to set the step size for the next step Following Hairer jzi-21|ToL: 8 comparison of how well the estimated error stands et al [38], we let err = up against the allowable error at this step If err < 1, then we incurred less error than allowed at this step — an accepted step If err > 1, we incurred more error than allowed — a rejected step Either way, we use the value of err further to determine the size of the next step of the integration.

Now, the error of our integration scheme is such that err ~ Ch?*!, where p is the order of the numerical method Further, at any step, the optimal step size Nopt will produce err % 1, since jz; — £,| will be close to TOL for such a large step. Thus, 1 Chee If we solve this expression for Œ and substitute it into the general expression for the previous error err with step size h, we find that the optimal step size for the next step is given by

In essence, the rule we use to determine the optimal step size for the next step is to multiply our current step size h by a factor which depends on the current error.

In practice, though, a few modifications are made to this bare bones optimal step-size rule To avoid step-size acceptance/rejection oscillation and to increase the probability that the next step is accepted, the above is multiplied by a “safety factor” of 0.8 or 0.9 Another common practice is to prevent the step size A from increasing too fast Thus, it is restricted to a maximum of only 1.5 to 5 times its current value. Typically, there is a minimum placed on it as well We summarize by writing the computation for the size of the next step as

Anew = h : min (Fac max (ec, facsafe - (=) ° )) (2.1) err

Some implementations will set facmax = 1 for the next forward step after a step rejection.

As pointed out in [38], although err is an estimate for the lower-order method, it is more advantageous to advance the integration by using the approximation of the higher order-method Of course, when we do this, we disassociate err from the approximate error of the step (We assume that the true error of the step, i.e., the error associated with the higher-order method, is even less.) This technique of advancing the integration by using the high-order method is referred to as local extrapolation.

Having reviewed some important considerations of ODE solvers that are applicable to SDE solvers, we now turn our attention to some of the adaptive timestepping methods found in the recent literature In 1997, Gains and Lyons [34] implemented a variable step size SDE solver They generated the Brownian path dynamically, so that the same Brownian path could be used in subsequent solutions (for instance, using a smaller tolerance) To do this, they represented their Brownian paths as binary trees.The trees could be saved in computer files and later read into memory if needed The use of Brownian (binary) trees, however, simplified their method of step-size selection as well as their proof of convergence of the method The authors chose to control the variances of the one-step errors rather than the errors themselves This involved an initial, rough solution of the SDE using a small enough fixed step size, and then, using this solution, they solved a linear error SDE backward in time to get the appropriate maximum error per step for the original SDE Further, they assumed that the error per step was random and at each step the error was independent of the error at the previous step But this is not likely to be the case if the SDE being solved is in a deterministic-dominated regime ( i.e., the error produced by the deterministic part of the SDE is greater than that produced by the stochastic part) Unfortunately, in this case, the Gains and Lyons algorithm might be ignoring much readily available and useful, if not crucial, information.

A Threshold Model of Investor Behavior 0

At the irrational end of the spectrum, traders sometimes buy stocks based upon what they hear on financial news shows or from their friends, read about in Internet chat rooms or just on sheer impulse [56,91] There are also reasons not to trade, such as transaction costs or the psychological reluctance to admit that a previous trade was a bad one and unwind it The challenge is to provide a model that can incorporate many of these rationales in a simple and logically consistent manner.

The model is evolved over discrete timesteps of length h that correspond to one trading day in these simulations This introduces a fundamental timescale into the model, as we cannot explicitly model any agent that acts faster than this We thus directly simulate M “slow” investors, all of equal weight in the market, and each can be long (+1) or short (—1) the market This is of course a major oversimplification, but one that is widely used by market modelers (see [1, 4, 20, 21, 60, 62, 64, 69] for similar examples).

I shall first introduce a pricing formula that determines how the price depends upon both exogenous information and the current market state Then I will describe the modeling of the M individual agents in detail.

3.1.1 Determination of the Asset Price

In real speculative markets, there are two main kinds of information that influence the price of an asset The first type is exogenous information generated outside of the market about the underlying companies whose assets are being traded For instance,outside information might be generated by business analysts, newsworthy events, forces of nature or political events Regarding its influence on the market, we assume that this type of information is correct, uncorrelated, globally available and that it instantaneously affects market price This is consistent with the strongest form of the EMH and it should be noted that we are not abandoning the EMH in its entirety.

We model the effect that external information has on the relative price of the asset for the next time interval by: p(n + 1) = p(n) exp (viAttn) — s) (3.1)h where W(n), a standard Gaussian random variable, represents the effect of all new, uncorrelated information that is globally available over the time period of length h.

In fact, the price defined by (3.1) is the solution of the It6 SDE for geometric price changes with zero drift and unit volatility commonly used in financial applications [79]: dp(£) = p(t) dWit) p(0)=pa#0, (3.2) the solution to which is p(t) = po exp (-; t+W 0) (3.3) 1 ;

We interpret time to be measured in units in which the variance of W/(n) is unity The asset price generated using (3.1) we call the “fundamental price” pr(n) Finally, the absence of a drift term corresponds to a price process that is drift adjusted relative to the risk-free interest rate.

The second type of information that affects the market price is information generated by the market itself, such as price increases, price volatility or market sentiment(defined below) Thus, we now bring into the model the state (long or short) of the i“ investor Denote the position of the 7 investor at the end of the nTM time interval by s;(n) = +1, and the sense or sentiment of the market by the average of the states of all of the M investors: a(n) = i ằ 3;(n) (3.4)

The change in market sentiment is Ao(n) = o(n) — ơ(n — 1) and this modifies the EMH price formula (3.1) as follows: pớn +1) = pln) exp (VRAW(n) = 3 + sAứ(m) , (3.5)} where the parameter ô quantifies the effect that a single market participant has on the relative price of the asset We note that this choice represents market impact due to internally generated information as a linear function of the change of market sentiment Although the exact nature of the functional relationship is uncertain, a linear function represents a good first approximation. ẽ now turn to the problem of incorporating the effect of fast noise traders (day traders) into the pricing formula We assume that day traders pay particularly close attention to market internals and try to ride market trends When the price of the asset is going up, they buy and then liquidate a short time later This puts added pressure on the price, to increase it, since demand has increased Likewise, when the price is going down, day traders short the asset in hopes of buying it later at a lesser price when positions are closed This, too, places pressure on the price of the asset, to decrease it, since supply has increased Both activities produce wider price swings, thus increasing the price volatility.

There would be no need to model this effect if it were uniform in time, since it could be incorporated into the data stream AW However, we posit that in highly polarized markets (times of extreme market sentiment) the activity, and possibly the number, of such traders increases We therefore include the influence of day traders in the market model by incorporating a volatility factor 1 + a|a(n)| into the relative price, where the constant a determines the effect that surplus day trading activity has on price volatility The price is thus finally determined by pớn +1) = pln) exp ((VRAW(n) ~ 2)(1 + alz(n)j)+Aứ(m)) (69}

Note that when o(n) + 0, there is no additional day trader influence on the relative price On the other hand, in a bullish or bearish market, the influence of external information, and hence the volatility, is increased This is consistent with Brown’s finding that volatility is strongly related to investor sentiment [7).

Both o(n) and Ao(n) are determined by the individual investor’s position in the market, to which we now turn our attention.

3.1.2 Determination of an Investor’s Position

In order to close the model we must now specify how the states of the individual agents are determined, i.e., how the #° agent decides when to switch This of course is where the individual agent’s bounded rationality and psychological considerations enter the model.

Many heterogeneous agent-based models, including those of Lux and his co- workers, have incorporated the concept of “herding” or “contagion,” which encourages investors holding a minority position to switch and join the majority The presence of herding in a model appears to directly lead to the formation of fat tails and boom-bust dynamics, but it is often difficult or impossible to verify a direct causal relationship or to quantify it In particular, two questions are of interest Firstly, is herding responsible for the power-law decay of the price-return tails and, secondly, which of the other stylized facts (especially volatility clustering) can be attributed to it? This model allows us to address both these questions.

The herding phenomenon is often thought of, especially by classically trained economists, as a purely irrational, emotional response but this may not always be the case Many professional investors or institutions would lose their jobs/investment capital if they significantly underperformed the market for even a few quarters in a row In such circumstances herding is a perfectly rational response, triggered by internal market structures that themselves violate the EMH.

To include herding effects into the model we imbue each agent with a fixed- threshold value C; > 0 which determines the agent’s tolerance to being in the minority At time ứ, her herding tendency (or ‘cowardice’) level is denoted by c;(n). This level is incremented via cj(n +1) = ¢(n)+hlo(n)| (1.e., increased by an amount proportional to the length of the time interval and the severity of the inconsistency) whenever s;(n)o(n) < 0 If the investor is in the majority, then her cowardice level remains unchanged from one timestep to the next and c;(n +1) = c¡(n) As soon as the investor’s level of cowardice đ¡(n) exceeds C;, the investor switches market position and her cowardice level is reset to zero.

Stochastic Differential Equations 0.00000, 57

We are primarily interested in numerical methods of order 1 to determine discrete- time, strong (path-wise) approximate solutions to autonomous stochastic differential equations of the form: dX (t) = f(X(t)) dt + g(X(t))dW(t), X(0)=NXo, t€[0.T], X |eR* (310)

The increment dW (t) = W(t+ At) — W(t) is a Gaussian random variable N(0, At) corresponding to the m-dimensional Wiener process W(t) (Brownian motion) We assume that the functions f and g are twice differentiable and that they satisfy appropriate Lipschitz conditions, moment bounds and growth conditions for a solution to exist We say that a discrete-time approximate solution Y(t) with maximum timestep size h converges strongly to X at time 7, provided lim E(Xr — Y(7)J) = 0 (3.11) hoot

The solution of the SDE (3.10) is given by

X(t) = Xo +/ fŒXG))4s+ | g(X(s))dW(s), (3.12) “t where the stochastic integral is calculated as the mean-square limit of an approximating sum Alternatively, we may view Equation (3.12) as the integral form of Equation (3.10).

Unlike the usual Riemann integration for functions with bounded variation, the value of a stochastic integral depends on the location, within each subinterval, of the point at which the integrand is evaluated In particular, consider a regular partition of the interval [0,7] into N equal-sized subintervals using grid points t,,(i = 0,1,2, ,N) Further, let 7, = (1 — A)ty-1 + Atm for A € [0,1] The stochastic integral in (3.12) is defined as the mean-square limit as N — oo of the approximating sum

The systematically selected values of 7, produced by setting \ in turn to 0, 1/2 and

1 produce, in general, different random variables in the limit I list the integrals and the SDE each solves. e \=0 — lô Integral This choice guarantees, under mild conditions, that the ltô Integral will be a martingale and is required in many financial applications since it corresponds to non-anticipatory stochastic processes This integral is used to solve Equation (3.10) ®À = 3 — Stratonovich Integral When we set À = 3ì we obtain the well-known Stratonovich integral, usually written using the circle notation i g(X(s)) 0 dW(s). to distinguish it from the Ito integral The Stratonovich integral and the Hô integral are related by the SDE each solves If we assume that Ito integration solves Equation (3.10), then Stratonovich integration solves the related SDE: ¿Xi = (FX) FoR aCX)) ae aXW) oa (8:3) 1,

We shall define f(X(t) = f(X(t)) — $g/(X(t))g(X(t)) One of the main advan- tages of working with Stratonovich integration and its corresponding SDEs is that Stratonovich calculus more closely follows the simple transformation rules of traditional calculus, resulting in simpler Taylor-Series expansions of the exact and numerical solutions. e \=1 — Backward It6 Integral When we set À = 1, we obtain the less well-known backward Ité integral [49] To distinguish it from the Ito and the Stratonovich integrals, above, we write it using the solid circle notation, as

Just as the Stratonovich integral and the It6 integral are related by the SDE each solves, backward It6 integration solves the SDE: dX (t) = (f(X(t)) — g(XŒ))ứ(X(0)) dt + g(X (00) ô dW(t) (3.14)

I single out backward It6 integration because of the role it plays in establishing reliable implicit numerical integration algorithms.

Most SDEs are not solvable in closed form unlike, for example, the simple geometric Brownian motion SDE (2.2) associated with efficient market hypothesis and financial markets For those equations which are not, we use numerical methods to obtain approximate solutions The simplest such integration scheme is known as the Euler- Maruyama Method:

Xnar = Xn + f(Xnjh + gi Xn JAW) (3.15)

Although simple to use, in general, this scheme suffers from a low order of strong convergence, namely 1/2 See [50] for further details.

To obtain a higher-order integration scheme without incurring much more computational cost, we add the term šứ(Xz;)ứ(X„,)(AW — h) to the Euler-Maruyama method to obtain the well-known Milstein method,

Xngi = Xn + F(X jh + g(X,)AW,, + 59 (Xn)9(Xn)(AW;, — h) (3.16)1 which is of strong order 1 Written in Stratonovich form this becomes

Xngi = Xn + hf (Xn) + AW„g(X„) + s0 (Xn)g(X„)AWỗ (3.17)1

Both the Euler-Maruyama and Milstein methods are examples of Taylor methods since they correspond to truncations of the Taylor-Series expansion of the exact solution Integration schemes of higher order are known [9, 13,50), but their wide use is not evident in the literature Their complexity of implementation and the pronounced increases in computation cost, even for very modest increases in order, often do not weigh favorably when compared to the use of the Milstein scheme (even if a smaller step size h must be used).

Both of the methods described above are explicit, since each uses only the value known at X„ to calculate the next value at X„.¡ When the unknown quantity Ấn+i appears on both sides of the equation of the integration scheme, the scheme is known as an implicit method Of course, the equation must be solved for the unknown X41. Even if the term cannot be isolated algebraically, it may be solved for numerically. Although implicit schemes may be slightly more complex than their explicit counter- parts, such methods often exhibit improved stability properties Unfortunately, not all explicit schemes used to solve SDEs can be easily modified to produce an implicit scheme that is more stable.

A system of linear ordinary differential equations is considered ‘stiff’ when the eigenvalues of its coefficient matrix are negative and differ greatly in magnitude.When solved numerically, such stiff systems typically require implicit integration schemes due to the greater stability of such methods Similar phenomena occur in the solution of SDEs and have led to the development of implicit methods to solve SDEs.

The explicit Euler-Maruyama method (3.15) can be turned formally into an implicit scheme simply by evaluating the right-hand side of the integration equation at Xn+1 instead of X„, as follows:

In general, however, this scheme is not suitable for generating approximations because of the use of X41 in the stochastic term For example, see [50] (p 336) for an analysis of the method applied to the one-dimensional homogeneous linear It6 SDE dX; = f(X,)dt + g( Xd, (3.19) where unbounded random variables can occur.

An alternative to the above ‘fully’ implicit method is to introduce implicitness into only the deterministic term of the iteration scheme Continuing the case of the Euler-Maruyama method (3.15), this gives us the following scheme:

Xnai = Xn + f(Xner)h + g(Xn)AWn.- (3.20)

As in Tian and Burrage [98], we will refer to these variations on the Euler-Maruyama method as e explicit

Xngt = Xn + ƒ(Xa)h + g(XnjJAW), ® semi-implicit Ẩn = Xn + F(Xngijh + g(XnJAW), e implicit

The explicit and semi-explicit Euler-Maruyama methods converge to the exact solution of the Itô form of an SDE, but the implicit version involves a backward Ito integral As pointed out in [98], because of the use of the backward ltô integral in the integration scheme, we expect this method to converge to a different solution. Riimelin [86] shows that a similar predictor-corrector method related to the implicit method converges to the solution of the backward Ito SDE dX (t) = f(X(t))dt + g(X(t)) edW(t).

Tian and Burrage [98] use this as a starting point to develop what they call an

Tiêu đề	Models of Selected Problems in Mathematical Finance and Numerical Methods for Stochastic Differential Equations
Tác giả	Timothy L. Seaman
Người hướng dẫn	Harbir Lamba, Associate Professor
Trường học	George Mason University
Chuyên ngành	Computational Sciences and Informatics
Thể loại	Dissertation
Năm xuất bản	2006
Thành phố	Fairfax, Virginia

Định dạng
Số trang	133
Dung lượng	12,9 MB