V.V. Dombrovskii
In this work we propose a novel methodology for optimal dynamic allocation of a portfolio of risky financial assets under hard constraints on trading volume amounts. Our approach is direct in that it uses directly the observed historical data to construct an adaptive algorithm for online portfolio selection. The problem of portfolio optimization is stated as a dynamic problem of tracking a financial benchmark. We use the model predictive control (MPC) methodology in order to solve the problem. The main features of our approach are (a) the ability to adapt to non-stationary market environments by dynamically incorporating new information into the decision process; (b) no stochastic assumptions are needed about the stock prices, and (c) the flexibility of dealing with portfolio constraints. We also present the numerical modeling results, based on futures traded on the Russian Stock Exchange FORTS that give evidence of capacity and effectiveness of proposed approach.
Keywords: investment portfolio, non-stationary financial market, adaptive optimization, model predictive control.
The investment portfolio (IP) management is an area of both theoretical interest and practical importance. The basis of the current classical theory of optimal portfolio allocation problem is the single-period mean variance approach suggested by Markowitz [1] and the Merton dynamic IP model [2] in continuous time. At present, there exists a variety of models and approaches to the solution of the IP optimization problem, but most of them are the complications and extensions of the Markowitz and Merton approaches to various versions of stochastic models of the prices of risky and risk-free securities and utility functions. The review of the main trends existing in the modern theory of stochastic control in finance is given in [3].
Most existing methods need some statistical models of the asset returns (prices). To implement portfolios based on these models in practice, one needs to estimate the parameters of these models (typically, the means and covariance of asset returns). The portfolio optimization is divided into two steps: 1) observed historical data is first used to compute estimates of the parameters; 2) then a suitable optimization problem is solved using of the estimated quantities in place of the true ones. Each of these steps involves restrictive assumptions on the return process such as i.i.d. (independent, identically distributed) and stationarity hypotheses. Moreover, one needs an ergodic property of returns to ensure that the time average of a quantity converges to its expectation. But
1 2012-2014 , 8.4055.2011.
the market statistics shows that the return processes are non-stationary and non-ergodic [4]. So the result of optimization will be sensitive to errors in the estimation. Due to estimation error, the portfolios that rely on the sample estimates typically perform poorly out of sample. Moreover, the most of the results presented in the literature are limited to the cases without explicit constraints on the trading volume amounts. However its well-known that realistic investment models must include ones.
In this work, we take an absolutely different route to dynamic portfolio optimization under constraints. The method developed in this paper that is described below does not treat estimation and optimization separately. Our route is direct in that it does not rely on a two stages (estimation/optimization) approach and no stochastic assumptions are made about the stock prices. Therefore, unlike related models in the literature no statistical characteristics are needed about the stock prices and no statistical estimation techniques are used to compute the parameters of the portfolio model. Instead, parameters are treated as adjustable variables and directly obtained from the observed historical data to optimize the objective function. We leverage on the methodology of model predictive control (also known as receding horizon control) in order to design feedback portfolio optimization strategy [5].
MPC proved to be an appropriate and effective technique to solve the dynamic control problems subject to input and state/output constraints. MPC have begun to be used with success in financial applications such as portfolio optimization and dynamic hedging. Some of the recent works on this subject can be found, for instance, in [6]-[10]. In all these papers authors assume the hypothesis of serially independent returns and consider the explicit form of the model describing the price process of the risky assets (e.g. geometric Brownian motion, e.t.c.). The problem of MPC for discrete-time systems with dependent random parameters and its application to IP optimization is considered in [11, 12].
The main contribution of this paper is to propose a framework for the computation of dynamic trading strategies subject to hard constraints on the trading volume amounts that are adaptive to input data. Adaptive algorithms have the ability to adapt to the underlying data by dynamically incorporating new information into the decision process and they are naturally more suitable for non-stationary environments, such as those in finance. The method developed in this work based on the idea of moving horizon prediction that is to predict the portfolio state using a moving and fixed-size window of data. When a new measurement becomes available, the oldest measurement is discarded and the new measurement is added. The motivation is within the context of algorithmic trading, which demands fast and recursive updates of portfolio allocations, as new data arrives.
We present the numerical modeling results, based on futures, traded on the Russian Stock Exchange FORTS (Futures & Options on RTS), that give evidence of capacity and effectiveness of proposed approach. Numerical examples based on real market have shown that our approach is a theoretically sound and computationally efficient method.
1. Portfolio optimization problem
Consider the investment portfolio consisting on the n risky assets and one risk-free asset (e.g. a bank account or a government bond). Let u(k) (i = 0, 1,2, ..., n) denote the amount of money invested in the ith asset at time k; u0(k)>0 is the amount invested in a risk-free asset. Then the wealth process V(k) satisfies:
V(k) = f^u, (k) + U0(k). (1)
i =1
Notice, that if u(k)<0 (i = 1, 2, n), then we use short position with the amount of
shorting | u,(k)|.
Let ^(k+l) denote the return of the ith risky asset per period [k, k + 1]. It is a stochastic unobservable at time k value defined as
p k
where Pi(k) denotes the market value of the ith risky asset at time k.
By considering the self-finance strategies (self-financing means that we do not allow wealth to be added to or extracted from the portfolio), the wealth process V( ) at the time k+1 is given by:
V (k +1) = [1 + n (k +1)](k) +[1 + r ] o (k), (2)
i =1
where r is a risk-free interest rate of the risk-free asset, here w0 (k) = V(k) - (k).
i =1
Using (1) we can rewrite (2) as follows (see [11]):
V (k +1) = [1 + r ] V (k) + (n(k +1) - enr )u(k), (3)
Where u(k)=[u1(k), ...,un(k)]T is the vector of control inputs, n(k)=[nI(k) q2(k) ... n(k)]
is the vector of risky asset returns, en is n-dimensional vector with unit elements.
We impose the following constraints on the control actions [11]
utmin (k) < ut (k) < utmax (k), (i = \n), (4)
uomin (k) < V(k) - ut (k) < uomax (k). (5)
If u,mm(k)<0 (i=1,2,...,n), so we suppose that the amounts of the short-sale are restricted by |u,mm(k)|; if the short-selling is prohibited then u,mm(k)>0 (i=1,2,...,n). The amounts of long-sale are restricted by u,max(k) (i=1,2,.,n); u0max(k)>0 defines the amount we can invest in the risk-free asset; umin (k) < 0 determines the maximum volume of a loan
over the risk-free asset. Note, that values u,mm(k) (i=0,1,...,n), u,max(k) (i=0,1,...,n) are often depend on common wealth of portfolio in practice. So that we can write u,mm(k) = yi V(k), u,max(k) = Yi V(k), where yi, ji' are constant parameters.
Our objective is to control the investment portfolio, via dynamics asset allocation among the n stocks and the bond, as closely as possible tracking the deterministic benchmark
V 0(k +1) = [1 + ^]V 0(k), (6)
where ^0 is a given parameter representing the growth factor, the initial state is
Let the only source of information at instant k be the history of securities and current
values of V (k). Notice that variable V0(k) is known for all time instant k=I,2,... and
may be considered as a pre-chosen parameter. Thereby, the optimal portfolio appears to be dependent on the current values of the portfolio and the history of securities and changed by new information. This type of reasoning is standard for the control theory under uncertainty and gives the basis of feedback type control laws.
We use the MPC methodology in order to define the optimal control portfolio strategy. The main concept of MPC is to solve an open-loop constrained optimization problem with receding horizon at each time instant and implement only the initial optimizing control action of the solution, that leading to the following optimization:
f m 2 mm J(k +1/k) = E ^[V(k + i / k) - V0(k + i)l +
u(k),...,u(k+ m-1) [ i=1 (7)
+uT (k + i -1)R(k, i - 1)u(k + i -1) / V(k),n(k), n(k -1),...,n(k - N)},
where m is the prediction horizon, u(k+i)=[u1(k+i), .,un(k+i)]T is the predictive control vector, R(k,i)>0 is a positive symmetric matrix of control cost coefficients, V(k+i/k)(i=I,...,m) are the predicted values of portfolio, N is a depth of history taken into account. The performance criterion (7) is composed by a quadratic part, representing the quadratic error between the portfolio value and a benchmark. So our portfolio is minimized against an ideal benchmark portfolio that has positive deterministic returns for each time step and is riskless.
2. Model predictive control strategies design
Criterion (9) can be transformed into equivalence form
J (k + m / k) = E \V 2(k + i / k) - 2V (k + i / k )V 0(k + i) +
I i =1
+uT (k + i -1)R(k, i - 1)u(k + i -1)/ V(k), n(k),..., n(k - N)}, (8)
where we eliminated the term that is independent of control variables. Define prediction value of the portfolio by the equation
V (k + i / k) = A'V (k) + a1 -1 B[0(k )]u (k) + A1-2 B[0(k )]u (k +1) +...
... + B[0(k)]u(k + i - 1),(i = 1m) (9)
A=(I+r), B[0(k)] = [0(k) - enr],
0(k) = a1n(k) +a2n(k -1) +... + aNn(k - N +1), a1,a2,...,aN
are some pre-chosen parameters. These parameters are determined so as to achieve the best results when the model (9) is used for the decision making. It must be emphasized that no assumptions are made about these parameters. In this context is not required that the sum of parameters be less or equal of unity (as, for example, required about the parameters of AR models). Moreover, the linearity on parameters is supposed only for the sake of simplicity. We emphasize that equation (3) determine the evolution of the materialized portfolio whereas equation (9) define the predicted value of the portfolio. So unlike related model in the literature we don't forecast future returns but we predict the future value of the portfolio. Model calibration method is described below in section 3. If at time instant k we observe an actual value of the portfolio and realization of the returns by looking at a stream of N historical data for the returns then the observed return sequence becomes deterministic, and (9) would return a deterministic vector V(k+i/k) (i=I,...,m) (however, before we get to observe returns this vector remain uncertain and random). So we optimize the future inputs as deterministic variables (i.e., variables de-
termined by V(k) and n(k), n(k -1), n(k - 2),..., n(k - N)) . We can re-express (8) as follows
J(k + m / k) = WT (k + 1)W (k +1) - Aj(k +1)W(k +1) + UT (k)A(k)U(k),
W (k +1) = TV (k) + [0(k )]U (k),
" V (k +1/k)" u(k)
W (k +1) = V (k + 2/ k) ,U (k) = m (k +1)
V (k + m / k)_ 1 ( k + m 1 1
T =
" A " ' 5[0(k)] 0nx2 . . 0nx2
A2 , [0(k)] = AB[0(k)] B[0(k)] . . 0nx2
_ Am _ _ Am-1 B[0(k)] Am-2 B[0(k)] . . B[0(k)]_
Aj(k +1) = 2 [V0 (k +1) V0 (k + 2) ... V0(k + m) ].
... 0_
A(k) =
R(k,0) 0x ...
0x R(k ,1) ...
... R(k,m -1)]
Using (11) we can write (10) as follows
J(k + m /k) = V2 (k)TTT + [2V(k)TT - Aj(k +1)] [0(k)]U(k) -+UT (k) [T [0(k )][0(k)] + A(k) ] U (k).
Denote the following matrices H (k) = T [0(k )][0(k)] + A(k), G(k) = TT [0(k)], F (k) = Aj(k + 1)[0(k)].
Thus we have that the problem of minimizing the criterion (12) subject to (4), (5) is equivalent to the quadratic program problem with criterion
Y (k + m / k) = [2 V (k )G(k) - F (k)] U (k) + UT (k) H (k )U (k)
subject to constrains (element-wise inequality)
Umin (k) < S(k)U(k) < Umax (k),
Umin (k) = [Mmm(k),0 n+1x1 ,,° n+1x1] ,
Umax(k) = [umax(k),0 n+1x1 ,...,0 n+1x1] ,
Umm(k) =[MlmIn(k), ... Mmln(k), 0mln(k) - V (k) ] , Umax (k) = [1max(k), ... maX(k), 0max(k) - V (k) ]T , S (k) is the block matrix of the form S (k) = diag(S (k ),0+1x ,...,0+1x)
" 1 0 .. . 0"
0 1 .. . 0
0 0 .. . 1
-1 -1 .. . -1_
S (k) =
The MPC policy with receding horizon m for each instant k is defined by the equation:
u(k) = [ In 0n ... 0n ]U (k), where In is n-dimensional identity matrix; 0n is n-dimensional zero matrix. Therefore we obtain the desired result.
3. Numerical examples
This section tests the proposed approach. We want to assess the performance of our model under real market conditions by computing the portfolio wealth over a long period of time. To this end, we consider the real security returns in the period from July 2007 to May 2013 and conduct a backtest. We consider the situation of an investor who has to allocate his wealth among five risky assets and one risk-free asset. The updating of the portfolio is executed once every trading day. We used five futures traded on the Russian Stock Exchange FORTS (Futures & Options on RTS): RTS, Gazprom, LUKOIL, Sberbank, GOLD. We tested the results on daily actual closing prices over a period of time from July 20, 2007 to May 22, 2013. The risk-free asset considered here as bank account with risk-free rate r = 0 per annum. We set the tracking target to return 0.3% per day (^=0.003). For our portfolio, we assumed an initial wealth of V(0)=V\0)=LThe weight coefficients are set as R = diag(10-4, ...,10-4). We impose hard constraints on the tracking portfolio problem with parameters y,' = 4/5, y," = 4 (i = 1, .,5), y0' = 4. For the on-line finite horizon problems MPC we used a prediction horizon of m = 10, and numerically solved it in MATLAB by using the quadprog.m function.
First we need to tune the required model parameters. Lacking an analytical solution, we tune the parameters and the value data window length N based on an initial training data period to minimize the objective function. Our method includes the recalculation of the MPC trading strategies with different parameters. All these different versions of the parameters become the input of the MPC algorithm. We compute the portfolio wealth over the training period for each of the set of parameters and select the one that generates the best results of tracking. To reduce the number of tuning parameters we will simplify the model. Let
= a(2); N1 +N2 = N .
a1 =a 2 = ... = a N =a(1); aN
= ... = aN
N1 N2
0(k) = a(1) n(k -1 +1) + a(2) n(N -1 +1).
Thus the simplified model include only two parameters a(1), a(2) and two values of data lengths windows N1, N2. The number of observations for training and test datasets are 200 and 1200 , respectively. The following values of parameters were se-
lected: a(1) = 0,7 ; a(2) = 0,3 and length N1 = 15, N2 = 10 . These parameters are as-
,(2) -
sumed to be stationary over the investment horizon and equal to the initial empirical values, based on backwards data. The results are summarized in three figures. Figure 1 plots portfolio (bold line) and benchmark values (dotted line). In figure 2 we have investments in the RTS futures. Figure 3 illustrates the evolution of daily RTS futures returns.
V, Vo 60
0 400 800 1200 k
Fig. 1. Performance of benchmark tracking (V - bold line, V - dotted line)
u 120
0 400 800 1200 k
Fig. 2. Invested amount in RTS futures
We find that on actual data the proposed approach is reasonable. The value of the portfolio is effectively tracked the benchmark and respected the constraints. It is important to acknowledge that, even in this example, where we use simple unsophisticated approach to tune the parameters, the tracking performance appears to be rather efficient. The obvious appeal of our approach is its simplicity and the fact that it is not oriented to a special class of forecasting schemes.
-0,2 -------r--------;.......\........i-------:-------r.......-
0 400 800 1200 k
Fig. 3. Daily returns of RTS Stock Exchange Index futures Conclusion
In this paper we studied a discrete-time portfolio selection problem subject to constraints on trading volume amounts. The optimal open-loop adaptive portfolio control strategy with using MPC methodology is derived. We also present the numerical modeling results, based on stocks (futures) traded on the Russian Stock Exchange FORTS that give evidence of capacity and effectiveness of proposed approach.
The main features of our approach are (a) the ability to adapt to non-stationary market environments by dynamically incorporating new information into the decision process; (b) no stochastic assumptions are needed about the stock prices; (c) it is not oriented to a special class of forecasting schemes, and (d) the flexibility of dealing with portfolio constraints.
1. MarcowitzH.M. Portfolio selection // J. Finance. 1952. V. 7. No. 1. P. 77-91.
2. Merton R.C. Continuous-time finance. Cambridge: Blackwell, 1990.
3. Runggaldier W.J. On stochastic control in finance, in Mathematical systems Theory in Biology, Communication, Computation and Finance, D. Gilliam and J. Rosental, Eds., New-York: Springer, 2002.
4. Cont R. Empirical properties of asset returns: stylized facts and statistical issues // Quantitative Finance. 2001. V. 1. P. 223-236.
5. Rawlings J. Tutorial: model predictive control technology // Proc. Amer. Control Conf. San Diego. California. June 1999. P. 662-676.
6. Dombrovskii V.V., Dombrovskii D.V., and Lyashenko E.A. Predictive control of random-parameter systems with multiplicative noise. Application to investment portfolio optimization // Automation and Remote Control. 2005. V. 66. No. 4. P. 583-595.
7. Dombrovskii V.V., Obedko T.Yu. Predictive control of systems with Markovian jumps under constraints and its application to the investment portfolio optimization //Automation and Remote Control. 2011. V. 72. No. 5. P. 989-1003.
8. Herzog F., Dondi G, Geering H.P. Stochastic model predictive control and portfolio optimization // Int. J. Theoretical and Applied Finance. 2007. V. 10. No. 2. P. 203-233.
9. Primbs J.A., Sung C.H. A stochastic receding horizon control approach to constrained index tracking // Asia-Pacific Finan Markets. 2008. V. 15. P. 3-24.
10. BemporadA., Puglia T., Gabriellini T. A Stochastic model predictive control approach to dynamic option hedging with transaction costs // Proc. American Control Conference, San Francisco, CA,USA, June 29 - July 01, 2011. P. 3862-3867.
11. Dombrovskii V.V., Dombrovskii D.V., Lyashenko E.A. Model predictive control of systems with random dependent parameter under constraints and Its application to the investment portfolio optimization // Automation and Remote Control. 2006. V. 67. No. 12. P. 1927-1939.
12. Dombrovskii V.V., Obedko T.Yu. Portfolio optimization in the financial market with serially dependent returns under constraints // ocoo ocyacoo . , ca oaa. (Tomsk State University Journal of Control and Computer Science). 2012. 2 (19). . 5-13.
Dombrovskii Vladimir V.
Tomsk State University
E-mail: 12 2013 .
.. (^ ocyac ). .
coa: co , caoa ^^ , oa, c ooy o.
ao ooo acoo co c oa o oa. aaoa ao, ocoa o cooa cocx occ caccoo oa aao o. oa o oyyc c oo o, ooa c cooa oa ooyo . ^ ooa c cooa occcoo -acooo . co yca ooo oxoa c: ) o-ooc aaoac c yco oo oa ^o^cc ; ) yc ^Ern^^o oo omora-o oocx coc ^^ ao; ) -oy caccx oo oa aao o; ) oooc oa.