doi:10.1016/S0378-4371(00)00381-2
Copyright © 2000 Elsevier Science B.V. All rights reserved.
Intraday patterns and local predictability of high-frequency financial time series
Humboldt–University Berlin, Institute of Physics, 10115 Berlin, Germany
Received 8 May 2000;
revised 14 June 2000.
Available online 4 December 2000.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
The structure of high-frequency time series of financial data taking the DAX future as an example is investigated with respect to the existence of local order on a time horizon of a few minutes. We will show that there might be special local situations where local order exists and where the predictability is considerably higher than average. We discretize the time series and investigate the continuation frequency of definite words of length n first. Besides higher order Shannon entropies and conditional entropies (dynamic entropies) which yield mean values of the uncertainty/predictability, we study the local values of the uncertainty/predictability and the distribution of these quantities. The local order significance is treated by means of surrogate sequences with identical short memory as the original data.
Author Keywords: Local predictability; Entropy; Symbol dynamics
PACS classification codes: 05.45.Tp; 02.50.Ey; 65.50.+m
Fig. 1. DAX future (upper curve) and local uncertainty h5 of the prediction of the sixth symbol based on the five preceding symbols (lower curve) for a trading day with large fluctuations are shown. The greyvalue in the lower curve codes the level of significance calculated from surrogates with memory of one. Dark represents a large deviation from the noise level (good significance). There is no trivial coherence between the price evolution (upper curve) and predictability (lower curve).
Fig. 2. Conditional entropy (uncertainty) as a function of word length n. The uncertainties for the original data (solid curve) are always smaller than those calculated from the surrogate sequences (dashed curve) of the same length. The surrogates have a memory of one, i.e., for infinite surrogate sequences the uncertainties would be constant for n
1 (dotted curve). Beyond n=5 (grey region) the calculation of the conditional entropy is not reliable due to large statistical errors (finite length effects) [10, 11 and 12].
Fig. 3. Local uncertainty distribution of the surrogate sequence for the word 11111.
Fig. 4. Out of sample performance analysis (solid line) as a function of the minimal desired in sample prediction significance K is shown. The total number of out-of-sample predictions (dashed line) having an in-sample significance value larger than K is a nearly exponentially decreasing function of K.
Table 1. Conditional probability p(2)(A2|A1) of the discretized price changes

Table 2. Words with the smallest uncertainty hn (highest predictability rn=1−hn) have a good significance K. The significance K is on average decreasing with the word length n due to finite length effects

Table 3. We list the empirically observed relative frequencies of a larger downturn (0), a roughly constant market (1) and a larger upswing (2) for the next 2 trading minutes, for a variety of histories (summarized by our words (absolute frequency)) of the preceding five symbols (10 min). These are the most predictable events from Table 2
