Effects on Long-Range Dependence and Multifractality in Temporal Resolution Recovery of High Frame Rate HEVC Compressed Content

Gavrovska, Ana

doi:10.3390/app13179851

Open AccessArticle

Effects on Long-Range Dependence and Multifractality in Temporal Resolution Recovery of High Frame Rate HEVC Compressed Content

by

Ana Gavrovska

School of Electrical Engineering, University of Belgrade, Bulevar kralja Aleksandra 73, 11120 Belgrade, Serbia

Appl. Sci. 2023, 13(17), 9851; https://doi.org/10.3390/app13179851

Submission received: 13 August 2023 / Revised: 28 August 2023 / Accepted: 29 August 2023 / Published: 31 August 2023

(This article belongs to the Special Issue Cryptography and Information Security)

Download

Browse Figures

Versions Notes

Abstract

:

In recent years, video research has dealt with high-frame-rate (HFR) content. Even though low or standard frame rates (SFR) that correspond to values less than 60 frames per second (fps) are still covered. Temporal conversions are applied accompanied with video compression and, thus, it is of importance to observe and detect possible effects of typical compressed video manipulations over HFR (60 fps+) content. This paper addresses ultra-high-definition HFR content via Hurst index as a measure of long-range dependency (LRD), as well as using Legendre multifractal spectrum, having in mind standard high-efficiency video coding (HEVC) format and temporal resolution recovery (TRR), meaning frame upconversion after temporal filtering of compressed content. LRD and multifractals-based studies using video traces have been performed for characterization of compressed video, and they are mostly presented for advanced video coding (AVC). Moreover, recent studies have shown that it is possible to perform TRR detection for SFR data compressed with standards developed before HEVC. In order to address HEVC HFR data, video traces are analyzed using LRD and multifractals, and a novel TRR detection model is proposed based on a weighted k-nearest neighbors (WkNN) classifier and multifractals. Firstly, HFR video traces are gathered using six constant rate factors (crfs), where Hurst indices and multifractal spectra are calculated. According to TRR and original spectra comparison, a novel detection model is proposed based on new multifractal features. Also, five-fold cross-validation using the proposed TRR detection model gave high-accuracy results of around 98%. The obtained results show the effects on LRD and multifractality and their significance in understanding changes in typical video manipulation. The proposed model can be valuable in video credibility and quality assessments of HFR HEVC compressed content.

Keywords:

video; high frame rate (HFR); temporal resolution recovery; change detection; constant rate factor; multifractals

1. Introduction

Video content evolves to more complex forms of media, with a variety of different combinations of spatial resolution, dynamic range, color range, codecs, containers and frame rates. Among others, it is well known that high-frame-rate (HFR) video content is important for high-quality consummation and especially further relevant video investigation, since transmission bandwidth and storage is affected by new frame rate formats [1]. Generating new video and immersive media comes with a lot of challenges for traditional infrastructures and sharing possibilities, particularly for higher resolutions, like temporal. HFR contributes to the increase in perceived quality, but in most practical applications, video still rarely exceeds 60 frames per second (fps) [1,2].

Low frame rates (LFR) or standard frame rates (SFR) have become an obvious limitation, especially when it comes to sport, fast action genres (as in cinema and gaming) and immersive virtual reality (VR) and augmented reality (AR) content [2]. This is also recognized in BT.2020 or the ultra-high-definition (UHD) television standard, where up to 100- and 120-frame frequencies are adopted for further exploitation [3]. Moreover, in the Advanced Television Systems Committee (ATSC) 3.0 ecosystem, high or very high frame rates are expected, as well as video services developed by interconnecting 5G communication networks and ATSC 3.0 broadcasting. Currently, in these cases, high-efficiency video coding (HEVC or H.265) is adopted to deal with novel video technology formats like HFR [4,5,6]. Progressive formats are accompanied by picture rates with possible dealing with SFR and HFR (like 120 fps) recovery and temporal filtering [5,6]. Media over Internet Protocol (IP) provides a high level of flexibility, and new broadcast systems are already using IP infrastructures [7]. The infrastructures enable HFR distribution, but dealing with video in such manner also leads to significant changes in bandwidth requirements. Besides future formats and services, compression technologies are of crucial interest [6]. The perceptual quality improvement resulting from HFR is recognized in industrial and academic communities [8]. HFR is preferable for applications in order to enhance the smooth end-user experience and to produce different effects [9,10]. It is not easy to select and adopt frame rates, since frame frequency changes produce distortions. In typical workflows, the frame rate is decided before any acquisition. This leads to general suggestions that production needs to be as high as possible during the production phase, where end-user deliverables are adopted/modified to final needed frame rates [10]. An increasing number of modern content creator captures show their activities using HFR on social networks and sharing IP platforms [11,12,13]. Working with fast-forward video and similar content that is not the result of temporal consecutive frames is especially challenging, since the quality is jointly dependant on frame rate and compression [14]. Video reproductions in HFR have been reported and analyzed in [13,14,15,16,17,18,19,20,21,22,23]. Video conversion, on the other hand, usually means frame rate upscaling or upconversion, often referred to as frame interpolation [24,25]. One should have in mind that it also comes with compression format [26,27,28,29,30,31,32].

There has been extensive analysis of video tracing by long-range dependency (LRD) and multifractals [33,34,35,36,37,38]. One of the most popular tracing tools is the Fast Forward Moving Picture Experts Group (Fast Forward MPEG or ffmpeg) solution [33], while self-similarity is considered as one of the most powerful properties [34], where LRD and multifractals have been used in many applications related to different types of sequences [35], statistical modeling and analysis of video traffic [36,37] and queuing performance [38]. Video traces after frame parsing have been examined for two main purposes: the purpose of traffic modeling [39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55] or towards characterization of compressed video by focusing on specific standard [56,57,58,59]. For traffic modeling, studies are related to specific protocols [39], queuing [40], variable bitrates [41], specific prediction models [41,42,43,44,45,46,47,48,49,50,51], buffering [52], dynamic bandwidth allocation [53], attacks [54] and various content [55]. On the other hand, characterizations of compressed video using specific standard and video traces are considered using MPEG-4 Advanced Simple Profile (ASP) [56], MPEG-4 version 2 and H.263 in [57] and advanced video coding (AVC or H.264) [58] and its extensions in [59]. So far, to the author’s knowledge, LRD- and multifractals-based studies oriented towards characterization of compressed video by focusing on specific standard and different compression factors have only been examined up to MPEG-4/AVC content [56,57,58,59], while HEVC should deal with modern HFR video content. Since each standard affects tracing differently, it is crucial to understand the LRD and multifractal behavior of HEVC HFR compressed content while having in mind different compression quality. Moreover, studies related to detection of frame rate upconversions, i.e., temporal resolution recovery (TRR), are presented in [60,61,62,63]. In recent works, MPEG-4 traces have been investigated [60]. They are mostly investigated for original LFR content: frame rates of 24 fps upconverted to 30 fps [61], frame rates of 15 fps [62] and frame rates of 15 to 30 fps [63]. Recent previous studies have shown that it is possible to perform TRR detection [60,61,62,63] using MPEG-4/AVC SFR video content/traces.

The purpose of this work is to analyze video traces corresponding to HFR HEVC compressed content using LRD and multifractals and to tackle issue of TRR detection by examining effects found in TRR. Namely, this paper addresses HEVC (or H.265) frame size traces extracted from HFR video content that are considered from an LRD and multifractal point of view. As a consequence of the above, the analysis of HFR should go hand in hand with compression observed using constant rate factors. Here, the focus is on UHD HEVC video traces collected from data corresponding to HFR and frequency up to 120 fps, where tests have been performed using publicly available reference HFR video content. Temporal resolution recovery (TRR) has also been examined. The contributions of this paper are as follows:

−: HFR HEVC frame size traces show specific behavior in LRD- and multifractal-based analysis, where difference before and after temporal resolution recovery (TRR) exist.
−: The experimental results are obtained for HEVC compressed HFR video frame size traces for the first time in multifractal domain, which may contribute to recognition of possible changes like TRR.
−: Having in mind the obtained results and spectra behavior, a novel detection method is proposed for TRR detection regardless of compression level expressed through constant rate factors.
−: The proposed TRR detection model based on weighted k-nearest neighbors (weighted kNN or WkNN) classifier shows high accuracy detection percentage in the performed experimental analysis using a relatively low number of features.

This paper is organized as follows. After the introduction, in Section 2, a brief description of HFR, video quality and coding is given. In Section 3, additional details on works related to multifractal analysis of compressed video content are presented. Video frame size traces and data gathering are explained in Section 4. Applied methods for LRD and multifractal spectrum calculation are described in Section 5. HFR content is characterized using LRD and multifractal properties before and after TRR. Moreover, novel model for TRR detection is proposed based on HFR video multifractal analysis, WkNN classifier and a relatively low number of multifractal features. The experimental results on 4k 120 fps HFR content are shown in Section 6, where a high accuracy percentage is obtained for different content and compression rate factors. Finally, conclusions are given in Section 7.

2. HFR Processing and Challenges

It is well known that frame rate impacts quality of experience related to how realistic content being consumed is or which style one desires to obtain, like motion blur, slow motion or fast forward. HFR video is expected to approach realism when there are a lot of actions happening, as in sports, busy scenes in movies and gaming but also in live and realistic experience with crisp information. One of the benefits is increased realism, where video seems more immersive by making the viewer’s experience more lifelike.

Frame rate is generally described as number of frames per second (fps), which is illustrated in Figure 1, where HFR means that temporal resolution is increased and more images are captured in a given amount of time. HFR can be described as video content captured or displayed at a frame rate of 60 fps or higher. This is in contrast to the SFR/LFR that is typically used for television [5,6]. Besides the realistic experience which is tied to the perception of motion, HFR reduces motion blur when an object is moving and enables its clearer representation, described as smooth motion given in more detail. Even so, differences in frame rates in acquisition and reproduction may produce uneven pacing and sometimes longer frames. Inconsistent frame times, known as judder, and decreasing low frame rate producing so-called stutter effects are only some of the issues [12]. Video quality estimation may produce different results due to varying video behavior when transforms are made in frame rates according to specific quality mode selections. Wearable and lightweight cameras like action cameras are popular in the consumer industry, meaning that both professional and nonprofessional content has a variety of distortions [13].

Temporal resolution changes through objective measurements are still mainly analyzed using standard full-reference approaches [17]. Quality assessments are usually made by mean squared error, peak signal-to-noise ratio, structural similarity index, video multimethod assessment fusion and similar metrics [18,19,20]. In [12], frame rate differences have been considered by using video multimethod assessment fusion (VMAF) and entropy differences. Video collections like Youtube-UGC [21] or Konstanz KoNViD-1k [22] are made for research using different quality scores, compression results and distortion diversity, which can be used for purposes like constructing general no-reference models. Still, a few studies have been specifically considering HFR with publicly video sets that are acquired with frame rates equal to or above 60 fps for research reasons, like Waterloo HFR [16], LIVE-YT-HFR [12] and BVI-HVR [1]. Primarily, for HFR experimental analysis, the video tracing and compression analysis Ultra Video Group (UVG) dataset can be used, since it consisted of 120 fps sequences of even higher spatial resolution, meaning containing RAW video content up to 4k 120 fps [23]. This is the reason why this dataset is chosen here. The general suggestion in production is to keep the frame rate high as much as possible, where the choice of video frame rate may be intentionally HFR [10].

One should have in mind that even if an acquired video is HFR, this can be a significant barrier for many systems and devices. Devices with limited processing power may require downsampling, like frame dropping. HFR may be experienced even as unnatural and not easy to follow by the human visual system, leading to frame rates being decreased. Moreover, interoperability between components of a system may cause lower frame rates for HFR processing tasks. Frame rate can be downscaled, leading to significant decrease in cost in storage or streaming. On the other hand, it is well known that decreasing the frame rate can also result in choppy video experience. This means that video conversion can also be followed by frame rate upscaling, usually referred as frame interpolation, where temporal resolution is increased by adding the frames between the known frames [24,25]. This represents temporal resolution recovery or TRR.

Any conversion is difficult and comes with a lot of challenges, especially in the temporal domain. HFR leads to higher video file size and bandwidth challenges. Since frame rate affects storage and the capacity of telecommunication channel, HFR quality is accompanied with compression. This inevitably introduces possible unwanted artifacts and undesirable components in motion picture result, where coding and compression solutions enable to decrease the video size by keeping the video quality high. Giving appropriate insight into such content is needed.

HFR assessment goes with video compression. In a nutshell, the design of variety of coding standards and codecs is needed to achieve specific tasks. MPEG is dedicated to efficient coding and compression algorithms, where MPEGx and H26x standards have been popular over the years [26,27,28,29,30,31]. Algorithms are becoming more complex, and advancements are being made to deal with new video technologies. In general, coding steps include block-oriented making intra- and interpredictions, transformation and quantization, filtering and entropy coding. MPEG-2 has become popular over the years in practical applications like broadcasting, where MPEG-4 continues to be the leading choice for streaming implementations. H.264, or AVC standard (MPEG-4 Part 10), was introduced in 2003 by International Telecommunication Union—Telecommunication (ITU-T) Standardization Sector and Organization for Standardization/International Electrotechnical Commission (ISO/IEC) [26,27,28]. It is a video compression standard based on block-oriented and motion-compensated coding supported by a wide range of devices and systems and is still one of the most widely accepted standards known for high compression efficiency and high-definition television implementation.

AVC was followed by HEVC, introduced in 2013, also known as H.265 or MPEG-H Part 2 [29,30]. HEVC, briefly speaking, enables compression of approximately half the size of AVC as a next-generation standard. It supports streaming and broadcasting with higher resolution, where HEVC is mostly used in action cameras and smart phones for HFR purposes. Also, there are many other available solutions for IP delivery, like VP9 and its successor AV1 [31,32,33]. Nevertheless, it should not be neglected that AVC is still in force for various implementations, but when it comes to HFR, it is expected to transfer to standards like HEVC [26]. HFR HEVC compressed video content effects have not been considered to a large extent, and this is relevant to practical implementations. For example, recently, in [32], HFR was analyzed from a perceptual quality point of view in the case of HEVC and VP9 by authors for full high-definition (HD) video sequences and five constant quality factor values, showing the better performance of HEVC for higher rates using standard metrics.

Frame size trace sequences for AVC HD and 4k/UHD HEVC video can be seen in Figure 2 for a sequence taken from the publicly available UVG dataset described in [23]. Comparison after frame frequency alignment in Figure 2 shows different trace sequence behavior. Still, self-similarity-based analysis related to HFR HEVC has not been performed so far to the best of the author’s knowledge.

There are a lot of challenges related to HFR processing that need to be further investigated, such as effects due to compression and codec settings, effects due to differences in frame rates, no-reference HFR content characterization, HFR reproduction, editing and hardware utilization. Here, only some of these issues are tackled. Effects due to compression are considered in this paper for HEVC standard and TRR, but other standards like VP9 and content modifications can also be taken into account. No-reference characterization and quality estimation is of general interest for HFR processing. A relatively large amount of different raw HFR video content is needed in the research community. Also, HFR reproduction, decoding and editing require additional resource utilization compared with SFR due to the high number of frames per second, where possible effects of available acceleration approaches need to be researched further.

3. Self-Similarity and Multifractal Analysis of Compressed Video Content

Video can be manipulated in many ways, affecting and controlling the overall quality. The most frequent choices are setting constant quality factor or buffer size or using constant, constrained or variable bit rates [33,34,35,36,37,38,39,40,41,42,43,44,45,46]. A large number of physical systems and nonstationary signals tend to show similar behavior at different scales, known as having self-similarity properties [34,35,36]. These properties have been analyzed using fractal and multifractal theory for compressed video content and video tracing.

In order to ensure a desirable quality of service, self-similarity is investigated for constant and variable bit rates in [37]. Self-similar patterns are explored for high-speed network traffic in [38]. In early works, long-range dependency or LRD in video traffic represented by traces has been mostly quantified by a single-parameter Hurst exponent [37,38]. LRD means that traces exhibit correlation over a range of time scales, where among standard statistical video traffic metrics, like mean and variation in video traces, additional self-similarity properties have been investigated under different conditions, where LRD is only one feature of fractal-like behavior. In [39], Transport Control Protocol (TCP) traffic collected through a number of bytes arriving per time is multifractal and is analyzed using spectra, enabling valuable statistical estimation. Multifractals are applied for behavior of a queuing system in [40]. Tail distributions in a multifractal sense while measuring variable bit rate are compared in [41].

Generally, there are two main directions in analysis of video traces. The first one is traffic modeling. Self-similarity has been widely recognized, and multifractal-based traffic modeling has been found suitable for video tracing [39,40,41,42,43,44,45]. Different self-similarity models are tested for network traffic analysis and prediction: using fractional traffic Brownian motion model [41], wavelets [42,43], multiplicative approaches [44,45], autoregressive models [46,47,48,49] and Markov chains [50,51]. Experimental multifractal analysis is applied for dimensioning, buffer capacity interpretation and statistical multiplexing of video streams [52], as well as for dynamic bandwidth allocation [53]. Multifractal spectra have been compared during the normal work and force attacks in a communication network [54], while differentiation between spectra is used to show the consistency of LRD [55].

The second direction in investigating self-similarity properties of video traces is oriented towards the characterization of compressed video, having in mind specific standards. Most of the research uses MPEG-4 traces for testing, being one of the most valuable practical standards, like in [56,57]. MPEG-4 Advanced Simple Profile (ASP)-based encoded traffic is tested for estimating queuing performance [56]. In [57], MPEG-4 version 2 and H.263 video traces are compared using accompanied parsers in order to extract ten sequences, so-called frame size traces, which are found statistically valuable for testing performance. This work has been continued on new encoders like H.264/MPEG-4 AVC [58,59]. H.264 video compressed traces are analyzed using multifractal and fractal approaches [58]. Trace analysis with extended encoding standards is performed in [59]. The whole encoding or transcoding process takes time and, due to settings, it is hard to compare the former generated traces with the new traces [56,58]. Frame size trace sequence according to each standard is different.

4. HFR HEVC Video Traces and Temporal Recovery Data

Temporal recovery or frame upconversion detection has been examined in [60,61,62,63] using MPEG-4/AVC traces. In [62], motion-compensated frame rate upconversion is proposed with the possibility of its detection via optical flow algorithm, where original frame rates were of 15 fps. A frame rate conversion detection is also analyzed in [63], having in mind interpolation schemes like common nearest neighbor interpolation. An automatic approach using machine learning is proposed for four original frame rates of 15 to 30 fps and conversions up to 30 fps. Multifractality may be useful in recovery detection, having in mind video tracing and multifractal analysis [56,57,58,59,64,65,66]. For example, machine learning and multifractal features are applied for an intrusion detection system in an unmanned aerial system in [65]. Moreover, Legendre multifractal spectrum is applied in [66] for animation frame analysis and its differentiation from real and partially animated ones, especially due to self-similarity properties found also in video traffic analysis.

In this paper multifractal analysis of HFR frame size traces of HEVC compressed video sequences is the focus. LRD and self-similarity effects are considered for compressed video characterization, with special attention to their application in video change/modification detection. Here, HFR video traces are collected similarly to as is explained in the previous section, where frame size sequences are extracted using an accompanied parser [57]. HEVC compressed video represents input for the parser, which is applied for obtaining the xml trace file needed for statistical analysis, as shown in Figure 3.

Video trace sequences are generated using ffmpeg v5.1.2 for different content. If audio exists within a file, it is removed. Constant rate factor, denoted as crf, i.e., two-pass crf, is selected as a model for controlling the output. The crf option is available for popular codecs and keeps the output quality level by rate control method, which is applied in practical implementations. Lower crf values in compressed data correspond to higher video quality. Six crf values are used, ranging here from 20 to 40. The supported preset option focused on speed and codec complexity is set to default, meaning medium, for the video trace collection, and no additional tuning is applied. The trace/data collection and the experimental analysis are carried out on Intel(R) Core(TM) i7-10750H Central Processing Unit (CPU), 2.60 GHz with 16 GB Random-Access Memory (RAM) on Windows 10 Pro 64 bit operating system without including specific graphical acceleration possibilities.

Since it is of interest to investigate the behavior of video traces, the testing circumstances are selected to be as simple as possible. In order to perform the analysis, the experimental procedure employs LRD and multifractal methods for estimation of HFR. The most common HFR video change is temporal resolution recovery, named here as TRR. The scenario of typical TRR found in practice includes temporal filtering, followed by temporal resolution matching. Significant savings in memory and channel capacity can be probably temporarily made in the temporal domain by decreasing frame frequency, and this is called temporal filtering [5,6]. Frame frequency alignment leads to HFR TRR. It is generated by the increase in frame number after a loss of original data, where specific temporal upsampling is ignored, as in [67], to avoid the choice of different methods and adding undesirable artifacts. Here, it is assumed that self-similarity properties of video traces may be observed in TRR scenario. TRR is valuable, since the practical implementations often need savings and further comparisons in the HFR domain.

This HFR TRR after temporal filtering can be considered common in practices where it is needed to have a matching frame rate, as in the original HFR video case. In this paper, the focus in on differentiating these TRR and original sequences. Additionally, it is possible to decrease frame rate to match original one expecting similar traces to the original ones, but it is evident that in the HFR 120 fps case, this is not still common in practice. The losses in HFR video recovery may produce specific tracing behavior that may contribute to possible detection of such changes. Besides temporal resolution changes, selected compression quality is expressed here through crf. For the experimental analysis, reference and publicly available HFR video sequences are selected. Additional tests are made using an action camera.

The basis of the experiments represents video trace collection that is made according to UVG dataset [23]. Recently, the benchmark was widened in 2020 for additional sequences, where of particular interest here are the 120 fps source files, available in YUV format in 4k/UHD or 2160 p spatial resolution. Source files representing HFR YUV 8-bit video sequences used for the analysis are listed in Table 1.

Each source video file contributes to original and TRR video trace sequences corresponding to different crf values. Applied methods are related to LRD, or to be more precise, Hurst index evaluation, as well as multifractal spectra calculation for further comparison and testing.

5. Methods for Estimation of HFR Video Characteristics

5.1. Hurst Index

In time series analysis, specific behavior is analyzed using common statistical measures, as shown in Figure 3. LRD is especially of interest in order to understand behavior of a structure/sequence, especially in the cases of video traffic analysis. Particularly, Hurst index or exponent, H, is evaluated as a statistical measure to better determine the characteristics of traffic, cardiac dynamics or finance [38,57,58,68,69]. Hurst exponent H can be estimated for LRD tests in different ways via: R/S statistics, periodogram, aggregated variance method, absolute moments method, detrended fluctuation analysis, etc. [68,69,70,71,72]. The Hurst index is found in the range 0.5–1, where H equals 0.5 in the case of pure random process like Brownian motion, with no correlation between incremental signal changes. An index is applied for measuring dependence in a structure/sequence, and it can be considered as a fractal-related feature. In the case when the index is less than 0.5, 0 < H < 0.5, the process is negatively correlated. Likewise, when the Hurst index is higher than 0.5, 0.5 < H < 1, it indicates a persistent behavior with long-term positive autocorrelation, meaning that higher values are probably followed by another high value. This self-similarity expressed by H values higher than 0.5 means that LRD occurs. LRD can be described having a relatively high degree of correlation between distant data points.

In a nutshell, the complexity can be evaluated using a correlation sum C(r) calculated for a range of distances, or points within a radius r, where the correlation sum scales with radius as C(r) ~ r^D, giving the exponent D as a correlation dimension [68,72]. It is possible to estimate the dimension as the local slope of log(C(r)) versus log(r) for a sufficiently large range of small r values. Since the variance of local slope can be relatively high, different approaches may be used for as a function of distances between considered points in a structure. In a generalized case and fractal geometry [68,71,72], Hurst index H, as a measure of LRD, which is directly related to fractal dimension, can be described as a function of parameter denoted as q for a time series x(t), using the scaling properties given as:

\frac{< {|x (t + r) - x (t)|}^{q} >}{< | x (t) |^{q} >} ~ r^{q H (q)},

(1)

where radius or lag averaging over the considered time window is denoted by < . >. Another common algorithm is Detrended Fluctuation Analysis (DFA) [71,72,73]. Firstly, integrated sequence y_k is obtained based on x, and then the local trend denoted by the y_m of boxes or subseries of length m is found. This is followed by the calculation of the root mean square fluctuation:

F (m) = \sqrt{\frac{1}{N} \sum_{k = 1}^{N} {[y_{k} - y_{m} (k)]}^{2}} .

(2)

The index is finally calculated as the slope of linear regression, which compares log(F(m)) and log(m).

The alternative for Hurst index calculation can be periodogram method [70,71,74]. Specifically, the method relies on calculating the periodogram:

I_{L} (ω_{k}) = \frac{1}{L} {|\sum_{t = 1}^{L} x (t) e^{- 2 π i (t - 1) ω_{k}}|}^{2},

(3)

and it is based on the discrete Fourier transform applied for a set of samples {x_t: t = 1, …L}, where ω_k = k/L, k = 1, …, [L/2] are corresponding frequencies. When plotting the periodogram in a log–log domain, the index H can be found according to the slope of regression line as H = (1 − slope)/2.

One of the best-known procedures for calculating the LRD is the R/S statistics or the R/S method [70,73,74]. Firstly, a times series can be divided into d subseries or blocks of length n, where each subseries m = 1, …, d is normalized in order to generate cumulative time series y_i,m for I = 1, …, n. The range is calculated for each block as R_m = max{y_i,m: I = 1, …n} − min{y_i,m: I = 1, …n} and rescaled. The mean value of all subseries of length n can be found as

{(R / S)}_{n} = \frac{1}{d} \sum_{m = 1}^{d} R_{m} / S_{m},

(4)

where the ratio R/S follows the rule (R/S)_n ~ n^H, enabling estimation of the H index as the slope of linear regression line in a log(R/S) versus log(n) plot.

One should have in mind that each of the abovementioned methods for Hurst index estimation may give different H values. Here, they are used for LRD estimation of HFR data. Hurst indices for self-similar series are in the range of 0.5–1, where it can be considered that for higher H value that is closer to 1, the degree of self-similarity increases. The difference between the Hurst indices of a recovered/modified and original sequence is calculated as

H_{d i f f} = \frac{H_{r e c} - H_{o r i g i n a l}}{H_{o r i g i n a l}},

(5)

where H_rec denotes index of a recovered sequence, and H_orig is the index of a corresponding sequence where no frame rate change is made.

5.2. Multifractal Spectrum

The Hurst index is suitable to describe the behavior of self-similar processes using a single value. Nevertheless, the asymptotical consideration of a regression slope in a log–log domain may not always be sufficient to fully understand the video traces. Thus, multifractal concept represents a generalization of a fractal one, where instead of one dimension or exponent, a spectrum of exponents is defined to describe dynamics of time series. There are different multifractal approaches for spectrum calculation [74,75,76,77,78].

In general, multifractal formalism is based on dimensions defined for a set of exponents within a small range, where dimensions can be given as

D_{h} = \inf_{q} (q h (q) - τ (q) + k),

(6)

where q describes singularity, τ(q) is a nonlinear function and k is a constant. For q = 1, information dimension is obtained, while for q > 1, strong singular, and for q < 1 less singular, structures are described. The Legendre transformation for α enables obtaining the multifractal spectrum f(α) as

α = \frac{d τ (q)}{d q}, f (α) = q α - τ (q)

(7)

Multifractal or singularity spectrum denoted as f(α) can be described as a distribution of the quantity or Hölder exponent α. In practice, Legendre spectrum gives a smooth concave function of the exponent useful in understanding the behavior of different structures [39,40,41,58]. Here, for the spectrum calculation, Fraclab software version 2.2 is used [78]. An example of multifractal spectrum is presented in Figure 4 as function f(α).

The main characteristics are shown in the same figure and are used as features in discrimination and classification models in different fields [75,76,77]. The structure of a spectrum is usually considered through its left and right sides or tails by dividing singularity of components into two parts. The features represent width w, asymmetry a, tail slopes and similar. Also, there are characteristic points of a spectrum that are foundation of these features, like minimum α value (α_min), maximum α value (α_max) and α corresponding to the curve maximum (α₀). Moreover, information dimension can be found asymptotically using f(α) = α dimension. This is applied here as well to make a difference between video traces using multifractals in a statistical manner. Besides the Hurst exponent, these multifractal properties are also considered.

5.3. Detection Model and Evaluation

In the analysis of multifractal properties, different features were examined, like the left endpoint of a spectrum or the endpoint of a right tail [41]. Feature values directly rely on the content being analyzed. Thus, it is expected that properly selected features would enable a differentiation between temporally recovered sequences from corresponding original sequences. So, it is convenient to have a reference or a pseudoreference which will enable such differentiation. However, this is not an easy task in practice, where different motion or entropy is related to various signals/sequences. Thus, multifractal features are performed on cumulative sums of trace sequences. Feature extraction is performed on the corresponding spectra in order to make differentiation between no frame rate change and the temporal recovery case [78].

In this paper, a TRR detection model is proposed. The focus is on differentiation between 120 fps original video and corresponding TRR video recovered from 60 fps compressed with HEVC and characterized by six crf values. The bases of the proposed approach are new multifractal features, representing primarly the relative width and height of multifractal spectra, calculated as

w_{r e l} = \frac{w_{r}}{w} = \frac{α_{\max} - α_{0}}{α_{\max} - α_{\min}},

(8)

h_{r e l} = \frac{a}{f (α_{\max})} = \frac{f (α_{\min}) - f (α_{\max})}{f (α_{\max})} .

(9)

The different quality of signals represented through crf values may show similar trends in the case of relative trends. Besides the slope corresponding to right side of a spectrum [79], the left side was examined utilizing information dimension. That is, a distance between two characteristic points is applied to describe the curve shape:

d_{\inf} = {({(α_{1} - α_{0})}^{2} + {(f (α_{1}) - f (α_{0}))}^{2})}^{1 / 2} .

(10)

It is thought that it is possible to identify differences between the two groups, meaning original and TRR, i.e., to detect whether temporal recovery is performed or not in practice. Four multifractal features are selected in experimental analysis in order to develop a detection model without a reference using machine learning. Extracted features are utilized for machine-learning-based temporal recovery detection. Several classifiers are tested, like Support Vector Machine (SVM), decision tree and k-nearest neighbors (kNN) [80,81,82]. SVM is often applied for classification, with the ability to generate a hyperplane to perform differentiation between original and nonmodified data. The decision tree (DT) also represents one of the common choices when it comes to classification, where results are obtained according to a learned tree based on selected features. Also, one of the most popular picked solutions is the kNN algorithm. During the training/learning process, the algorithm tries to find k samples that are the closest to the targets. According to the selected metric, the distances between the specific point and the target point in the known category/group are calculated in a nonparametric approach. The smallest distances from the specific point (x_q, y_q) are found, where the label of a point is found using majority voting:

{\hat{y}}_{q} = \underset{j}{\arg \max} \sum_{x_{i} \in k N N (x_{q})} I (y_{i} = j),

(11)

where kNN in (11) denotes the neighborhood, and I is the indicator function. The main concept of kNN classification is illustrated in Figure 5a. Here, a modified kNN or weighted kNN (WkNN)-based model is proposed for TRR detection, and it is presented in Figure 5b. After generating trace sequences, where each video from the database contributes to each group (TRR and non-TRR meaning original), feature extraction is applied using the multifractal features explained in the previous paragraph. The WkNN classification model includes the so-called Mahalanobis distance calculations in the feature domain, their sorting, taking k-nearest neighbors, assigning weights and voting. For the WkNN model, training and testing are performed in order to obtain results, where 1 (Yes/True) denotes that TRR exists and 0 (No/False) denotes that the tested video is original.

An improved version of kNN is the WkNN [81,82], where feature space is corrected. Namely, features are assigned according to position between the points. If weights W_i are calculated using squared inverse method based on the distances d, the voting can be written as

{\hat{y}}_{q} = \underset{j}{\arg \max} \sum_{i} W_{i} I (y_{i} = j) = \frac{\sum_{i} W_{i} y_{i}}{\sum_{i} W_{i}}, W_{i} = \frac{1}{d {(x_{q}, x_{i})}^{2}} .

(12)

There are various metrics [83] that can be applied for finding distance between two feature vectors x = (x₁, x₂, …, x_m) and y = (y₁, y₂, …, y_m) for classifying the point into one of the groups. For example, Euclidean, Cityblock and Mahalanobis are common choices, expressed by

d_{E u c l i d}^{2} = \sum_{i = 1}^{m} {(x_{i} - y_{i})}^{2},

(13)

d_{C i t y b l o c k} = \sum_{i = 1}^{m} |x_{i} - y_{i}|,

(14)

d_{M a h a l a n}^{2} = {(x - y)}^{T} C^{- 1} (x - y),

(15)

respectively, where C is the corresponding covariance matrix. Euclidean distance assumes that features are somewhat independent with a spherical distribution, where Mahalonobis is similar to Euclidean but seems to be a good alternative in such cases by taking into account the distance between a point and a distribution [82].

The temporal recovery detection model through trace sequences is presented in Figure 5b. The binary classification between the temporal recovered and original data are performed similarly to [82,84] with 5-fold cross-validation. Video content that was included in the training process for each crf was not part of the testing. The classification performance is measured using accuracy (Acc):

A c c = (T P + T N) / (T P + T N + F P + F N),

(16)

And true positive rate (TPR) or recall, as well as positive predictive value (PPV) or precision, are calculated as

T P R = T P / (T P + F N), P P V = T P / (T P + F P),

(17)

Using true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN) in TRR detection.

6. Experimental Results and Discussion

6.1. Hurst Index Differences between Original and Temporal Recovery Data in LRD Estimation

HFR traces are collected for different video content of 120 fps of the same duration, where the original content has been compressed according to HEVC standard. Hurst exponent is calculated for LRD estimation, where it varies depending on the coarse approach applied in the calculation. After examination, it is evidenced that in most cases it has values in the expected range, i.e., 0.5 < H < 1. This is presented for four Hurst index methods and ReadySetGo sequence in Table 2. LRD behavior is evident for the different index calculation methods applied: generalized method, DFA method, Hurst exponent calculated using periodogram and R/S statistics [68,69,70,71,72,73,74]. For example, the periodogram method applied in the case of crf36 is illustrated in Figure 6 for the ReadySetGo and YachtRide sequence. Lower Hurst values are obtained for TRR compared with the original, where in Figure 6, the TRR trend is presented with a solid line compared with original, which is presented with a dashed line. Still, these lower Hurst values are above 0.5, showing LRD. The dashed lines are found for data where no frame rate change is made. In the case of R/S statistics, it is also evident that the temporal resolution recovery case shows a decrease in Hurst index. This is illustrated in Figure 7. Such difference between indices may be useful for better understanding the changes found in recovered video data. The differences in Hurst evaluation using R/S statistics for UHD 120 fps compressed ReadySetGo sequence are shown in Table 3 for different crf values (H_diff). The method was used repeatedly for the other five original videos from Table 1. The averaged difference values H_diff,average for each tested crf are also shown in the table. The relative difference varies from −11.51 to 1.30 percent. The highest absolute relative difference is obtained for crf24. Higher crfs show less variations in Hurst indices than the original sequences. This means that a higher difference in temporal recovered from nonrecovered data are expected in lower crfs, meaning higher video quality. Similar calculations are made in the case which is not common in practice for 120 fps HFR, where frame rate primarily increases and then decreases to 120 fps. In the additional case, such modifications show behavior more similar to the original data, as shown Figure 8. Nevertheless, temporal recovery data are still different from the original case. This additional case and its similarity to original traces are also illustrated in Figure 8.

The obtained Hurst index differences between original and temporal recovery data in LRD estimation show that TRR case generally decreases index values. If the original and corresponding TRR sequences are available, such differences may be useful in recognizing the TRR change. Moreover, each content shows different behavior, where maximum absolute difference in Hurst index may vary depending on the content.

6.2. Differences between Original and Temporal Recovery Data in Multifractal Spectrum Estimation

Different crf values from 20 to 40 are applied to demonstrate the effect of compression levels in practice. The standard effect of crf selection can be seen in Figure 9. If the crf increases, frame sizes are smaller, and the decreasing trend of trace sum is obvious in Figure 9a. In Figure 9b, the change in crf value selection is shown in the multifractal domain. This is demonstrated via multifractal spectra and six crf values for compressed UHD ReadySetGo sequence of 120 fps. Six spectra are found for sequences that are named as original here, since no TRR is applied in Figure 9b. The lowest crf values show narrower spectra, meaning the spectrum width is relatively small and vice versa.

Multifractal spectra for ReadySetGo with crf24 are presented in Figure 10a, where TRR is compared with the original. Similarly, TRR spectrum is shown for crf36, where it can be noticed that the relation is quite different, especially in terms of asymmetry (a), as opposed to the higher quality in crf24, as shown in Figure 10b. The calculation of multifractal characteristics enables going a step further with the analysis and the comparison between the two scenarios/cases: TRR and original. It is believed that significant data loss in TRR can be manifested through specific features compared with original UHD 120 fps sequences. Each spectrum is characterized by several points: minimum α value (α_min), maximum α value (α_max), α corresponding to curve maximum (α₀), and α corresponding to information dimension (α₁). These traditional characteristics, along with spectrum width, are presented in Table 4 for TRR and six crf values. Similarly, these characteristics are shown for original sequences and six crf values in Table 5.

Wider widths are generally found in the TRR case, and this can be a suitable feature if a reference/original exists. Unfortunately, this is not the case in practical implementations. Similar results can be obtained for the right slope and other characteristics since, for the ReadySetGo details and motion, spectra are quite different in TRR compared with the original, regardless of crf selection. When a compressed video has a lot of details, particularly recovered video, the right side, for example, expressed through w_r, contributes to the wider width. This is illustrated in Figure 11a. An additional case is also represented, and it is evident that similar results compared to the original data are obtained. Six TRR and six original sequences are presented in Figure 11a in multifractal domain. This is repeated for other UHD 120 fps video content as well. The spectra for HoneyBee and YachtRide are also presented in Figure 11, precisely Figure 11b and Figure 11c, respectively. By examining the cases and corresponding characteristics for other spectra, it can be seen that the abovementioned features are not suitable for possible recovery detection, regardless of crf and video content. For example, in Figure 11, the width feature values of all spectra for HoneyBee spectra are quite different from the widths of all spectra found in Figure 11a,c. The right side is not pointed in Figure 11b due to the video content itself, where little motion is present. Thus, cumulative sums of trace sequences are presented in multifractal domain in order to have a novel consideration of content from the standpoint of the multifractal concept.

Spectra curves obtain similar concave shapes due to a cumulative trend, as presented in Figure 12. The multifractal spectra of a cumulative sum of traces are presented for the sequences of ReadySetGo and HoneyBee in Figure 12. The structures seem more suitable to observe the behavior of data. Feature extraction is performed on the corresponding spectra in order to mark the differentiation between temporal recovery and the original case. Still, the width is not a proper selection as a feature for TRR detection. Thus, features are selected in a manner not to deal with the width but with the relative width, as in (8). Moreover, the right side of the spectra is zoomed for ReadySetGo and HoneyBee sequence in Figure 12 in order to observe additional features. Besides the slope, features (8)–(10) are applied in TRR detection.

Differences between original and temporal recovery data in multifractal spectrum estimation, as in the case of the Hurst index differences, show the possibility to recognize TRR content. Common features such as spectrum width can be applied in most cases when original data of the same compression quality are available. Nevertheless, if different compression quality and content are taken into account, this is not an easy task, due to spectra overlapping. Thus, in practice, common features may not be useful, and TRR detection should be based on specific features.

6.3. Temporal Recovery Detection Results Using the Proposed Model based on Multifractal Features

It is of practical importance to distinguish TRR from original HFR video data. Simply reducing and recovering HFR can be considered as one of the typical procedures in order to compensate found losses and to match original frequency. Nevertheless, it should not be neglected that distortions exist and need not be replaced with original data in cases where media integrity and authentication are involved. This should be available for any quality of level expressed here through compression. The selected crf mode enables a practical and sophisticated approach compared with setting constant quantization or output quality level. So, such settings are beneficial in practice, since they lead to results that are dependable on video content itself. The functional TRR detection approach presented in Figure 5b should serve a wide purpose as a tool. Here, UHD 120 fps data are analyzed.

After preparing video data and generating video trace sequence, feature extraction is performed according to four multifractal features that can be noted as feature1 to feature4, respectively, they are relative width, relative height, right side slope of multifractal spectra and left side distance based on two characteristic points of a spectrum. This is followed by a classifier which employs features that do not apply any reference. The performance of different classifiers in multifractal TRR detection is calculated using five randomly selected video source files from Table 1. These files give two groups of samples by using six crf values. In five-fold cross-validation, totally, sixty samples are used, presented in Figure 13a, where videos numerated from 1 to 5 are included. In Figure 13a, samples are presented as points in two feature domains. There is a linear trend of slope 1.07, and the norm of residuals calculated as the square root of the sum of the squares is 0.037. By examining cases, linear fits can be made for each group here. Namely, each group has similar norm of residuals close to 0.021 but with similar regression slopes close to 1.07, implying that appropriate fitting lines can be considered as parallel. The proposed model for temporal recovery detection is based on WkNN using Mahalanobis metric. In Figure 13b, a cross-validation result is presented, where one point is misdetected as temporally recovered. In this case, this is a point corresponding to the Beauty sequence of crf20. The trends of samples for each of the group are noticeable, even though different content is included in the tests. The experiment is performed on data recovered from 60 fps. Additional tests in practice were conduced in order to recover HFR HEVC data from 30 fps, showing satisfying results, as shown in Figure 14.

This is confirmed for YachtRide numerated as the sixth source video from [23] and TRR from 30 fps, which was not included in the cross-validation. Multifractal spectra of TRR data from different frame rates, like 24 fps, 30 fps, 60 fps and 96 fps, are presented in Figure 14f. Even though all YachtRide recovered data are correctly detected, one should be aware that cross-validation is performed only on TRR from 60 fps and original data of 120 fps, and that such cases with different downsampling should probably be included in the training. Future experiments need to include various combinations in the spatial–temporal domain, which was not the focus of this paper. Here, it was of interest to show that the multifractal domain and the proposed TRR detection can show high accuracy detection percentage in the set experiment using a relatively low number of features.

In Figure 15a, recovered ReadySetGo is presented using several successive frames, obtaining recovered 120 fps suitable for comparing with 120 fps original data, where the difference between these two 120 fps is not easy to observe by a standard viewer. This is shown in order to point out that even though video content is very similar, a difference exists between HFR data depending on whether TRR is applied or not. Moreover, the analysis shows that such TRR detection based on multifractal features can be developed.

The comparison between different distance measures showed significant advantage of the Mahalanobis metric (15) compared with other choices. In Table 6, the area under the curve is above 0.86 for different metrics for WkNN. The outcomes of the suggested approachbased on four features for (13)–(15) are given in Table 6, where the significant advantage of Mahalanobis is obtained compared with other distances [83]. Moreover, the performance is examined in the feature domain based on different classifiers, like kNN, decision tree and SVM, based on polynomial kernel functions such as linear, cubic and quadratic, as well as WkNN using the Mahalanobics metric, where a cross-validation procedure is applied. This is presented in Table 7, where recall, precision and accuracy are calculated, showing the highest value of accuracy for the proposed WkNN approach. Similar results are obtained for different 120 fps video data from the reference UVG dataset.

For different video content included in the cross-validation, high accuracy is confirmed. Namely, the average for six iterations for the proposed model still gives a high accuracy of 98.1%, as shown in Figure 15b. The obtained experimental results for TRR detection in the performed analysis seem promising, giving an accuracy above 98%. All four features are selected in the proposed model, where the performance based on fewer features is shown in Figure 15. The proper choice of classifier and selecting the right distance metric are necessary to obtain high accuracy results. Video sequences of different complexities and crf values are tested in the experiment which originated in 2160 p resolution, where only temporal conversion was applied with TRR HEVC detection, mainly focused on 120 fps. Compared with this, in [63], where recovery detection is tested, about 96% accuracy for AVC 60 fps video is reported. Future work should be oriented towards spatial conversions and other combinations. Here, HEVC compressed video content which originated in 2160 p resolution and 120 fps is trained and tested, which is usually carried out for much lower resolution formats and AVC [60,61,62,63]. It is expected that for proper training applied to different content and crf values, the model could be applicable for lower frame rates and lower spatial resolutions.

All the experimental analyses were performed on the same platform, and ffmpeg was applied for the decoding tasks. This paper’s emphasis is on the effects on LRD and multifractality and not the hardware or power consumption. Generally, playing 4k HFR HEVC may produce freezing. Nevertheless, the method does not require reproduction, and the parsing enables collecting traces in the offline mode with minimal graphic utilization. Even though tools for video reproduction can be used, specific graphical processing acceleration possibilities were not applied, such as Nvidia-GeForce-based ones, and the settings were kept as simple as possible. Here, CPU utilization is about twenty percent without visualization. Also, the abovementioned information does not affect the model. Any visualization increases both CPU and graphics unit utilization. In general, in decoding processes, there may be slight differences in trace sequences due to testing environment. Here, by repeating the process and testing different types of content, the model can be considered as robust. For content creation, action cameras like GoPro can be used. Also, for editing like audio removal, ffmpeg can be applied. Moreover, precision editing can be one of the possible challenges which need to be further analyzed. For content creation, it is possible to set prior acquisition time. Nevertheless, nonadequate editing such as trimming for modern standards like HEVC may produce corrupted video as an unwanted effect, which is well known in the forensic field. The proposed model was not developed for any broadcasting. So far, parsing was performed for sequences of particular length in offline mode, meaning that a significant amount of data are available for obtaining the trace sequences and applying the model. However, the tested duration can be treated as a minimal requirement for the method compared with the commonly used long movie trace sequences [58].

One of the advantages of the proposed approach is found in the selection of a small number of features that do not rely on a reference. In opposition to this, it is expected that including reference or pseudoreference calculations may additionally improve results, which is conducted in recent HFR works for HEVC content of different crf values for perceptual and objective quality assessment [32,67]. Also, multifractal and fractal approaches have been applied on previous compression standards like LFR/SFR MPEG-4 content using frame size video traces [56,57,58,59]. TRR detection related to HEVC HFR has not been performed so far to the author’s knowledge. The effects have not yet been thoroughly investigated in the research community due to difficulties found when dealing with HFR and possible corresponding distortions [10]. HFR changes in temporal resolution affect video content, which is presented here from the LRD and multifractal standpoint. Temporal recovery detection results using the proposed model based on WkNN and specific multifractal features show satisfying performance. The TRR detection approach can be considered useful for HEVC HFR compressed data of different compression quality.

7. Conclusions

This paper presents experimental results obtained for HEVC compressed HFR video frame size traces for the first time in a multifractal domain. In the analysis, it is presented that HFR trace sequences manifest long-range dependence and multifractal behavior. In comparison between temporally recovered UHD 120 fps HFR and corresponding nonrecovered or original HFR data, lower Hurst indices are obtained, as well as often wider multifractal spectra. By analyzing spectra for different crf compressed sequences, it is assumed that it is possible to differentiate TRR signals from the original ones. The proposed WkNN approach was able to detect recovered video data, where the Mahalanobis measure was applied. Also, the feature vector is of low length, and features are extracted as nonreference. Input can be a TRR sample, which can be detected without prior assumption related to constant rate factors and without direct comparison between modified and original sequence. The proposed detection approach gave above 98% in accuracy during the cross-validation.

Overall, the differences between TRR and original HFR video are not easy to notice, even though Hurst indices like the ones calculated using R/S statistics show these differences. Multifractal spectra and their characteristics are also indicative in differentiating between the two groups consisting of various content and motion. By examination of the reference UVG dataset included in the video trace analysis, this research shows that multifractal descriptors and the trained model may be adequate for detection. The model enabled high-accuracy results regardless of compression rate or content. However, further development of the proposed model should be oriented towards other distortion possibilities in HFR domain.

Integrity and authentification issues may arise, and the approach may be useful in TRR or frame rate upconversion detection. Namely, there are a lot of challenges associated with HFR, and HFR needs to be properly addressed in order to truly realize its potential. The obtained results in this work can be considered valuable for future research. It can be concluded that HFR represents a significant advancement in the field of video technologies, and tracing analysis is important for dealing with specific behavior that HFR brings.

Funding

This work was partially supported by the Ministry of Science, Technological Development and Innovation of the Republic of Serbia, no.: 451-03-47/2023-01/200103.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data analyzed are available in a publicly accessible repository at https://ultravideo.fi/ (accessed on 1 May 2023) [23]. Additional data are available from the corresponding author upon reasonable request.

Conflicts of Interest

The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Mackin, A.; Zhang, F.; Bull, D.R. A study of high frame rate video formats. IEEE Trans. Multimed. 2018, 21, 1499–1512. [Google Scholar] [CrossRef]
Armstrong, M.G.; Flynn, D.J.; Hammond, M.E.; Jolly, S.J.E.; Salmon, R.A. High frame-rate television. SMPTE Motion Imaging J. 2009, 118, 54–59. [Google Scholar] [CrossRef]
Sugawara, M.; Choi, S.Y.; Wood, D. Ultra-high-definition television (Rec. ITU-R BT. 2020): A generational leap in the evolution of television [standards in a nutshell]. IEEE Signal Process. Mag. 2014, 31, 170–174. [Google Scholar] [CrossRef]
Noland, M.; Whitaker, J.; Claudy, L. ATSC: Beyond Standards and a Look at the Future. SMPTE Motion Imaging J. 2021, 130, 29–38. [Google Scholar] [CrossRef]
Advanced Television Systems Committee (ATSC). ATSC 3.0 Standards. [Online]. Available online: https://www.atsc.org/standards/atsc-3-0-standards/;https://prdatsc.wpenginepowered.com/wp-content/uploads/2021/04/A341-2019-Video-HEVC.pdf (accessed on 20 May 2023).
You, D.; Kim, S.H.; Kim, D.H. ATSC 3.0 ROUTE/DASH Signaling for Immersive Media: New Perspectives and Examples. IEEE Access 2021, 9, 164503–164509. [Google Scholar] [CrossRef]
Weber, K.; van Geel, R. History and Future of Connecting Broadcast Television Cameras: From Multicore to Native IP. SMPTE Motion Imaging J. 2021, 130, 46–52. [Google Scholar] [CrossRef]
Wu, J.; Yuen, C.; Cheung, N.M.; Chen, J.; Chen, C.W. Enabling adaptive high-frame-rate video streaming in mobile cloud gaming applications. IEEE Trans. Circuits Syst. Video Technol. 2015, 25, 1988–2001. [Google Scholar] [CrossRef]
Berton, J.A.; Chuang, K.L. Effects of very high frame rate display in narrative CGI animation. In Proceedings of the 2016 20th International Conference Information Visualisation (IV), Lisbon, Portugal, 19–22 July 2016; pp. 395–398. [Google Scholar] [CrossRef]
Davis, T. Rethinking Frame Rate and Temporal Fidelity in a Cinema Workflow. SMPTE Motion Imaging J. 2017, 126, 62–71. [Google Scholar] [CrossRef]
Gomez-Barquero, D.; Li, W.; Fuentes, M.; Xiong, J.; Araniti, G.; Akamine, C.; Wang, J. IEEE transactions on broadcasting special issue on: 5G for broadband multimedia systems and broadcasting. IEEE Trans. Broadcast. 2019, 65, 351–355. [Google Scholar] [CrossRef]
Madhusudana, P.C.; Yu, X.; Birkbeck, N.; Wang, Y.; Adsumilli, B.; Bovik, A.C. Subjective and objective quality assessment of high frame rate videos. IEEE Access 2021, 9, 108069–108082. [Google Scholar] [CrossRef]
Wen, S.; Wang, J. A strong baseline for image and video quality assessment. arXiv 2021, arXiv:2111.07104. [Google Scholar]
Silva, M.M.; Ramos, W.L.S.; Ferreira, J.P.K.; Campos, M.F.M.; Nascimento, E.R. Towards semantic fast-forward and stabilized egocentric videos. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 557–571. [Google Scholar] [CrossRef]
European Broadcasting Union (EBU) TR 050. Subjective Evaluation of 100Hz High Frame Rate; European Broadcasting Union: Geneva, Switzerland, 2019. [Google Scholar]
Nasiri, R.M.; Wang, J.; Rehman, A.; Wang, S.; Wang, Z. Perceptual quality assessment of high frame rate video. In Proceedings of the 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP), Xiamen, China, 19–21 October 2015; pp. 1–6. [Google Scholar] [CrossRef]
Lee, D.Y.; Ko, H.; Kim, J.; Bovik, A.C. Space-time video regularity and visual fidelity: Compression, resolution and frame rate adaptation. arXiv 2021, arXiv:2103.16771. [Google Scholar] [CrossRef]
Rahim, T.; Shin, S.Y. Subjective Evaluation of Ultra-high Definition (UHD) Videos. KSII Trans. Internet Inf. Syst. (TIIS) 2020, 14, 2464–2479. [Google Scholar] [CrossRef]
Sara, U.; Akter, M.; Uddin, M.S. Image quality assessment through FSIM, SSIM, MSE and PSNR—A comparative study. J. Comput. Commun. 2019, 7, 8–18. [Google Scholar] [CrossRef]
Li, Z.; Bampis, C.; Novak, J.; Aaron, A.; Swanson, K.; Moorthy, A.; Cock, J.D. VMAF: The journey continues. Netflix Technol. Blog 2018, 25, 1. [Google Scholar]
Wang, Y.; Inguva, S.; Adsumilli, B. YouTube UGC dataset for video compression research. In Proceedings of the 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP), Kuala Lumpur, Malaysia, 27–29 September 2019; pp. 1–5. [Google Scholar] [CrossRef]
Hosu, V.; Hahn, F.; Jenadeleh, M.; Lin, H.; Men, H.; Sziranyi, T.; Li, S.; Saupe, D. The Konstanz natural video database (KoNViD-1k). In Proceedings of the 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), Erfurt, Germany, 31 May–2 June 2017; pp. 1–6. [Google Scholar] [CrossRef]
Mercat, A.; Viitanen, M.; Vanne, J. UVG dataset: 50/120fps 4K sequences for video codec analysis and development. In Proceedings of the 11th ACM Multimedia Systems Conference, Istanbul, Turkey, 8–11 June 2020; pp. 297–302. [Google Scholar] [CrossRef]
Danier, D.; Zhang, F.; Bull, D. A Subjective Quality Study for Video Frame Interpolation. arXiv 2022, arXiv:2202.07727. [Google Scholar] [CrossRef]
Vanam, R.; Reznik, Y.A. Frame rate up-conversion using bi-directional optical flows with dual regularization. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 558–562. [Google Scholar] [CrossRef]
MPEG. Available online: https://www.mpeg.org/ (accessed on 28 May 2023).
ITU-T, H. 264: Advanced Video Coding for Generic Audiovisual Services. Available online: https://www.itu.int/rec/T-REC-H.264-202108-I/en (accessed on 22 May 2023).
Wiegand, T.; Sullivan, G.J.; Bjontegaard, G.; Luthra, A. Overview of the H. 264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 560–576. [Google Scholar] [CrossRef]
ITU-T, H. 265: High Efficiency Video Coding. Available online: https://www.itu.int/rec/T-REC-H.265 (accessed on 22 May 2023).
Bordes, P.; Clare, G.; Henry, F.; Raulet, M.; Viéron, J. An overview of the emerging HEVC standard. In Proceedings of the International Symposium on Signal, Image, Video and Communications, ISIVC, Valenciennes, France, 4–6 July 2012. [Google Scholar]
Belton, J. Introduction: BEYOND HEVC. SMPTE Motion Imaging J. 2019, 128, 12–13. [Google Scholar] [CrossRef]
Rahim, T.; Usman, M.A.; Shin, S.Y. Comparing H. 265/HEVC and VP9: Impact of high frame rates on the perceptual quality of compressed videos. arXiv 2020, arXiv:2006.02671. [Google Scholar] [CrossRef]
FFmpeg. Available online: https://ffmpeg.org/ (accessed on 28 May 2023).
Mandelbrot, B.B. The Fractal Geometry of Nature; WH Freeman: New York, NY, USA, 1982; Volume 1. [Google Scholar] [CrossRef]
Durán-Meza, G.; López-García, J.; del Río-Correa, J.L. The self-similarity properties and multifractal analysis of DNA sequences. Appl. Math. Nonlinear Sci. 2019, 4, 267–278. [Google Scholar] [CrossRef]
Garrett, M.W.; Willinger, W. Analysis, modeling and generation of self-similar VBR video traffic. ACM SIGCOMM Comput. Commun. Rev. 1994, 24, 269–280. [Google Scholar] [CrossRef]
Willinger, W.; Taqqu, M.S.; Sherman, R.; Wilson, D.V. Self-similarity through high-variability: Statistical analysis of Ethernet LAN traffic at the source level. IEEE/ACM Trans. Netw. 1997, 5, 71–86. [Google Scholar] [CrossRef]
Ritke, R.; Hong, X.; Gerla, M. Contradictory relationship between Hurst parameter and queueing performance (extended version). Telecommun. Syst. 2001, 16, 159–175. [Google Scholar] [CrossRef]
Riedi, R.; Véhel, J.L. Multifractal Properties of TCP Traffic: A Numerical Study. Rapport de recherché, L′Institut national de recherche en informatique et en automatique (INRIA), Le Chesnay-Rocquencourt, France. 1997. Available online: https://hal.inria.fr/file/index/docid/73560/filename/RR-3129.pdf (accessed on 20 May 2023).
Gao, J.; Rubin, I. Multifractal analysis and modelling of VBR video traffic. Electron. Lett. 2000, 36, 1. [Google Scholar] [CrossRef]
Vieira, F.H.T.; Bianchi, G.R.; Lee, L.L. A network traffic prediction approach based on multifractal modeling. J. High Speed Netw. 2010, 17, 83–96. [Google Scholar] [CrossRef]
Ribeiro, V.J.; Riedi, R.H.; Baraniuk, R.G. Wavelets and multifractals for network traffic modeling and inference. In Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221), Salt Lake City, UT, USA, 7–11 May 2001; Volume 6, pp. 3429–3432. [Google Scholar] [CrossRef]
Jiang, J.; Xiong, Z. Wavelet-based modeling and smoothing for call admission control of VBR video traffic. In Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Pacific Grove, CA, USA, 9–12 November 2003; Volume 2, pp. 1510–1513. [Google Scholar] [CrossRef]
Krishna, M.; Gadre, V.; Desai, U. Multiplicative multifractal process based modeling of broadband traffic processes: Variable bit rate video traffic. In Proceedings of the 2002 International Zurich Seminar on Broadband Communications Access-Transmission-Networking (Cat. No. 02TH8599), Zurich, Switzerland, 19–21 February 2002; p. 18. [Google Scholar] [CrossRef]
Huang, X.D.; Zhou, Y.H.; Zhang, R.F. A multiscale model for MPEG-4 varied bit rate video traffic. IEEE Trans. Broadcast. 2004, 50, 323–334. [Google Scholar] [CrossRef]
Rocha, F.G.C.; Vieira, F.H.T. Modeling of MPEG-4 video traffic using a multifractal cascade with autoregressive multipliers. IEEE Lat. Am. Trans. 2011, 9, 860–867. [Google Scholar] [CrossRef]
Yu, Y.; Song, M.; Fu, Y.; Song, J. Traffic prediction in 3G mobile networks based on multifractal exploration. Tsinghua Sci. Technol. 2013, 18, 398–405. [Google Scholar] [CrossRef]
Ergenç, D.; Onur, E. On network traffic forecasting using autoregressive models. arXiv 2019, arXiv:1912.12220. [Google Scholar] [CrossRef]
Lazaris, A.; Koutsakis, P. Modeling multiplexed traffic from H.264/AVC videoconference streams. Comput. Commun. 2010, 33, 1235–1242. [Google Scholar] [CrossRef]
Lucantoni, D.M.; Neuts, M.F.; Reibman, A.R. Methods for performance evaluation of VBR video traffic models. IEEE/ACM Trans. Netw. 1994, 2, 176–180. [Google Scholar] [CrossRef]
Nogueira, A.; Salvador, P.; Valadas, R.; Pacheco, A. Modeling network traffic with multifractal behavior. Telecommun. Syst. 2003, 24, 339–362. [Google Scholar] [CrossRef]
Haßlinger, G.; Takes, P. Real time video traffic characteristics and dimensioning regarding QoS demands. In Teletraffic Science and Engineering; Elsevier: Amsterdam, The Netherlands, 2003; Volume 5, pp. 1211–1220. [Google Scholar] [CrossRef]
De Godoy Stênico, J.W.; Ling, L.L. A multifractal based dynamic bandwidth allocation approach for network traffic flows. In Proceedings of the 2010 IEEE International Conference on Communications, Cape Town, South Africa, 23–27 May 2010; pp. 1–6. [Google Scholar] [CrossRef]
Dymora, P.; Mazurek, M. An innovative approach to anomaly detection in communication networks using multifractal analysis. Appl. Sci. 2020, 10, 3277. [Google Scholar] [CrossRef]
Park, C.; Hernández-Campos, F.; Le, L.; Marron, J.S.; Park, J.; Pipiras, V.; Smith, F.D.; Smith, R.L.; Trovero, M.; Zhu, Z. Long-range dependence analysis of Internet traffic. J. Appl. Stat. 2011, 38, 1407–1433. [Google Scholar] [CrossRef]
Liew, C.H.; Kodikara, C.K.; Kondoz, A.M. MPEG-encoded variable bit-rate video traffic modelling. IEE Proc.-Commun. 2005, 152, 749–756. [Google Scholar] [CrossRef]
Fitzek, F.H.; Reisslein, M. MPEG-4 and H. 263 video traces for network performance evaluation. IEEE Netw. 2001, 15, 40–54. [Google Scholar] [CrossRef]
Reljin, I.; Samčović, A.; Reljin, B.H. 264/AVC video compressed traces: Multifractal and fractal analysis. EURASIP J. Adv. Signal Process. 2006, 2006, 75217. [Google Scholar] [CrossRef]
Seeling, P.; Reisslein, M. Video transport evaluation with H. 264 video traces. IEEE Commun. Surv. Tutor. 2011, 14, 1142–1165. [Google Scholar] [CrossRef]
Bestagini, P.; Battaglia, S.; Milani, S.; Tagliasacchi, M.; Tubaro, S. Detection of temporal interpolation in video sequences. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 3033–3037. [Google Scholar] [CrossRef]
Bian, S.; Luo, W.; Huang, J. Detecting video frame-rate up-conversion based on periodic properties of inter-frame similarity. Multimed. Tools Appl. 2014, 72, 437–451. [Google Scholar] [CrossRef]
Ding, X.; Zhang, D. Detection of motion-compensated frame-rate up-conversion via optical flow-based prediction residue. Optik 2020, 207, 163766. [Google Scholar] [CrossRef]
Yoon, M.; Nam, S.H.; Yu, I.J.; Ahn, W.; Kwon, M.J.; Lee, H.K. Frame-rate up-conversion detection based on convolutional neural network for learning spatiotemporal features. Forensic Sci. Int. 2022, 340, 111442. [Google Scholar] [CrossRef]
Wang, S.H.; Qiu, Z.D. A novel multifractal model of MPEG-4 video traffic. In Proceedings of the IEEE International Symposium on Communications and Information Technology, 2005. ISCIT 2005, Beijing, China, 12–14 October 2005; Volume 1, pp. 101–104. [Google Scholar] [CrossRef]
Zhang, R.; Condomines, J.P.; Lochin, E. A Multifractal Analysis and Machine Learning Based Intrusion Detection System with an Application in a UAS/RADAR System. Drones 2022, 6, 21. [Google Scholar] [CrossRef]
Zajić, G.J.; Vesić, M.D.; Gavrovska, A.M.; Reljin, I.S. Animation frame analysis. In Proceedings of the 2015 23rd Telecommunications Forum Telfor (TELFOR), Belgrade, Serbia, 24–26 November 2015; pp. 732–735. [Google Scholar] [CrossRef]
Madhusudana, P.C.; Birkbeck, N.; Wang, Y.; Adsumilli, B.; Bovik, A.C. ST-GREED: Space-time generalized entropic differences for frame rate dependent video quality prediction. IEEE Trans. Image Process. 2021, 30, 7446–7457. [Google Scholar] [CrossRef] [PubMed]
Mandelbrot, B. Statistical methodology for nonperiodic cycles: From the covariance to R/S analysis. In Annals of Economic and Social Measurement; NBER: Cambridge, MA, USA, 1972; Volume 1, pp. 259–290. [Google Scholar]
Menkens, O. Value at risk and self-similarity. Numer. Methods Financ. 2007, 1–23. [Google Scholar]
Weron, R. Estimating long-range dependence: Finite sample properties and confidence intervals. Phys. A Stat. Mech. Its Appl. 2002, 312, 285–299. [Google Scholar] [CrossRef]
Bărbulescu, A.; Serban, C.; Maftei, C. Evaluation of Hurst exponent for precipitation time series. In Proceedings of the 14th WSEAS International Conference on Computers, Corfu Island, Greece, 23–25 July 2010; Volume 2, pp. 590–595. [Google Scholar]
Kugiumtzis, D.; Tsimpiris, A. Measures of analysis of time series (MATS): A MATLAB toolkit for computation of multiple measures on time series data bases. arXiv 2010, arXiv:1002.1940. [Google Scholar] [CrossRef]
García, M.D.L.N.L.; Requena, J.P.R. Different methodologies and uses of the Hurst exponent in econophysics. Stud. Appl. Econ. 2019, 37, 96–108. [Google Scholar] [CrossRef]
Montanari, A.; Taqqu, M.S.; Teverovsky, V. Estimating long-range dependence in the presence of periodicity: An empirical study. Math. Comput. Model. 1999, 29, 217–228. [Google Scholar] [CrossRef]
Véhel, J.L.; Tricot, C. On various multifractal spectra. In Fractal Geometry and Stochastics III; Birkhäuser: Basel, Switzerland, 2004; pp. 23–42. [Google Scholar] [CrossRef]
Ihlen, E.A.; Vereijken, B. Multifractal formalisms of human behavior. Hum. Mov. Sci. 2013, 32, 633–651. [Google Scholar] [CrossRef]
Krzyszczak, J.; Baranowski, P.; Zubik, M.; Kazandjiev, V.; Georgieva, V.; Sławiński, C.; Siwek, K.; Kozyra, J.; Nieróbca, A. Multifractal characterization and comparison of meteorological time series from two climatic zones. Theor. Appl. Climatol. 2019, 137, 1811–1824. [Google Scholar] [CrossRef]
Fraclab. Available online: https://project.inria.fr/fraclab/ (accessed on 20 June 2023).
Gavrovska, A.; Zajić, G.; Reljin, I.; Reljin, B. Classification of prolapsed mitral valve versus healthy heart from phonocardiograms by multifractal analysis. Comput. Math. Methods Med. 2013, 2013, 376152. [Google Scholar] [CrossRef] [PubMed]
Gajan, S. Modeling of seismic energy dissipation of rocking foundations using nonparametric machine learning algorithms. Geotechnics 2021, 1, 534–557. [Google Scholar] [CrossRef]
Fan, G.F.; Guo, Y.H.; Zheng, J.M.; Hong, W.C. Application of the weighted k-nearest neighbor algorithm for short-term load forecasting. Energies 2019, 12, 916. [Google Scholar] [CrossRef]
Sharma, A.; Jigyasu, R.; Mathew, L.; Chatterji, S. Bearing fault diagnosis using weighted K-nearest neighbor. In Proceedings of the 2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 11–12 May 2018; pp. 1132–1137. [Google Scholar] [CrossRef]
Chomboon, K.; Chujai, P.; Teerarassamee, P.; Kerdprasop, K.; Kerdprasop, N. An empirical study of distance metrics for k-nearest neighbor algorithm. In Proceedings of the 3rd International Conference on Industrial Application Engineering, Kitakyushu, Japan, 28–31 March 2015; Volume 2. [Google Scholar] [CrossRef]
Jusman, Y.; Anam, M.K.; Puspita, S.; Saleh, E.; Kanafiah, S.N.A.M.; Tamarena, R.I. Comparison of dental caries level images classification performance using knn and svm methods. In Proceedings of the 2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), Kuala Terengganu, Malaysia, 13–15 September 2021; pp. 167–172. [Google Scholar]

Figure 1. High frame rate (HFR) as high number of frames per second.

Figure 2. Frame size trace sequence for (a) high-definition (HD) low frame rate (LFR) content in advanced video coding (AVC) format and (b) 4k or ultra-high-definition (UHD) HFR content in high-efficiency video coding (HEVC) format, as well as (c) their comparison after frame frequency alignment presented in a twenty seconds interval.

Figure 3. Statistical analysis of HEVC traces.

Figure 4. An example of multifractal spectrum and corresponding characteristics.

Figure 5. (a) The concept of k-nearest neighbors (kNN) binary classification and (b) the proposed model for temporal resolution recovery (TRR) detection.

Figure 6. Periodogram method applied for constant rate factor (crf) 36 (crf36) of the (a) ReadySetGo and (b) YachtRide sequence, where the TRR trend is presented with a solid line and the corresponding original is presented with a dashed line.

Figure 7. R/S statistics method applied for crf36 of the (a) ReadySetGo and (b) YachtRide sequence, where the TRR trend is presented with a solid line and the corresponding original is presented with a dashed line.

Figure 8. (a) R/S statistics method applied for ReadySetGo with crf36; (b) R/S statistics method applied for different crf values, where the TRR trend is presented with a solid line and the corresponding original is presented with a dashed line.

Figure 9. (a) Decreasing trend of trace sum; (b) multifractal spectra for ReadySetGo with different crf values.

Figure 10. Multifractal spectra for ReadySetGo with (a) crf24 and (b) crf36 for temporal recovery compared with the original.

Figure 11. Legendre multifractal spectra presented for different HFR signals with different crf values: (a) ReadySetGo, (b) HoneyBee and (c) YachtRide.

Figure 12. Legendre spectra structure and its right side for different crf values and different UHD 120 fps video content: (a) ReadySetGo and (b) HoneyBee.

Figure 13. Points included in cross-validation: (a) unlabeled samples and (b) weighted k-nearest neighbors (WkNN)-based, distinguishing between TRR and original samples.

Figure 14. (a–c) Recovered sequences from 30 fps and corresponding (d–f) spectra presented with original data, where, in (f), temporal recovery is performed using different frequency rates.

Figure 15. (a) Recovered ReadySetGo and (b) different number of features included in the cross-validation, taking into account tested classifiers with average performance.

Table 1. Test source video files.

No.	Source (YUV)	Spatial Resolution	Frame Rate	Frame Number	Bit Depth
1	Beauty	3180 × 2160 (2160 p)	120 fps	600	8
2	Bosphorus	3180 × 2160 (2160 p)	120 fps	600	8
3	HoneyBee	3180 × 2160 (2160 p)	120 fps	600	8
4	Jockey	3180 × 2160 (2160 p)	120 fps	600	8
5	ReadySetGo	3180 × 2160 (2160 p)	120 fps	600	8
6	YachtRide	3180 × 2160 (2160 p)	120 fps	600	8

Table 2. Long-range dependency (LRD) estimation via Hurst index calculation for ultra-high-definition (UHD) 120 fps compressed ReadySetGo sequence.

No.	Hurst Method	crf20	crf24	crf28	crf32	crf36	crf40
1	Hurst (Generalized)	0.6928	0.7058	0.7164	0.7138	0.7126	0.7089
2	Hurst (DFA)	0.7527	0.7962	0.8132	0.8024	0.7975	0.7832
3	Periodogram	0.7554	0.6036	0.5154	0.5512	0.5901	0.6113
4	R/S statistics	0.8652	0.8943	0.8848	0.8721	0.8678	0.8585

Table 3. Differences in Hurst evaluation using R/S statistics for UHD 120 fps compressed ReadySetGo sequence.

No.	Description	crf20	crf24	crf28	crf32	crf36	crf40
1	Temporal recovery	0.7910	0.7840	0.7807	0.7957	0.8100	0.8212
2	Original	0.8652	0.8943	0.8848	0.8721	0.8678	0.8585
3	H_diff	−8.58%	−12.33%	−11.77%	−8.76%	−6.67%	−4.34%
4	H_diff,average	−8.74%	−11.51%	−7.65%	−7.94%	−3.29%	+1.30%

Table 4. Traditional multifractal characteristics of temporally recovered UHD ReadySetGo sequence.

No.	Parameter	crf20	crf24	crf28	crf32	crf36	crf40
1	α_min	0.8901	0.8891	0.8820	0.8899	0.9100	0.9129
2	α₁	0.9888	0.9894	0.9889	0.9882	0.9876	0.9856
3	α₀	1.10112	1.0106	1.0112	1.0120	1.0129	1.0154
4	α_max	1.2182	1.1941	1.2052	1.2126	1.2262	1.2314
5	w	0.3282	0.3050	0.3231	0.3227	0.3162	0.3185

Table 5. Traditional multifractal characteristics of original UHD ReadySetGo sequence.

No.	Parameter	crf20	crf24	crf28	crf32	crf36	crf40
1	α_min	0.9376	0.9350	0.9178	0.8949	0.8882	0.8698
2	α₁	0.9921	0.9929	0.9924	0.9913	0.9906	0.9901
3	α₀	1.0083	1.0073	1.0076	1.0086	1.0092	1.0095
4	α_max	1.1602	1.1285	1.1115	1.1084	1.1093	1.1010
5	w	0.2225	0.1934	0.1937	0.2135	0.2212	0.2312

Table 6. Performance of WkNN classifier with different metrics.

No.	Classifier Type	True Positive Rate (TPR)	Positive Predictive Value (PPV)	Accuracy (Acc)
1	WkNN with Euclidean metric (13)	66.7	80	75
2	WkNN with Cityblock metric (14)	70	80.8	76.7
3	WkNN with Mahalanobis metric (15) (proposed approach)	96.7	100	98.3

Table 7. Performance of different classifiers in temporal resolution detection.

No.	Classifier Type	True Positive Rate (TPR)	Positive Predictive Value (PPV)	Accuracy (Acc)
1	kNN	70	77.8	75
2	Decision tree	60	56.3	56.7
3	Linear SVM	46.7	56	55
4	Cubic SVM	93.3	90.3	91.7
5	Quadratic SVM	89.3	83.3	86.7
6	WkNN with Mahalanobis metric (proposed approach)	96.7	100	98.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gavrovska, A. Effects on Long-Range Dependence and Multifractality in Temporal Resolution Recovery of High Frame Rate HEVC Compressed Content. Appl. Sci. 2023, 13, 9851. https://doi.org/10.3390/app13179851

AMA Style

Gavrovska A. Effects on Long-Range Dependence and Multifractality in Temporal Resolution Recovery of High Frame Rate HEVC Compressed Content. Applied Sciences. 2023; 13(17):9851. https://doi.org/10.3390/app13179851

Chicago/Turabian Style

Gavrovska, Ana. 2023. "Effects on Long-Range Dependence and Multifractality in Temporal Resolution Recovery of High Frame Rate HEVC Compressed Content" Applied Sciences 13, no. 17: 9851. https://doi.org/10.3390/app13179851

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Effects on Long-Range Dependence and Multifractality in Temporal Resolution Recovery of High Frame Rate HEVC Compressed Content

Abstract

1. Introduction

2. HFR Processing and Challenges

3. Self-Similarity and Multifractal Analysis of Compressed Video Content

4. HFR HEVC Video Traces and Temporal Recovery Data

5. Methods for Estimation of HFR Video Characteristics

5.1. Hurst Index

5.2. Multifractal Spectrum

5.3. Detection Model and Evaluation

6. Experimental Results and Discussion

6.1. Hurst Index Differences between Original and Temporal Recovery Data in LRD Estimation

6.2. Differences between Original and Temporal Recovery Data in Multifractal Spectrum Estimation

6.3. Temporal Recovery Detection Results Using the Proposed Model based on Multifractal Features

7. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI