HeadSpin Documentation

AI-based Quality of Experience Monitoring


Monitoring performance is about achieving the best user experience. The HeadSpin AI provides insight into the user experience issues the end user sees. No reference required.

User-Perceived Performance

Visual Load Time

The HeadSpin AI automatically detects and surfaces high priority user-perceived performance issues.

Page Content Time Series

The HeadSpin Page Content metric illustrates how the information displayed on the screen changes throughout a session. The score ranges from 0, a blank screen, to 1, where grayscale values are uniformly distributed across the intensity spectrum.

Low Page Content Issue Card

The Low Page Content issue card highlights regions of a test where there is little or no content displayed on the screen for more than a second. This often occurs during slow page loads.

Loading Animation Issue Card

The Loading Animation issue card flags regions of a test where the HeadSpin AI detects animations indicative of slow page load times or video buffering events longer than 1 second. These animations are hugely diverse and may be displayed in a wide variety of sizes, styles, and durations.

Page Load Time Metric

A common key performance indicator is the visual app launch time or time to load a page during a page transition. HeadSpin provides an easy solution to measure this metric using our visual page load time API. See the Session API docs for details.

User-Perceived Video Quality


Blockiness measures the intensity of macroblocking due to compression artifacts. Block artifacts are a result of block transform coding. The transform, typically a discrete cosine transform, is applied to a block of pixels to achieve lossy compression by quantizing the transform coefficients of each block. This transform is applied to each block independently, leading to different quantization of adjacent blocks and discontinuities at block boundries. Blockiness values range between 0 and 1 which represent no macroblocking and perfectly regular global macroblocking, respectively.


Blurriness measures the intensity of blur effects. Blurring may be caused by optical effects such as depth-of-field focus or introduced through video encoding. Videos with low resolution or bit rate may appear blurry due to loss of fine detail. Blurriness values range between 0 and 1 where 0 represents no apparent blurring and 1 indicates that applying further blurring to the video does not result in further information loss.


Contrast is a measurement of the difference in brightness between objects in the same field of view. Low contrast, low brightness video content is hard for users to see and may be indicative of underexposure, mismatches between film editing and playback conditions, or loss of essential detail during compression. Perception of contrast depends on both local and global differences in brightness. Our contrast metric estimates the perceptual contrast by evaluating contrast at multiple scales. Contrast values range from 0 to 1 where 0 contrast represents a flat color with no distinguishable content and 1 indicates stark contrast changes at both a local and global level.


Brightness measures the perceived light intensity of an image. Excessively bright or dark content may be indicative of upstream issues in video recording or editing. Our brightness metric is derived from the L* channel of the CIELAB color space. Brightness values range from 0 to 1 where 0 indicates the darkest black and 1 indicates the brightest white.


Colorfulness is a measure of perceptual color saturation. The colorfulness of an object depends on its spectral reflectance and the strength of illumination. Colorfulness values range from 0 to 1 where 0 represents grayscale content (no color) and 1 represents content fully composed of the most saturated pure colors.

Downsampling Index

The Downsampling Index is a measure of the pixel-level information content of an image. Adaptive video streaming often leverages downsampling to reduce buffering at the cost of lower video quality. The Downsampling Index metric reflects the amount of information lost when an image is downsampled and then upsampled back to the original size. The values range from 0 to 1. A value of 1 indicates that no pixel-level information was lost after the transformation. This is characteristic of content that has a low effective resolution, or highly compressible content such as blank screens (see Page Content and the Low Page Content issue). A value of 0 is indicative of content that uses individual pixels to convey spatial information, which suggests that the content is displayed at the highest possible resolution given the device pixel size.

Reference-free Subjective Video Quality Mean Opinion Score (MOS)

The HeadSpin MOS is a patent-pending measure of the holistic subjective quality of video as it would be rated by a jury of users. This metric is generated by an AI algorithm trained on our proprietary data set of videos annotated by real users. To generate these labels, our labelers were shown a video on a mobile device and asked to provide one of the following Likert labels for each video: Very Poor, Poor, Fair, Good, and Excellent. We predict the HeadSpin MOS on a scale of 1 (Very Poor) to 5 (Excellent) for videos using both spatial and temporal features that are extracted from the video using a convolutional neural network. See our documentation on the HeadSpin Video Quality MOS for more details.

Reference-based Video Multi-Method Assessment Fusion (VMAF)

The VMAF is a perceptual video quality assessment algorithm developed by Netflix. It is designed to reflect the viewer's perception of streaming video quality for Netflix video streams. This algorithm uses human-provided quality labels on reference and distorted video pairs. These labels are used to train a machine learning algorithm that then predicts the perceptual quality of a distorted video given a reference video. This algorithm is only available for videos that have a source video as a reference for comparison. The HeadSpin platform implements the open-source framework for VMAF as documented on the project page: https://github.com/Netflix/vmaf

Comparison with Reference-Free Standards

ITU-T Rec. P.1203.1 (01/2019) specifies a set of parametric algorithms for monitoring quality for video streaming over TCP. Several modes of quality estimation are available depending on the metadata available for the video content. Example metadata required include information such as the video codec, bitrate, rebuffering events, and frame rate. The model produces quality scores for 1 second windows as well as an integral quality score for the duration of the content.

This approach offers a preliminary model for video quality based on a pool of databases available to the ITU working group if the required instrumentation is in place to capture the metadata required for the desired mode of operation.

At HeadSpin, we have curated the largest subjective quality score data set of its kind from videos captured in real world playback conditions on our global device cloud. Video content in mobile apps is too diverse for implementation-specific instrumentation to keep up with the demand for monitoring high quality streaming content. Instead, our algorithms leverage perceptual features to provide reference-free metrics, and identify quality issues throughout a video.

Poor Video Quality Issue

The Poor Video Quality issue card is surfaced when regions of the Video Quality MOS time series fall below the "Fair" video threshold. Various factors such as blockiness, blurriness, or other artifacts that may be difficult to define or describe are combined nonlinearly to contribute to the Poor Video Quality issue. Our neural network approach enables us to capture this information and map it to our best estimate of what an end-user would perceive as the quality of the content. Because the HeadSpin Mean Opinion Score is reference-free and automatic, any content, streaming or not, can have an estimated HeadSpin MOS.

Freezing and Stuttering

Frame Rate Time Series

The Frame Rate time series measures the frames per second observed on the device.

Low Frame Rate Issue Card

The Low Frame Rate issue card highlights regions of a test where the device frame rate dropped below 24 fps for more than 1 second. For gaming apps in particular this issue surfaces regions where the end user experiences slow loading or where stuttering may be observed during gameplay. This issue card is only emitted by default for regions of a test where the device was in the landscape orientation.

Screen Change Time Series

The Screen Change time series tracks the amount of visual change between video frames. This metric is calculated as the mean of the pairwise differences in intensity between pixels in adjacent frames. The Screen Change metric ranges in values from 0 (no change) to 1 (the whole screen changed)

Screen Freezing Issue Card

Large changes such as scene changes in a full screen video or page changes in a mobile app correspond to large magnitude events on the time series. Smaller events often include within scene rendering events for full screen videos or games, or small animations, carousel slideshow elements, or partial screen auto-playing video in mobile apps. When the metric is zero, no changes have been rendered on the screen.

The Screen Freezing issue card highlights regions of a test where the video visually appears to have frozen. The regions impacted by frozen video are highlighted above the Screen Change time series on the Waterfall UI. By default this analysis only runs when the screen is in the landscape orientation, and can be run manually over a specified session or regions of a session using the screen freezing analysis API. See the Session API docs for details.