TeCoNeRV: Leveraging Temporal Coherence for Compressible Neural Representations for Videos

University of Maryland, College Park

TeCoNeRV enables temporally coherent weight updates for implicit video compression. As video content evolves over time (left), TeCoNeRV produces smooth clip-to-clip weight residuals (right), in contrast to the baseline (NeRV-Enc [1]) whose residual magnitudes fluctuate significantly. This temporal coherence enables more efficient compression while preserving visual quality. TeCoNeRV is the first hypernetwork-based implicit video compression method that scales to 720p and 1080p, while maintaining fast encoding speeds.

Key Advantages

Memory Reduction

Patch-tubelet decomposition decouples memory from resolution, enabling training on standard GPUs.

Faster Encoding

1.5-3× faster encoding compared to baseline NeRV-Enc, with no per-video optimization required.

Lower Bitrate

Temporal coherence regularization produces highly compressible weight residuals, with nearly 40% lower bitrates.

Resolution Independence

Train at 480p, inference at 720p or 1080p. No high-resolution training data required.

Abstract

Implicit Neural Representations (INRs) have recently demonstrated impressive performance for video compression. However, since a separate INR must be overfit for each video, scaling to high-resolution videos while maintaining encoding efficiency remains a significant challenge. Hypernetwork-based approaches predict INR weights (hyponetworks) for unseen videos at high speeds, but with low quality, large compressed size, and prohibitive memory needs at higher resolutions. We address these fundamental limitations through three key contributions: (1) an approach that decomposes the weight prediction task spatially and temporally, by breaking short video segments into patch tubelets, to reduce the pretraining memory overhead by 20×; (2) a residual-based storage scheme that captures only differences between consecutive segment representations, significantly reducing bitstream size; and (3) a temporal coherence regularization framework that encourages changes in the weight space to be correlated with video content. Our proposed method, TeCoNeRV, achieves substantial improvements of 2.47dB and 5.35dB PSNR over the baseline at 480p and 720p on UVG, with 36% lower bitrates and 1.5-3× faster encoding speeds. With our low memory usage, we are the first hypernetwork approach to demonstrate results at 480p, 720p and 1080p on UVG, HEVC and MCL-JCV. the baseline at 480p and 720p on UVG, with 36% lower bitrates and 1.5-3× faster encoding speeds. With our low memory usage, we are the first hypernetwork-based approach to demonstrate results at 480p, 720p and 1080p on UVG, HEVC, and MCL-JCV.

Method

TeCoNeRV method overview

Above: Hypernetworks from prior work predict weights for entire video frames at once, resulting in large base parameters and bitstream size. Below: Our approach, TeCoNeRV, with (a) patch-tubelets that decouple spatial resolution from memory requirements, (b) residual encoding that stores only weight differences across time steps, and (c) temporal coherence finetuning that regularizes weight differences. Together, these components achieve better compression efficiency and superior reconstruction quality.

Results

Main quantitative results

Main quantitative results (PSNR/SSIM, bitrate, and encoding/decoding speed). Our primary hypernetwork baseline is NeRV-Enc (from FastNeRV [1]). Since NeRV-Enc’s memory grows quickly with resolution and is OOM at 1080p, we report comparisons against NeRV [2] and HiNeRV [3] at 1080p.

Rate–distortion curves at 480p

Rate–distortion at 480p. PSNR vs. bitrate (bpp) on UVG (left) and Kinetics-400 (right), comparing TeCoNeRV to NeRV-Enc.

480p qualitative comparison: NeRV-Enc vs TeCoNeRV

Visual comparison of reconstruction quality at 480p (TeCoNeRV vs. NeRV-Enc, on UVG).

Video reconstructions of TeCoNeRV at higher resolutions

NeRV-Enc is omitted at higher resolutions due to degraded quality at 720p and out-of-memory at 1080p.

720p • Johnny (HEVC)

1080p • Beauty (UVG)

References

BibTeX