PAVEN: A Perceptual Algorithm for Versatile video Encoding using Neural networks

by Martinraj Nadar | Sunday, Jul 13, 2025
blog photo

Abstract:

This work introduces the Perceptual Algorithm for Versatile video Encoding using Neural Networks (PAVEN), a subjective video coding algorithm designed to reduce the bit rate in videos encoded with the Versatile Video Coding (VVC) standard without compromising subjective video quality. The algorithm uses a deep learning model specifically trained by the authors to account for the specific characteristics of video signals. The trained model outperforms others in the literature by more accurately identifying areas of the frames where viewers are most likely to focus their attention. The output of the deep learning model is further processed to merge all disjoint areas and adapt the result to the Coding Tree Unit (CTU) size in VVC, allowing for greater compression in less important areas. The results show an average reduction in bit rate of 7% while maintaining the same subjective video quality, validated through viewer interviews using the Mean Opinion Score (MOS) metric.

Authors:

Fernández-Lagos, P., Ríos, B., Kalva, H., Cebrián-Márquez, G., Vigueras, G., & Diaz-Honrubia, A. J.

Conference / Journal

Engineering Applications of Artificial Intelligence, Volume 159, Part B, 8 November 2025