Now You See Me, Now You Don't: A Unified Framework for Expression Consistent Anonymization in Talking Head Videos

14 January 2026

Anil Egin

Andrea Tangherloni

Antitza Dantcheva

PICV

ArXiv (abs)PDF HTML

Main:8 Pages

7 Figures

Bibliography:2 Pages

5 Tables

Abstract

Face video anonymization is aimed at privacy preservation while allowing for the analysis of videos in a number of computer vision downstream tasks such as expression recognition, people tracking, and action recognition. We propose here a novel unified framework referred to as Anon-NET, streamlined to de-identify facial videos, while preserving age, gender, race, pose, and expression of the original video. Specifically, we inpaint faces by a diffusion-based generative model guided by high-level attribute recognition and motion-aware expression transfer. We then animate deidentified faces by video-driven animation, which accepts the de-identified face and the original video as input. Extensive experiments on the datasets VoxCeleb2, CelebV-HQ, and HDTF, which include diverse facial dynamics, demonstrate the effectiveness of AnonNET in obfuscating identity while retaining visual realism and temporal consistency. The code of AnonNet will be publicly released.

View on arXiv

Comments on this paper