Yijiang Li1, Genpei Zhang2, Jiacheng Cheng1,
Yi Li3, Xiaojun Shan1
Dashan Gao3, Jiancheng Lyu3, Yuan Li3,
Ning Bi3, Nuno Vasconcelos1
1University of California San Diego 2University of Electronic Science & Technology of China 3Qualcomm AI Research
While the rapid proliferation of wearable cameras has raised significant concerns about egocentric video privacy, prior work has largely overlooked the unique privacy threats posed to the camera wearer. This work investigates the question: How much privacy information about the wearer can be inferred from first-person videos? We introduce EgoPrivacy, the first large-scale benchmark covering three privacy types and seven tasks, from fine-grained identity recovery to coarse-grained age prediction. To emphasize the threats, we propose Retrieval-Augmented Attack (RAA), which leverages ego-to-exo retrieval to boost demographic attacks. Extensive experiments show wearer information is highly susceptible to leakage—foundation models recover identity, scene, gender and race with 70–80 % accuracy even in zero-shot settings.
EgoPrivacy spans 3 privacy categories | 7 tasks | 9 k clips | 950 + identities | 130 + scenes.
CLIP predicts gender 73 %, race 65 %, age 80 % without egocentric fine-tuning.
Cross-view retrieval boosts demographic inference by +10–16 pp.
Attacks remain well above chance on Charades-Ego (OOD), revealing persistent risks.
Attention / RNN heads leak more privacy than MLP; gains saturate beyond eight frames.
RAA retrieves visually similar exocentric clips and fuses predictions with ego clips, yielding stronger demographic attacks.
Table 2. Demographic inference accuracy (higher is better).
Table 3. Identity and retrieval-augmented results.
@inproceedings{Li2025EgoPrivacy,
title = {EgoPrivacy: What Your First-Person Camera Says About You},
author = {Li,Yijiang and Zhang,Genpei and Cheng,Jiacheng and Li,Yi and others},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
url = {https://arxiv.org/abs/2506.12258}
}