Human-in-the-loop Adaptation
in Group Activity Feature Learning
for Team Sports Video Retrieval

1Toyota Technological Institute, Japan 2University of Hyogo, Japan
Computer Vision and Image Understanding
Special issue on "Computer Vision for Sports and Winter Sports"

Abstract

This paper proposes human-in-the-loop adaptation for Group Activity Feature Learning (GAFL) without group activity annotations. This human-in-the-loop adaptation is employed in a group-activity video retrieval framework to improve its retrieval performance. Our method initially pre-trains the GAF space based on the similarity of group activities in a self-supervised manner, unlike prior work that classifies videos into pre-defined group activity classes in a supervised learning manner. Our interactive fine-tuning process updates the GAF space to allow a user to better retrieve videos similar to query videos given by the user. In this fine-tuning, our proposed data-efficient video selection process provides several videos, which are selected from a video database, to the user in order to manually label these videos as positive or negative. These labeled videos are used to update (i.e., fine-tune) the GAF space, so that the positive and negative videos move closer to and farther away from the query videos through contrastive learning. Our comprehensive experimental results on two team sports datasets validate that our method significantly improves the retrieval performance. Ablation studies also demonstrate that several components in our human-in-the-loop adaptation contribute to the improvement of the retrieval performance. Code: https://github.com/chihina/GAFL-FINE-CVIU.

BibTeX

@article{NAKATANI2026104577,
        title = {Human-in-the-loop adaptation in group activity feature learning for team sports video retrieval},
        journal = {Computer Vision and Image Understanding},
        volume = {263},
        pages = {104577},
        year = {2026},
        author = {Chihiro Nakatani and Hiroaki Kawashima and Norimichi Ukita},
        }