v1v2 (latest)

AgenticTagger: Structured Item Representation for Recommendation with LLM Agents

5 February 2026

Zhouhang Xie

Bo Peng

Zhankui He

Ziqi Chen

Alice Han

Isabella Ye

Benjamin Coleman

Noveen Sachdeva

Fernando Pereira

Julian McAuley

Wang-Cheng Kang

Derek Zhiyuan Cheng

Beidou Wang

Randolph Brown

3DV

ArXiv (abs)PDF HTML Github

Main:10 Pages

5 Figures

Bibliography:4 Pages

11 Tables

Appendix:1 Pages

Abstract

High-quality representations are a core requirement for effective recommendation. In this work, we study the problem of LLM-based descriptor generation, i.e., keyphrase-like natural language item representation generation frameworks with minimal constraints on downstream applications. We propose AgenticTagger, a framework that queries LLMs for representing items with sequences of text descriptors. However, open-ended generation provides little control over the generation space, leading to high cardinality, low-performance descriptors that render downstream modeling challenging. To this end, AgenticTagger features two core stages: (1) a vocabulary-building stage in which a set of hierarchical, low-cardinality, and high-quality descriptors is identified, and (2) a vocabulary-assignment stage in which LLMs assign in-vocabulary descriptors to items. To effectively and efficiently ground vocabulary in the item corpus of interest, we design a multi-agent reflection mechanism in which an architect LLM iteratively refines the vocabulary guided by parallelized feedback from annotator LLMs that validate the vocabulary against item data. Experiments on public and private data show AgenticTagger brings consistent improvements across diverse recommendation scenarios, including generative and term-based retrieval, ranking, and controllability-oriented, critique-based recommendation.

View on arXiv

Comments on this paper