Model Generalization on Text Attribute Graphs: Principles with Large Language Models

17 February 2025

Haoyu Wang

Main:8 Pages

9 Figures

Bibliography:6 Pages

14 Tables

Appendix:8 Pages

Abstract

Large language models (LLMs) have recently been introduced to graph learning, aiming to extend their zero-shot generalization success to tasks where labeled graph data is scarce. Among these applications, inference over text-attributed graphs (TAGs) presents unique challenges: existing methods struggle with LLMs' limited context length for processing large node neighborhoods and the misalignment between node embeddings and the LLM token space. To address these issues, we establish two key principles for ensuring generalization and derive the framework LLM-BP accordingly: (1) Unifying the attribute space with task-adaptive embeddings, where we leverage LLM-based encoders and task-aware prompting to enhance generalization of the text attribute embeddings; (2) Developing a generalizable graph information aggregation mechanism, for which we adopt belief propagation with LLM-estimated parameters that adapt across graphs. Evaluations on 11 real-world TAG benchmarks demonstrate that LLM-BP significantly outperforms existing approaches, achieving 8.10% improvement with task-conditional embeddings and an additional 1.71% gain from adaptive aggregation.

View on arXiv

@article{wang2025_2502.11836,
  title={ Model Generalization on Text Attribute Graphs: Principles with Large Language Models },
  author={ Haoyu Wang and Shikun Liu and Rongzhe Wei and Pan Li },
  journal={arXiv preprint arXiv:2502.11836},
  year={ 2025 }
}

Comments on this paper