Understanding 6G through Language Models: A Case Study on LLM-aided Structured Entity Extraction in Telecom Domain

20 May 2025

Main:5 Pages

4 Figures

Bibliography:1 Pages

Abstract

Knowledge understanding is a foundational part of envisioned 6G networks to advance network intelligence and AI-native network architectures. In this paradigm, information extraction plays a pivotal role in transforming fragmented telecom knowledge into well-structured formats, empowering diverse AI models to better understand network terminologies. This work proposes a novel language model-based information extraction technique, aiming to extract structured entities from the telecom context. The proposed telecom structured entity extraction (TeleSEE) technique applies a token-efficient representation method to predict entity types and attribute keys, aiming to save the number of output tokens and improve prediction accuracy. Meanwhile, TeleSEE involves a hierarchical parallel decoding method, improving the standard encoder-decoder architecture by integrating additional prompting and decoding strategies into entity extraction tasks. In addition, to better evaluate the performance of the proposed technique in the telecom domain, we further designed a dataset named 6GTech, including 2390 sentences and 23747 words from more than 100 6G-related technical publications. Finally, the experiment shows that the proposed TeleSEE method achieves higher accuracy than other baseline techniques, and also presents 5 to 9 times higher sample processing speed.

View on arXiv

@article{yuan2025_2505.14906,
  title={ Understanding 6G through Language Models: A Case Study on LLM-aided Structured Entity Extraction in Telecom Domain },
  author={ Ye Yuan and Haolun Wu and Hao Zhou and Xue Liu and Hao Chen and Yan Xin and Jianzhong and Zhang },
  journal={arXiv preprint arXiv:2505.14906},
  year={ 2025 }
}

Comments on this paper