ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.12380
14
0

Bundle Fragments into a Whole: Mining More Complete Clusters via Submodular Selection of Interesting webpages for Web Topic Detection

19 September 2024
Junbiao Pang
Anjing Hu
Qingming Huang
ArXivPDFHTML
Abstract

Organizing interesting webpages into hot topics is one of key steps to understand the trends of multimodal web data. A state-of-the-art solution is firstly to organize webpages into a large volume of multi-granularity topic candidates; hot topics are further identified by estimating their interestingness. However, these topic candidates contain a large number of fragments of hot topics due to both the inefficient feature representations and the unsupervised topic generation. This paper proposes a bundling-refining approach to mine more complete hot topics from fragments. Concretely, the bundling step organizes the fragment topics into coarse topics; next, the refining step proposes a submodular-based method to refine coarse topics in a scalable approach. The propose unconventional method is simple, yet powerful by leveraging submodular optimization, our approach outperforms the traditional ranking methods which involve the careful design and complex steps. Extensive experiments demonstrate that the proposed approach surpasses the state-of-the-art method (i.e., latent Poisson deconvolution Pang et al. (2016)) 20% accuracy and 10% one on two public data sets, respectively.

View on arXiv
Comments on this paper