BERTopic: Neural topic modeling with a class-based TF-IDF procedure

11 March 2022

Papers citing "BERTopic: Neural topic modeling with a class-based TF-IDF procedure"

50 / 423 papers shown

Title
GLARE: Guided LexRank for Advanced Retrieval in Legal Analysis Fabio Gregório Rafaela Castro Kele Belloze Rui Pedro Lopes Eduardo Bezerra AILaw ELM 24 0 0 10 Sep 2024
Mapping News Narratives Using LLMs and Narrative-Structured Text Embeddings Jan Elfes 38 1 0 10 Sep 2024
Advancing Topic Segmentation of Broadcasted Speech with Multilingual Semantic Embeddings Sakshi Deo Shukla Pavel Denisov Tuğtekin Turan 54 0 0 10 Sep 2024
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Ilya Gusev LLMAG 136 3 0 10 Sep 2024
MessIRve: A Large-Scale Spanish Information Retrieval Dataset Francisco Valentini Viviana Cotik D. Furman Ivan Bercovich Edgar Altszyler Juan Manuel Pérez 88 2 0 09 Sep 2024
FairHome: A Fair Housing and Fair Lending Dataset Anusha Bagalkotkar Aveek Karmakar Gabriel Arnson Ondrej Linda FaML AI4TS AILaw 59 0 0 09 Sep 2024
The Kubernetes Security Landscape: AI-Driven Insights from Developer Discussions J. Alexander Curtis Nasir U. Eisty 29 2 0 06 Sep 2024
Self-supervised Topic Taxonomy Discovery in the Box Embedding Space Yuyin Lu Hegang Chen Pengbo Mao Yanghui Rao H. Xie Fu Lee Wang Qing Li BDL 90 1 0 27 Aug 2024
Multi-Faceted Question Complexity Estimation Targeting Topic Domain-Specificity Sujay R Suki Perumal Yash Nagraj Anushka Ghei Srinivas K S 75 0 0 23 Aug 2024
Characterizing Online Toxicity During the 2022 Mpox Outbreak: A Computational Analysis of Topical and Network Dynamics Lizhou Fan Lingyao Li Libby Hemphill 403 1 0 21 Aug 2024
GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization Ran Liu Ming Liu Min Yu Jianguo Jiang Gang Li Dan Zhang Jingyuan Li Xiang Meng Weiqing Huang 56 0 0 19 Aug 2024
PhysBERT: A Text Embedding Model for Physics Scientific Literature Thorsten Hellert Joao Montenegro Andrea Pollastro PINN AI4CE 73 4 0 18 Aug 2024
GeneticPrism: Multifaceted Visualization of Scientific Impact Evolutions Ye Sun Zipeng Liu Yuankai Luo Lei Xia Lei Shi 41 0 0 14 Aug 2024
The Adaptive Strategies of Anti-Kremlin Digital Dissent in Telegram during the Russian Invasion of Ukraine Apaar Bawa Ugur Kursuncu Dilshod Achilov V. Shalin 39 1 0 13 Aug 2024
Iterative Improvement of an Additively Regularized Topic Model Alex Gorbulev V. Alekseev Konstantin Vorontsov 114 0 0 11 Aug 2024
Topic Modeling with Fine-tuning LLMs and Bag of Sentences Johannes Schneider 87 0 0 06 Aug 2024
OneLove beyond the field -- A few-shot pipeline for topic and sentiment analysis during the FIFA World Cup in Qatar Christoph Rauchegger Sonja Mei Wang Pieter Delobelle 62 0 0 05 Aug 2024
DebateQA: Evaluating Question Answering on Debatable Knowledge Rongwu Xu Xuan Qi Zehan Qi Wei Xu Zhijiang Guo ELM 93 7 0 02 Aug 2024
DisTrack: a new Tool for Semi-automatic Misinformation Tracking in Online Social Networks Francesco Di Salvo Álvaro Huertas-García Sebastian Doerrich Javier Huertas-Tato Christian Ledig 131 1 0 01 Aug 2024
Automatic Generation of Behavioral Test Cases For Natural Language Processing Using Clustering and Prompting Ying Li Rahul Singh Tarun Joshi Agus Sudjianto 52 1 0 31 Jul 2024
Industrial-Grade Smart Troubleshooting through Causal Technical Language Processing: a Proof of Concept Alexandre Trilla Ossee Yiboe Nenad Mijatovic Jordi Vitrià 88 1 0 30 Jul 2024
An Iterative Approach to Topic Modelling Albert Wong Florence Wing Yau Cheng Ashley Keung Yamileth Hercules Mary Alexandra Garcia Yew-Wei Lim Lien Pham 105 0 0 25 Jul 2024
A Large-Scale Sensitivity Analysis on Latent Embeddings and Dimensionality Reductions for Text Spatializations Daniel Atzberger Tim Cech Willy Scheibel Jürgen Döllner M. Behrisch Tobias Schreck 114 4 0 25 Jul 2024
Mapping the Technological Future: A Topic, Sentiment, and Emotion Analysis in Social Media Discourse A. Landowska Maciej Skorski Krzysztof Rajda 114 0 0 20 Jul 2024
AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism William Brannon Doug Beeferman Hang Jiang Andrew Heyward Deb Roy 33 1 0 17 Jul 2024
Bridging Dictionary: AI-Generated Dictionary of Partisan Language Use Hang Jiang Doug Beeferman William Brannon Andrew Heyward Deb Roy 67 0 0 12 Jul 2024
Open-world Multi-label Text Classification with Extremely Weak Supervision Xintong Li Jinya Jiang Ria Dharmani Jayanth Srinivasa Gaowen Liu Jingbo Shang VLM 78 2 0 08 Jul 2024
How Similar Are Elected Politicians and Their Constituents? Quantitative Evidence From Online Social Networks Waleed Iqbal Gareth Tyson Ignacio Castro 76 0 0 03 Jul 2024
Generative Monoculture in Large Language Models Fan Wu Emily Black Varun Chandrasekaran SyDa 69 5 0 02 Jul 2024
Interactive Topic Models with Optimal Transport Garima Dhanania Sheshera Mysore Chau Minh Pham Mohit Iyyer Hamed Zamani Andrew McCallum OT 75 1 0 28 Jun 2024
SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs Xin Su Man Luo Kris W Pan Tien Pei Chou Vasudev Lal Phillip Howard 116 4 0 28 Jun 2024
Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models Daniel Lopez-Martinez MedIm 95 1 0 24 Jun 2024
Towards Region-aware Bias Evaluation Metrics Angana Borah Aparna Garimella Rada Mihalcea 56 1 0 23 Jun 2024
SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages G. Ghazaryan Erik Arakelyan Pasquale Minervini Isabelle Augenstein SyDa 74 0 0 20 Jun 2024
Seeing Through AI's Lens: Enhancing Human Skepticism Towards LLM-Generated Fake News Navid Ayoobi Sadat Shahriar Arjun Mukherjee DeLMO 83 1 0 20 Jun 2024
Mining United Nations General Assembly Debates Mateusz Grzyb Mateusz Krzyzinski Bartłomiej Sobieski Mikołaj Spytek Bartosz Pieliñski Daniel Dan Anna Wróblewska 53 0 0 19 Jun 2024
PromptDSI: Prompt-based Rehearsal-free Continual Learning for Document Retrieval Tuan-Luc Huynh Thuy-Trang Vu Weiqing Wang Yinwei Wei T. Le D. Gašević Yuan-Fang Li Thanh-Toan Do VLM CLL 117 1 0 18 Jun 2024
COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities Zihao He Rebecca Dorn Siyi Guo Minh Duc Hoang Chu Kristina Lerman 94 8 0 17 Jun 2024
LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models Xiaohao Yang He Zhao Dinh Q. Phung Wray Buntine Lan Du ALM ELM 176 2 0 13 Jun 2024
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus Matthieu Futeral A. Zebaze Pedro Ortiz Suarez Julien Abadji Rémi Lacroix Cordelia Schmid Rachel Bawden Benoît Sagot 171 3 0 13 Jun 2024
HelpSteer2: Open-source dataset for training top-performing reward models Zhilin Wang Yi Dong Olivier Delalleau Jiaqi Zeng Gerald Shen Daniel Egert Jimmy J. Zhang Makesh Narsimhan Sreedhar Oleksii Kuchaiev AI4TS 127 109 0 12 Jun 2024
Unraveling Code-Mixing Patterns in Migration Discourse: Automated Detection and Analysis of Online Conversations on Reddit Fedor Vitiugin Sunok Lee Henna Paakki Anastasiia Chizhikova Nitin Sawhney 64 1 0 12 Jun 2024
Blowfish: Topological and statistical signatures for quantifying ambiguity in semantic search T. R. Barillot Alex De Castro 70 0 0 12 Jun 2024
Tailoring Generative AI Chatbots for Multiethnic Communities in Disaster Preparedness Communication: Extending the CASA Paradigm Xinyan Zhao Yuan Sun Wenlin Liu Chau-Wai Wong 45 3 0 12 Jun 2024
MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention Prince Jha Raghav Jain Konika Mandal Aman Chadha Sriparna Saha P. Bhattacharyya 77 8 0 08 Jun 2024
Creating an AI Observer: Generative Semantic Workspaces Pavan Holur Shreyas Rajesh David Chong V. Roychowdhury 45 0 0 07 Jun 2024
MODABS: Multi-Objective Learning for Dynamic Aspect-Based Summarization Xiaobo Guo Soroush Vosoughi 76 0 0 05 Jun 2024
Towards Transparency: Exploring LLM Trainings Datasets through Visual Topic Modeling and Semantic Frame Charles de Dampierre Andrei Mogoutov Nicolas Baumard 98 2 0 03 Jun 2024
Comprehensive Evaluation of Large Language Models for Topic Modeling T. Doi Masaru Isonuma Hitomi Yanaka ELM 70 1 0 02 Jun 2024
Enhancing Antibiotic Stewardship using a Natural Language Approach for Better Feature Representation Simon A. Lee Trevor Brokowski Jeffrey N. Chiang 74 8 0 30 May 2024