Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.15296
Cited By
Scalability Optimization in Cloud-Based AI Inference Services: Strategies for Real-Time Load Balancing and Automated Scaling
16 April 2025
Yihong Jin
Ze Yang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Scalability Optimization in Cloud-Based AI Inference Services: Strategies for Real-Time Load Balancing and Automated Scaling"
4 / 4 papers shown
Title
DeepRAG: Integrating Hierarchical Reasoning and Process Supervision for Biomedical Multi-Hop QA
Yuelyu Ji
Hang Zhang
Shiven Verma
Hui Ji
Chun Li
Yushui Han
YanShan Wang
LRM
26
0
0
31 May 2025
Curriculum Guided Reinforcement Learning for Efficient Multi Hop Retrieval Augmented Generation
Yuelyu Ji
Rui Meng
Zhuochun Li
Daqing He
183
1
0
23 May 2025
Cross-Cloud Data Privacy Protection: Optimizing Collaborative Mechanisms of AI Systems by Integrating Federated Learning and LLMs
Huaiying Luo
Cheng Ji
FedML
62
5
0
19 May 2025
Cloud-Based AI Systems: Leveraging Large Language Models for Intelligent Fault Detection and Autonomous Self-Healing
Cheng Ji
Huaiying Luo
76
7
0
16 May 2025
1