Academic Home

白卓新 Zhuoxin Bai

🎓 面向大模型推理系统优化的研究者

🧠 当前聚焦 LLM 推理加速、KV Cache 优化、Agent Memory 以及面向长上下文与高吞吐推理场景的系统设计。

我目前主要关注大语言模型推理系统，包括推理加速、KV Cache 管理、长上下文场景下的内存优化，以及 Agent Memory 相关机制设计。这个主页采用单页结构，将个人介绍、研究兴趣、动态图表和论文列表整合在同一界面中，并通过 GitHub Actions 定期从 ORCID 自动同步论文数据。

📍 重庆大学，重庆，中国 🌱 欢迎交流 LLM Systems、推理优化、智能体记忆与开源实现。

✉️ Email 🧾 ORCID 💻 GitHub 📝 Blog

8 篇论文

3 活跃年份

2026 最新发表年份

5 ORCID 关键词

✨ 研究画像

一页内快速看清我在做什么。

🔬 Research · LLM Inference、KV Cache、Agent Memory ⚙️ Engineering · Systems、Storage、Serving 🏷️ LLM 🏷️ KV Cache 🏷️ GPU 🏷️ Storage 🏷️ GNN

当前关注

LLM 推理加速、KV Cache 优化、Agent Memory

当前单位

Chongqing University · student

论文数据来源

ORCID 公开记录 · GitHub Actions 定期同步

最近同步时间

2026-05-18 07:04 UTC

📊 论文趋势

图表颜色会跟随系统深浅色模式自动切换。

🧾 论文列表

数据由 GitHub Actions 定期从 ORCID 同步生成。

HitKV: Activation Frequency Knows Which Tokens Are Important

Proceedings of the AAAI Conference on Artificial Intelligence · 2026-03

Conference Paper

🔗 DOI 📄 Link 📥 Source: Zhuoxin Bai

Latency Optimization in Hybrid Memory System for GNNs

IEEE Transactions on Computers · 2026-03

Journal Article

🔗 DOI 📄 Link 📥 Source: Crossref

Cocache: An Accurate and Low-Overhead Dynamic Caching Method for GNNs

Lecture Notes in Computer Science · 2026

Conference Paper

🔗 DOI 📄 Link 📥 Source: Zhuoxin Bai

DualSpar: A Dual-Granularity Memory Framework with Adaptive Sparsity for Efficient LLM Inference

2025 IEEE 43rd International Conference on Computer Design (ICCD) · 2025-11

Conference Paper

🔗 DOI 📄 Link 📥 Source: Zhuoxin Bai

LAShards: Low-Overhead and Self-Adaptive MRC Construction for Non-Stack Algorithms

IEEE Transactions on Computers · 2025-10

Journal Article

🔗 DOI 📄 Link 📥 Source: Crossref

GNNBoost: Accelerating sampling-based GNN training on large scale graph by optimizing data preparation

Journal of Systems Architecture · 2025-10

Journal Article

🔗 DOI 📄 Link 📥 Source: Crossref

RobTrack: A Robust 3D Multi-object Tracking Method for Edge Devices

Lecture Notes in Computer Science · 2025

Conference Paper

🔗 DOI 📄 Link 📥 Source: Zhuoxin Bai

BGS: Accelerate GNN training on multiple GPUs

Journal of Systems Architecture · 2024-08

Journal Article

🔗 DOI 📄 Link 📥 Source: Crossref