ASPLOS2025 论文分类总结

ASPLOS2025 论文分类总结

会议概览

ASPLOS 2025会议共收录论文72篇,汇聚了来自学术界与工业界的研究人员,展示了系统领域前沿技术的最新进展。作为计算机系统领域的重要国际会议,ASPLOS持续推动体系结构、操作系统与软件系统间的交叉创新。本届会议技术趋势涵盖AI与系统融合、分布式与数据中心系统优化、存储与文件系统创新、操作系统内核设计、资源调度与管理机制、数据库系统演进,以及隐私安全增强技术。研究热点聚焦于AI驱动的系统优化、高效能数据中心架构、低延迟存储方案、安全隔离机制及跨层资源协同管理,反映出系统研究在智能化与分布式趋势下的深度演进。

数据生成时间:2025年08月31日

72
论文总数
8
技术领域
13
参与国家
10
主要机构

🔬 技术领域分类

技术领域分布

技术领域统计

🌍 作者来源分析

国家分布

主要机构

🔍 关键词分析

技术关键词热度

论文列表

AI + Systems (18篇)

Accelerating Retrieval-Augmented Generation

Authors: Derrick Quinn (Cornell University), Mohammad Nouri (Cornell University), Neel Patel (Cornell University), John Salihu (University of Kansas), Alireza Salemi (UMass Amherst), Sukhan Lee (Samsung Electronics), Hamed Zamani (UMass Amherst), Mohammad Alian (Cornell University)

AnA: An Attentive Autonomous Driving System

Authors: Wonkyo Choe (University of Virginia), Rongxiang Wang (University of Virginia), Felix Xiaozhu Lin (University of Virginia)

Cinnamon: A Framework for Scale-Out Encrypted AI

Authors: Siddharth Jayashankar (Carnegie Mellon University), Edward Chen (Carnegie Mellon University), Tom Tang (Carnegie Mellon University), Wenting Zheng (Carnegie Mellon University), Dimitrios Skarlatos (Carnegie Mellon University)

Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning

Authors: Shenggan Cheng (National University of Singapore), Shengjie Lin (Georgia Institute of Technology), Lansong Diao (Alibaba Group), Hao Wu (George Mason University), Siyu Wang (Alibaba Group), Chang Si (Alibaba Group), Ziming Liu (National University of Singapore), Xuanlei Zhao (National University of Singapore), Jiangsu Du (Sun Yat-sen University), Wei Lin (Alibaba Group), Yang You (National University of Singapore)

Early Termination for Hyperdimensional Computing Using Inferential Statistics

Authors: Pu (Luke) Yi (Stanford University), Yifan Yang (Stanford University), Chae Young Lee (Stanford University), Sara Achour (Stanford University)

Fast On-device LLM Inference with NPUs

Authors: Daliang Xu (Key Lab of HCST (PKU), MOE; SCS, Peking University), Hao Zhang (Beijing University of Posts and Telecommunications), Liming Yang (Key Lab of HCST (PKU), MOE; SCS, Peking University), Ruiqi Liu (Key Lab of HCST (PKU), MOE; SCS, Peking University), Gang Huang (Key Lab of HCST (PKU), MOE; SCS, Peking University & National Key Laboratory of Data Space Technology and System), Mengwei Xu (Beijing University of Posts and Telecommunications), Xuanzhe Liu (Key Lab of HCST (PKU), MOE; SCS, Peking University)

FRUGAL: Efficient and Economic Embedding Model Training with Commodity GPUs

Authors: Minhui Xie (Tsinghua University & Renmin University of China), Shaoxun Zeng (Tsinghua University), Hao Guo (Tsinghua University), Shiwei Gao (Tsinghua University), Youyou Lu (Tsinghua University)

FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models

Authors: Xinglin Pan (The Hong Kong University of Science and Technology (Guangzhou)), Wenxiang Lin (Harbin Institute of Technology, Shenzhen), Lin Zhang (Hong Kong University of Science and Technology), Shaohuai Shi (Harbin Institute of Technology, Shenzhen), Zhenheng Tang (The Hong Kong University of Science and Technology), Rui Wang (The Hong Kong University of Science and Technology (Guangzhou)), Bo Li (Hong Kong University of Science and Technology), Xiaowen Chu (The Hong Kong University of Science and Technology (Guangzhou) & Hong Kong University of Science and Technology)

GRAPHPIPE: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism

Authors: Byungsoo Jeon (NVIDIA), Mengdi Wu (Carnegie Mellon Univerisity), Shiyi Cao (UC Berkeley), Sunghyun Kim (Massachusetts Institute of Technology), Sunghyun Park (NVIDIA), Neeraj Aggarwal (Carnegie Mellon University), Colin Unger (Stanford University), Daiyaan Arfeen (Carnegie Mellon University), Peiyuan Liao (Carnegie Mellon University), Xupeng Miao (Carnegie Mellon University), Mohammad Alizadeh (Massachusetts Institute of Technology), Gregory R. Ganger (Carnegie Mellon University), Tianqi Chen (Carnegie Mellon University), Zhihao Jia (Carnegie Mellon University)

METASAPIENS: Real-Time Neural Rendering with Efficiency-Aware Pruning and Accelerated Foveated Rendering

Authors: Weikai Lin (University of Rochester), Yu Feng (Shanghai Jiao Tong University), Yuhao Zhu (University of Rochester)

MOE-LIGHTNING: High-Throughput MoE Inference on Memory-constrained GPUs

Authors: Shiyi Cao (UC Berkeley), Shu Liu (UC Berkeley), Tyler Griggs (UC Berkeley), Peter Schafhalter (UC Berkeley), Xiaoxuan Liu (UC Berkeley), Ying Sheng (Stanford University), Joseph E. Gonzalez (UC Berkeley), Matei Zaharia (UC Berkeley), Ion Stoica (UC Berkeley)

MVQ: Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization

Authors: Shuaiting Li (Zhejiang University), Chengxuan Wang (Zhejiang University), Juncan Deng (Zhejiang University), Zeyu Wang (Zhejiang University), Zewen Ye (Zhejiang University), Zongsheng Wang (Zhejiang University), Haibin Shen (Zhejiang University), Kejie Huang (Zhejiang University)

Nazar: Monitoring and Adapting ML Models on Mobile Devices

Authors: Wei Hao (Columbia University), Zixi Wang (Columbia University), Lauren Hong (Columbia University), Lingxiao Li (Columbia University), Nader Karayanni (Columbia University), AnMei Dasbach-Prisk (University of California San Diego), Chengzhi Mao (Columbia University), Junfeng Yang (Columbia University), Asaf Cidon (Columbia University)

PartIR: Composing SPMD Partitioning Strategies for Machine Learning

Authors: Sami Alabed (Google DeepMind), Daniel Belov (Google DeepMind), Bart Chrzaszcz (Google DeepMind), Juliana Franco (Google DeepMind), Dominik Grewe (Google DeepMind), Dougal Maclaurin (Google DeepMind), James Molloy (Google DeepMind), Tom Natan (Google DeepMind), Tamara Norman (Google DeepMind), Xiaoyue Pan (Google DeepMind), Adam Paszke (Google DeepMind), Norman A. Rink (Google DeepMind), Michael Schaarschmidt (Isomorphic Labs), Timur Sitdikov (Google DeepMind), Agnieszka Swietlik (Google DeepMind), Dimitrios Vytiniotis (Google DeepMind), Joel Wee (Google DeepMind)

PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption

Authors: Yifan Tan (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University), Cheng Tan (Northeastern University), Zeyu Mi (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University), Haibo Chen (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University)

SmoothE: Differentiable E-Graph Extraction

Authors: Yaohui Cai (Cornell University), Kaixin Yang (Cornell University), Chenhui Deng (Cornell University), Cunxi Yu (University of Maryland, College Park), Zhiru Zhang (Cornell University)

SuperNoVA: Algorithm-Hardware Co-Design for Resource-Aware SLAM

Authors: Seah Kim (University of California, Berkeley), Roger Hsiao (University of California, Berkeley), Borivoje Nikolić (University of California, Berkeley), James Demmel (University of California, Berkeley), Yakun Sophia Shao (University of California, Berkeley)

vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention

Authors: Ramya Prabhu (Microsoft Research), Ajay Nayak (Indian Institute of Science), Jayashree Mohan (Microsoft Research), Ramachandran Ramjee (Microsoft Research), Ashish Panwar (Microsoft Research)

分布式系统与数据中心 (6篇)

Composing Distributed Computations Through Task and Kernel Fusion

Authors: Rohan Yadav (Stanford University), Shiv Sundram (Stanford University), Wonchan Lee (NVIDIA), Michael Garland (NVIDIA), Michael Bauer (NVIDIA), Alex Aiken (Stanford University), Fredrik Kjolstad (Stanford University)

Cooperative Graceful Degradation in Containerized Clouds

Authors: Kapil Agrawal (University of California, Irvine), Sangeetha Abdu Jyothi (University of California, Irvine and VMware Research)

Copper and Wire: Bridging Expressiveness and Performance for Service Mesh Policies

Authors: Divyanshu Saxena (The University of Texas at Austin), William Zhang (The University of Texas at Austin), Shankara Pailoor (The University of Texas at Austin), Isil Dillig (The University of Texas at Austin), Aditya Akella (The University of Texas at Austin)

FleetIO: Managing Multi-Tenant Cloud Storage with Multi-Agent Reinforcement Learning

Authors: Jinghan Sun (UIUC), Benjamin Reidys (UIUC), Daixuan Li (UIUC), Jichuan Chang (Google), Marc Snir (UIUC), Jian Huang (UIUC)

Performance Prediction of On-NIC Network Functions with Multi-Resource Contention and Traffic Awareness

Authors: Shaofeng Wu (The Chinese University of Hong Kong), Qiang Su (The Chinese University of Hong Kong), Zhixiong Niu (Microsoft Research), Hong Xu (The Chinese University of Hong Kong)

PULSE: Accelerating Distributed Pointer-Traversals on Disaggregated Memory

Authors: Yupeng Tang (Yale University), Seung-seob Lee (Yale University), Abhishek Bhattacharjee (Yale University), Anurag Khandelwal (Yale University)

文件与存储系统 (3篇)

AnyKey: A Key-Value SSD for All Workload Types

Authors: Chanyoung Park (Hanyang University), Jungho Lee (Hanyang University), Chun-Yi Liu (Micron Technology Inc.), Kyungtae Kang (Hanyang Univeristy), Mahmut Taylan Kandemir (Pennsylvania State University) Wonil Choi (Hanyang University)

ByteFS: System Support for (CXL-based) Memory-Semantic Solid-State Drives

Authors: Shaobo Li (University of Illinois Urbana-Champaign), Yirui (Eric) Zhou (University of Illinois Urbana-Champaign), Hao Ren (University of Illinois Urbana-Champaign), Jian Huang (University of Illinois Urbana-Champaign)

ZRAID: Leveraging Zone Random Write Area (ZRWA) for Alleviating Partial Parity Tax in ZNS RAID

Authors: Minwook Kim (Seoul National University), Seongyeop Jeong (Seoul National University), Jin-Soo Kim (Seoul National University)

内核与操作系统 (4篇)

Debugger Toolchain Validation via Cross-Level Debugging

Authors: Yibiao Yang (State Key Laboratory for Novel Software Technology, Nanjing University), Maolin Sun (State Key Laboratory for Novel Software Technology, Nanjing University), Jiangchang Wu (State Key Laboratory for Novel Software Technology, Nanjing University), Qingyang Li (State Key Laboratory for Novel Software Technology, Nanjing University), Yuming Zhou (State Key Laboratory for Novel Software Technology, Nanjing University)

Rethinking Java Performance Analysis

Authors: Stephen M. Blackburn (Google & Australian National University), Zixian Cai (Australian National University), Rui Chen (Unaffiliated-Independent), Xi Yang (IOP Systems), John Zhang (Canva), John Zigman (The University of Sydney)

RTL Verification for Secure Speculation Using Contract Shadow Logic

Authors: Qinhan Tan (Princeton University), Yuheng Yang (Massachusetts Institute of Technology), Thomas Bourgeat (École Polytechnique Fédérale de Lausanne), Sharad Malik (Princeton University), Mengjia Yan (Massachusetts Institute of Technology)

Segue & ColorGuard: Optimizing SFI Performance and Scalability on Modern Architectures

Authors: Shravan Narayan (UT Austin), Tal Garfinkel (UC San Diego), Evan Johnson (UC San Diego), Zachary Yedidia (Stanford University), Yingchen Wang (UC Berkeley), Andrew Brown (Intel), Anjo Vahldiek-Oberwagner (Intel Labs), Michael LeMay (Intel Labs), Wenyong Huang (Intel), Xin Wang (Intel), Mingqiu Sun (Intel), Dean Tullsen (UC San Diego), Deian Stefan (UC San Diego)

调度与资源管理 (7篇)

Automatic Tracing in Task-Based Runtime Systems

Authors: Rohan Yadav (Stanford University), Michael Bauer (NVIDIA), David Broman (KTH Royal Institute of Technology), Michael Garland (NVIDIA), Alex Aiken (Stanford University), Fredrik Kjolstad (Stanford University)

Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud Platforms

Authors: Benjamin Reidys (University of Illinois Urbana-Champaign), Pantea Zardoshti (Microsoft), Íñigo Goiri (Microsoft), Celine Irvene (Microsoft), Daniel S. Berger (Microsoft & University of Washington), Haoran Ma (University of California-Los Angeles), Kapil Arya (Microsoft), Eli Cortez (Microsoft), Taylor Stark (Microsoft), Eugene Bak (Microsoft), Mehmet Iyigun (Microsoft), Stanko Novaković (Google), Lisa Hsu (Meta), Karel Trueba (Microsoft), Abhisek Pan (Microsoft), Chetan Bansal (Microsoft), Saravan Rajmohan (Microsoft), Jian Huang (University of Illinois Urbana-Champaign), Ricardo Bianchini (Microsoft)

Dilu: Enabling GPU Resourcing-on-Demand for Serverless DL Serving via Introspective Elasticity

Authors: Cunchi Lv (ICT, CAS & UCAS), Xiao Shi (ICT, CAS & Nanjing Institute of InforSuperBahn), Zhengyu Lei (ICT, CAS & UCAS), Jinyue Huang (ICT, CAS & UCAS), Wenting Tan (ICT, CAS), Xiaohui Zheng (ICT, CAS), Xiaofang Zhao (ICT, CAS & IICT, Suzhou, CAS)

Forecasting GPU Performance for Deep Learning Training and Inference

Authors: Seonho Lee (Georgia Institute of Technology), Amar Phanishayee (Meta), Divya Mahajan (Georgia Institute of Technology)

Tally: Non-Intrusive Performance Isolation for Concurrent Deep Learning Workloads

Authors: Wei Zhao (Stanford University & CentML), Anand Jayarajan (University of Toronto & Vector Institute & CentML), Gennady Pekhimenko (University of Toronto & Vector Institute & CentML)

TELA: A Temporal Load-Aware Cloud Virtual Disk Placement Scheme

Authors: Difan Tan (Huazhong University of Science and Technology), Jiawei Li (Huazhong University of Science and Technology), Hua Wang (Huazhong University of Science and Technology), Xiaoxiao Li (Huazhong University of Science and Technology), Wenbo Liu (Huazhong University of Science and Technology), Zijin Qin (Huazhong University of Science and Technology), Ke Zhou (Huazhong University of Science and Technology), Ming Xie (Tencent Inc.), Mengling Tao (Tencent Inc.)

Using Analytical Performance/Power Model and Fine-Grained DVFS to Enhance AI Accelerator Energy Efficiency

Authors: Zibo Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Yijia Zhang (Peng Cheng Laboratory), Fuchun Wei (Huawei Technologies Co., Ltd), Bingqiang Wang (Peng Cheng Laboratory), Yanlin Liu (Huawei Technologies Co., Ltd), Zhiheng Hu (Huawei Technologies Co., Ltd), Jingyi Zhang (Huawei Technologies Co., Ltd), Xiaoxin Xu (Huawei Technologies Co., Ltd), Jian He (Huawei Technologies Co., Ltd), Xiaoliang Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Wanchun Dou (State Key Laboratory for Novel Software Technology, Nanjing University), Guihai Chen (State Key Laboratory for Novel Software Technology, Nanjing University), Chen Tian (State Key Laboratory for Novel Software Technology, Nanjing University)

数据库系统 (1篇)

Fusion: An Analytics Object Store Optimized for Query Pushdown

Authors: Jianan Lu (Princeton University), Ashwini Raina (Princeton University), Asaf Cidon (Columbia University), Michael J. Freedman (Princeton University)

隐私与安全 (6篇)

CLOSUREX: Compiler Support for Correct Persistent Fuzzing

Authors: Rishi Ranjan (Virginia Tech), Ian Paterson (Virginia Tech), Matthew Hicks (Virignia Tech)

HALO: Loop-aware Bootstrapping Management for Fully Homomorphic Encryption

Authors: Seonyoung Cheon (Yonsei University), Yongwoo Lee (Yonsei University), Hoyun Youm (Yonsei University), Dongkwan Kim (Yonsei University), Sungwoo Yun (Yonsei University), Kunmo Jeong (Yonsei University), Dongyoon Lee (Stony Brook University), Hanjun Kim (Yonsei University)

Marionette: A RowHammer Attack via Row Coupling

Authors: Seungmin Baek (Seoul National University), Minbok Wi (Seoul National University), Seonyong Park (Seoul National University), Hwayong Nam (Seoul National University), Michael Jaemin Kim (Seoul National University), Nam Sung Kim (University of Illinois), Jung Ho Ahn (Seoul National University)

MOAT: Securely Mitigating Rowhammer with Per-Row Activation Counters

Authors: Moinuddin Qureshi (Georgia Institute of Technology), Salman Qazi (Google)

PCcheck: Persistent Concurrent Checkpointing for ML

Authors: Foteini Strati (ETH Zurich), Michal Friedman (ETH Zurich), Ana Klimovic (ETH Zurich)

Robustness Verification for Checking Crash Consistency of Non-volatile Memory

Authors: Zhilei Han (School of Software, Tsinghua University), Fei He (School of Software, Tsinghua University)

其他 (27篇)

ARC: Warp-level Adaptive Atomic Reduction in GPUs to Accelerate Differentiable Rendering

Authors: Sankeerth Durvasula (Vector Institute, University of Toronto), Adrian Zhao (Vector Institute, University of Toronto), Fan Chen (University of Toronto), Ruofan Liang (Vector Institute, University of Toronto), Pawan Kumar Sanjaya (Vector Institute, University of Toronto), Yushi Guan (Vector Institute, University of Toronto), Christina Giannoula (Vector Institute, University of Toronto), Nandita Vijaykumar (Vector Institute, University of Toronto)

BatchZK: A Fully Pipelined GPU-Accelerated System for Batch Generation of Zero-Knowledge Proofs

Authors: Tao Lu (Zhejiang University & National University of Singapore), Yuxun Chen (Zhejiang University), Zonghui Wang (Zhejiang University), Xiaohang Wang (Zhejiang University), Wenzhi Chen (Zhejiang University), Jiaheng Zhang (National University of Singapore)

CRUSH: A Credit-Based Approach for Functional Unit Sharing in Dynamically Scheduled HLS

Authors: Jiahui Xu (ETH Zurich), Lana Josipović (ETH Zurich)

DarwinGame: Playing Tournaments for Tuning Applications in Noisy Cloud Environments

Authors: Rohan Basu Roy (University of Utah), Vijay Gadepally (Massachusetts Institute of Technology), Devesh Tiwari (Northeastern University)

Design and Operation of Shared Machine Learning Clusters on Campus

Authors: Kaiqiang Xu (Hong Kong University of Science and Technology), Decang Sun (Hong Kong University of Science and Technology), Hao Wang (Hong Kong University of Science and Technology), Zhenghang Ren (Hong Kong University of Science and Technology), Xinchen Wan (Hong Kong University of Science and Technology), Xudong Liao (Hong Kong University of Science and Technology), Zilong Wang (Hong Kong University of Science and Technology), Junxue Zhang (Hong Kong University of Science and Technology), Kai Chen (Hong Kong University of Science and Technology)

Earth+: On-Board Satellite Imagery Compression Leveraging Historical Earth Observations

Authors: Kuntai Du (University of Chicago), Yihua Cheng (University of Chicago), Peder Olsen (Microsoft Research), Shadi Noghabi (Microsoft Research), Junchen Jiang (University of Chicago)

EDM: An Ultra-Low Latency Ethernet Fabric for Memory Disaggregation

Authors: Weigao Su (Purdue University), Vishal Shrivastav (Purdue University)

Efficient Lossless Compression of Scientific Floating-Point Data on CPUs and GPUs

Authors: Noushin Azami (Department of Computer Science, Texas State University), Alex Fallin (Department of Computer Science, Texas State University), Martin Burtscher (Department of Computer Science, Texas State University)

Enhancing CGRA Efficiency Through Aligned Compute and Communication Provisioning

Authors: Zhaoying Li (National University of Singapore), Pranav Dangi (National University of Singapore), Chenyang Yin (Peking University), Thilini Kaushalya Bandara (National University of Singapore), Rohan Juneja (National University of Singapore), Cheng Tan (Google), Zhenyu Bai (National University of Singapore), Tulika Mitra (National University of Singapore)

Exo 2: Growing a Scheduling Language

Authors: Yuka Ikarashi (MIT CSAIL), Kevin Qian (MIT CSAIL), Samir Droubi (MIT CSAIL), Alex Reinking (Adobe), Gilbert Louis Bernstein (University of Washington), Jonathan Ragan-Kelley (MIT CSAIL)

Faster Chaitin-like Register Allocation via Grammatical Decompositions of Control-Flow Graphs

Authors: Xuran Cai (Hong Kong University of Science and Technology), Amir Kafshdar Goharshady (University of Oxford), S. Hitarth (Hong Kong University of Science and Technology), Chun Kit Lam (Hong Kong University of Science and Technology)

Mint: Cost-Efficient Tracing with All Requests Collection via Commonality and Variability Analysis

Authors: Haiyu Huang (Sun Yat-sen University), Cheng Chen (Alibaba Group), Kunyi Chen (Alibaba Group), Pengfei Chen (Sun Yat-sen University), Guangba Yu (Sun Yat-sen University), Zilong He (Sun Yat-sen University), Yilun Wang (Sun Yat-sen University), Huxing Zhang (Alibaba Group), Qi Zhou (Alibaba Group)

QECC-Synth: A Layout Synthesizer for Quantum Error Correction Codes on Sparse Architectures

Authors: Keyi Yin (University of California, San Diego), Hezi Zhang (University of California, San Diego), Xiang Fang (University of California, San Diego), Yunong Shi (AWS Quantum Technologies), Travis S. Humble (Oak Ridge National Laboratory), Ang Li (Pacific Northwest National Laboratory), Yufei Ding (University of California, San Diego)

RANGE-BLOCKS: A Synchronization Facility for Domain-Specific Architectures

Authors: Anagha Molakalmur Anil Kumar (Simon Fraser University), Aditya Prasanna (Simon Fraser University), Arrvindh Shriraman (Simon Fraser University)

RASSM: Residue-based Acceleration of Single Sparse Matrix Computation via Adaptive Tiling

Authors: Anirudh Jain (Georgia Institute of Technology), Pulkit Gupta (Georgia Institute of Technology), Thomas M. Conte (Georgia Institute of Technology)

RESBM: Region-based Scale and Minimal-Level Bootstrapping Management for FHE via Min-Cut

Authors: Yan Liu (Ant Group), Jianxin Lai (Ant Group), Long Li (Ant Group), Tianxiang Sui (Ant Group), Linjie Xiao (Ant Group), Peng Yuan (Ant Group), Xiaojing Zhang (Ant Group), Qing Zhu (Ant Group), Wenguang Chen (Tsinghua University & Ant Group), Jingling Xue (UNSW, Ant Group)

Selectively Uniform Concurrency Testing

Authors: Huan Zhao (National University of Singapore), Dylan Wolff (National University of Singapore), Umang Mathur (National University of Singapore), Abhik Roychoudhury (National University of Singapore)

Accelerating Number Theoretic Transform with Multi-GPU Systems for Efficient Zero Knowledge Proof

Authors: Zhuoran Ji (School of Cyber Science and Technology, Shandong University), Jianyu Zhao (School of Cyber Science and Technology, Shandong University), Peimin Gao (School of Cyber Science and Technology, Shandong University), Xiangkai Yin (School of Cyber Science and Technology, Shandong University), Lei Ju (Quan Cheng Laboratory)

D-VSync: Decoupled Rendering and Displaying for Smartphone Graphics

Authors: Yuanpei Wu (IPADS, Shanghai Jiao Tong University & Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Dong Du (IPADS, Shanghai Jiao Tong University & Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Chao Xu (Fields Lab, Huawei Central Software Institute), Yubin Xia (IPADS, Shanghai Jiao Tong University & Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Ming Fu (Fields Lab, Huawei Central Software Institute), Binyu Zang (IPADS, Shanghai Jiao Tong University & Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Haibo Chen (IPADS, Shanghai Jiao Tong University & Key Laboratory of System Software (Chinese Academy of Science))

MEDUSA: Accelerating Serverless LLM Inference with Materialization

Authors: Shaoxun Zeng (Tsinghua University), Minhui Xie (Tsinghua University), Shiwei Gao (Tsinghua University), Youmin Chen (Tsinghua University), Youyou Lu (Tsinghua University)

Optimizing Datalog for the GPU

Authors: Yihao Sun (Syracuse University), Ahmedur Rahman Shovon (University of Illinois, Chicago), Thomas Gilray (Washington State University), Sidharth Kumar (University of Illinois, Chicago), Kristopher Micinski (Syracuse University)

Optimizing Quantum Circuits, Fast and Slow

Authors: Amanda Xu (University of Wisconsin-Madison), Abtin Molavi (University of Wisconsin-Madison), Swamit Tannu (University of Wisconsin-Madison), Aws Albarghouthi (University of Wisconsin-Madison)

Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow

Authors: Yixuan Mei (Carnegie Mellon University), Yonghao Zhuang (Carnegie Mellon University), Xupeng Miao (Carnegie Mellon University), Juncheng Yang (Carnegie Mellon University), Zhihao Jia (Carnegie Mellon University), Rashmi Vinayak (Carnegie Mellon University)

H-HOUDINI: Scalable Invariant Learning

Authors: Sushant Dinesh (University of California, Berkeley), Yongye Zhu (University of California, Berkeley), Christopher W. Fletcher (University of California, Berkeley)

Instruction-Aware Cooperative TLB and Cache Replacement Policies

Authors: Dimitrios Chasapis (Barcelona Supercomputing Center (BSC)), Georgios Vavouliotis (Unaffiliated), Daniel A. Jiménez (Texas A&M University), Marc Casas (Barcelona Supercomputing Center (BSC) & Universitat Politècnica de Catalunya (UPC))

Target-Aware Implementation of Real Expressions

Authors: Brett Saiki (University of Washington), Jackson Brough (University of Utah), Jonas Regehr (University of Utah), Jesús Ponce (University of Utah), Varun Pradeep (University of Washington), Aditya Akhileshwaran (University of Washington), Zachary Tatlock (University of Washington), Pavel Panchekha (University of Utah)

UniZK: Accelerating Zero-Knowledge Proof with Unified Hardware and Flexible Kernel Mapping

Authors: Cheng Wang (Xi’an Jiaotong University & Institute for Interdisciplinary Information Core Technology), Mingyu Gao (Tsinghua University & Shanghai Qi Zhi Institute)