李权熹 (Quanxi Li)

PhD Candidate at ICT, CAS. Focus on developing hardcore systems and compilers for emerging hardware

About Me

I am a PhD Candidate at the Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS) co-advised by Dr. Xiaobing Feng (冯晓兵) and Dr. Chenxi Wang (王晨曦).

My research focuses on the development of hard-core systems tailored for emerging hardware platforms, such as resource-disaggregated datacenter.

I am particularly interested in enhancing the performance of applications on these systems through the design and implementation of advanced programming models and compiler optimizations.

Research Interests

My research focuses on the development of hard-core systems tailored for emerging hardware platforms, such as resource-disaggregated datacenter. I am particularly interested in enhancing the performance of applications on these systems through the design and implementation of advanced programming models and compiler optimizations.

Runtime for Disaggregated Memory

As an emerging datacenter architecture, resource-disaggregation aims to reorganize datacenter hardware of each kind into their dedicated resource servers to improve resource utilization and fault tolerance and simplify hardware adoption. These servers are connected by advanced network fabrics a such as Infiniband and Intel Fabrics. As a result, the cloud application running on the resource-disaggregated cluster can get compute and memory resources from different servers.

I build a memory-disaggregated framework, Beehive, which improves the remote access throughput by exploiting the asynchrony within each thread. Beehive is capable of automatically converting applications into asynchronous execution code, which helps to reduce the microsecond-scale latency associated with remote memory access. This is achieved while maintaining low CPU overhead and enhancing data locality, leading to more efficient application performance.

Publications

Beehive: A Scalable Disaggregated Memory Runtime Exploiting Asynchrony of Multithreaded Programs
Quanxi Li, Hong Huang, Ying Liu, Yanwen Xia, Jie Zhang, Mosong Zhou, Xiaobing Feng, Huimin Cui, Quan Chen, Yizhou Shan, Chenxi Wang*
The 22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI 2025)

Developing memory-disaggregated applications atop the emerging I/O fabrics is drawing more attention from industry and academia due to its ability to break the memory capacity wall and improve resource utilization. However, the microsecond(μs)-scale I/O fabrics raise tension between the programming productivity and performance. The multithreaded synchronous programming model is popular in developing memory-disaggregated applications due to its intuitive program logic. However, our key insight is that although thread switching can effectively mitigate the μs-scale latency, it leads to poor data locality and non-trivial scheduling overhead, leaving significant opportunities to improve the performance further.

This paper proposes a memory-disaggregated framework, Beehive, which improves the remote access throughput by exploiting the asynchrony within each thread. Beehive contains three components: the programming interfaces, the Rust compiler, and the runtime system. To improve the programming usability, Beehive allows the programmers to develop applications in the conventional multithreaded synchronous model and automatically transforms the code into pararoutine (a newly proposed computation and scheduling unit) based asynchronous code via the compiler. We evaluated Beehive with eight workloads, including data analytics, graph processing frameworks, machine learning frameworks, key-value stores, web services etc. As a result, Beehive outperforms the state-of-the-art memory-disaggregated frameworks, i.e., Hermit and AIFM, by 3.05× and 1.58× on average, correspondingly.

Orthrus: Efficient and Timely Detection of Silent User Data Corruption in the Cloud with Resource-Adaptive Computation Validation
Chenxiao Liu, Zhenting Zhu, Quanxi Li, Yanwen Xia, Yifan Qiao, Xiangyun Deng, Youyou Lu, Tao Xie, Huimin Cui, Zidong Du, Harry Xu, Chenxi Wang
the 31st ACM Symposium on Operating Systems Principles (SOSP 2025)

Even with substantial endeavors to test and validate processors, computational errors may still arise post-installation. One particular category of CPU errors transpires discreetly, without crashing applications or triggering hardware warnings. These elusive errors pose a significant threat by undermining user data, and their detection is challenging. This paper introduces Orthrus, a solution for the timely detection of silent user-data corruption caused by post-installation CPU errors. Orthrus safeguards user data in cloud applications by providing simple annotations and compiler support for users to identify data operators and validating these operators asynchronously across cores while maintaining an ultra-low overhead (2-6%), making it practical for production deployment. Our evaluation, using carefully injected errors, demonstrates that Orthrus can detect 87% of data corruptions with just a single core dedicated to validation, increasing to 91% and 96% when two and four cores are used.

Codar: A Contextual Duration-Aware Qubit Mapping for Various NISQ Devices
Haowei Deng, Yu Zhang*, Quanxi Li
The 57th ACM/IEEE Design Automation Conference (DAC 2020)
Link
Optimizing quantum programs against decoherence: Delaying qubits into quantum superposition
Yu Zhang*, Haowei Deng, Quanxi Li, Haoze Song and Leihai Nie
The 2019 International Symposium on Theoretical Aspects of Software Engineering (TASE 2019)
Link

Education

2016-2020

Bachelor Degree

University of Science and Technology of China (USTC)

Major in Physics

Minor in Computer Science and Technology

Contact

liquanxi20z@ict.ac.cn