About Me
I am a PhD Candidate at the Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS) co-advised by Dr. Xiaobing Feng (冯晓兵) and Dr. Chenxi Wang (王晨曦).
My research focuses on the development of hard-core systems tailored for emerging hardware platforms, such as resource-disaggregated datacenter.
I am particularly interested in enhancing the performance of applications on these systems through the design and implementation of advanced programming models and compiler optimizations.
Research Interests
My research focuses on the development of hard-core systems tailored for emerging hardware platforms, such as resource-disaggregated datacenter. I am particularly interested in enhancing the performance of applications on these systems through the design and implementation of advanced programming models and compiler optimizations.
Runtime for Disaggregated Memory
As an emerging datacenter architecture, resource-disaggregation aims to reorganize datacenter hardware of each kind into their dedicated resource servers to improve resource utilization and fault tolerance and simplify hardware adoption. These servers are connected by advanced network fabrics a such as Infiniband and Intel Fabrics. As a result, the cloud application running on the resource-disaggregated cluster can get compute and memory resources from different servers.
I build a memory-disaggregated framework, Beehive, which improves the remote access throughput by exploiting the asynchrony within each thread. Beehive is capable of automatically converting applications into asynchronous execution code, which helps to reduce the microsecond-scale latency associated with remote memory access. This is achieved while maintaining low CPU overhead and enhancing data locality, leading to more efficient application performance.
Publications
Developing memory-disaggregated applications atop the emerging I/O fabrics is drawing more attention from industry and academia due to its ability to break the memory capacity wall and improve resource utilization. However, the microsecond(μs)-scale I/O fabrics raise tension between the programming productivity and performance. The multithreaded synchronous programming model is popular in developing memory-disaggregated applications due to its intuitive program logic. However, our key insight is that although thread switching can effectively mitigate the μs-scale latency, it leads to poor data locality and non-trivial scheduling overhead, leaving significant opportunities to improve the performance further.
This paper proposes a memory-disaggregated framework, Beehive, which improves the remote access throughput by exploiting the asynchrony within each thread. Beehive contains three components: the programming interfaces, the Rust compiler, and the runtime system. To improve the programming usability, Beehive allows the programmers to develop applications in the conventional multithreaded synchronous model and automatically transforms the code into pararoutine (a newly proposed computation and scheduling unit) based asynchronous code via the compiler. We evaluated Beehive with eight workloads, including data analytics, graph processing frameworks, machine learning frameworks, key-value stores, web services etc. As a result, Beehive outperforms the state-of-the-art memory-disaggregated frameworks, i.e., Hermit and AIFM, by 3.05× and 1.58× on average, correspondingly.
Education
Ph.D. Candidate
the Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS)
co-advised by Dr. Xiaobing Feng (冯晓兵) and Dr. Chenxi Wang (王晨曦).Major in Computer Science and Technology
Bachelor Degree
University of Science and Technology of China (USTC)
Major in Physics
Minor in Computer Science and Technology