I work on making AI/LLM inference systems faster and more efficient.
Stream2LLM
— overlap context streaming with LLM prefill for up to 11x faster
TTFT.
MLSys '26 ·
[Project Page]
Lotus
— profile ML data preprocessing pipelines to find GPU utilization
bottlenecks.
IISWC '24 · HotInfra '24 ·
[Project Page]
MS CS @ Georgia Tech, graduating May 2026.
Previously, I spent four years in the PhD program at Georgia Tech where I
was advised by
Prof. Ada Gavrilovska
and
Prof. Kexin Rong.
BS @ Penn State.
rr@gatech.edu · GitHub · X · LinkedIn