I work on making AI/LLM inference systems faster and more efficient.
Stream2LLM — overlap context streaming with LLM prefill for up to 11x faster TTFT.
MLSys '26 · [Project Page]
Lotus — profile ML data preprocessing pipelines to find GPU utilization bottlenecks.
IISWC '24 · HotInfra '24 · [Project Page]
MS CS @ Georgia Tech, graduating May 2026.
Previously, I spent four years in the PhD program at Georgia Tech where I was advised by Prof. Ada Gavrilovska and Prof. Kexin Rong.
BS @ Penn State.
rr@gatech.edu · GitHub · X · LinkedIn
Stream2LLM — overlap context streaming with LLM prefill for up to 11x faster TTFT.
MLSys '26 · [Project Page]
Lotus — profile ML data preprocessing pipelines to find GPU utilization bottlenecks.
IISWC '24 · HotInfra '24 · [Project Page]
MS CS @ Georgia Tech, graduating May 2026.
Previously, I spent four years in the PhD program at Georgia Tech where I was advised by Prof. Ada Gavrilovska and Prof. Kexin Rong.
BS @ Penn State.
rr@gatech.edu · GitHub · X · LinkedIn