Host Network, Network Infrastructure
•Founded Eve, an LLM-powered AIOps platform for Uber’s core infrastructure, reducing incident diagnosis time from ~60 to ~7 minutes by automating triage and root cause analysis across distributed systems
•Designed a hierarchical multi-agent system (LangGraph) with controlled tool access and dual-mode execution, combining runbook automation for high-frequency alerts (~40% of incidents) with free-form reasoning across the infrastructure stack, across host, data-center network, and internet edge
•Built a dynamic telemetry query and hypothesis engine that generates and evaluates hypotheses over time-series data using correlation models and statistical analysis to identify causal relationships across services and infrastructure layers
•Founded NetOS, Uber’s infrastructure observability and operations platform spanning multi-cloud and on-prem environments; led development from scratch using TypeScript, React, and GraphQL, driving adoption across engineering orgs and standardizing telemetry and self-service debugging