Experience
2024 — Now
2024 — Now
San Francisco Bay Area
Worked on Model APIs, Baseten's pay-per-token inference service and one of the company's top initiatives. Designed and implemented data models, particularly around rate-limiting and billing. Delivered an intuitive model playground experience on the frontend.
Took Baseten's Chains product (interconnected model microservices - i.e. compound AI) to GA. Ensured that all underlying resources for a Chain are deployed atomically. Implemented GraphQL data loaders to optimize the performance of deeply nested Chain queries. Polished the Chains UI.
Partnered with NVIDIA to integrate Baseten as an inference provider for their NIM product. Turned Baseten into an OAuth provider so that NVIDIA can sign users up through Baseten and collect NIM usage data. Honestly the main takeaway here is that this was shipped in the span of 3 days.
Won 3rd place at a Baseten hackathon by building a phone-calling agent with a team. Integrated Twilio and web sockets with a real-time control loop that leverages open-source VAD, STT, and LLM, all deployed through Baseten's Chains product.
Led the frontend development for Baseten's training product. Built the UI for training logs, metrics, and billing.
2021 — Now
2021 — Now
San Francisco Bay Area
Single-handedly created a computational knowledge base that enables users to easily centralize and organize their knowledge through low-code automations, such as AI integrations.
Engineered a secure code evaluation framework that allows users to quickly build custom features into the app.
Developed an intuitive and versatile rich text editor using Slate.js, complemented by a highly robust and efficient auto-saving mechanism.
Implemented a live search feature powered by Typesense and a custom-built distributed search indexer with a shared task queue in Redis.
Deployed the app as a set of microservices using Google Cloud Run, backed by databases on VMs for cost-effective data storage.
Authored the entire documentation website.
2021 — 2024
2021 — 2024
San Francisco Bay Area
Accelerated Udacity’s course creation throughput by 10x by building the Self-Service Pipeline, which allows the Content team to create and deploy Docker images for cloud-based coding environments without relying on Engineering.
Modernized Udacity’s Workspaces infrastructure from GCE VMs to Kubernetes, reducing cloud computing costs by 50% and enabling the Content team to deliver cutting-edge AI educational content.
Created a distributed task processor using Redis to handle thousands of live student Workspaces.
Minimized student wait time by shipping various Kubernetes optimizations such as proactive autoscaling and image pulls.
Migrated numerous VM images to Docker images and built several key Docker images for Udacity’s content, including GPU-capable images for machine learning.
2020 — 2020
2020 — 2020
San Francisco Bay Area
Delivered intelligently generated fill-in-the-blanks questions to users by deploying a real-time machine learning service using Docker and Kubernetes.
Allowed developers to integrate containerized tasks into their data pipelines by defining and implementing best practices for Airflow.
Significantly increased SEO traffic by scaling the sitemap generation ETL and growing Quizlet’s sitemaps from 60M URLs to 230M URLs.
2018 — 2020
2018 — 2020
San Francisco Bay Area
Led the re-architecture of Quizlet’s flagship study mode using Kotlin Multiplatform, thus enabling various teams to build new study experiences.
Set a new standard of unit testing by authoring a mocking library and educating other engineers on how to use it. Led a team of engineers to open-source it (https://github.com/quizlet/hammock).
Facilitated a smooth transition into the high-traffic “back to school” season by implementing Memcached optimizations that reduced queries into major databases by 10-30%.
Education
University of Waterloo