I work on advanced load balancing optimizations for Google's AI/ML model serving and all of Google's traffic. Read more about my latest contributions at: - https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway - https://docs.cloud.google.com/load-balancing/docs/service-lb-policy
2022 — Now
Sunnyvale, CA
I develop primarily in C++, Python, Golang
Designed and Implemented load balancing for ML/server workloads using GPU/TPU/KV-cache utilization. This achieved:
* 96% faster Time-to-First-Token P90 latency.
* 60% faster Normalized-Time-Per-Output-Token P90 latency.
* 32% faster overall throughput for prefix-heavy workloads.
Read more: https://cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway
Designed and led the implementation of a multi-threaded, concurrent, sharded, distributed load balancing client-server system while working with partner teams in Canada and Poland. This achieved:
* Cloud customers can balance service traffic along any utilization dimension.
* Vastly simpler configuration for custom load balancing behavior.
Read more: https://cloud.google.com/kubernetes-engine/docs/how-to/expose-custom-metrics
10x speedup for network design process with 15x reduction in memory resources for network topology graph data structures.
Wrote graph traversal algorithms to efficiently traverse Google's network topology and detect single points of failure.
Improved automation to re-map ~25% of all edge network customers to greatly improve customer cost center allocation.
2021 — 2021
Sunnyvale, California, United States
I worked on the Cloud Replications Services team and developed primarily in C/C++ on low-level kernel processes. My key contributions include:
3x speedup for OS compile-update-reboot time for ONTAP virtual machines
Automated ONTAP cloud cluster setup configuration for dynamic load testing
2.7x speedup for WAFL scheduler client I/O latency with minimal slowdown (0.8x) to WAFL scheduler replication operations latency
Further optimized 3.5x speedup for client I/O and 1.2x speedup for replication operations with online random forest server load prediction
2020 — 2020
As part of a student-led start-up to tackle contact-tracing challenges in countries where the majority of people do not use smartphones, I worked closely with my fellow teammates to build an SMS contact-tracing and Android contact-tracing application. I was responsible for implementing intuitive user interfaces and efficient, privacy preserving contact-tracing database systems to facilitate safe social distancing, and fast algorithms to find points of potential infection between people.
Key contributions:
Implemented Android and SMS (for non-smartphone users) contact tracing application for COVID-19 in Senegal
Designed NoSQL, PostgreSQL databases along with AWS Lambda RESTful API for TCN protocol contact tracing
Directed and implemented all permissions, roles within ZODAJ for Amazon AWS databases
Created overlap algorithm for n people in a store that runs in max(O(k) or O(n log n)), where k is number of overlaps
Wrote graph algorithms, discrete-time markov chain SIR epidemic modeling for tracking infections
Singapore
As part of the Open Source Desktop team, I maintained and developed existing and new features for a customized distribution of Ubuntu which is capable of interfacing with Bosch's network security layers and offers pre-made configurations for Bosch's Linux developers to set up and use Ubuntu more efficiently and safely within Bosch's developer ecosystem. I also implemented build, testing, and deployment improvements for our product pipeline. I was responsible for both front-end and back-end implementation projects.
Key contributions:
Created efficient searchable user interface for developers to install their chosen ISO file via network boot (HTA, Jinja, JavaScript) from internal servers
Enabled Jenkins server to run two ISO builds in parallel (1 executor, 2 slaves) for testing via dynamic port allocation
Enabled automatic mounting of QEMU disk on Jenkins server to extract logs on initial startup post-installation for quicker debugging on failed build pipelines (Bash)
Created package to automate unit tests on newly installed ISO files on physical and virtual machines for standard functionality such as web and email access and having the latest security/proxy configurations (Bash, Python)
Implemented Jenkins pipeline build-steps modification for quicker deployment testing (via allowing the possibility to bypass ISO installation on ram disk, which reduces testing time from 20 mins to 1 min)
Began migration of current inventory servers to new OCS inventory servers via Ansible for seamless configuration
2017 — 2018
As part of the Analytics team, I was responsible for automating day-to-day processes in order to improve workflow of Eyeota's analytics department.
Key contributions:
Automated data cleaning and utilized Levenshtein distance, regex on text data from programmatic buy-side platforms to classify advertiser and buyer data for the business analytics team to track audience data sales
Extracted advertiser behavioral trends such as target demographics via web-scraping and from buyer data
Automated update process of daily, weekly, and monthly revenue generated from audience data sales
Automated formatting of .csv files of audience data sales across different programmatic platforms such as Google DoubleClick Bid Manager, Adform, and Appnexus, for use in further data processing
Education
Carnegie Mellon University
Bachelor of Science - BS
Hwa Chong Insitution