I work on advanced load balancing optimizations for Google's AI/ML model serving and all of Google's traffic. Read more about my latest contributions at: - https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway - https://docs.cloud.google.com/load-balancing/docs/service-lb-policy

2022 — NowGoogleSoftware Engineer

2022 — Now

Sunnyvale, CA

I develop primarily in C++, Python, Golang

Designed and Implemented load balancing for ML/server workloads using GPU/TPU/KV-cache utilization. This achieved:

* 96% faster Time-to-First-Token P90 latency.

* 60% faster Normalized-Time-Per-Output-Token P90 latency.

* 32% faster overall throughput for prefix-heavy workloads.

Read more: https://cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway

Designed and led the implementation of a multi-threaded, concurrent, sharded, distributed load balancing client-server system while working with partner teams in Canada and Poland. This achieved:

* Cloud customers can balance service traffic along any utilization dimension.

* Vastly simpler configuration for custom load balancing behavior.

Read more: https://cloud.google.com/kubernetes-engine/docs/how-to/expose-custom-metrics

10x speedup for network design process with 15x reduction in memory resources for network topology graph data structures.

Wrote graph traversal algorithms to efficiently traverse Google's network topology and detect single points of failure.

Improved automation to re-map ~25% of all edge network customers to greatly improve customer cost center allocation.

2021 — 2021NetAppSoftware Engineer Intern

2021 — 2021

Sunnyvale, California, United States

I worked on the Cloud Replications Services team and developed primarily in C/C++ on low-level kernel processes. My key contributions include:

3x speedup for OS compile-update-reboot time for ONTAP virtual machines

Automated ONTAP cloud cluster setup configuration for dynamic load testing

2.7x speedup for WAFL scheduler client I/O latency with minimal slowdown (0.8x) to WAFL scheduler replication operations latency

Further optimized 3.5x speedup for client I/O and 1.2x speedup for replication operations with online random forest server load prediction

2020 — 2020ZODAJSoftware Engineer Intern

2020 — 2020

As part of a student-led start-up to tackle contact-tracing challenges in countries where the majority of people do not use smartphones, I worked closely with my fellow teammates to build an SMS contact-tracing and Android contact-tracing application. I was responsible for implementing intuitive user interfaces and efficient, privacy preserving contact-tracing database systems to facilitate safe social distancing, and fast algorithms to find points of potential infection between people.

Key contributions:

Implemented Android and SMS (for non-smartphone users) contact tracing application for COVID-19 in Senegal

Designed NoSQL, PostgreSQL databases along with AWS Lambda RESTful API for TCN protocol contact tracing

Directed and implemented all permissions, roles within ZODAJ for Amazon AWS databases

Created overlap algorithm for n people in a store that runs in max(O(k) or O(n log n)), where k is number of overlaps

Wrote graph algorithms, discrete-time markov chain SIR epidemic modeling for tracking infections

2019 — 2019Bosch ASEANLinux System Engineer Intern (DevOps)

2019 — 2019

Singapore

As part of the Open Source Desktop team, I maintained and developed existing and new features for a customized distribution of Ubuntu which is capable of interfacing with Bosch's network security layers and offers pre-made configurations for Bosch's Linux developers to set up and use Ubuntu more efficiently and safely within Bosch's developer ecosystem. I also implemented build, testing, and deployment improvements for our product pipeline. I was responsible for both front-end and back-end implementation projects.

Key contributions:

Created efficient searchable user interface for developers to install their chosen ISO file via network boot (HTA, Jinja, JavaScript) from internal servers

Enabled Jenkins server to run two ISO builds in parallel (1 executor, 2 slaves) for testing via dynamic port allocation

Enabled automatic mounting of QEMU disk on Jenkins server to extract logs on initial startup post-installation for quicker debugging on failed build pipelines (Bash)

Created package to automate unit tests on newly installed ISO files on physical and virtual machines for standard functionality such as web and email access and having the latest security/proxy configurations (Bash, Python)

Implemented Jenkins pipeline build-steps modification for quicker deployment testing (via allowing the possibility to bypass ISO installation on ram disk, which reduces testing time from 20 mins to 1 min)

Began migration of current inventory servers to new OCS inventory servers via Ansible for seamless configuration

2017 — 2018EyeotaData Analytics Intern

2017 — 2018

As part of the Analytics team, I was responsible for automating day-to-day processes in order to improve workflow of Eyeota's analytics department.

Key contributions:

Automated data cleaning and utilized Levenshtein distance, regex on text data from programmatic buy-side platforms to classify advertiser and buyer data for the business analytics team to track audience data sales

Extracted advertiser behavioral trends such as target demographics via web-scraping and from buyer data

Automated update process of daily, weekly, and monthly revenue generated from audience data sales

Automated formatting of .csv files of audience data sales across different programmatic platforms such as Google DoubleClick Bid Manager, Adform, and Appnexus, for use in further data processing

Education

Carnegie Mellon University

Bachelor of Science - BS

Hwa Chong Insitution

Experience+

Education

Bachelor of Science - BS

Singapore Cambridge GCE A-Levels