# Nicklaus Choo > AI/ML Server Load Balancing Optimizations @ Google (GSLB)| CS @ CMU ’22 Location: San Francisco Bay Area, United States Profile: https://flows.cv/nicklaus I work on advanced load balancing optimizations for Google's AI/ML model serving and all of Google's traffic. Read more about my latest contributions at: - https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway - https://docs.cloud.google.com/load-balancing/docs/service-lb-policy ## Work Experience ### Software Engineer @ Google Jan 2022 – Present | Sunnyvale, CA I develop primarily in C++, Python, Golang - Designed and Implemented load balancing for ML/server workloads using GPU/TPU/KV-cache utilization. This achieved: * 96% faster Time-to-First-Token P90 latency. * 60% faster Normalized-Time-Per-Output-Token P90 latency. * 32% faster overall throughput for prefix-heavy workloads. Read more: https://cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway - Designed and led the implementation of a multi-threaded, concurrent, sharded, distributed load balancing client-server system while working with partner teams in Canada and Poland. This achieved: * Cloud customers can balance service traffic along any utilization dimension. * Vastly simpler configuration for custom load balancing behavior. Read more: https://cloud.google.com/kubernetes-engine/docs/how-to/expose-custom-metrics - 10x speedup for network design process with 15x reduction in memory resources for network topology graph data structures. - Wrote graph traversal algorithms to efficiently traverse Google's network topology and detect single points of failure. - Improved automation to re-map ~25% of all edge network customers to greatly improve customer cost center allocation. ### Software Engineer Intern @ NetApp Jan 2021 – Jan 2021 | Sunnyvale, California, United States I worked on the Cloud Replications Services team and developed primarily in C/C++ on low-level kernel processes. My key contributions include: - 3x speedup for OS compile-update-reboot time for ONTAP virtual machines - Automated ONTAP cloud cluster setup configuration for dynamic load testing - 2.7x speedup for WAFL scheduler client I/O latency with minimal slowdown (0.8x) to WAFL scheduler replication operations latency - Further optimized 3.5x speedup for client I/O and 1.2x speedup for replication operations with online random forest server load prediction ### Software Engineer Intern @ ZODAJ Jan 2020 – Jan 2020 As part of a student-led start-up to tackle contact-tracing challenges in countries where the majority of people do not use smartphones, I worked closely with my fellow teammates to build an SMS contact-tracing and Android contact-tracing application. I was responsible for implementing intuitive user interfaces and efficient, privacy preserving contact-tracing database systems to facilitate safe social distancing, and fast algorithms to find points of potential infection between people. Key contributions: -Implemented Android and SMS (for non-smartphone users) contact tracing application for COVID-19 in Senegal - Designed NoSQL, PostgreSQL databases along with AWS Lambda RESTful API for TCN protocol contact tracing - Directed and implemented all permissions, roles within ZODAJ for Amazon AWS databases - Created overlap algorithm for n people in a store that runs in max(O(k) or O(n log n)), where k is number of overlaps - Wrote graph algorithms, discrete-time markov chain SIR epidemic modeling for tracking infections ### Linux System Engineer Intern (DevOps) @ Bosch ASEAN Jan 2019 – Jan 2019 | Singapore As part of the Open Source Desktop team, I maintained and developed existing and new features for a customized distribution of Ubuntu which is capable of interfacing with Bosch's network security layers and offers pre-made configurations for Bosch's Linux developers to set up and use Ubuntu more efficiently and safely within Bosch's developer ecosystem. I also implemented build, testing, and deployment improvements for our product pipeline. I was responsible for both front-end and back-end implementation projects. Key contributions: - Created efficient searchable user interface for developers to install their chosen ISO file via network boot (HTA, Jinja, JavaScript) from internal servers - Enabled Jenkins server to run two ISO builds in parallel (1 executor, 2 slaves) for testing via dynamic port allocation - Enabled automatic mounting of QEMU disk on Jenkins server to extract logs on initial startup post-installation for quicker debugging on failed build pipelines (Bash) - Created package to automate unit tests on newly installed ISO files on physical and virtual machines for standard functionality such as web and email access and having the latest security/proxy configurations (Bash, Python) - Implemented Jenkins pipeline build-steps modification for quicker deployment testing (via allowing the possibility to bypass ISO installation on ram disk, which reduces testing time from 20 mins to 1 min) - Began migration of current inventory servers to new OCS inventory servers via Ansible for seamless configuration ### Data Analytics Intern @ Eyeota Jan 2017 – Jan 2018 As part of the Analytics team, I was responsible for automating day-to-day processes in order to improve workflow of Eyeota's analytics department. Key contributions: - Automated data cleaning and utilized Levenshtein distance, regex on text data from programmatic buy-side platforms to classify advertiser and buyer data for the business analytics team to track audience data sales - Extracted advertiser behavioral trends such as target demographics via web-scraping and from buyer data - Automated update process of daily, weekly, and monthly revenue generated from audience data sales - Automated formatting of .csv files of audience data sales across different programmatic platforms such as Google DoubleClick Bid Manager, Adform, and Appnexus, for use in further data processing ### Platoon Sergeant @ Singapore Armed Forces (SAF) Jan 2016 – Jan 2017 | Singapore Served as Platoon Sergeant at Transportation Node, Nee Soon Camp, as part of mandatory military service. I proactively used MS Excel to create formulae and Visual Basic subroutines to set up innovative logistics automation systems from scratch which include: - Automated vehicle accounting (available, under repairs, loaned out, on missions). Over 20 vehicle types and 400 individual units accounted for daily. Improved count time from 45 mins to 5 mins. - Intuitive monthly soldier activity forecast: training, driving, on prolonged missions, on leave. - Daily attendance system to automatically categorize soldier status based on ‘remarks’ column. - Personal spreadsheets for soldiers to track and calculate driven mileage on different vehicle classes to speed up administrative processes when they complete their mandatory military service. ### Machine Learning Research Intern @ Singapore Institute of Manufacturing Technology Jan 2014 – Jan 2014 | Singapore Developed a MATLAB system utilizing k-means clustering, partial least squares regression, extreme learning machine, and multiple linear regression to predict ink cartridge lifespans for research and development purposes and remove the need for multiple destructive batch-testing. ## Education ### Bachelor of Science - BS in Computer Science Carnegie Mellon University ### Singapore Cambridge GCE A-Levels in Physics, Chemistry, Mathematics, Knowledge & Inquiry Hwa Chong Insitution ## Contact & Social - LinkedIn: https://linkedin.com/in/nicklaus-choo --- Source: https://flows.cv/nicklaus JSON Resume: https://flows.cv/nicklaus/resume.json Last updated: 2026-03-29