Develop cloud platform and infrastructure - Nvidia GPU Network (NGN) for Nvidia cloud gaming service - Ge Force Now (GFN)
Implemented a GitOps approach together with the existing CICD pipeline for bringing-up and maintaining the state of the cluster using Gitlab, Quay.io, Jenkin, Flux, and Kustomize.
Developed Kubernetes distributed system logging with third party technology stacks for automating logging and security audit system such as Kafka, Elasticsearch, Splunk, AWS enterprise resources. Implemented Lumberjack protocol output plugin for Fluentbit for containers and system logs digest, construct, and delivery.
Built a Kubernetes scheduler extender that supports image (guest OS), GPU type, and other resources bin packing that reduced the cost of licensing per new VM request. The extender also provides High Availability with multi-layered fault domains and batch scheduling of VMs.