Passionate about Machine Learning application in Media & Technology.
Experience
2022 — Now
2022 — Now
• Hybrid Model Registry
To enable a seamless hybrid cloud experience by integrating model registries across different cloud environments, ensuring smooth model management and deployment across different cloud environments.
• Model Registry Public-Facing API
Led the design and implementation of Cloudera's AI Registry public-facing API, a critical foundation for MLOps in the serving platform. This registry integrates with MLflow and allows importing external models, including Hugging Face and Nvidia NIM models, to streamline model management and deployment.
• Business User Experience for Cloudera Machine Learning
Delivered a seamless, intuitive interface for business users, improving accessibility and driving adoption of Cloudera Machine Learning solutions
• Conducted Proof of Concept for Feature Store Integration
Led a Proof of Concept (PoC) for integrating Feast, an open-source feature store, with Cloudera Machine Learning. As part of this effort, evaluated various databases for online and offline stores, including PhoenixDB, HBase, Impala, and Iceberg, ensuring optimal performance and scalability for feature engineering workflows.
2020 — Now
2020 — Now
San Francisco Bay Area
Lifecycle management server for Cloudera Data Science Workbench (CDSW) running on public cloud (Cloudera Data Platform).
• Provision cloud resources (i.e. EFS) and install CDSW to customer's cloud environment
• Suspend/Resume CDSW with cloud autoscaler
• Upgrade Kubernetes for CDSW
POC for Istio service mesh integration with CDSW
• Automated Istio / Istio CNI installation
• ~90% mutual TLS coverage
2018 — 2020
2018 — 2020
San Francisco Bay Area
Cloudera Data Science Workbench (CDSW) - A web application composed of Microservices written in go / node.js running in a K8S cluster.
Worked on general backend development for on-prem CDSW as well as the integration work for cloud CDSW based on AWS tech stack.
• Hardened the Virtual File System's gRPC file streaming operations and fixed various resource leaks, symlink venerability, and set up the gRPC server/client tests from scratch.
• Unified the GoLang package management system across ~20 components with 'go module'. Reduced the build time to 1/3, and increased network stability by using internal proxy registry.
• Integrated CDSW SSO with internal identity provider (SAML)
• Implemented the feature flag to disable / enable file upload / download.
Tech stack: GoLang, Docker, Kubernetes, AWS, gRPC, Helm, PostgresSQL, Node.js
2017 — 2017
2017 — 2017
Palo Alto, California
Built an Analyzer backend service for VM instance recommendation as part of the proof-of-concept
development. (https://github.com/Hyperpilotio/analyzer)
• Deployed containers to cluster and benchmarked system/service level metrics under workloads.
• Implemented the recommendation searching strategy and visualized the behavior of ML algorithm.
• Maintained Analyzer service API that communicates with front-end and load test service.
This project involves working with Django, MongoDB, Docker, Kubernetes, Intel-snap, Kafka, SciPy.
2014 — 2015
2014 — 2015
Taipei City, Taiwan
Developed a musical onset detection algorithm and improved the performance by 7%. The work was
published to IEEE Signal Processing Letters, 2015. (https://github.com/cheyuanl/OnsetDetectorCLR)
• Researched audio signal reconstruction error using constrained (sparsity/non-negativity) least squares.
Education
Carnegie Mellon University
Master's degree
National Chiao Tung University