# Johnny Shang > Staff Capacity Engineer at Credit Karma Location: Oakland, California, United States Profile: https://flows.cv/johnnyshang ## Major Valuable Projects in Career ## In Credit Karma, I implemented Capacity Data Pipeline to collect all kinds of data to improve‬ ‭ capacity planning efficiency and accuracy which saved lots of engineering efforts.‬ ‭ I also accomplished POC (Proof of Concept) to leverage Argo Rollouts in K8s for automated stress‬ ‭ testing in prod env and helped dev team on functions design, which reduced stress testing LOE‬ ‭ (Level of Effort) from 2 hours to just 1 minute. In Box, I investigated deeply into Intel CPU Spectre/Meltdown vulnerabilities, scheduled canary testing in prod env for different application workloads, and then came up with multiple mitigation approaches which eventually saved Box tens of millions of dollars. To fulfill company level cost saving OKR and accommodate on-prem to public cloud migration, I identified and reclaimed hundreds of zombie servers in different SKU types, I also provided advices to different functional teams to make best use of existing capacity, with all of these we nearly avoided any new server purchase in 2020 and targeting to zero purchase in 2021. In PayPal after split from eBay, I rebuilt the whole capacity team with my manager, expanded team’s coverage from front end capacity planning to IaaS platform, middle tier, database tier and backend storage. In eBay, I built up capacity dedicated data warehouse from scratch, developed Frontend & Database capacity self-service tools to streamline capacity cost/impact estimation. ## Work Experience ### Staff Capacity Engineer @ Credit Karma Jan 2021 – Present 1)‬‭ Capacity Forecasting‬‭ : Use domain knowledge and‬‭ machine learning models to forecast annual capacity‬‭ demands for various GCP services and projects, including GCE, GKE, BQ, GCS, etc., and then align with‬‭ FP&A for budget planning.‬ ‭2)‬‭ Peak Capacity Sizing‬‭ : Analyze and estimate seasonal‬‭ peak capacity demand for each service to prevent‬‭ capacity issues.‬ ‭3)‬‭ Cost Efficiency‬‭ : Identify opportunities for binpacking‬‭ and rightsizing in GKE to achieve cost savings‬‭ without impacting service SLA.‬‬ ‭4)‬‭ Site Issues‬‭ : Follow up on site issues and performance troubleshooting to avoid unnecessary capacity‬‭ additions.‬ ‭5)‬‭ Performance Optimization‬‭ : Diagnose and optimize‬‭ performance issues on services based on‬‭ Java/Node.js/Typescript.‬ ‭6)‬‭ Stress Testing‬‭ : Automate stress testing via Argo‬‭ Rollouts in prod env without affecting the user‬ experience.‬ ‭7)‬‭ Capacity Recommendation Engine‬‭ : Leverage and train‬‭ machine learning models to implement capacity‬‭ recommendation engine to streamline capacity rightsizing for cost efficiency.‬ ‭8)‬‭ Capacity ETL Pipeline‬‭ : Implement capacity ETL pipeline‬‭ to collect capacity and performance planning‬‭ dependent data from multiple sources at appropriate granularity, including K8s performance data, service‬‭ traffic data, metadata, and more.‬ ‭9)‬‭ Capacity Knowledge Bot‬‭ : Implement capacity knowledge‬‭ bot using LLM on top of collected data by ETL‬ pipeline to simplify and improve communication efficiency across teams.‬ ### Staff Capacity Engineer @ Box Jan 2017 – Jan 2021 | Redwood City, CA 1) Define DR model and capacity utilization model for site high availability and scalability. 2) Long term and short team capacity planning and forecasting for the entire infrastructure. 3) Work with Supply Chain team to ensure infrastructure purchases are properly planned to meet site growth demand and timeline. 4) Analyze site health periodically and proactively to avoid abnormal non-linear capacity add. 5) Site performance issue analysis and troubleshooting to improve capacity utilization efficiency. 6) Capacity rightsizing for all services to avoid server abuse and save cost. 7) Build and operate capacity automation and analytics on multi-terabytes data sets for entire infrastructure performance data to ensure efficient infrastructure scaling in public and private clouds. 8) Capacity As A Service framework design and capacity data warehouse buildup to improve capacity planning efficiency. 9) Build data center migration model on top of performance and meta data in data warehouse to improve capacity utilization efficiency and minimize new server purchase. 10) Server SKU benchmark/stress testing for different application workloads and performance data analysis/comparison. ### MTS2 Performance Capacity Planning & Analyst @ PayPal Jan 2015 – Jan 2017 | San Jose, CA 1) Capacity sizing estimation for all site components, including Oracle, SAN, NAS, Front End, middle tier, etc. e.g. Estimate how many resource is needed for a new project/feature or for # of increased traffic. 2) Long term capacity planning and forecasting, to meet seasonal peak requirement of each year, we need to proactively estimate how many extra resources are required. 3) Performance dashboard development, including Oracle, SAN, NAS, Front End, middle tier, etc. 4) Capacity Self-Service tool development to automate and streamline capacity cost/impact estimation. 5) Correlation analysis between business and system performance metrics for more accurate forecasting. 6) Capacity workflow enhancement, to better serve our customers and reduce human efforts. ### MTS1 Performance Capacity Planning & Analyst @ eBay Inc Jan 2007 – Jan 2014 | Shanghai, China 1) Capacity sizing estimation for all site components, including Oracle, SAN, NAS, Front End, middle tier, etc. e.g. Estimate how many resource is needed for a new project/feature or for # of increased traffic. 2) Long term capacity planning and forecasting, to meet seasonal peak requirement of each year, we need to proactively estimate how many extra resources are required. 3) Performance dashboard development, including Oracle, SAN, NAS, Front End, middle tier, etc. 4) Capacity Self-Service tool development to automate and streamline capacity cost/impact estimation. 5) Correlation analysis between business and system performance metrics for more accurate forecasting. 6) Capacity workflow enhancement, to better serve our customers and reduce human efforts. ### Oracle DBA @ eBaoTech Corporation Jan 2006 – Jan 2007 | Shanghai, China 1) Oracle database management, troubleshooting, performance tuning and SQL review. 2) Oracle application server (OAS) management and deployment. 3) Testing, Staging, UAE and production environments deployment and maintenance. 4) Technical support to customers, including Oracle database monitoring, heavy SQL tuning, package deployment as well as weekly database health check report, etc. 5) Involved in projects, table structure design, PL/SQL & Shell scripts development, etc. ### Oracle DBA @ Hua Analytical Technology Jan 2004 – Jan 2006 | Shanghai, China 1) Oracle database management, troubleshooting, performance tuning and SQL review. 2) www.minicrm.com (ASP platform) and www.ez51.com maintenance. 3) Based on data models generated by SAS, re-implement via SQL and optimize data processing. 4) System performance data collection and analysis. ## Contact & Social - LinkedIn: https://linkedin.com/in/johnny-shang-5123aa26 --- Source: https://flows.cv/johnnyshang JSON Resume: https://flows.cv/johnnyshang/resume.json Last updated: 2026-04-12