# Derrick Kondo

> Staff AI Engineer, Tech Lead

Location: San Francisco Bay Area, United States
Profile: https://flows.cv/derrickkondo

Tech lead applying GenAI in production to solve business problems and developing GenAI infrastructure to enable others to do the same.

10+ years of experience with strong bias for swift execution and impactful results.  

CS PhD @ UCSD. CS Bachelor’s @ Stanford.

Specialties: LLM's and GenAI [Agents], ML model training and serving, backend platforms and data processing

LLM: LLMOps, Azure ChatGPT, GCP Vertex AI / Gemini, LiteLLM, vLLM, LangChain, LangSmith, LangServe, GenAI Agents in LangGraph

ML frameworks: Tensorflow, PyTorch, TorchServe, vLLM, scikit-learn

ML platforms: Azure, GCP Vertex AI, Metaflow, Ray, Mlflow, Seldon

ML libraries: Pandas, NumPy

GPU optimization: TensorRT compiler

Orchestration: Kubernetes, Docker

Cloud: AWS, GCP, Azure

Data management: BigQuery, Postgres, MySQL, GCP CloudSQL

Languages: Python, Java, C

## Work Experience
### Staff Software Engineer, Machine Learning Platform @ Etsy
Jan 2021 – Present
Productionizing GenAI at scale:

● Helping build company-wide strategy and platform for LLM's and GenAI. "Founding engineer" of GenAI enablement team.

● Partnering with product marketing and seller teams to initiate applications of GenAI. Productionized first agentic application in Langgraph.

● Drove engineering from prototype to production applying GenAI to improve Search Engine Optimization, conducting LLM-based image caption generation across hundreds of millions of listing images.  Increased sales volume (GMS) by $25 million.

● Built a no-code tool that enables product experts to prototype and evaluate GenAI use cases.

Cost-efficient and Easy-to-use Machine Learning Platform:

● Tech lead for team focused on ML model serving and inference

● Developing scalable and cost-efficient ML model serving platform, supporting hundreds of models at hundreds of thousands of requests per second.

● Led product for automatically recommending configurations for ML model serving.  Saved over $23.2 million GMS and reduced ML model deployment time from ~2 weeks to ~2 hours.

### Machine Learning Technical Lead @ 23andMe
Jan 2019 – Jan 2021
Scalable and Reliable Machine Learning Infrastructure:

● Helped lead the creation of a new business unit and machine learning platform for predicting human health.  Collaborated with cross-org data scientists and data engineers to solve cross-functional problems and define processes.

● Led a team of 5 junior and senior engineers.

● Helped design, implement and operate production ML infrastructure to train, deploy, and serve ML models

● Implemented scalable machine learning end-to-end pipelines and workflows using Metaflow over AWS.

● Received Engineering Excellence Award in 2020.

### Principal Software Engineer @ WorkSpan
Jan 2018 – Jan 2019
Backend platform engineering:
●  Technical lead for a team of 6 senior/junior engineers, planning, designing and implementing the core backend platform

●  Helped design and built backend platform on the Google Cloud Platform: data pipelines with Google DataFlow/Beam, web services with REST API’s backed with NoSql datastore, Google Search API (like ElasticSearch), asynchronous messaging (pub-sub), and data caching

● Backend engineer #2 at start-up who helped build the core backend platform and web services used daily by Global 500 companies such as Intel, SAP, and Lenovo.

### Member Of Technical Staff @ WorkSpan
Jan 2015 – Jan 2018

### Senior Software Engineer @ Teradata Aster
Jan 2012 – Jan 2015 | San Francisco Bay Area
Distributed batch data processing:
● Improving SQL Map-Reduce batch job execution in a parallel database (similar to Spark) designed for analytics.

● Helped accelerate Map-Reduce batch jobs by up to 24x using collaborative planning between a SQL query optimizer and user-defined table operators. Contributed job execution and API implementation with collaborative planning.

● Patent US20160239544A1 awarded for collaborative planning for accelerating analytic queries

● Received Outstanding Corporate Award

### Tenured Research Scientist @ INRIA
Jan 2007 – Jan 2012
Machine learning, data analytics, statistical analysis:

● Designed machine learning methods to predict host availability in a 10,000+ node distributed system. Improved accuracy of failure prediction by 50%.
● Built and led team focused on statistical and predictive models of distributed systems.
● Led and helped construct the Failure Trace Archive (http://fta.inria.fr), a public repository of failure traces of distributed systems, and analytical tools.    Conducted ETL of log traces for systems with as many as 10,000+ hosts.

Performance modeling of distributed and parallel systems:

● Developed performance models of batch applications from machine data of large parallel and shared distributed systems. Enabled prediction of application performance for high-throughput applications.

### Postdoctoral Researcher @ INRIA
Jan 2005 – Jan 2007


## Education
### Bachelor of Science (B.S.) in Computer Science
Stanford University

### Doctor of Philosophy (PhD) in Computer Science
UC San Diego


## Contact & Social
- LinkedIn: https://linkedin.com/in/derrick-kondo

---
Source: https://flows.cv/derrickkondo
JSON Resume: https://flows.cv/derrickkondo/resume.json
Last updated: 2026-04-12