# Derrick Kondo > Staff AI Engineer, Tech Lead Location: San Francisco Bay Area, United States Profile: https://flows.cv/derrickkondo Tech lead applying GenAI in production to solve business problems and developing GenAI infrastructure to enable others to do the same. 10+ years of experience with strong bias for swift execution and impactful results. CS PhD @ UCSD. CS Bachelor’s @ Stanford. Specialties: LLM's and GenAI [Agents], ML model training and serving, backend platforms and data processing LLM: LLMOps, Azure ChatGPT, GCP Vertex AI / Gemini, LiteLLM, vLLM, LangChain, LangSmith, LangServe, GenAI Agents in LangGraph ML frameworks: Tensorflow, PyTorch, TorchServe, vLLM, scikit-learn ML platforms: Azure, GCP Vertex AI, Metaflow, Ray, Mlflow, Seldon ML libraries: Pandas, NumPy GPU optimization: TensorRT compiler Orchestration: Kubernetes, Docker Cloud: AWS, GCP, Azure Data management: BigQuery, Postgres, MySQL, GCP CloudSQL Languages: Python, Java, C ## Work Experience ### Staff Software Engineer, Machine Learning Platform @ Etsy Jan 2021 – Present Productionizing GenAI at scale: ● Helping build company-wide strategy and platform for LLM's and GenAI. "Founding engineer" of GenAI enablement team. ● Partnering with product marketing and seller teams to initiate applications of GenAI. Productionized first agentic application in Langgraph. ● Drove engineering from prototype to production applying GenAI to improve Search Engine Optimization, conducting LLM-based image caption generation across hundreds of millions of listing images. Increased sales volume (GMS) by $25 million. ● Built a no-code tool that enables product experts to prototype and evaluate GenAI use cases. Cost-efficient and Easy-to-use Machine Learning Platform: ● Tech lead for team focused on ML model serving and inference ● Developing scalable and cost-efficient ML model serving platform, supporting hundreds of models at hundreds of thousands of requests per second. ● Led product for automatically recommending configurations for ML model serving. Saved over $23.2 million GMS and reduced ML model deployment time from ~2 weeks to ~2 hours. ### Machine Learning Technical Lead @ 23andMe Jan 2019 – Jan 2021 Scalable and Reliable Machine Learning Infrastructure: ● Helped lead the creation of a new business unit and machine learning platform for predicting human health. Collaborated with cross-org data scientists and data engineers to solve cross-functional problems and define processes. ● Led a team of 5 junior and senior engineers. ● Helped design, implement and operate production ML infrastructure to train, deploy, and serve ML models ● Implemented scalable machine learning end-to-end pipelines and workflows using Metaflow over AWS. ● Received Engineering Excellence Award in 2020. ### Principal Software Engineer @ WorkSpan Jan 2018 – Jan 2019 Backend platform engineering: ● Technical lead for a team of 6 senior/junior engineers, planning, designing and implementing the core backend platform ● Helped design and built backend platform on the Google Cloud Platform: data pipelines with Google DataFlow/Beam, web services with REST API’s backed with NoSql datastore, Google Search API (like ElasticSearch), asynchronous messaging (pub-sub), and data caching ● Backend engineer #2 at start-up who helped build the core backend platform and web services used daily by Global 500 companies such as Intel, SAP, and Lenovo. ### Member Of Technical Staff @ WorkSpan Jan 2015 – Jan 2018 ### Senior Software Engineer @ Teradata Aster Jan 2012 – Jan 2015 | San Francisco Bay Area Distributed batch data processing: ● Improving SQL Map-Reduce batch job execution in a parallel database (similar to Spark) designed for analytics. ● Helped accelerate Map-Reduce batch jobs by up to 24x using collaborative planning between a SQL query optimizer and user-defined table operators. Contributed job execution and API implementation with collaborative planning. ● Patent US20160239544A1 awarded for collaborative planning for accelerating analytic queries ● Received Outstanding Corporate Award ### Tenured Research Scientist @ INRIA Jan 2007 – Jan 2012 Machine learning, data analytics, statistical analysis: ● Designed machine learning methods to predict host availability in a 10,000+ node distributed system. Improved accuracy of failure prediction by 50%. ● Built and led team focused on statistical and predictive models of distributed systems. ● Led and helped construct the Failure Trace Archive (http://fta.inria.fr), a public repository of failure traces of distributed systems, and analytical tools. Conducted ETL of log traces for systems with as many as 10,000+ hosts. Performance modeling of distributed and parallel systems: ● Developed performance models of batch applications from machine data of large parallel and shared distributed systems. Enabled prediction of application performance for high-throughput applications. ### Postdoctoral Researcher @ INRIA Jan 2005 – Jan 2007 ## Education ### Bachelor of Science (B.S.) in Computer Science Stanford University ### Doctor of Philosophy (PhD) in Computer Science UC San Diego ## Contact & Social - LinkedIn: https://linkedin.com/in/derrick-kondo --- Source: https://flows.cv/derrickkondo JSON Resume: https://flows.cv/derrickkondo/resume.json Last updated: 2026-04-12