Versatile machine learning expert with 10+ years of experience creating AI applications that solve complex real-world problems.

2019 — NowStudio XoloSenior Software Engineer

2019 — Now

Remote

Building GrocerBird.com, an app written in Django (Python) and Javascript that turns grocery receipts into actionable insights to optimize grocery expenditure using Vision Language Models.

Built Hoodkit, an aggregated database and API based on 47 NYC real estate data feeds to accelerate investment property research, using Apache Airflow and SQL for data ingestion and normalization, AWS RDS (PostgreSQL) for storage, FastAPI for REST API backend service and JSON Web Token for secure integration with frontend service.

Created RoughCut, an AI-assisted documentary editing app written in Python and Javascript (React) leveraging automatic transcripts and video search using Google Cloud API and custom search engine to make editing 10 times faster.

Prototyped real-time object detection model for inspection and surveillance on Nvidia Jetson equipped drones using PyTorch and Nvidia Deepstream SDK in C++.

Refactored and modularized a Multi-Agent Reinforcement Learning research codebase to increase experiment iteration speed, introduced GitHub pull request and code review best practices to a team of academic researchers.

Mentoring machine learning leaders and engineers.

2020 — 2024Turing Labs Inc.Lead Machine Learning Engineer

2020 — 2024

Remote

As the first ML Engineer, designed and built a probabilistic AI system from scratch to help CPG R&D leaders to reduce testing, decrease formulation cost, optimize product performance and find new ingredients, delivered millions of dollars in business impact.

As a tech lead manager, recruited and led a 5-person ML Engineering team including research scientists, machine learning engineers, and software engineers in a remote-first culture across four time zones with a focus on diversity, inclusion, psychological safety, and best practices. Led ML development strategy and roadmap, and cross-functional collaboration with Web, Product and Customer Success teams.

Built a REST API backend service using Python, FastAPI and PostgreSQL to enable automated model training, model deployment and inference, as well as Bayesian Optimization, reducing time to experiment by 90%.

Designed and implemented AI system architecture and infrastructure with Docker and AWS (ECS, RDS, S3) to work with an existing Web application in AWS. Prepared cloud and application architecture diagrams for due diligence.

Leveraged Large Language Models via OpenAI API to build a quantitative Knowledge Graph of formulation science to solve the cold start problem.

Prototyped a new AI feature using OpenAI API and Streamlit, leveraging Large Language Models for expressing complex optimization objectives and constraints.

Parallelized continuous integration tests using Pytest to reduce build time by 50%.

2018 — 2019PhiarCo-Founder (YC S18, acquired by Google)

2018 — 2019

San Francisco Bay Area

Built and led a 6-person agile R&D team of software engineers and 3D perception experts. Led technical strategy and development, and technical due diligence.

Researched, implemented and deployed efficient Deep Neural Networks achieving real-time semantic segmentation on mobile devices using PyTorch, Onnx, Swift and Apple Metal Framework.

Created Deep Neural Network training pipeline in PyTorch optimized for research iteration speed, experimented with cutting edge algorithms, leveraged Git branches for reproducible experiments.

Co-authored 2 patents for augmented reality navigation systems integrating computer vision and advanced driver assistance functionalities.

Accepted into leading startup accelerator YCombinator (S18). Acquired by Google.

2015 — 2018ShutterstockMachine Learning Engineer

2015 — 2018

New York, New York, United States

Implemented a large-scale (100m+) next-generation visual search engine consisting of a custom Vector Database for image embeddings and retrieval algorithm written in C++ and gRPC, and a REST API written in Python. Containerized using Docker, set up continuous integration on GitHub and deployed to AWS ECS.

Built an experimental search ranker using a machine learning model, including a data pipeline written in Apache Spark (Scala) and SQL (Hive on Hadoop) for gathering large amounts of historical data for training, and a REST API service written in Python.

Prototyped an image auto-tagging algorithm, implemented a web demo using React and Flask, worked with an iOS app developer on mobile app integration and present a demo together in a company hackathon. Led the production implementation of image auto-tagging feature, created backend API design doc, worked with Mobile and Web product managers and developers for integration. This feature gained press coverage, increased mobile content submission by 60%.

Prototyped a visual clustering algorithm to improve diversity of image search results and make it easier to find the right content, won a company hackathon.

Prototyped the back-end of a visual search chrome plugin for migrating customer’s saved assets from competing marketplaces, won a company hackathon, and patented the algorithm.

Trained and evaluated Deep Learning models for image understanding tasks using PyTorch and Tensorflow.

2014 — 2014SOCUREData Scientist

2014 — 2014

New York, New York, United States

As an early employee, researched and developed a novel real-time face similarity scoring algorithm written in Java and Scala, achieved 0.83 AUC, and improved identity fraud detection rate by 10%.

Built an internal data annotation web app using Javascript and Node.js for evaluating face similarity scoring algorithm performance.

Contributed an algorithm bug fix to the open source OpenCV project on GitHub.

Education

2012 — 2014

Columbia University

Master's degree

2012 — 2014

2018 — 2018

Y Combinator

2018 — 2018

2008 — 2012

Guangdong University of Foreign Studies

Bachelor of Science (BS)

2008 — 2012

Experience+

Education

Master's degree

Bachelor of Science (BS)