Notable Projects:
FashionGen
• Built a multi-model Al pipeline (Bounding boxes w/ segmentation models -> Parallelized BLIP image-to-caption -> Stable diffusion finetuning)
• Built an interface allowing users to generate novel outfits
• Indexed CLIP embeddings of fashion images for text
and visual semantic search of fashion items
• Multi-GPU training on AWS EC2
Virtual Clothes Try-On (Dreambooth)
• Used 10-20 images of individual clothing articles to finetune Stable Diffusion allowing users to apply a real outfit onto a picture of themselves.
• Built automated pipeline to scrape images from a product page, process them, and launch distributed training
• Used textual-inversion and prompt engineering to optimize results
Latent Face Editor
• Built full Web UI for face editing
• Implemented adversarial gradient-masked loss to allow localized edits.
• Wrote custom backpropagation hooks in the PyTorch autograd engine
Color Diffusion
• World's first diffusion model to tackle image colorization
• Built experience training conditional diffusion models from scratch
• Learned about model pruning, hyperparameter search, hardware-specific optimization
VQGAN-CLIP reimplementation
• Simple implementation focused on readability and ease-of-use: 3 lines of code to generate or edit an image
• Designed to be clean, modular, extensible
• Featured as an official research project on the Huggingface transformers repo