San Francisco, California, United States
• Implement and evaluate performances of 6+ word embedding models (Flag Embedding, E5 models) for semantic similarity task in ad searching system
• Utilize PyTorch, Sentence Transformers, and Lambda’s cloud GPUs to fine-tune word embedding models, achieving a 60% decrease in training time
• Develop and optimize data pipelines for pre-processing textual data for fine-tuning
• Utilize data analysis techniques such as exploratory data analysis, statistical analysis, and visualization to explore and understand textual training data