Greater Toronto Area, Canada
• Data Science and Data Analytics Certificate Program, UC Santa Cruz Silicon Valley Extension
• Python for Data Analysis: Medical cost case study, Airport case study
• Techniques: data collection, data cleaning, data analyzing, and data visualization
• Machine Learning
• Prediction of housing price using linear regression with degree-k polynomial
• Feature selection using Chi Squared Test and PCA
• Anomaly Detection in Network Data using XGBoost, GANs, and Autoencoders
• Fine-Tune of LLMs using LoRA and HuggingFace Supervised Fine-tuning Trainer (SFT)
• Fine-Tune of embeddings using LIamaIndex + Retrieval Augmented Generation (RAG)
• GANs for Data Synthesis
• Enhancement of CGAN, DCGAN, WGAN, pix2pix and Cycle GAN for Data synthesis
• Training of the enhanced models
• Big data system design
• Designed a big data system for Twitter’s front page loading and geolocation-based emergency detection.
• Conducted business requirements analysis, functional and non-functional design, data schema design, and system workflow design.
• Knowledge of Kafka, Spark, Hadoop, and MySQL.
• Deep Learning and Artificial Intelligence
• Perceptions: CNN, DNN and Transfer learning
• NLP: RNN, LSTM, BERT and Word embedding
• Car Navigation System
• Conducted quantitative analysis of INS/GNSS availability for self-driving vehicles in urban environments.
• Evaluated the performance of high precision GNSS receivers for self-driving vehicles.