I’m a Senior AI/ML Engineer with 11+ years of experience building real-world AI and data solutions across healthcare, finance, insurance, and enterprise environments. My work focuses on solving complex business problems using machine learning, data science, and modern Generative AI approaches.
Experience
2023 — Now
2023 — Now
Westlake Village, California, United States
• Built GenAI pipelines (GPT-4, Llama, Hugging Face) to extract underwriting, claims, and exposure data from complex insurance documents, reducing manual review time by ~70%.
• Designed RAG & Agentic AI systems (Pinecone, Weaviate, LangChain) to ensure LLMs used carrier-approved policy, endorsement, and claims data, cutting hallucinations by 60%+.
• Integrated LLM-generated features into XGBoost & LightGBM models to improve fraud detection, claim severity prediction, and underwriting risk scoring.
• Deployed FastAPI-based AI platforms with PII/PHI redaction, guardrails, and evaluation metrics, enabling secure real-time policy Q&A, clause checks, and document summarization.
2022 — 2023
2022 — 2023
Stamford, CT
Built TensorFlow & PyTorch AI pipelines on DOCSIS, RF, and CMTS telemetry to detect network faults and signal degradation, improving fault-recognition accuracy by 26% across high-density service groups.
• Applied BERT & DistilBERT NLP to modem logs, field tickets, and CMTS alerts, enabling automated root-cause detection and faster NOC triage from unstructured operational data.
• Developed autoencoders, VAEs, LSTM & attention models to predict node health, detect ingress noise, RF leakage, and plant instability before customer impact.
• Deployed real-time AI scoring APIs with SHAP explainability and AWS EMR retraining, enabling proactive, transparent, and scalable network-health monitoring.
2020 — 2022
2020 — 2022
New York, United States
• Built Python, PySpark & scikit-learn ML pipelines to score transactions and detect fraud across card, ACH, and digital channels, improving fraud signal throughput by 28%.
• Engineered behavioral, geo-velocity, merchant, and time-window features that significantly improved fraud and credit-risk model accuracy and stability.
• Trained and optimized XGBoost, LightGBM, and ensemble models to reduce false negatives and strengthen high-risk transaction detection under strict risk-governance standards.
• Delivered real-time scoring & model monitoring (Spark Streaming, MLflow, SHAP, drift detection) enabling fast alerts, explainable decisions, and audit-ready fraud operations.
2017 — 2020
San Jose, California, United States
• Built RAG-based semantic search & NLP pipelines to surface insights from public health records, clinical guidelines, and case notes, improving information retrieval speed and accuracy for analysts and care teams.
• Developed ML models for population health, SDOH analysis, risk stratification, and member segmentation, enabling data-driven planning for county healthcare programs.
• Engineered PySpark & Spark SQL pipelines to process large-scale eligibility, encounter, and utilization data, supporting scalable public-sector healthcare analytics.
• Automated reporting and dashboards using Python & Tableau, reducing recurring analytics turnaround time by 30%+ while ensuring responsible and explainable AI.
2016 — 2017
2016 — 2017
Atlanta, Georgia, United States
• Built credit risk & default prediction models using Python, SQL, and scikit-learn on bureau tradelines, utilization, and delinquency data, improving model accuracy by 18% across Equifax’s risk-scoring systems.
• Engineered Spark, PySpark & Hive feature pipelines to create stable, interpretable credit behavior variables for fraud, risk, and creditworthiness modeling.
• Leveraged AWS EMR, S3, Hadoop & Spark to process high-volume bureau data, enabling scalable training, validation, and batch scoring for enterprise credit models.
• Implemented model validation, drift monitoring (PSI/KS), and compliance-ready deployments, ensuring reliable and audit-ready credit decisioning for financial institutions.