# Vinay Ramrupe > AI/ML Engineer | LLM & Generative AI Specialist | Azure OpenAI, Databricks (DBRX) | RAG, Fine-Tuning, Scalable Inference | 5+ YOE Location: San Francisco Bay Area, United States Profile: https://flows.cv/vinayramrupe • AI ML Engineer with 5+ years of experience building enterprise and open source large language model solutions across Azure OpenAI and Databricks Lakehouse platforms in India and the USA. • Strong hands on experience in LLM development including data preprocessing, prompt engineering, fine tuning, distributed training, and scalable inference using Python, PyTorch, Spark, and MLflow. • Proven background in delivering Copilot style AI applications and production ready LLM systems with focus on performance optimization, cost efficiency, security, and responsible AI practices. • Experienced in collaborating with research, product, and platform teams to translate complex business and engineering requirements into scalable AI solutions used in real world enterprise environments. ## Work Experience ### AI/ML Engineer @ Databricks Jan 2024 – Present | CA, USA • Contributing to the development and optimization of DBRX, Databricks open source large language model, focusing on scalable training and inference workflows on distributed compute. • Worked on data curation and preprocessing pipelines using Apache Spark and Delta Lake to prepare large scale high quality training datasets for LLM development. • Implemented model training and evaluation workflows using PyTorch and Databricks ML Runtime to support efficient experimentation and reproducibility. • Collaborated with research and platform teams to benchmark DBRX performance across reasoning, summarization, and code generation tasks. • Optimized distributed inference pipelines for DBRX on Databricks by improving batching strategies and GPU utilization, resulting in ~22% higher throughput under production scale workloads. • Integrated DBRX with Databricks Lakehouse architecture to enable enterprise ready deployment, governance, and monitoring for production LLM workloads. • Built automated evaluation and experiment tracking workflows using MLflow and Python, reducing manual validation effort by ~30% and improving model iteration speed across releases. • Supported fine tuning and instruction tuning efforts to adapt DBRX for enterprise use cases including analytics assistance and internal Copilot style applications. • Implemented resource monitoring and cost observability dashboards that helped identify inefficient GPU usage patterns, contributing to ~15% improvement in infrastructure cost efficiency. • Collaborated with cross functional teams to align open source model development with enterprise security, compliance, and responsible AI guidelines. ### AI/ML Engineer @ Microsoft Jan 2020 – Jan 2023 | India • Worked on designing and deploying Azure OpenAI based enterprise solutions by integrating GPT models with internal Microsoft services, enabling intelligent document processing and conversational workflows. • Assisted in building Copilot style AI features using Azure OpenAI, Azure Functions, and REST APIs to automate enterprise knowledge retrieval and task recommendations for internal business teams. • Developed data preprocessing and prompt engineering pipelines in Python to improve response relevance and consistency for large language model powered applications. • Collaborated with senior engineers to integrate Azure Cognitive Search with OpenAI models, enabling semantic search and contextual question answering over enterprise documents. • Implemented secure API based access using Azure Active Directory and role based access control to ensure compliance with enterprise security and data governance standards. • Supported fine tuning experiments and prompt optimization techniques that improved model response accuracy by ~18% for domain specific enterprise use cases. • Built logging and monitoring workflows using Azure Monitor and Application Insights to track model latency, usage patterns, and failure scenarios in production environments. • Worked closely with product managers and solution architects to translate business requirements into scalable AI workflows using Azure ML and OpenAI endpoints. • Optimized inference pipelines and token usage strategies, helping reduce operational API costs by 12% while maintaining response quality. • Documented architecture designs, model limitations, and responsible AI considerations to support internal reviews and enterprise client readiness. ## Education ### Master's Degree in Information Technology Cleveland State University ## Contact & Social - LinkedIn: https://linkedin.com/in/vinay-ramrupe-6b34b228b - Portfolio: https://www.linkedin.com/in/vinay-ramrupe-6b34b228b --- Source: https://flows.cv/vinayramrupe JSON Resume: https://flows.cv/vinayramrupe/resume.json Last updated: 2026-04-16