# Vinay Ramrupe

> AI/ML Engineer | LLM & Generative AI Specialist | Azure OpenAI, Databricks (DBRX) | RAG, Fine-Tuning, Scalable Inference | 5+ YOE

Location: San Francisco Bay Area, United States
Profile: https://flows.cv/vinayramrupe

• AI ML Engineer with 5+ years of experience building enterprise and open source large language model solutions across Azure OpenAI and Databricks Lakehouse platforms in India and the USA.
• Strong hands on experience in LLM development including data preprocessing, prompt engineering, fine tuning, distributed training, and scalable inference using Python, PyTorch, Spark, and MLflow. 
• Proven background in delivering Copilot style AI applications and production ready LLM systems with focus on performance optimization, cost efficiency, security, and responsible AI practices. 
• Experienced in collaborating with research, product, and platform teams to translate complex business and engineering requirements into scalable AI solutions used in real world enterprise environments.

## Work Experience
### AI/ML Engineer @ Databricks
Jan 2024 – Present | CA, USA
• Contributing to the development and optimization of DBRX, Databricks open source large language
model, focusing on scalable training and inference workflows on distributed compute.
• Worked on data curation and preprocessing pipelines using Apache Spark and Delta Lake to
prepare large scale high quality training datasets for LLM development.
• Implemented model training and evaluation workflows using PyTorch and Databricks ML Runtime to
support efficient experimentation and reproducibility.
• Collaborated with research and platform teams to benchmark DBRX performance across reasoning,
summarization, and code generation tasks.
• Optimized distributed inference pipelines for DBRX on Databricks by improving batching strategies
and GPU utilization, resulting in ~22% higher throughput under production scale workloads.
• Integrated DBRX with Databricks Lakehouse architecture to enable enterprise ready deployment,
governance, and monitoring for production LLM workloads.
• Built automated evaluation and experiment tracking workflows using MLflow and Python, reducing
manual validation effort by ~30% and improving model iteration speed across releases.
• Supported fine tuning and instruction tuning efforts to adapt DBRX for enterprise use cases
including analytics assistance and internal Copilot style applications.
• Implemented resource monitoring and cost observability dashboards that helped identify inefficient
GPU usage patterns, contributing to ~15% improvement in infrastructure cost efficiency.
• Collaborated with cross functional teams to align open source model development with enterprise
security, compliance, and responsible AI guidelines.

### AI/ML Engineer @ Microsoft
Jan 2020 – Jan 2023 | India
• Worked on designing and deploying Azure OpenAI based enterprise solutions by integrating GPT
models with internal Microsoft services, enabling intelligent document processing and
conversational workflows.
• Assisted in building Copilot style AI features using Azure OpenAI, Azure Functions, and REST APIs
to automate enterprise knowledge retrieval and task recommendations for internal business teams.
• Developed data preprocessing and prompt engineering pipelines in Python to improve response
relevance and consistency for large language model powered applications.
• Collaborated with senior engineers to integrate Azure Cognitive Search with OpenAI models,
enabling semantic search and contextual question answering over enterprise documents.
• Implemented secure API based access using Azure Active Directory and role based access control
to ensure compliance with enterprise security and data governance standards.
• Supported fine tuning experiments and prompt optimization techniques that improved model
response accuracy by ~18% for domain specific enterprise use cases.
• Built logging and monitoring workflows using Azure Monitor and Application Insights to track model
latency, usage patterns, and failure scenarios in production environments.
• Worked closely with product managers and solution architects to translate business requirements
into scalable AI workflows using Azure ML and OpenAI endpoints.
• Optimized inference pipelines and token usage strategies, helping reduce operational API costs by
12% while maintaining response quality.
• Documented architecture designs, model limitations, and responsible AI considerations to support
internal reviews and enterprise client readiness.


## Education
### Master's Degree in Information Technology
Cleveland State University


## Contact & Social
- LinkedIn: https://linkedin.com/in/vinay-ramrupe-6b34b228b
- Portfolio: https://www.linkedin.com/in/vinay-ramrupe-6b34b228b

---
Source: https://flows.cv/vinayramrupe
JSON Resume: https://flows.cv/vinayramrupe/resume.json
Last updated: 2026-04-16