# Sai Nikhil Varada

> AI/ML Engineer | Generative AI (GenAI), LLMs, RAG, Agentic AI | LangChain, LlamaIndex, LangGraph | LLaMA Fine-Tuning (LoRA/QLoRA) | MLOps, Kubernetes, MLflow | AWS, Azure, GCP

Location: Atlanta Metropolitan Area, United States
Profile: https://flows.cv/sainikhilvarada

AI/ML Engineer with 3+ years of experience designing and deploying scalable Machine Learning, Deep Learning, and Generative AI (GenAI) systems across enterprise environments in the U.S. and India. Currently at Oracle, I build LLM-powered applications, RAG pipelines, and agentic AI systems that drive measurable business impact, including 60% faster knowledge retrieval, 4x productivity gains, and 250K+ daily low-latency inference at 99.9% uptime. Proven ability to take solutions from research and prototyping to production-scale deployment, with a strong focus on performance, reliability, and cost optimization.
 
My expertise spans LLMs, RAG, and Agentic AI, with hands-on experience in LangChain, LlamaIndex, LangGraph, Hugging Face, and Transformers, along with LLM fine-tuning (LLaMA, LoRA, QLoRA, RLHF) and prompt engineering. I bring strong capabilities in MLOps and cloud-native AI systems, including MLflow, Kubeflow, Docker, Kubernetes, CI/CD, and model monitoring, deployed across AWS (SageMaker), Azure OpenAI, and GCP Vertex AI. Additionally experienced in data engineering and vector databases such as Apache Spark, Kafka, Airflow, Pinecone, FAISS, pgvector, Snowflake, and PostgreSQL, with a focus on building high-performance, low-latency inference systems using CUDA and optimization techniques.
 
I hold a Master’s degree in Computer Science from George Washington University, and previously worked at Cognizant where I delivered NLP, computer vision, and predictive ML solutions processing 1M+ records, achieving up to 93% accuracy, and reducing operational costs by 40%. I am passionate about building production-ready AI systems and advancing Generative AI, LLM applications, and autonomous agent systems. Open to opportunities as an AI/ML Engineer, Generative AI Engineer, LLM Engineer, or Applied Scientist in the U.S.

## Work Experience
### AI/ML Engineer @ Oracle
Jan 2025 – Present | United States
• Architected a scalable GenAI enterprise knowledge platform leveraging LLMs, RAG pipelines, and LangChain on Oracle Cloud Infrastructure, reducing knowledge retrieval time by 60% for 8,000+ global users.
• Led development of a multi-agent AI orchestration system using LangGraph and LlamaIndex, enabling automated SQL generation and intelligent report synthesis, resulting in 4x improvement in analyst productivity.
• Fine-tuned LLaMA 3 models using LoRA and QLoRA on 50K+ domain-specific documents, achieving 91% answer relevance, and deployed via FastAPI and AWS Lambda with sub-120ms end-to-end latency.
• Optimized LLM inference pipelines using NVIDIA CUDA, PyTorch quantization, and Azure Kubernetes Service (AKS), reducing serving latency by 42% while supporting 250K+ daily requests at 99.9% uptime.
• Designed and implemented MLOps and production deployment standards using Docker, Kubernetes, MLflow, and CI/CD pipelines on GCP Vertex AI, reducing release cycle time by 50% across 15+ production-grade ML systems.

### Machine Learning Engineer @ Cognizant
Jan 2021 – Jan 2023 | India
• Designed and deployed NLP document classification pipelines using BERT and Hugging Face Transformers, processing 1M+ legal and compliance documents and improving extraction accuracy by 85% across 6 enterprise clients.
• Built scalable end-to-end ML training and retraining pipelines using Apache Airflow and AWS SageMaker, reducing retraining cycle time by 48% and enabling automated weekly model refreshes across 7 production systems.
• Developed customer churn prediction models using XGBoost and LightGBM on large-scale CRM datasets, achieving 91% AUC and driving data-backed retention strategies that reduced churn by 25%.
• Engineered computer vision defect detection systems using ResNet-50 and PyTorch, achieving 93% classification accuracy and reducing manual inspection costs by 40% in manufacturing workflows.
• Led adoption of MLflow for experiment tracking, model registry, and lifecycle management across 10+ ML projects, standardizing MLOps workflows, improving model governance, and reducing production incidents by 38%


## Education
### Master of Science - MS in Computer Science
The George Washington University

### Bachelor of Technology - BTech in Electronics and Computer Engineering
Vellore Institute of Technology (VIT)

### High School Diploma in Biology, General
Chinmaya Vidyalaya


## Contact & Social
- LinkedIn: https://linkedin.com/in/varadasainikhil
- Portfolio: https://shelfsmart-one.vercel.app

---
Source: https://flows.cv/sainikhilvarada
JSON Resume: https://flows.cv/sainikhilvarada/resume.json
Last updated: 2026-04-16