# Nivith Avula > Generative AI Engineer @Blue Yonder | LLMs, RAG, Multi Agent Systems, MCP | Azure OpenAI & Amazon Bedrock | Production AI at Scale Location: United States, United States Profile: https://flows.cv/nivith Generative AI Engineer with 5+ years of experience building and deploying scalable AI/ML systems across industries such as supply chain, banking, healthcare, and energy. Specialized in LLMs, Retrieval Augmented Generation (RAG), and Agentic AI systems, with a strong focus on delivering production-grade solutions. Experienced in designing end to end AI platforms from data pipelines and model fine-tuning (LoRA/QLoRA) to API development and cloud deployment using Azure, AWS, Docker, and Kubernetes. Proven ability to improve model performance, reduce hallucinations, and optimize systems for scalability and reliability. Passionate about solving real world business problems with AI, working in fast paced environments, and owning systems from concept to production. ## Work Experience ### Generative AI Engineer @ Blue Yonder Jan 2025 – Present | Coppell, TX ● Lead the architecture and enterprise deployment of Generative AI solutions for supply chain planning & forecasting using Azure OpenAI, LangChain, LangGraph, and Python, enabling AI driven decision intelligence across mission critical operations. ● Architect multiagent and Agent to Agent (A2A) orchestration frameworks using LangGraph to coordinate retrieval, reasoning, and validation agents for complex, multi step supply chain decision workflows. ● Design and implement scalable Retrieval Augmented Generation (RAG) platforms leveraging Azure Cognitive Search (vector indexing), Azure OpenAI embeddings, Azure Blob Storage, and semantic chunking to power contextual enterprise knowledge retrieval. ● Apply parameter efficient fine tuning techniques (LoRA, QLoRA) using Hugging Face and PyTorch to adapt pretrained LLMs to domain specific supply chain terminology while optimizing compute utilization. ● Develop and govern production grade AI microservices using Python, Go, FastAPI, Azure Functions, and Azure App Service, exposing secure REST APIs consumed by enterprise planning and analytics systems. ● Evaluate and implement vector database strategies (Pinecone, Weaviate) to optimize semantic search performance, relevance scoring, and latency at enterprise scale. ● Establish enterprise data ingestion and transformation pipelines using Azure Data Factory, Databricks (PySpark), Pandas, and Blob Storage to convert ERP and unstructured documentation into vectorized knowledge assets. ● Enhance model reliability and response accuracy by reducing hallucinations by 38% through structured prompt engineering, few-shot design, token optimization, and validation pipelines. ● Enforce enterprise grade security through OAuth2, Azure Entra ID, Managed Identities, Azure API Management, Key Vault, private endpoints, and RBAC controls. ● Operationalized GenAI workloads using Docker, AKS, ACR, Helm, GitHub Actions, and CI/CD with monitoring via Prometheus and Grafana. ### AI Engineer @ UBS Jan 2024 – Jan 2025 | New York, United States ● Directed the architecture and deployment of enterprise AI and Generative AI platforms for banking operations using Amazon SageMaker, Bedrock, LangChain, LangGraph, and Hugging Face within regulated financial environments. ● Designed multi-agent orchestration frameworks to coordinate policy retrieval, compliance reasoning, and response validation for complex, audit-sensitive workflows. ● Implemented parameter efficient fine tuning (LoRA, QLoRA) to specialize LLMs for financial and regulatory language while maintaining cost and governance controls. ● Engineered secure Retrieval Augmented Generation architectures using Amazon OpenSearch (vector search), FAISS, S3, and embedding models to enable contextual search across enterprise policy repositories. ● Developed high-throughput AI microservices using Python, FastAPI, Flask, Docker, and AWS Lambda, supporting millions of monthly inference requests with sub-second latency. ● Integrated Neo4j knowledge graphs to enhance entity-aware retrieval and context reasoning within compliance intelligence systems. ● Built document intelligence pipelines using Amazon Textract and NLP preprocessing workflows, reducing document processing time by 48%. ● Governed AI workloads using EC2, ECS, EKS, ECR, Terraform, and CI/CD pipelines aligned with regulatory standards and change management frameworks. ● Implemented comprehensive monitoring, audit logging, and compliance observability using CloudWatch, CloudTrail, and OpenSearch Dashboards. ● Ensured model risk governance through benchmarking, bias evaluation, human in the loop validation, and UAT cycles in partnership with risk and compliance stakeholders. ### Full Stack AI Engineer @ Landis+Gyr Jan 2024 – Jan 2024 | Atlanta, GA ● Delivered end to end AI and Generative AI solutions for smart energy analytics platforms, integrating LLM-driven insights, predictive modeling, and real-time IoT processing to support grid operations. ● Designed autonomous agent based workflows using LangChain and CrewAI to orchestrate anomaly detection, contextual data retrieval, and operational insight generation. ● Developed LLM enabled applications using Amazon Bedrock and Hugging Face within RAG architectures to enable natural-language querying of smart-meter, outage, and grid datasets. ● Built forecasting and anomaly detection models using SageMaker, XGBoost, TensorFlow, and time-series feature engineering, improving forecast accuracy by 14% over baseline models. ● Engineered scalable ingestion and processing pipelines using AWS IoT Core, Kinesis, Glue, S3, Spark (EMR), and OpenSearch to support both batch and real-time ML workloads. ● Operationalized AI models using MLOps best practices including SageMaker Pipelines, experiment tracking, automated retraining, and environment promotion. ● Deployed containerized AI services across Kubernetes environments (EKS/ECS/EC2) with CI/CD automation and production-grade monitoring frameworks. ### Python Full Stack Engineer @ UnitedHealth Group Jan 2023 – Jan 2023 | Dallas, Texas, United States ● Developed scalable healthcare web applications using Python (Flask, Django, FastAPI) and built responsive user interfaces with React, Next.js, JavaScript, TypeScript, HTML, and CSS, ensuring seamless frontend-backend integration. ● Designed and implemented RESTful APIs and integrated them with React-based dashboards to enable real-time clinical data visualization, patient monitoring, and analytics reporting. ● Engineered machine learning and deep learning models including CNNs, RNN/LSTMs, GANs, and Transformer-based NLP architectures to generate actionable healthcare insights from structured and unstructured datasets. ● Built and deployed end-to-end ML pipelines using Kubeflow on Google Kubernetes Engine (GKE) and managed scalable model serving through Google Vertex AI. ● Developed computer vision and NLP solutions using OpenCV, YOLO, and Hugging Face (BERT, GPT), integrating model outputs into user-facing applications to enhance diagnostic and patient communication workflows. ● Leveraged Google BigQuery, Cloud Storage, and Dataflow for large-scale data processing, and implemented MLflow for model lifecycle management, experiment tracking, and reproducibility. ### Python Developer @ Celanese Jan 2023 – Jan 2023 | Irving, Texas, United States ● Designed and developed enterprise microservices using Python (Flask, Django, FastAPI) to support scalable, distributed backend systems. ● Deployed high availability services using Azure Functions and AKS, implementing containerized and serverless architectures. ● Architected event driven systems using Kafka, Azure Event Hubs, and Service Bus to enable real time, asynchronous data processing. ● Built and automated CI/CD pipelines using Azure DevOps and GitHub Actions, implementing infrastructure provisioning and application deployments in multi-cloud environments using Terraform and Ansible. ● Optimized database performance across PostgreSQL, MySQL, Cassandra, and Azure Cognitive Search through indexing and schema improvements. ● Integrated AI services and analytics components into Azure-hosted enterprise applications. ### Python Developer @ Altimetrik Jan 2020 – Jan 2022 | Pune District • Designed and developed scalable ETL pipelines using Python and PySpark to process large-scale enterprise datasets, improving reporting efficiency by 50%. • Built and optimized data workflows using Apache Airflow with robust scheduling, dependency management, and failure handling to ensure reliable pipeline execution. • Migrated legacy on premise data systems to Azure based Snowflake data warehouse architecture, improving scalability and reducing report generation time. • Performed complex data transformations, cleansing, validation, and deduplication to enhance data quality and ensure accurate business intelligence reporting. • Developed RESTful APIs using FastAPI and Flask to expose processed data to downstream applications and analytics platforms. • Refactored monolithic components into modular, scalable Python based microservices, improving system maintainability and deployment efficiency. • Containerized applications using Docker and deployed them on Azure Kubernetes Service (AKS) to support scalable, cloud-native architecture. • Optimized SQL queries and implemented indexing strategies to improve database performance and reduce API response times by up to 30%. • Integrated CI/CD pipelines using Git and Jenkins to automate build, testing, and deployment workflows. • Implemented structured logging, monitoring, and exception handling mechanisms to enhance production stability and reduce downtime. ## Education ### Master of Science - MS in Information systems and technology University of North Texas ### Bachelor's degree in Computer Science CVR College of Engineering, Hyderabad ## Contact & Social - LinkedIn: https://linkedin.com/in/nivith-avula-0b44a0140 - Portfolio: https://nivith-portfolio.vercel.app/ - Portfolio: https://nivith-portfolio.vercel.app --- Source: https://flows.cv/nivith JSON Resume: https://flows.cv/nivith/resume.json Last updated: 2026-04-17