# Rajesh Munuma

> Senior GenAI Engineer @ bpx energy | Agentic AI & RAG Systems | AWS Bedrock, OpenSearch, Snowflake

Location: San Francisco, California, United States
Profile: https://flows.cv/rajeshmunuma

I am a Generative AI and Agentic AI Engineer with strong hands-on experience building production-grade AI platforms that combine Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and Text-to-SQL. My recent work focuses on designing real-world GenAI systems that deliver accurate, explainable, and business-aligned insights, especially in complex enterprise environments.

I have architected and built end-to-end GenAI platforms using AWS and GCP, working with services such as Amazon Bedrock (Claude models), Vertex AI (Gemini), OpenSearch vector search, Snowflake, BigQuery, Lambda, EventBridge, DynamoDB, and S3. My projects include conversational AI, semantic search, SQL-RAG, session-aware chatbots, and multi-LLM routing, with a strong emphasis on reducing hallucinations, improving response relevance, and ensuring security and governance.

In addition to GenAI development, I bring a solid platform and cloud engineering background. I have owned cloud infrastructure, IAM, CI/CD, observability, and incident response across AWS and GCP. I work closely with product, data, and domain experts to translate business workflows into scalable AI solutions. I am especially interested in roles involving GenAI platforms, Agentic AI, RAG systems, and enterprise AI adoption.

## Work Experience
### Senior GenAI Engineer @ bpx energy
Jan 2025 – Present | Denver, Colorado, United States
• Designed and implemented a production-grade Generative AI RAG platform for BPX Energy, delivering secure, low-hallucination Q&A over enterprise data with sub-second to low–single-digit-second response latency for most user queries.
• Built a semantic retrieval layer using Amazon OpenSearch as the vector store, supporting thousands of indexed documents and embeddings with consistent retrieval performance under concurrent user load.
• Integrated LLM inference via Amazon Bedrock using Claude 3.5 Haiku, optimizing for low-latency responses and cost-efficient inference compared to larger foundation models.
• Implemented event-driven ingestion and indexing pipelines using AWS Lambda and Amazon EventBridge, enabling near-real-time document updates while keeping ingestion costs fully serverless and usage-based.
• Designed prompt templates and semantic routing logic to dynamically assemble prompts from user input, retrieved context, and session state, reducing hallucinations and improving answer relevance across multi-turn conversations.
• Built session and conversation memory management using Amazon DynamoDB, supporting high-concurrency access patterns with predictable single-digit millisecond read/write performance.
• Enforced least-privilege security controls using AWS IAM for Bedrock, Lambda, OpenSearch, and DynamoDB, ensuring secure agent execution in a regulated enterprise environment.
• Implemented end-to-end observability with Amazon CloudWatch, tracking request latency, error rates, and invocation counts to support operational stability and rapid troubleshooting.
• Architected the platform for horizontal scalability, allowing independent scaling of ingestion, retrieval, inference, and session layers to handle growth in users, documents, and query volume without redesign.
• Optimized overall system cost by combining serverless ingestion, lightweight LLM models (Haiku), and semantic retrieval, keeping per-query costs low while maintaining enterprise-grade accuracy.

### GenAI Engineer | Application Platform Engineer @ Motive Practicing Wisely
Jan 2022 – Jan 2025 | San Francisco, California, United States
• Architected and led the development of production-grade GenAI systems using Retrieval-Augmented Generation (RAG) and Text-to-SQL to deliver explainable, guideline-grounded insights.
• Built LLM-powered APIs and chatbots using FastAPI and managed inference platforms (AWS Bedrock, Vertex AI).
• Designed hybrid retrieval workflows combining vector search and deterministic SQL, reducing hallucinations by >90% and improving response relevance by 35%.
• Implemented Lang Chain-based RAG pipelines with OpenSearch vector indexing to support semantic retrieval over large clinical document corpora.
• Developed secure Text-to-SQL execution layers enabling natural-language analytics over Snowflake with strict read-only enforcement.
• Integrated observability and drift monitoring using CloudWatch to track LLM latency, retrieval quality, and system health.
• Partnered with clinicians, data teams, and leadership to deliver role-aware GenAI experiences for physicians versus medical directors.	
•Detected and contained an active AWS security breach caused by exposed credentials in source control.
• Performed deep IAM investigation and identified attacker-created IAM users, access keys, and privilege escalation paths used for persistence.
• Restored all production and non-production environments within 24 hours.
• Led post-incident security redesign using AWS SSO, Azure AD integration, least-privilege IAM, Prowler.
•Sole owner of AWS and GCP cloud platforms, responsible for infrastructure, IAM, CI/CD, monitoring, and disaster recovery.
•Implemented AWS Organizations and GCP organizational structures with proper account, project, and access isolation.
• Designed and implemented end-to-end disaster recovery (DR) plans.
• Implemented continuous security posture management using Prowler, AWS Config, and Security Hub.
• Designed and enforced RBAC models, permission sets, and MFA-based access using IAM Identity Center and Azure AD.

### Python Developer @ Optum
Jan 2021 – Jan 2022 | New York, United States
•  Developed Python-based automation scripts to support healthcare data processing, system monitoring, and operational workflows.
•  Worked with structured data formats (CSV, XML, JSON) using Python to clean, transform, and prepare datasets for downstream systems.
•  Integrated Python scripts with Linux environments to schedule and execute recurring jobs using cron.
•  Assisted in debugging, testing, and improving Python applications, following clean coding and basic performance optimization practices

### DevOps Consultant @ General Dynamics Information Technology
Jan 2018 – Jan 2021 | New York, United States
• Migrated on-prem workloads to AWS and Azure using Terraform and ARM templates.
• Automated deployments via Ansible integrated with Jenkins and GitHub.
• Created reusable Terraform modules and IaC templates for repeatable multi-region deployments.

### DevOps Engineer @ Maveric Systems Limited
Jan 2014 – Jan 2015 | Chennai, Tamil Nadu, India
• Installed, configured, and maintained Jenkins/Hudson for continuous integration and end-to-end automation of builds and deployments.  
• Provided system administration support for over 100 servers across diverse platforms and operating systems.  
• Developed and documented software release management procedures, including release notes for scheduled releases.  
• Managed Linux environments, deploying web applications using Puppet and automating processes with BASH and Shell scripts.


## Education
### Master's degree in Computer Information Systems and Information Technology
University of Central Missouri

### Bachelor of Technology - BTech in Computer Science
Jawaharlal Nehru Technological University


## Contact & Social
- LinkedIn: https://linkedin.com/in/rajesh-m-07a13a25m

---
Source: https://flows.cv/rajeshmunuma
JSON Resume: https://flows.cv/rajeshmunuma/resume.json
Last updated: 2026-04-01