Experience
United States
Deployed and managed Prometheus and Grafana for system metrics and alerting, improving detection of infrastructure bottlenecks.
Deploy and manage containerized applications using OpenShift platform, ensuring seamless continuous integration and delivery (CI/CD) pipelines.
Collaborated with development teams using Git and integrated it with GCP-based CI/CD tools for automated versioning and code deployment.
Designed, developed, and maintained AWS Glue ETL jobs to process, transform, and load large-scale structured and semi-structured data.
Automated data pipelines using AWS Glue Workflows, triggers, and schedules to ensure reliable data processing.
Implemented CI/CD pipelines using GitHub Actions, automating build, test, and deployment processes to enhance software delivery efficiency.
Integrated GitHub with Jenkins to streamline automated testing and deployment workflows, improving developer productivity.
Designed, deployed, and managed scalable Azure cloud infrastructures using Azure Virtual Machines, Virtual Networks, and Load Balancers.
Designed and deployed multicluster Kubernetes environments on AWS EKS, leveraging KCP for API aggregation and workspace management.
Developed custom CRDs, APIResourceSchemas, APIExports, and APIBindings to enable dynamic API discovery and integration with external providers.
Automated infrastructure provisioning and configuration using Terraform and Helm for consistent, repeatable deployments.
Implemented centralized logging and auditing pipelines using Fluentd, CloudWatch, and S3 for compliance and troubleshooting.
Created real-time metrics collection and alerting with Prometheus, Grafana, and AWS CloudWatch to monitor platform health and resource usage.
Acted as on-call SRE supporting 24/7 production workloads, handling incident triage, mitigation, and escalation.
United States
Designing, deploying, and managing cloud infrastructure on AWS, Azure, and Google Cloud Platform (GCP) to optimize performance, scalability, and cost-efficiency.
Provisioned and maintained AWS and Azure infrastructure, including EC2, S3, IAM, VPC, Azure Web Apps, Storage, and Active Directory.
Managed microservices with Docker, Kubernetes, OpenShift, and Azure Kubernetes Service (AKS).
Implemented continuous integration and delivery pipelines with tools like Git, TeamCity, Octopus, and AWS Code Pipeline.
Designed and implemented automated pipelines for AWS EC2 to OCI Compute instance migration, ensuring minimal downtime and optimized performance.
Configured Prometheus to collect real-time metrics from cloud infrastructure, applications, and services for performance monitoring.
Provisioned and maintained cloud resources across AWS, Azure, and GCP, including EC2, S3, IAM, VPC, Azure Web Apps, and GCP Compute Engine for scalable deployments.
Developed automation scripts with PowerShell, Ansible, and Chef to streamline deployment and infrastructure management.
Defined and enforced SLOs/SLIs as part of the observability strategy, aligning system reliability targets with business objectives.
Deployed containerized applications and scaled Kubernetes clusters, enabling efficient orchestration and resource utilization.
Developed Infrastructure as Code (IaC) solutions using Terraform to automate provisioning of computer, networking, and storage resources in OCI.
Utilized Azure Recovery Vault and backups to ensure disaster recovery and data integrity.
Set up Prometheus Alert manager to trigger alerts based on predefined thresholds, ensuring quick incident response and resolution.
Proficient in using Terraform to define, provision, and manage cloud infrastructure (AWS, GCP, Azure) through code, ensuring consistent and repeatable deployment processes for scalable and secure environments.
United States
Expertise in Prometheus, Grafana, ELK Stack, Datadog, and CloudWatch for initiative-taking monitoring, logging, and incident response.
Experienced in Terraform, CloudFormation, and Ansible to automate provisioning and management of cloud resources.
DevOps Workflow encompassing all stages, beginning with SCM Commit Build, Integration Build Compiling.
Integrated monitoring and logging solutions using OCI Logging & Oracle Cloud Observability, ensuring initiative-taking issue resolution and enhanced system reliability.
Kernel tuning, Writing Shell scripts for system maintenance and file management.
Integrated observability into CI/CD pipelines, enabling shift-left monitoring and early detection of performance regressions during deployments.
Experience in Chef with configuring Chef-Repo, setting up multiple Chef Workstations, and writing Chef Cookbooks and Recipes to automate the deployment process using Spinnaker and integrated with Jenkins jobs for CD framework.
Skilled in integrating Git repositories with CI/CD tools (e.g., Jenkins, GitLab CI) for automated build, test, and deployment pipelines, accelerating the software delivery process.
Developed automation scripting in Python (core) using Puppet to deploy and manage Java applications across Linux servers.
Utilized Datadog security monitoring features to track vulnerabilities, detect threats, and ensure compliance with industry standards.
Integrated Grafana with multiple data sources, including Prometheus, Elasticsearch, and Datadog, for centralized monitoring.
Utilized Python for data extraction, transformation, and analysis, leveraging libraries such as Pandas and NumPy to process large datasets.
Created scripts in Python which are integrated with Amazon API to control instance operations.
Integrated Prometheus with Grafana for real-time visualization and with tools like Kubernetes and Docker for enhanced container monitoring.
Hyderabad, Telangana, India
I am skilled in utilizing tools such as Prometheus, Grafana, and kubectl to monitor cluster health, diagnose issues, and implement initiative-taking measures for resource optimization and application reliability in Kubernetes environments.
Experienced on AWS EC2, EBS, ELB scaling groups, Trusted Advisor, S3, Cloud Watch, Cloud Front, IAM, Security Groups, Auto Scaling.
Expertise in using Git for version control to manage and track code changes, ensuring efficient collaboration across distributed teams and maintaining a clean project history.
Developed a custom AWS-to-OCI security policy mapping tool, converting AWS IAM roles, policies, and security groups to OCI IAM, ensuring compliance.
Effectively planned and deployed hybrid Cloud infrastructure in a production environment.
Analyse cloud infrastructure and recommend improvements for performance gains and cost efficiency solutions.
Created the architecture and created the Cloud Formation template to facilitate deployment.
Have knowledge about Basic information about Linux OS. (File system, File configuration, Linux structure, directories.)
Working on Various incidents like as ESX/ESXi server Down, Data store storage issues, Vmotion, Patching, Snapshots, HA, and DRS, etc.
Use VMware VSphereVcenter Update Manager to apply patches to ESX, ESXi and virtual machines.
Maintaining Vcenter Servers, creating Virtual Machine Templates.
Performing different ESX server & Virtual Machine related tasks like vMotion, Storage. VMotion, High Availability (HA), DRS (Distributed Resource Scheduling), Cloning, Snapshot.
Responsible for remote administration of 2003/2008/2012 servers in domain environment.
Service requests: Tickets regarding changes in the infrastructure, increase of memory, hard disk, Number of CPU’s, v2v migrations, installing software.
Education
Trine University
Master of Science - MS
Acharya Nagarjuna University (ANU), Guntur