• Worked as a key member of the DevOps team to provide smooth development, continuous deployment, and reliable operation of software for robotic workcells
• Took on Site Reliability Engineer responsibilities maximizing uptime and availability of deployed systems by implementing and tuning open source monitoring and alerting stack
• Developed and maintained scripts and tooling to automate routine tasks and improve productivity, including provisioning compute and networking equipment for production deployments
• Maintained core cloud environment running in AWS, working with services including VPC, EC2 S3, IAM, CloudWatch
• Collaborated with development teams to design and build software running natively in Docker and Kubernetes, resulting in highly scalable and resilient systems; Personally designed and built first high availability cluster for production deployment
• Collaborated with development teams to design and build software running natively in Docker and Kubernetes, packaged as Helm charts, resulting in highly scalable and resilient systems; Personally designed and built first high availability cluster for production deployment