• Led end-to-end redesign of a monolithic analytics platform into 20+ cloud-native microservices on AWS ECS, with service isolation, horizontal scaling, and automated deployments, cutting release cycles by 50% and eliminating cascading failures
• Designed and owned distributed ingestion pipelines using S3, Lambda, IAM, and event-driven workflows processing 100GB+ per day, increasing throughput by 45% while maintaining 99.99% availability
• Built strongly-typed Python–MySQL service layers with schema validation, idempotent writes, and structured observability, eliminating 200+ recurring data-integrity defects across production systems
• Migrated and re-sharded 20+ production services from MSSQL to Amazon Aurora MySQL, implementing replication, automated failover, and read scaling, reducing query latency by 40% and removing single-point-of-failure risks
• Developed 30+ production-grade REST APIs with versioning, rate-limiting, and backward-compatible contracts, reducing regression defects by 35% and enabling parallel feature development across teams
• Owned production reliability via metrics, logs, and distributed tracing, leading incident response for critical outages and consistently restoring service within 1-hour SLAs
• Built FastAPI-based ingestion services to migrate 100GB+ of unstructured data into Amazon S3, using parallel uploads, retry logic, and checksum validation, cutting storage costs by 80% while guaranteeing data correctness