Experienced software engineer with background in statistics and biotechnology
Staff software engineer with extensive experience architecting and implementing distributed, cloud-native systems at scale. Demonstrated success leading complex technical initiatives that significantly improve performance, reduce costs, and enhance reliability across mission-critical applications.
Designed and implemented event-driven, cloud-native data analysis pipeline processing hundreds of terabytes of DNA sequencing data for lung cancer screening purposes.
•
Architected and led Kubernetes migration of bioinformatic analysis pipelines, achieving a 2.5x increase in throughput and 60% cost reduction, while maintaining backward compatibility.
•
Implemented petabyte-scale data lake with discovery and query APIs, supporting 30+ data scientists in product development and validation efforts.
•
Established automated test frameworks achieving >90% coverage across business-critical systems.
Designed and implemented service for enterprise data storage and discovery fulfilling ISO-27001 compliance requirements and FAIR data principles.
•
Promoted software engineering and architecture best practices through design reviews, mentoring, and documentation.
•
Pioneered event-driven service architecture, decoupling tightly integrated systems and enabling asynchronous releases between interconnected software components.
•
Defined architectural vision and roadmap, balancing tech debt with business priorities.
Optimized a complex variant-calling algorithm, reducing runtime from hours to minutes using Java concurrency techniques and data structures.
•
Built a serverless task service within a strictly regulated environment, and integrated it with central business applications, enabling faster development cycles for new features.
•
Implemented Hadoop / Apache Impala back-end API for on-demand analytics, improving stability and latency by >50% across billions of rows of DNA sequencing records.
•
Developed a configurable clinical trial data ingestion pipeline supporting 10+ clinical trials with multi-million dollar revenue data transfer agreements.