Software development (Python, C) and algorithm optimization of the GenapSys bioinformatics pipeline. Played a significant role designing and optimizing the customer-facing version of our analysis pipeline in order to support our product launch in Q4 2019.
• Refined code via Cython, architecture changes, Unix piping (to/from samtools, bwa, bbmap, etc.), and general algorithm improvements (e.g. aho-corasick for DNA subsequence searches), delivering numerous 10x+ speedups and 3x+ reductions in RAM overhead.
• Developed pipeline for converting machine learning models trained in Tensorflow to TensorRT, boosting inference speed by 2x+.
• Created and helped manage and develop multiple GitHub repositories that interface with the primary bioinformatics pipeline.