# Matthew Kim > Principal Software Engineer, Bioinformatics at Roche Location: Pleasanton, California, United States Profile: https://flows.cv/matthewkim2 Experienced software engineer with a deep understanding of the software development life cycle and a broad knowledge of various data processing workflows. Proficient in solving complex problems, capable of architecting software development stacks, and adept at developing large-scale data processing and analysis software. Skilled in deploying and optimizing solutions on both cloud and High-Performance Computing (HPC) platforms. ## Work Experience ### Principal Bioinformatics Software Engineer @ Roche Jan 2020 – Present | Pleasanton, California, United State - Led the initiative to productionize and optimize bioinformatics software in collaboration with the scientific algorithm development team, leveraging a tech stack encompassing Java, Python, C++, Docker, Nextflow, High-Performance Computing (HPC), and AWS. - Led the charge in conducting in-depth algorithm performance assessments, including runtime analysis and memory/CPU usage profiling, and subsequently executed targeted code optimization strategies. These efforts significantly improved code efficiency and reduced turnaround time. - Vigilantly monitored every phase of the development process to ensure strict adherence to company standards, both in terms of design principles and software quality. - Developed and implemented algorithm software based on prototype code from the scientific algorithm development team for primary and secondary analysis. - Wrote clean, maintainable, documented code using engineering best practices and participated in design reviews. - Communicated with the software team, and mentored bioinformatics team members in software development best practices and skills development. - Led the successful launch of the primary analysis pipelines for cancer diagnostic products, including AVENIO Circular Tumor DNA (ctDNA), AVENIO Non-Hodgkin’s Lymphoma (NHL) Cancer on Roche CLIA Lab. - Led the development of a cutting-edge deep-learning-based bioinformatics tool designed to predict consensus DNA sequences, collaborating closely with the algorithm development team and utilizing C++, ONNX Runtime API, and Armadillo library. ### Sr. Bioinformatician/Full-Stack Software Engineer @ Loop Genomics Jan 2018 – Jan 2020 | San Francisco Bay Area - Spearheaded the design and development of a robust software platform to lead bioinformatics initiatives in the creation of cutting-edge NGS DNA sequencing products. - Leveraged technologies, including Django, AngularJS, Azure Batch, and HPC Cluster to drive innovation and product excellence. - Collaborated closely with interdisciplinary “wet” development teams to define software roadmaps, establish project timelines, and foster seamless communication. - Successfully communicated project progress and challenges, ensuring the on-time and on-budget delivery of the product. - Formed a strategic partnership alongside the CEO to develop and implement the company’s informatics strategy. - Assumed a proactive leadership role alongside the CEO, actively contributing to the day-to-day operations of Loop Genomics. - Demonstrated versatility by taking on various additional duties and roles as assigned, contributing to the overall success of the organization. ### Data Engineer @ Samsung Research America Jan 2018 – Jan 2018 | Mountain View - Responsible for the robust maintenance and scalable expansion of the AI infrastructure supporting Bixby, enabling the processing of vast volumes of data with exceptional speed and efficiency. - Implemented automation to streamline existing processes and designed systems that promote self-service data consumption, enhancing efficiency and accessibility across the teams. - Interfaced with data scientists, analysts, product managers, and all other customers of the analytics infrastructure to understand their needs and expand the infrastructure. ### Bioinformatics Developer @ Bristol-Myers Squibb Jan 2018 – Jan 2018 | Redwood City - Designed and implemented integrated pipelines to process Next-Generation Sequencing (NGS) data originating from antibody research. - Developed custom scripts to efficiently deploy and manage dockerized pipelines, facilitating their seamless execution on both the Amazon Domino platform and the SevenBridges platform. ### Staff Software Engineer @ Thermo Fisher Scientific Jan 2015 – Jan 2018 | South San Francisco - Played a key role in the development of ConvergeTM forensic analysis software, leveraging Java, Spring, Hibernate, Tomcat, PostgreSQL, and AngularJS while adhering to agile development practices. - Served as the driving force in leading the development of bioinformatics analysis pipelines and algorithms. - Worked closely with the validation and verification team, as well as offshore teams to ensure effective collaboration and project success. ### Bioinformatics Specialist/Full-Stack Software Engineer @ Bayer CropScience Jan 2013 – Jan 2015 | Morrisvile,NC - Designed and implemented a template-based pipeline framework in Python to expedite the creation of diverse pipelines on an HPC cluster. These pipelines seamlessly integrated multiple bioinformatics tools and efficiently processed Next-Generation Sequencing (NGS) data from various species. - Conceptualized, designed, and developed a highly scalable genotype database using MongoDB, establishing it as a standardized solution for biomarker research. The database effectively managed and stored an extensive volume of over 36 billion genotype information records. - Architected and implemented a RESTful framework, leveraging Python, Bottle, JavaScript, HTML/CSS, Dojo, and MongoDB. This framework seamlessly integrated a database, remote file system, and multiple pipelines, enhancing overall system efficiency and functionality. - Managed and implemented bioinformatics APIs to support ongoing trait research projects. - Engaged in collaborative efforts with fellow R&D scientists and contractors, making valuable contributions to trait discovery projects. ### Bioinformatician @ Duke University Medical Center, Center for Human Genome Variation Jan 2011 – Jan 2013 - Conceived and executed a template-based pipeline framework in Java, elevating the capabilities of existing pipelines and streamlining their development and deployment. - Architected and engineered web applications to seamlessly integrate diverse pipelines on an HPC cluster, while connecting with databases and remote file systems. Technologies used included Apache, Perl, JavaScript, HTML/CSS, and MySQL. - Successfully processed and primarily analyzed a substantial volume of genomic data, including over 3,000 human exomes, 900 human genomes (at 35x coverage), 1,000 custom-captured human samples, and various RNA sequences. These analyses were conducted in support of population genetics studies. - Collaborated with scientists and provided customized pipelines to help them analyze large sequence datasets (published in a journal). ### Bioinformatics Analyst @ Baylor College of Medicine, Epigenome Center in Department of Human Genetics Jan 2010 – Jan 2011 - Took an active role in the development of Next-Generation Sequencing (NGS) pipelines, implemented on a High-Performance Computing (HPC) cluster, for the detection of human DNA methylation. - Implemented methylation comparison tools that incorporated linear regression analysis as server-side components in a RESTful architecture. ### Bioinformatics Specialist @ Kansas State University, Bioinformatics Center Jan 2007 – Jan 2009 - Conceived, implemented, and cultivated a database, accessible at http://www.beetlebase.org, to provide comprehensive genomic data on red-flower beetle. This project involved the utilization of Apache, Perl, C, PostgreSQL, and HTML/CSS. - Conceptualized and executed model-based gene expression analysis utilizing whole genome tiling-microarray datasets, resulting in publication in a journal. SAS was the primary analytical tool employed for the research. - Led the collection, validation, and annotation of gene sequences, culminating in the successful re-assembly of the entire genome sequences of the red-flower beetle. This achievement was subsequently published in a peer-reviewed journal. ### Graduate Research Assistant @ University of Illinois at Urbana-Champaign Jan 2004 – Jan 2007 - Designed, developed, and conducted analysis on the MANET database, accessible at http://manet.illinois.edu, which seamlessly integrated protein structural, functional, and phylogenic information sourced from KEGG and SCOP databases. This groundbreaking work was published in peer-reviewed journals. - Developed Object-Oriented software packages, including DIM-Pack for statistical analyses and supply chain simulator, using VisualBasic.NET as the primary development platform. - Served as the manager for a Mac OSX cluster server and storage and made significant contributions to multiple research projects, resulting in publications in various academic journals. ## Education ### MS in Bioinformatics University of Illinois Urbana-Champaign ### BS in Computer Science University of Illinois Urbana-Champaign ### Certified Scrum Master Scrum Alliance ## Contact & Social - LinkedIn: https://linkedin.com/in/matthew-kim-28204428 --- Source: https://flows.cv/matthewkim2 JSON Resume: https://flows.cv/matthewkim2/resume.json Last updated: 2026-04-12