# Matthew Kim

> Principal Software Engineer, Bioinformatics at Roche

Location: Pleasanton, California, United States
Profile: https://flows.cv/matthewkim2

Experienced software engineer with a deep understanding of the software development life cycle and a broad knowledge of various data processing workflows. Proficient in solving complex problems, capable of architecting software development stacks, and adept at developing large-scale data processing and analysis software. Skilled in deploying and optimizing solutions on both cloud and High-Performance Computing (HPC) platforms.

## Work Experience
### Principal Bioinformatics Software Engineer @ Roche
Jan 2020 – Present | Pleasanton, California, United State
- Led the initiative to productionize and optimize bioinformatics software in collaboration with the scientific algorithm development team, leveraging a tech stack encompassing Java, Python, C++, Docker, Nextflow, High-Performance Computing (HPC), and AWS.
- Led the charge in conducting in-depth algorithm performance assessments, including runtime analysis and memory/CPU usage profiling, and subsequently executed targeted code optimization strategies. These efforts significantly improved code efficiency and reduced turnaround time.
- Vigilantly monitored every phase of the development process to ensure strict adherence to company standards, both in terms of design principles and software quality.
- Developed and implemented algorithm software based on prototype code from the scientific algorithm development team for primary and secondary analysis.
- Wrote clean, maintainable, documented code using engineering best practices and participated in design reviews.
- Communicated with the software team, and mentored bioinformatics team members in software development best practices and skills development.
- Led the successful launch of the primary analysis pipelines for cancer diagnostic products, including AVENIO Circular Tumor DNA (ctDNA), AVENIO Non-Hodgkin’s Lymphoma (NHL) Cancer on Roche CLIA Lab.
- Led the development of a cutting-edge deep-learning-based bioinformatics tool designed to predict consensus DNA sequences, collaborating closely with the algorithm development team and utilizing C++, ONNX Runtime API, and Armadillo library.

### Sr. Bioinformatician/Full-Stack Software Engineer @ Loop Genomics
Jan 2018 – Jan 2020 | San Francisco Bay Area
- Spearheaded the design and development of a robust software platform to lead bioinformatics initiatives in the creation of cutting-edge NGS DNA sequencing products.
- Leveraged technologies, including Django, AngularJS, Azure Batch, and HPC Cluster to drive innovation and product excellence.
- Collaborated closely with interdisciplinary “wet” development teams to define software roadmaps, establish project timelines, and foster seamless communication. 
- Successfully communicated project progress and challenges, ensuring the on-time and on-budget delivery of the product.
- Formed a strategic partnership alongside the CEO to develop and implement the company’s informatics strategy.
- Assumed a proactive leadership role alongside the CEO, actively contributing to the day-to-day operations of Loop Genomics.
- Demonstrated versatility by taking on various additional duties and roles as assigned, contributing to the overall success of the organization.

### Data Engineer @ Samsung Research America
Jan 2018 – Jan 2018 | Mountain View
- Responsible for the robust maintenance and scalable expansion of the AI infrastructure supporting Bixby, enabling the processing of vast volumes of data with exceptional speed and efficiency.
- Implemented automation to streamline existing processes and designed systems that promote self-service data consumption, enhancing efficiency and accessibility across the teams.
- Interfaced with data scientists, analysts, product managers, and all other customers of the analytics infrastructure to understand their needs and expand the infrastructure.

### Bioinformatics Developer @ Bristol-Myers Squibb
Jan 2018 – Jan 2018 | Redwood City
- Designed and implemented integrated pipelines to process Next-Generation Sequencing (NGS) data originating from antibody research. 
- Developed custom scripts to efficiently deploy and manage dockerized pipelines, facilitating their seamless execution on both the Amazon Domino platform and the SevenBridges platform.

### Staff Software Engineer @ Thermo Fisher Scientific
Jan 2015 – Jan 2018 | South San Francisco
- Played a key role in the development of ConvergeTM forensic analysis software, leveraging Java, Spring, Hibernate, Tomcat, PostgreSQL, and AngularJS while adhering to agile development practices.
- Served as the driving force in leading the development of bioinformatics analysis pipelines and algorithms.
- Worked closely with the validation and verification team, as well as offshore teams to ensure effective collaboration and project success.

### Bioinformatics Specialist/Full-Stack Software Engineer @ Bayer CropScience
Jan 2013 – Jan 2015 | Morrisvile,NC
- Designed and implemented a template-based pipeline framework in Python to expedite the creation of diverse pipelines on an HPC cluster. These pipelines seamlessly integrated multiple bioinformatics tools and efficiently processed Next-Generation Sequencing (NGS) data from various species.
- Conceptualized, designed, and developed a highly scalable genotype database using MongoDB, establishing it as a standardized solution for biomarker research. The database effectively managed and stored an extensive volume of over 36 billion genotype information records.
- Architected and implemented a RESTful framework, leveraging Python, Bottle, JavaScript, HTML/CSS, Dojo, and MongoDB. This framework seamlessly integrated a database, remote file system, and multiple pipelines, enhancing overall system efficiency and functionality.
- Managed and implemented bioinformatics APIs to support ongoing trait research projects.
- Engaged in collaborative efforts with fellow R&D scientists and contractors, making valuable contributions to trait discovery projects.

### Bioinformatician @ Duke University Medical Center, Center for Human Genome Variation
Jan 2011 – Jan 2013
- Conceived and executed a template-based pipeline framework in Java, elevating the capabilities of existing pipelines and streamlining their development and deployment.
- Architected and engineered web applications to seamlessly integrate diverse pipelines on an HPC cluster, while connecting with databases and remote file systems. Technologies used included Apache, Perl, JavaScript, HTML/CSS, and MySQL.
- Successfully processed and primarily analyzed a substantial volume of genomic data, including over 3,000 human exomes, 900 human genomes (at 35x coverage), 1,000 custom-captured human samples, and various RNA sequences. These analyses were conducted in support of population genetics studies.
- Collaborated with scientists and provided customized pipelines to help them analyze large sequence datasets (published in a journal).

### Bioinformatics Analyst @ Baylor College of Medicine, Epigenome Center in Department of Human Genetics
Jan 2010 – Jan 2011
- Took an active role in the development of Next-Generation Sequencing (NGS) pipelines, implemented on a High-Performance Computing (HPC) cluster, for the detection of human DNA methylation.
- Implemented methylation comparison tools that incorporated linear regression analysis as server-side components in a RESTful architecture.

### Bioinformatics Specialist @ Kansas State University, Bioinformatics Center
Jan 2007 – Jan 2009
- Conceived, implemented, and cultivated a database, accessible at http://www.beetlebase.org, to provide comprehensive genomic data on red-flower beetle. This project involved the utilization of Apache, Perl, C, PostgreSQL, and HTML/CSS.
- Conceptualized and executed model-based gene expression analysis utilizing whole genome tiling-microarray datasets, resulting in publication in a journal. SAS was the primary analytical tool employed for the research.
- Led the collection, validation, and annotation of gene sequences, culminating in the successful re-assembly of the entire genome sequences of the red-flower beetle. This achievement was subsequently published in a peer-reviewed journal.

### Graduate Research Assistant @ University of Illinois at Urbana-Champaign
Jan 2004 – Jan 2007
- Designed, developed, and conducted analysis on the MANET database, accessible at http://manet.illinois.edu, which seamlessly integrated protein structural, functional, and phylogenic information sourced from KEGG and SCOP databases. This groundbreaking work was published in peer-reviewed journals.
- Developed Object-Oriented software packages, including DIM-Pack for statistical analyses and supply chain simulator, using VisualBasic.NET as the primary development platform.
- Served as the manager for a Mac OSX cluster server and storage and made significant contributions to multiple research projects, resulting in publications in various academic journals.


## Education
### MS in Bioinformatics
University of Illinois Urbana-Champaign

### BS in Computer Science
University of Illinois Urbana-Champaign

### Certified Scrum Master
Scrum Alliance


## Contact & Social
- LinkedIn: https://linkedin.com/in/matthew-kim-28204428

---
Source: https://flows.cv/matthewkim2
JSON Resume: https://flows.cv/matthewkim2/resume.json
Last updated: 2026-04-12