# Elliott Drabek, Ph.D.

> Bioinformatics pipelines you don’t have to think about

Location: Mountain View, California, United States
Profile: https://flows.cv/elliottdrabekphd

● Experienced bioinformatics software engineer and data scientist
● Effective communicator, planner, and leader
● Expert at delivering clean, performant bioinformatics software and clear insights from NGS data

Bioinformatics: NGS workflows, RNASeq and differential expression (bulk and single-cell), metagenomics (amplicon-based), sequence assembly, population genomics, immune repertoire analysis, variant calling

Programming languages: Expert in Python, R, Bash, SQL, AWK. Capable in C/C++, Java, JavaScript

Software technologies: Nextflow, AWS (Batch, EC2, ECR, RDS, CloudWatch, S3, CodeBuild, Athena), Linux, Docker, mamba/conda, Git, Travis CI, MongoDB, PostgreSQL, Jupyter, R/Shiny

Data science: Machine learning, multivariate statistics, neural networks, sequence alignment, hidden Markov models, hypothesis testing, clustering, dimensionality reduction

Other: Test engineering and test-driven development, scrum, product ownership

Publications: 40+ peer-reviewed publications on a wide breadth of topics in biomedicine (and earlier in statistical natural language processing), contributing as a software engineer (pipelines and algorithms) and data scientist. https://scholar.google.com/citations?user=LRdWf5cAAAAJ&view_op=list_works&sortby=pubdate

## Work Experience
### Bioinformatics Software Engineer @ ClearNote Health
Jan 2024 – Present
I engineer AWS-native bioinformatics pipelines, with an emphasis on robustness, clean design, and testing.

### Data Engineer @ Johnson & Johnson
Jan 2023 – Jan 2023
DEVELOPED AWS SOLUTIONS for centralized management of petabyte-scale NGS, downstream, and meta-data from all J&J bioinformatics platforms and translational research worldwide ● Improved reliability and operational efficiency by systematizing and automating six heterogeneous recurring database updates, streamlining inconsistent semi-manual processes with complex, undocumented dependencies into fully automated cron jobs with reproducible micromamba environments and uniform deployment, logging, and error notification ● Improved speed and cost-effectiveness of cataloging data assets and identification of duplicate data using AWS Athena and S3 inventories

### Senior Bioinformatics Software Engineer @ Atreca, Inc.
Jan 2016 – Jan 2022 | San Carlos, California, United States
MANAGED SOFTWARE PROJECTS from requirements through delivery, leading multiple engineers ● Empowered scientists & liberated bioinformaticians with a self-service web UI allowing scientists to launch & monitor bioinformatics pipelines on AWS Batch ● Enabled scientists to quickly contextualize drug candidates searching very large internal & external databases for antibodies potentially binding similar epitopes, with a system comprising a workflow achieving tractability using a careful combination of exact & approximate sequence matching, a mechanism to automatically run the workflow & store the results for all antibodies from the company’s core pipeline, & a user-facing dashboard to explore the results

DEVELOPED & MAINTAINED NGS WORKFLOWS ● Developed the primary analysis pipeline for an in-house NGS-based immunophenotyping platform adjunct to 10x immunosequencing ● Extended the company’s core pipeline with multiple annotations, such as sequence-based detection of potential development liabilities ● Deeply refactored the company’s core pipeline to isolate dependencies, improving maintainability, enabling containerization, & streamlining testing ● Implemented automated monitoring & reporting of possible laboratory contamination

ANALYZED NGS & OTHER DATA, communicating insights to dry lab, wet lab, & clinical audiences ● Supported development, optimization, & troubleshooting of multiple novel NGS platforms ● Identified signatures of selection & potential drug candidates in patient immune repertoires

MANAGED BIOINFORMATICS AWS INFRASTRUCTURE for a group of seven bioinformaticians & dozens of users, maintaining a dependable, cost-effective ecosystem, allowing my boss to focus on the scientific mission.

PROACTIVELY LED IMPROVEMENT OF SOFTWARE ENGINEERING PRACTICES ● Initiated & led adoption of automated CI/CD for all pipelines via Travis CI ● Initiated & led transition to Python 3 & Dockerization ● Coded a comprehensive test suite bringing all pipelines under test

### Senior Bioinformatics Software Engineer @ University of Maryland School of Medicine Institute for Genome Sciences
Jan 2010 – Jan 2016 | Baltimore
DEVELOPED NGS PIPELINES combining OTS software & novel algorithms for ● Analysis of simultaneous host-pathogen RNA-seq data ● Hybrid assembly of bacterial genomes assisted by whole-genome optical maps ● Hybrid assembly of P. falciparum var genes from PacBio and Illumina data ● Population genomics analysis of homologous gene sets from unfinished whole genome assemblies ● Primary analysis of 16S metagenomics data starting from raw sequencing reads & ending with taxonomic abundance profiles.

ANALYZED NGS & OTHER DATA & PRODUCED CUSTOM DATA VISUALIZATIONS for diverse systems including ● 16S taxonomic profiles & associated clinical data ● Single- & multiple-genome gene expression profiles (transcriptomics, host/pathogen transcriptomics, & metatranscriptomics) ● Whole genome & single-gene comparisons of assemblies & SNP calls ● Population genomics analysis of annotated assemblies.

PARTICIPATED IN THE DEVELOPMENT OF A STATISTICAL METHOD for identifying networks of interacting taxa in taxonomic profile data, & produced the R implementation of the distributed software (MetaDistance).

TAUGHT R PROGRAMMING & METAGENOMICS ANALYSIS for graduate courses & professional workshops.

MENTORED & SUPERVISED THE DAILY WORK OF DOCTORAL STUDENTS.

### PhD student/Research Assistant @ Johns Hopkins University
Jan 2001 – Jan 2009 | Center for Language and Speech Processing, Baltimore
Conducted original research in machine translation.

Developed statistical models of complexly structured multilingual data in large heterogeneous databases.

Advanced the state of the art in objectively measured performance on several tasks.

Wrote tens of thousands of lines of code, in six or more programming languages.

Wrote hundreds of pages of published scholarly prose.

Developed web-based interfaces for data collection from volunteers.


## Education
### PhD in Computer Science (natural language processing)
The Johns Hopkins University

### Master of Engineering (MEng) in Computer Science (natural language processing)
The Johns Hopkins University

### Master of Engineering (MEng) in Computer Science (natural language processing)
Tsinghua University

### Bachelor of Arts (BA) in Linguistics, with minors in Computer Science and Chinese Language
University of Arizona


## Contact & Social
- LinkedIn: https://linkedin.com/in/elliott-drabek-9b790423

---
Source: https://flows.cv/elliottdrabekphd
JSON Resume: https://flows.cv/elliottdrabekphd/resume.json
Last updated: 2026-04-10