Stanford, California, United States
Joined Dr. Katja Weinacht’s lab, a leader in iPSC-derived thymic cell research, to support efforts in understanding thymic biology through high-dimensional data analysis. Began by generating and visualizing insights from bulk and single-cell RNA-seq datasets using R (Seurat, clusterProfiler, WGCNA) and Python (matplotlib, pandas), before transitioning into a more integrated machine learning role developing computational models of thymic cell development and differentiation.
Perturbation Prediction System (multiDCP, Bulk RNA-seq): Developed an end-to-end system for predicting transcriptional responses to diverse perturbation inputs. Trained multiDCP on tens of thousands of thymic bulk RNA-seq samples and built a ranking framework that highlights perturbations most likely to achieve specified gene expression goals.
Biomni (Open-Source Bioinformatics AI Agent): Deployed and extended Stanford’s AI agent for natural language-driven data analysis by repurposing a MacBook server for lab-wide access. Expanded capabilities by writing new MCPS modules (e.g., WGCNA, Hardy–Weinberg analysis). Built server-side infrastructure with Docker to support long-running jobs and allow multiple researchers to run analyses concurrently, and integrated Kubernetes autoscaling to dynamically provision containers during peak demand, enabling smooth, multi-user performance.