# Cassie Lee > Backend Software Engineer | Python | Java | NLP & LLM Location: Cambridge, Massachusetts, United States Profile: https://flows.cv/cassielee I’m a software & machine learning engineer who turns clinical data into reliable, scalable tools that help clinicians move faster and improve patient outcomes. My work spans end-to-end ML pipelines, speech & language modeling for neurocognitive disorders in Python. With a strong engineering foundation and a research mindset, I bring a pragmatic, data-driven approach to extract data by leveraging LLM models and ML algorithm and clean messy real-world data to models to give output of meaningful result. ➝ At Boston University’s Center for Brain Recovery, I built an automated diagnostic pipeline for clinical speech—processing 600+ samples, extracting 103 linguistic biomarkers, and accelerating clinician review by ~80%. I containerized the stack for reproducible runs and cut teammate onboarding from ~2 hours to ~10 minutes. ➝ Stood up an AWS transcription flow (S3 + Lambda + Transcribe) that eliminated manual speech-to-text and streamlined data ingestion for a 7-person cross-functional team. ➝ Previously at Sabre (global travel tech), I shipped Java Spring Boot services and gRPC/Protobuf APIs, including ranking logic for 176k+ hotels by venue proximity. I also automated cost-aware data pipelines on GCP (Cloud Functions + BigQuery), surfacing insights that drove ~$150K/year in savings, and built a Git metrics dashboard that improved engineering visibility in a large monorepo. 🚀 I’m currently seeking a remote or Boston-hybrid or onsite role— Python Engineer, AI Software Engineer, ML Engineer, Data Scientist, or Backend Engineer—ideally in digital health, neurotech, or applied AI where real-world impact meets robust engineering. What I bring to every team: ✅ End-to-end ML/data pipelines (ingest → features → models -> Evaluation) ✅ Practical NLP for speech & clinical contexts ✅ Cloud & data engineering on AWS/GCP (Lambda, S3, BigQuery, Cloud Functions) ✅ Backend service design (Python, Java, REST/gRPC, Spring Boot) ✅ Reproducibility & MLOps basics (Dockerized environments, clear handoffs) ✅ Cross-functional collaboration with researchers, clinicians, and engineers 2 ways to learn more or get in touch: 💌 Email me at seunghee@bu.edu 💼 Connect with me on LinkedIn (Cassie Lee, Boston-based) ## Work Experience ### NLP Research Engineer @ Boston University Sargent College Jan 2025 – Present | Boston, Massachusetts, United States * Preprocessed 600+ clinical speech samples in Python using Stanza and spaCy, segmenting sentences and extracting 103 linguistic biomarkers feed into ML models and reduced clinician diagnostic time by 80%. * Automated Python environment setup with a shell script to install libraries and created a Dockerfile for reproducible deployment, reducing onboarding time for team members from 2 hours to 10 minutes. * Leveraged Amazon Transcribe with AWS S3 buckets and Lambda functions to transcribe 600+ clinical speech samples into text, automating speech-to-text conversion to eliminate the process of manual transcription. * Implemented a semantic coherence error detection algorithm using cosine similarity of word embeddings to identify discourse-level impairments in patient speech, achieving 80% classification accuracy across coherence subtypes. * Developed a novel unsupervised clustering algorithm for matching spoken utterances from aphasic and dementia patients to narrative concepts (e.g., Cinderella), achieving 83.7% accuracy on 10 patient transcripts, surpassing K-means by 10.7%, reducing clinical analysis time by 30%, and enabling personalized treatment for 50+ patients. ### Software Engineer @ Sabre Corporation Jan 2020 – Jan 2023 | Boston, Massachusetts, United States Created and consolidated cost-saving reports using Google’s Recommendation API to illustrate to stakeholders how to improve the company’s bottom line. Designed BigQuery tables to make raw data more accessible, ensuring readability and demonstrability for generating daily reports with an influx of 10k+ new data entries. Incorporated an SDK library of APIs into the microservice application, expanding its functionality and enabling seamless communication with external systems ### Software Engineer Intern @ Sabre Corporation Jan 2020 – Jan 2020 | Boston Working on Airbnb projects that redirects to 50 airlines’ flight booking links (Deep Link projects). Requesting APIs in order to be responded with valid links. (Springboot, Maven) Will develop Cloud, data science, advanced analytics, and machine-learning based solutions to understand travel business and solve problems using Sabre’s and external data (AWS, and GCP) ### Software Engineering Intern @ Constant Contact Jan 2019 – Jan 2019 | Waltham • Developing new features for front-end and backend in Contacts team that allows the company's clients easily engage and access the website with their customers in Agile/Scrum based environment. • Application is integrated using Backbone JS ### Student @ Boston University Jan 2015 – Jan 2019 | United States ### Teaching Assistant (Probability) @ Boston University Jan 2018 – Jan 2018 | Boston - Assisting students on homework and exams in ECE Probability course. - Utilized MatLab to aid students on curriculum projects ### Volunteer Web Developer @ Veri Nano Jan 2018 – Jan 2019 | Greater Boston Area - Developing the branding website using Wordpress, CSS, and HTML. - Clients can purchase products via the application. ### Software Engineer Intern @ NET-ID Jan 2018 – Jan 2018 | Seocho-gu, Seoul, Korea - Created mobile-responsive web pages for the company using HTML and CSS based on responsive web samples and templates. - Used Client C MQTT library and Server Java MQTT library from Open SSL, the app could send the pointer array of x and y points to the mobile to communicate through mobile blackboard. ## Education ### Master's degree in Data Science Boston University ### Bachelor's degree in Computer Engineering Boston University ## Contact & Social - LinkedIn: https://linkedin.com/in/sh-cassie-lee --- Source: https://flows.cv/cassielee JSON Resume: https://flows.cv/cassielee/resume.json Last updated: 2026-03-31