# Tianhao Zang

> MEng in AI @ UCLA | MLE Intern @ Meituan | AI Researcher @ Amazon | Research Assistant & BS in CS @ OSU

Location: Los Angeles, California, United States
Profile: https://flows.cv/tianhaozang

## Work Experience
### Machine Learning Engineer Intern @ Meituan
Jan 2025 – Jan 2025 | Beijing, China
1.Improved query–merchant relevance by +2.13% Accuracy and +2.07% Recall through multi-stage fine-tuning of cross-encoder mBERT with feature augmentation, leading to +1.02% UV_CTR and +0.83% GMV in online A/B tests while resolving 83% of bad cases.
2.Expanded training data by 300K+ samples with labeling accuracy from 70%+ to 90%+ via user behavior–based sampling (view, click, order) and LLM-enhanced labeling.
3.Boosted benchmark metrics across full-intent queries by implementing feature augmentation on query and merchant sides through multi-stage fine- tuning by designing multiple groups of experiments of user behavior-based training samples and original ones.
4.Streamlined offline pipeline by delivering end-to-end relevance modeling, including sample generation, LLM labeling, training, inference, and caching.

### Machine Learning Engineer Intern @ Mech-Mind Robotics
Jan 2024 – Jan 2024 | Beijing, China
1.Built a To-B chatbot integrating with a RAG pipeline including data pre-processing, text splitter, text embedding model, vector
database, and large language model (LLM) in Python to help business customers quickly learn details of robot products.
2.Converted HTML contents of text and tables into text format usingBeautifulSoup4 and stored all data including id, text, url, title, and
labels into MongoDB as the database.
3.Used Cross-segment BERT and text-splitter in LangChain to split the Chinese and English text into 30,000 chunks of different
lengths semantically.
4.Generated dense and sparse vector embeddings from text chunks using text-embedding-v2 from Ali DashScope and stored them
into Ali DashVector with multiple attributes for content-based and keyword-based recall.
5.Used TF-IDF algorithm to extract robot name keywords from user queries to help filter out intented contents of specific robots.
6.Fine-tuned Qwen-plus (LLM) by using proper prompting and top-k reranked recommended documents including text, title, and url from vector database based on cosine similarity and dot product to generate responses to users.
7.Deployed the RAG-based chatbot to Docker image and official company website and community using Flask.

### AI Researcher @ Amazon
Jan 2023 – Jan 2023 | Seattle, Washington, United States
1.Designed automated pipeline and evaluated models for automatic speech recognition (ASR) correction based on the WikiHow dataset, generating new knowledge base, and improving the speech-to-text performance of task-oriented robots by 30%.
2.Addressed challenges in question type recognition, prerocessed conversational text datasets, and trainedquestion classifiers to improve the robot's NLU.
3.Improved accuracy of intent recognition to enhance the ability to analyze user instructions.
4.Developed multi-model user interfaces for Echo device based on Alexa Presentation Language, incorporating voice, text, and video
interactions to enhance user satisfaction by 20%.
5.Published a research paper “TACO 2.0: A Task-Oriented Dialogue System with Mixed Initiatives and Multi-Modal Interaction“ on Amazon Science.

### Research Assistant @ The Ohio State University College of Engineering
Jan 2022 – Jan 2022 | Columbus, Ohio, United States
Analyzed specific audio datasets and generated corrupted audio files and transcriptions. Searched for and analyzed the state-of-the-art ASR models. Created a pipeline including Text-to-speech model and Speech-to-text model. Made improvements on ASR error correction model.

### Machine Learning Engineer Intern @ RemoBytes
Jan 2022 – Jan 2022 | San Francisco, California, United States
1. Developed and deployed a convolutional neural network (CNN) for malware classification, achieving 98.7% accuracy on a dataset of 200,000+ binary samples by converting binaries into grayscale images for feature extraction.
2. Optimized the CNN architecture with advanced layers and hyperparameters, implementing data augmentation strategies to enhance model robustness against new malware variants.
3. Integrated the model into a real-time cloud-based malware detection service, using Docker and Kubernetes for scalable deployment, processing thousands of files per minute.
4. Collaborated with the engineering team to integrate the system into the cybersecurity platform, conducting performance evaluations and ensuring seamless API development and deployment.

### Undergraduate Student Researcher @ Carnegie Mellon University
Jan 2021 – Jan 2021 | United States
Trained an SVM model and active learning to classify if an email is a spam or not. Published a paper about the final achievements.


## Education
### Master of Engineering - MEng in Artificial Intelligence
UCLA Henry Samueli School of Engineering and Applied Science

### Bachelor of Science - BS in Computer Software Engineering
The Ohio State University


## Contact & Social
- LinkedIn: https://linkedin.com/in/zangtianhao77

---
Source: https://flows.cv/tianhaozang
JSON Resume: https://flows.cv/tianhaozang/resume.json
Last updated: 2026-04-17