Experience
Mountain View, CA
Evaluation & Automation
Designed and implemented an end-to-end evaluation pipeline covering data ingestion, ground-truth generation, simulation, and performance analysis.
Built a multi-process framework achieving 10× faster runtime, enabling large-scale validation.
Created a flexible data pipeline ensuring reproducibility and traceability across test runs.
Defined evaluation metrics and regression tests aligned with Euro NCAP, GSR, and ASPICE standards with automated reporting.
Developed debug-image generation and interactive visualization tools to speed up root-cause analysis and data-driven release decisions.
Automated Looker dashboards that update with each dataset and software release, visualizing key metrics—detection rate, tracking accuracy, positional and velocity errors—to monitor trends, detect false AEB (Automatic Emergency Braking) events, and guide release planning.
Simulation
Designed a hardware-in-the-loop (HIL) framework validating perception software on TI AM62A and TDA4VH boards using real sensor replay.
Automated multi-board orchestration and YAML-based configuration for scalable parallel runs.
Built metadata-standardization tools for large rosbag datasets (~2 GB / 10 s), eliminating manual setup errors and improving efficiency.
Added validation checks (frame-rate, size, duration) with auto re-runs to ensure reliable ADAS results.
Ground Truth Development
Built a semi-automated labeling pipeline replacing vendor and manual annotation with an in-house CVAT-based system.
Integrated Grounding-DINO detection and GRM tracking to auto-generate CVAT-compatible annotations.
Implemented label-schema version control and transitioned to teacher-model pseudo labeling, supporting 10 European countries and 416 traffic-sign variations.
Reduced manual labeling effort by > 90 % and enabled continuous updates aligned with new model releases.
2022 — 2022
Burlingame, California, United States
Image Clustering Project
Develped the method for the diversity-based image selection (diverse scenes / context) for deep learning model validation rather using the randomly selected data sets
Helped to have insights on rare / corner case, so later, we can collect more data for training
Selected mutual-exclusive images among the clusters; it
1) can wisely curate the validation data set by selecting part of the images in each cluster, and 2) can identify the data sets that effectively improve the model performance (in other words, training the same mass images does not contribute to improving model performance such as accuracy).
Mastering skills in Python, SQL, data wrangling, data visualization, exploratory data analysis, and machine learning
Completion of the two in-depth capstone projects
(1) Prediction of Scores for Public Schools in California
I provided the prediction models using the regression and classification algorithms for finding the inferior schools that need help.
(2) Sentiment Analysis of Movie Reviews using a Deep Learning Convolutional Neural Network
I developed a deep learning model to automatically classify movie reviews as positive or negative in Python with Keras. We especially identify and deal with overfitting and use a pre-trained embedding in a neural network model to improve accuracy.
Extra Project: Data Anonymization for Privacy
De-identifying sensitive data by implementing anonymization techniques (e.g., generalization for bucketing, suppression, and pseudonymization)
Implemented the k-anonymity and the information loss for measuring privacy and accuracy of anonymized data
Observed the accuracy of machine learning models built with data of different levels of privacy (including differential privacy technique)
https://github.com/ahrimhan/springboard-course
https://github.com/ahrimhan/data-science-project
https://github.com/ahrimhan/data_anonymization
Anam-dong, Seoul
Advised graduate students to develop research topics and conduct the experiments
Received the two National Grants for my research projects, "Efficient Refactoring Candidate Identification" (Principal Investigator)
Published the research results in the top conferences and journals such as IEEE Transactions on Software Engineering
[Research Project: Efficient Refactoring Candidate Identification]
I developed an efficient refactoring candidate recommendation system that helps software developers change code more easily. The main challenge is the heavy computation for evaluating large refactoring candidates, therefore, I developed the following methods in order to increase computational efficiency.
Fast graph-based coupling metric for calculating the effects of suggested refactorings based on matrix computation. This sacrifices some degree of precision (e.g., Connectivity: precision = 0.52, recall = 0.96 ) but significantly reduces the computation complexity, which can be helpful in analyzing large-scale software (e.g., our approach: 154 sec vs. EPM (previous approach): 28,622 sec).
Two-phased search-based refactoring identification method predicting the probabilities of the candidates that are likely to improve "maintainability" and evaluating only the candidates with higher chances. Compared to the previous research of the no-reduction approach, the two-phased refactoring identification approach is 2.6 (min) to 13.5 (max) times faster in computation time.
Education
Korea Advanced Institute of Science and Technology
Doctor of Philosophy - PhD
Korea Advanced Institute of Science and Technology
Master of Science - MS
Sogang University