# Youngwan Lim > AI/ML Engineer at Roblox | Ex-Naver | Ex-Coupang Location: Sunnyvale, California, United States Profile: https://flows.cv/youngwan I am a 10+ year hands-on platform and backend engineer specializing in architecting, building, and scaling mission-critical distributed systems from scratch. I’ve built large-scale infrastructure supporting 100M+ concurrent users and 5B+ messages per day. I extended this backend expertise into data engineering, spending over three years building data-intensive platforms, including a company-wide A/B testing system used across product teams. I now apply this full-stack platform background to AI/ML at Roblox, where I launched RobloxGuard (SOTA LLM safety guardrails) and FAI-RL, a reinforcement learning library for LLM fine-tuning. As a competitive programmer, I placed 6th and 7th in the ICPC Asia Programming Contest and earned multiple national medals. ## Work Experience ### Principal Software Engineer @ Roblox Jan 2022 – Present | San Mateo County, California, United States AI/ML Platform team at Roblox | Building SOTA LLM Safety Guardrails (RobloxGuard) | Reinforcement Learning (FAI-RL) 1. Developed RobloxGuard 1.0 Model β€” Roblox’s Open-Source, State-of-the-Art LLM Safety Guardrails - Achieved SOTA performance, outperforming leading models such as Llama Guard, ShieldGemma, NVIDIA NeMo Guardrails, and even GPT-4o on key benchmarks - Launched a Text Generation API powered by RobloxGuard 1.0 πŸ”— GitHub: https://github.com/Roblox/RobloxGuard-1.0 πŸ”— Paper: https://www.arxiv.org/abs/2512.05339 πŸ”— Text Generation API Announcement: https://devforum.roblox.com/t/beta-introducing-text-generation-api/3556520 2. Implemented & Open-Sourced FAI-RL β€” A Production-Ready Reinforcement Learning Framework for LLM Fine-Tuning - Engineered a unified framework to support RL algorithms including DPO, PPO, GRPO, GSPO, and Supervised Fine-Tuning (SFT). - Designed a highly extensible system with YAML-based configs and support for custom reward functions and dataset templates. πŸ”— GitHub: https://github.com/Roblox/FAI-RL 3. Implemented a LLM Labeling Platform (Built from Scratch, Integrated with Label Studio) - An automated data pipeline for LLM labeling tasks. - Built a test case framework to streamline prompt engineering. - Implemented a daily search quality evaluation. - Functionality to sample and evaluate the quality of labeled data through human evaluators and LLMs. - A fine-tuned CLIP model for image labeling, effectively addressing a cold start issue. - Ability to retrieve internal datasets for use in a RAG system. Impact: - Created over 30 datasets with the help of 300 human evaluators. - Trained more than 15 models across 5 different teams. - Enhanced search quality by 2.2% through a fine-tuned CLIP model. ### Staff Software Engineer @ Coupang Jan 2018 – Jan 2022 | San Francisco Bay Area Built and scaled a real-time, multi-country A/B testing platform handling 200K QPS, implementing monitoring, circuit breakers 1. Implemented an in-house A/B test platform based on Java/Scala/Spring/Spark and Kafka/Hive/AWS for more than 3 years. - Implemented Circuit Breaker to detect problematic A/B tests within a few minutes from scratch based on Scala/Spark Streaming/Kafka/S3/Hive/Oozie/Yarn/Sqoop. - Implemented Exploration mode to reduce result update wait times from up to 24 hours to around 10 minutes for a full result set update from scratch based on Clickhouse/ZooKeeper. - Implemented query-based monitoring system and rule-based message generator from scratch based on Prometheus/Kotlin/MySQL. - Created a new data pipeline for exposure details widget to detect major exposure logging issue as early as possible from scratch. - Set up infra/batch/deployment/monitoring to expand the A/B test platform from one country to multiple countries. ### Senior Software Engineer @ Carousell Jan 2017 – Jan 2018 | Singapore Carousell is a simple and easy way to buy and sell with anyone. 1. Implemented Search Service from monolithic towards microservices based on Go/Grpc/Protobuf/Envoy and Docker/Kubernetes/Elasticsearch. - Implemented real-time item quality score, seller score for search ranking from scratch using Apache Beam/Redis/Kafka/Elasticsearch. - Ingested user impressions into Elasticsearch and implemented simple random buckets to make A/B testing. - Implemented boosting new seller’s items on the home page. - Implemented a cache layer to reduce the number of Elasticsearch access for low latency and cost reduction. ### Staff Software Engineer @ Coupang Jan 2015 – Jan 2017 | South Korea Built high-performance messaging and data streaming platforms, managing massive user messages and company-wide Kafka/RabbitMQ clusters. 1. Implemented Messaging Platform to handle massive messages for users based on Java/Vertx/Spring and Redis/Cassandra. - Implemented user service to collect user activity logs from connected sessions using Vertx/WebSocket/Redis. - Implemented inbox service to send and receive massive messages using Spring/Redis/Cassandra. 2. Provided company-wide Kafka cluster to aggregate data from different teams based on Kafka/RabbitMQ/Spark and Mesos/Marathon from scratch. - Set up and maintained Kafka, RabbitMQ cluster for the entire company on AWS. - Implemented cluster migration/data transmission between topics based on Spark Streaming/Mesos/Marathon. ### Senior Software Engineer @ NAVER Jan 2011 – Jan 2015 | South Korea Built LINE Push Platform handling 100M+ concurrent users and 5B daily messages, designing high-throughput distributed messaging and Redis-based services. 1. Implemented LINE Push Platform for over 100M concurrent users and 5B messages per day from scratch based on Java/Netty/Spring and Redis/ZooKeeper/MySQL for 4 years. - Implemented Session-Service to handle over 1M concurrent users per server using Netty. - Implemented Message-Service to distribute messages for 5B messages per day using Spring/Redis. - Implemented a high throughput, distributed Message Queue platform for asynchronous message -processing to support over 1M messages per second at peak time in front of API Gateway using Luxun. - Implemented Service discovery to lookup Servers, support multiple regions, and failover for replacing load balancers. - Implemented Redis cluster manager for high scalability and availability using Redis/ZooKeeper. ## Education ### Bachelor's degree in Electrical and Electronics Engineering Yonsei University ## Contact & Social - LinkedIn: https://linkedin.com/in/youngwan-lim --- Source: https://flows.cv/youngwan JSON Resume: https://flows.cv/youngwan/resume.json Last updated: 2026-04-12