# Kunyi Liu > Sr Software Engineer at Microsoft Location: Snoqualmie, Washington, United States Profile: https://flows.cv/kunyi Senior Software Engineer focused on infrastructure and backend systems. Experience designing and operating reliability-critical platforms. Currently exploring AI infrastructure and model-serving systems, with an emphasis on reliability, observability, and cost-efficient scaling. Interested in senior-level roles in infrastructure, platform, or AI systems engineering. Feel free to contact me: kunyi.liu.swe@gmail.com. ## Work Experience ### Senior Software Engineer @ Microsoft Jan 2024 – Present Daemon Job Framework: Continuous Execution Infrastructure (2024 - 2025) * Designed and built a daemon job framework for long-running and streaming workloads, boosting throughput by 70%+, improving reliability, and reducing service overhead with in-memory caching across job cycles. * Implemented auto-version upgrades with zero-downtime rollouts and geo-redundancy with safe staged deployments. * Delivered telemetry dashboards with latency, reliability, and timeline tracing metrics for deep operational visibility. Surgical Failover: Data Pipeline for Failover “Doctor” Engine (2025 - now) * Built a streaming topology pipeline using the daemon job framework to continuously ingest Azure and SharePoint data, serialize with Protobuf, and publish to blob storage. * Designed and implemented Doctor, a central decision engine that continuously ingests topology and telemetry from blob storage, snapshots the data for algorithm execution, and outputs automated failover decisions. * Optimized throughput with a producer–consumer model and sliding-window caching, processing 1M+ blobs / 30 min within ≤2 GB steady-state memory. ### Software Engineer II @ Microsoft Jan 2021 – Jan 2024 Directory Partition Migration & Failover Automation (2023 - 2024) * Led the migration from an Active Directory–based solution to a SQL-based directory architecture, driving cross-team collaboration between the SharePoint Directory and Disaster Recovery teams, and designing schemas, APIs, and CRUD workflows for directory partitions and replicas. * Architected automated partition failover with a traffic-light gating system and pre-failover validation, ensuring seamless recovery without data loss across 400+ production farms. * Developed monitoring and alerting for long-running or failed failover operations, significantly improving failover reliability, recovery time, and operational performance. ### Software Engineer @ Microsoft Jan 2020 – Jan 2021 | Washington ### Product Developer @ Indeed.com Jan 2018 – Jan 2020 | Austin Employment Related Service for Contractor Jobs • Work on full-stack development for the web application with Django, MySQL, and ReactJS, which essentially facilitates candidates to apply for matched jobs and allows recruiting agencies to manage the requisition activities • Implemented an automatic user-interface refresh feature for the project, involving adding code using the setInterval function and React Lifecycle methods to the related containers and components • Extended the project with a sourcing React App with features of creating, editing and filtering job requisitions, and sending invitations to potential candidates, which at least reduce the time for getting user applications by 60% • Set up a Node.js cron job for creating and refreshing Elasticsearch indices based on query results from the database, and made search queries with aggregations to score and rank the replaying candidates • Fulfilled the phone call transcription feature for users to check contact history, taking advantage of Twilio dual-channel recordings and IBM Watson Speech-to-text customized models Find Me a Meeting • Build an API design first web application, of which API features include creating events on the shared calendar, fetching free/busy availability, generating unique meeting URLs, and modifying user preferences • Overwrote the default login and logout views to enable recruiters to sign in via Google and Azure OAuth flows • Utilized token-based authentication, session-based authentication and customized permission classes to enable role-based access control and object-level access control • Integrated with Datadog and Sentry monitoring tools to keep track of the unexpected errors and performance ### Research Intern @ Columbia University Irving Medical Center Jan 2017 – Jan 2018 | New York, New York Project Making evidence appraisal available and computable 1. Scraped and extracted useful fields, such as pages, authors, and edits from edits in xml format of wikijournalclub using API, regex and beautifulsoup4 packages in Python 2.Identified edits that are minor vs. substantive based on features such as flags and size change  3. Determined which substantive comments are appraisals of studies or other information/assertions using text classifiers ### Research Team Member @ Bloomberg Jan 2017 – Jan 2018 | Greater New York City Area Automatic Summarization of Congressional Bills 1. Preprocessed the bills and summaries by creating XML parsers, and explored the latent relationship by doing exploratory data analysis on length distribution of texts 2. Collected various summarizer implementations, and ran the algorithms, like LexRank and KL-Sum, as a baseline 3. Planned to research ROUGE and other metrics to evaluate models, and create an API to the best summarizer ### Data Intern @ Knotel 'We are hiring' Jan 2017 – Jan 2018 | Greater New York City Area 1. Extracted and integrated data from various platforms including MongoDB database and API developer, and built a data pipeline to process data into form ready for analysis using Python 2. Implemented Ad-hoc analysis on data and performed data visualization on R studio 3. Assisted on the design and establishing the Data Warehouse (Redshift), and ETL (Export, Transform and Load) to centralize all the data from disparate systems. ### Web Developer Intern @ SnowSugar Video Jan 2017 – Jan 2017 | Greater New York City Area 1.Identified and Fixed the Japanese Keywords Hack taken place in website that created by PHP cording on WordPress  2. Explored MongoDB database (CRUD and advanced operations, and programming language driver ) and did some research on MLab including connecting MLab to Heroku 3. Learnt how to create and deploy a MERN (MongoDB/Express/React/Node.js) Stack application on to Heroku  ### Summer Intern @ Nanhai Administration of Power Supply, Communication Information Department Jan 2015 – Jan 2015 | Guangdong, China 1.Learned the operation of power supply information system and topological graph of power data network in Nanhai district, practiced communication fault dispose process 2.Utilized the knowledge of information security and information networks into work 3.Improved teamwork skills and fast learning ability ### Spring Intern @ China CITIC Bank International Limited Jan 2014 – Jan 2014 | Foshan, Guangdong, China 1. Assisted in the statistics work of retail management accounting and other financial statements 2.Managed the data of customers, products and distribution channels in retail department, organized customer’s files, utilized information mining technology to provide customized services and improve service quality 3.Integrated the various kinds of data from financial statements to gain a compendious view of the operation of certain retail companies 4. Learned basic knowledge about banking and finance, practiced the knowledge of management information system in the work, improved my practical ability ## Education ### Master of Science (MS) in Data Science Columbia University Jan 2016 – Jan 2018 ### Bachelor's degree in Management Information Systems, General Sun Yat-sen University Jan 2012 – Jan 2016 ## Contact & Social - LinkedIn: https://linkedin.com/in/kunyi-liu-06a038b5 --- Source: https://flows.cv/kunyi JSON Resume: https://flows.cv/kunyi/resume.json Last updated: 2026-03-22