# Min Guo > Software Engineer / Data Engineer at NCR Corporation Location: Redwood City, California, United States Profile: https://flows.cv/minguo ## Work Experience ### Software Engineer / Data Engineer @ NCR Corporation Jan 2020 – Present | Redwood City, California, United States Building Streaming API allows third-party Financial Institutions to consume the real-time banking activity data through Apigee API service using Pub/Sub. Migrate the existing Map-Reduce based on-prem data pipelines/platform to GCP streaming based data pipelines/platform by using Apache Beam. •Build and deploy streaming Dataflow pipelines processing ~2k per second syslog messages. The pipelines consume data from Pubsub and Firestore, transform syslog data (filtering, validation, data replay, de-duplication and grouping) to banking activity Avro data and ingest into BigQuery, Bigtable which allows third party Financial Institutions to query the real-time data through Apigee API service. •Build and deploy batch Dataflow pipelines to read data from BigQuery, transform (filtering, validation and grouping) and generate daily reports for third party Financial Institutions. •Introduce and deploy Apache Airflow as workflow scheduling tool and Cloud functions to run daily/hourly batch data flow pipelines generate reports in Google Cloud Composer. ### Software Engineer @ Lumiata Jan 2017 – Jan 2019 | San Mateo, CA • Migrated and evolved manual on premise data processing tool to cloud based automated data pipeline Designed a generic ETL pipeline can receive different schema healthcare raw data and ingest to the standard format using Spark 2.4 and Scala. Introduced Apache Airflow as workflow scheduling tool and brought to production on AWS EMR. Migrated Lumiata ETL pipeline with ~ 40 million patient healthcare records from on premise to AWS EMR and GCP Dataproc. Transformed ~ 40 million patient healthcare raw data with CSV format to standard healthcare data format (HAPI FHIR) and imported the standard output into BigQuery. • Built a generic Pyspark application to generate summary report for ~ 40 million patient healthcare raw data. Validated healthcare raw data based on data types and generate statistics report for all the values. • Built integration test for Lumiata ETL pipeline. Generated statistics reports for both raw data and standard healthcare data to test ETL. • Verified and debugged data using Jupyter Notebook. ### Software Engineer Internship @ GoFind.ai Jan 2017 – Jan 2017 | San Francisco Bay Area Develop the search infrastructure on Microsoft Azure cloud with MEAN stack. ### Research Assistant @ The University of Queensland, Australia Jan 2009 – Jan 2010 Successfully synthesized multilayer composited films of gold nanoparticles and semiconductor nanosheets via in-situ and layer-by-layer assembly methods. ## Education ### Master's degree in Computer Science New York University ### Doctor of Philosophy - PhD in Physics Xiamen University ### Bachelor's degree in Mathematics and Physics Inner Mongolia University ## Contact & Social - LinkedIn: https://linkedin.com/in/minguo1 --- Source: https://flows.cv/minguo JSON Resume: https://flows.cv/minguo/resume.json Last updated: 2026-03-29