# Rebecca Poch > Data Engineer Location: New York, New York, United States Profile: https://flows.cv/rebeccapoch Software Engineer with seven years of experience in Data Engineering and DevOps for internal engineering platform, recently focused on frontend development with React. Experienced in Terraforming AWS infrastructure and setting up CI/CD pipelines for improving developer experience. Strong presentation skills for both technical and non-technical audiences, for project alignment and knowledge sharing. Seeking opportunities to solve meaningful business problems through engineering, while collaborating across technical and product teams. ## Work Experience ### Software Engineer, Data @ Vendelux Jan 2024 – Present ### Senior Data Engineer @ Redesign Health Jan 2023 – Jan 2024 Data Team - Internal Data Platform - Built and deployed services for internal data platform using open-source components. Argo Workflows for ETL Job orchestration. Trino for querying across data stores. Hive Metastore for managing Iceberg tables. Apache Ranger for securing data permissions. Superset for BI visualizations. Datahub for metadata discovery. - Initiated Proof of Concept (POC) deployments using official or open-source third-party Helm charts on the existing EKS cluster. Used Terraform to manage AWS dependencies: IAM role permissions and S3 buckets. - Enhanced deployment security with DevOps team by implementing Okta SSO and Vault for secure deployment value management. - Used Prometheus and Grafana for application monitoring of uptime, long and resource intensive queries, and debugging error logs. Redesign Platform Core Team - Frontend Development - Developed frontend pages for the Redesign Health Platform Portal with React using reusable components where appropriate. - Used Storybook for rapid page development, independent of API connection. - Collaborated with team member on integrating API contract in unit testing for mocked data validation with said contract. - Identified gap in unit testing coverage and bootstrapped to increase coverage from near 0% to 45%. - Created Jira tasks aligned with product requirements and Figma design screens. Implemented devcontainer setup with docs for streamlined onboarding for both data and core teams. Unfortunately, no longer continuing role due to staff reduction ### Data Engineer @ Redesign Health Jan 2022 – Jan 2023 ### Data Engineer @ H1 Jan 2021 – Jan 2022 Data Platform Lead - Clinical Trials - Ingested a daily full feed of clinical trial data and recorded changes over time. - Expanded the process to incorporate multiple countries and establish linkages. - Orchestrated and managed workflows using Argo Workflows. - Wrote Pyspark jobs for reading/writing Apache Hudi tables and Python jobs, ensuring comprehensive test coverage. - Coordinated with internal data consumers for normalized, cleaned data through shared message contracts to queue, for consumers to pull from. - Implemented Continuous Integration (CI) with unit and integration test runs for each commit, Dockerized build job, and CircleCI integration. Managed AWS infrastructure using Terraform and coordinated with DevOps team. - Led agile grooming and planning sessions to strategize sprints and deliverables. Maintained alignment with technical manager and product owner priorities. - Mentored and conducted code reviews for a team of 4 that started with minimal Python and no Pyspark experience. - Maintained up-to-date documentation for the evolving team and project. ### Software Engineer @ Elsevier Jan 2017 – Jan 2021 | Greater New York City Area Chemical Substance Normalization - Collaborated with Subject Matter Experts (SMEs) to build and refine rule-based approach for enriching substance metadata and linking substances - Utilized AWS Step Functions for end-to-end task sequencing and error handling. Created jobs in Lambda and AWS Glue with PySpark for each data processing step. - Created Kibana dashboards for data sharing and quality analysis before data releases. Created Elasticsearch index schemas, indexed data, and provided query structure for application team requirements. - Established linkage for 21 million unique substances to original content sources, available for consumption by application teams in both RDS database and Elasticsearch. Geolocation Tagging Engine - Developed a service to identify and return top locations mentioned in a document. Existing technique for determining top locations based on area overlap count was too slow and yielded lower quality results compared to commercial software. - Reduced total runtime per document from 2 hours to under 20 seconds. - Achieved 65% improvement in top location result quality. - Implemented algorithm for area overlap count using open source python libraries (Geopandas, Shapely). - Replaced dictionary matching component of locations with Spacy PhraseMatcher for faster matching and simplified setup. - Implemented unit and integration tests where there were none before. - Integrated service into CI/CD pipeline using Github, Docker, Jenkins, and Terraform for scaling and deployment. ### Computer Center Operator @ The Cooper Union for the Advancement of Science and Art Jan 2016 – Jan 2017 | Cooper Union Computer Center, 41 Cooper Square - Assisted Cooper Union professors, staff, and students with day-to-day operations and troubleshooting with computers, printers, network connections, and other iT needs. Maintain inventory and stock. ### GMS Advanced Manufacturing and Innovation Intern @ Bristol-Myers Squibb Jan 2016 – Jan 2016 - Trained neural network with scikitlearn using time-series data to determine the critical quality attributes of a continuous direct compression tablet system. - Created an annotated iPython notebook with documentation with the ability to change process parameters for job runs. ## Education ### Bachelor of Engineering (B.E.) in Chemical Engineering The Cooper Union for the Advancement of Science and Art Jan 2013 – Jan 2017 ### High School Thomas Jefferson High School for Science and Technology Jan 2009 – Jan 2013 ## Contact & Social - LinkedIn: https://linkedin.com/in/rebeccapoch --- Source: https://flows.cv/rebeccapoch JSON Resume: https://flows.cv/rebeccapoch/resume.json Last updated: 2026-03-23