I enjoy learning and challenging myself. GitHub: https://github.com/zstatmanweil Website: https://www.zoestatmanweil.com

Experience

Vibrant PlanetSoftware Engineer

2024 — Now

The Earth GenomeSenior Software Engineer

2022 — 2024

Spearheaded the research, construction, and management of a vector/embeddings search engine using cutting- edge technologies like Qdrant and Milvus for an internal earth observation search project. Remain up-to-date with the latest developments in vector databases, quickly adapting to changing landscapes and new releases.

Design and implement RESTful APIs in Go for climate change-focused organization websites, delivering data for interaction and visualization. Leverage Postgres and PostGIS for data storage, geospatial search and vector tile delivery.

Successfully manage multiple services, including a Spatial Temporal Asset Catalog service and the vector services, within Kubernetes.

BayGeoPresident and Board Member

2019 — 2022

Lead a non-profit, professional geospatial technology organization dedicated to bringing together and supporting students, educators, developers and other geo-professionals excited about the world of geography and mapping.

Impact ObservatoryData Engineer

2021 — 2022

Designed, built and managed a robust data processing system on Azure Batch that deploys a land use land cover ML model over more than 500 terabytes of satellite imagery data to create a 10-m resolution annual global map of the world. Handled edge cases and transient errors to produce a fault-tolerant scalable pipeline.

Managed an internal Spatial Temporal Asset Catalog built from the stac-fastapi and pgstac open source projects. Participate in those projects via issues and pull requests.

Built the team's first CI/CD pipelines in Gitlab to manage the deployment of Docker images and internal python packages.

Premise DataData Engineer

2019 — 2021

Implement streaming and batch Apache Beam / Google Cloud Dataflow pipelines for data transformation in Python and Scala. Projects have included a streaming Scala pipeline to predict payment fraud, and a batch Python pipeline to convert unprocessed Premise App submission data to KMZ files downloadable by Premise's clients and Data Science team.

Construct batch Airflow data pipelines in Python, ranging from a simple pipeline that ingests Facebook ad data and exports it to a Google BigQuery table, to complex pipelines such as one that queries Premise App submission data through SQL, runs it through a Random Forest fraud prediction model via a Docker container, and outputs the results to BigQuery. Built a Delta ETL in Airflow to pre-process data in BigQuery for use by data analysts, saving hundreds of dollars in cost per month.

Build and support Python services for fraud detection, quality control, and data management. Significant contributions have included building auto-quality control modules (Flask REST API modules) to assess incoming Premise App submission data. For example, a module that utilizes a computer vision model developed by the Data Science team to predict the likelihood of incoming photos showing the expected content.

Offer support and explain data engineering technology and concepts to the Data Science, Growth, Engineering, Product, and Management teams.

Education

Carleton College

Bachelor of Arts - BA

San Francisco State University

Master of Science - MS

School for International Training

Experience+3

Education

Bachelor of Arts - BA

Master of Science - MS

Human Ecology and Resource Management

Experience