Architected and developed RESTful API and database schema (tables, views, indexes) for an application that aggregates real estate transaction data.
Designed and built data ingestion processes using Python and SQL to validate, transform, and load 1M+ real estate transaction records.
Developed and automated ETL pipelines using Airflow and Python to ingest and standardize 15+ third-party sources, delivering consistent and reliable data for downstream applications.
Implemented event-driven CDC pipelines leveraging Debezium, Apache Kafka (AWS MSK), AWS Lambda, and Python to capture real-time database changes and trigger metric calculations, updates, and audits.
Delivered analytics-ready datasets and Power BI dashboards providing insights into property performance, pricing, and investment KPIs.
Partnered with cross-functional teams to resolve production issues and ensure data reliability.