Led the end-to-end development of real-time ingestion pipelines using technologies like GraphQL, Kafka, DynamoDB Streams, and AWS Kinesis, delivering both real-time and offline data into ClickHouse and Druid and presto
Developed GraphQL-based ingestion to stream data into real-time (ClickHouse, Druid) and offline ( S3 parquet) data stores.
Designed and implemented real-time CDC (Change Data Capture) pipelines from DynamoDB into ClickHouse and Apache Druid, enabling sub-second ingestion and millisecond-latency querying for user-facing dashboards.
Built GraphQL-based APIs to serve both streaming and historical data with unified query interfaces.
Architected resilient data ingestion systems with replay, ordering guarantees, and schema evolution handling.
Integrated streaming and batch pipelines to support both real-time insights and offline reporting/analytics from a single unified data model.