# Ashish Joshi > Software Engineer II @ZoomInfo | Java Spring Boot • Scala • Python • Kafka • AWS | Distributed Systems • Microservices • Event-Driven Architecture Location: San Jose, California, United States Profile: https://flows.cv/ashishjoshi I'm a Software Engineer specializing in distributed systems and backend engineering, currently building real-time data processing platforms at ZoomInfo that handle 1M+ events daily across a 145M entity database. My work centers on designing and implementing scalable microservices, event-driven architectures, and high-throughput processing pipelines. I'm particularly passionate about solving complex systems problems at scale - from real-time entity resolution to building resilient data infrastructure that serves multiple engineering teams. TECHNICAL FOCUS • Languages: Java, Scala, Python, SQL • Systems: Distributed systems, microservices, event-driven architecture, real-time processing • Technologies: Spring Boot, Kafka, Apache Beam, Spark, BigQuery, Solr • Cloud: AWS (EMR, S3, Redshift), GCP (BigQuery, Dataflow, Cloud Functions) RECENT IMPACT • Architected real-time entity resolution system processing 1M+ Kafka events/day with Java Spring Boot microservices • Improved pipeline validation speed by 50% with distributed Scala-based framework on Airflow • Reduced incident investigation time by 40% through distributed traceability system with CDC using Apache Hudi I've progressed from building GCP data platforms at Quantiphi to engineering ML infrastructure at CCC to now designing distributed backend systems at ZoomInfo. Throughout this journey, I've contributed to 8+ production repositories, collaborated across teams, and shipped features that balance performance, reliability, and scalability. Beyond code, I enjoy leading technical discussions, mentoring engineers, and occasionally unwinding with music and cricket. Always interested in connecting with engineers working on distributed systems, platform engineering, or backend infrastructure challenges. Let's talk about building scalable systems. ## Work Experience ### Software Engineer II, Data Platform @ ZoomInfo Jan 2024 – Present | San Mateo, California, United States • Architected and developed microservices for a real-time entity resolution platform processing 1M+ Kafka events daily across a 145M company database, implementing event-driven data pipelines with Spring Boot that orchestrate data ingestion, profiling, scoring, and persistence to BigQuery and Solr. • Designed and implemented pluggable profiling engine with multiple specialized profilers, enabling extensible attribute normalization and scoring logic for company data attributes across 150+ data sources. • Contributed backend features across 8 production repositories serving multiple engineering teams, ensuring code quality through comprehensive testing and cross-team code reviews. • Improved pipeline validation speed by 50% by building an Airflow-orchestrated validation framework using Scala, processing large-scale Parquet/ORC datasets from S3, querying Snowflake for validation rules, and integrating Slack alerting for real-time anomaly detection across ETL workflows. • Reduced production incident investigation time by 40% by designing and implementing a distributed traceability system in Scala that captures data lineage, scoring algorithms, and decision rationale across 150+ data sources; architected CDC pipeline using Apache Hudi with Parquet serialization to S3, orchestrated via Airflow. ### Software Engineer, ML Platform @ CCC Intelligent Solutions Jan 2023 – Jan 2024 | Chicago, Illinois, United States • Built scalable image processing pipeline using Apache Beam on AWS EMR to process millions of images from S3, designing schema and ETL workflows to populate metadata warehouse in AWS Redshift • Evaluated and prototyped zero-shot object detection models, presenting technical findings to 20+ engineers; developed proof-of-concept using PyTorch demonstrating production viability for computer vision use cases • Developed model fine-tuning pipeline for Pix2Struct vision transformer achieving 93% precision on specialized document detection tasks (license plates, odometers, VINs), implementing training infrastructure and evaluation frameworks ### Software Engineer, Data Platform @ Colgate-Palmolive Jan 2021 – Jan 2022 | Mumbai, Maharashtra, India • Developed 20+ production ETL pipelines integrating vendor APIs with Salesforce and SAP using Airflow, Python, Docker, and PySpark, supporting customer retention initiative that improved retention by 30% • Improved system reliability by 50% by building monitoring infrastructure with Airflow to track 50+ GCS storage locations, implementing automated alerting system with customized email notifications to stakeholders • Designed and implemented SQL-based analytics pipelines on Snowflake using DBT for data transformations, enabling sales KPI reporting for Indian market in collaboration with senior leadership ### Software Engineer, Data Platform @ Quantiphi Jan 2019 – Jan 2021 | Mumbai, Maharashtra, India • Architected and built ELT pipelines from Salesforce to BigQuery using Google Cloud DataFlow, designing data models and implementing data quality validation framework to support customer targeting systems (25% accuracy improvement) • Optimized large-scale data aggregation processing terabytes of user activity data on BigQuery, improving data pipeline efficiency by 40% through query optimization and partitioning strategies • Developed Python-based image deduplication system processing binary-encoded property images, implementing similarity detection algorithms and hierarchical batching to improve database quality by 50% • Built end-to-end ML pipeline automation using Airflow, Python, Docker, and PySpark, orchestrating data migration, feature engineering, model prediction, and retraining workflows, reducing manual intervention by 66% • Implemented serverless data processing using Google Cloud Functions to extract and export daily prediction results from BigQuery GCS, accelerating downstream system consumption by 50% ## Education ### Master's degree in Computer Science North Carolina State University ### Bachelor of Engineering - BE in Computer Engineering University of Mumbai ## Contact & Social - LinkedIn: https://linkedin.com/in/ashish-joshi-here --- Source: https://flows.cv/ashishjoshi JSON Resume: https://flows.cv/ashishjoshi/resume.json Last updated: 2026-04-11