# Karan Alang

> Principal Software Engineer/Architect - Distributed systems, Cloud Computing, ML/AI | Speaker | Hackathon Judge | Mentor | IEEE Senior Member | ACM Member | Forbes Technology Council | Open Source contributor

Location: Cupertino, California, United States
Profile: https://flows.cv/karanalang

Experienced Software Engineer/Tech Lead/Architect having extensive experience in Design/Development of scalable Distributed systems, Data Engineering, Cloud Computing, ML/AI  
Currently, working as Software Engineer at Versa Networks Inc. as part of ML/AI group focussed on Data Engineering, Cloud Computing.
Other companies/clients worked with include - Apple Inc., Intel, 3M, Lenovo, US Steel 

Key technologies used - Python, Java, Scala, Apache Hadoop,  Apache Spark, Apache Kafka, PySpark, Grafana/Loki, Prometheus, GCP, AWS, Kubernetes, Redis, MongoDB, Airflow, Terraform

## Work Experience
### Principal Software Engineer @ Versa Networks
Jan 2021 – Present | San Jose, California, United States
part of ML/AI group
Tech Stack : Apache Kafka, Apache Spark, GCP, Kubernetes, Helm charts, Python, Java, Airflow, MongoDB, Prometheus, Grafana/Loki, Redis, Terraform
ML/AI - Anamoly detection(Isolation Forest), Prediction using FB Prophet, LLM

### Incorta Development project @ Apple
Jan 2019 – Jan 2021 | Sunnyvale
- Project involves creating an Automated system which does the reconciliation of Apple Employee Compensations in Apple Payroll and HR Systems.Automating the payroll checks and  Increasing their frequency will increase the accuracy of   payroll and decrease the workload for both Apple and ADP with regards to last minute escalations and fixes.

- Technology Stack : Apache Spark, Python, Apache Kafka, Apache Parquet, JSON, Incorta (In-Memory Data Warehousing & Visualization tool), JSON, Unix, Github

Accomplishments/Responsibilities:
- Led the Design, Architecture and Development activities for the Incorta Audit and Data Warehousing development project.
- Key participant in Design and Architecture decision making, closely working with upstream and downstream teams in Apple to finalize Non-Functional requirements, and translate them to Technical solutions
- Designed/developed Data model and data ingesting pipelines.
- Used Pyspark (primarily Analytical functions) to read data from parquet files, and create Semantic layer.
- Anchored Proof of Concept (PoC) development to validate proposed solutions and reduce Technical risk. 
- Involved in Providing critical Technical/Architecture related inputs to help finalize the Plan/Schedule and resource requirements.
- Involved in interviewing/hiring & mentoring team members, assigning/monitoring tasks and deliverables, providing Technical guidance to developers in the project.
- Supporting the application during Warranty period, and helping resolve/troubleshoot complex technical problems

### Apple - Itunes Analytics @ Apple
Jan 2018 – Jan 2019 | Cupertino
The project involves using iTunes data for Analytics/Reporting. It involves interfacing with Business & Core Induction teams to understand Analytics requirements, translating to Technical architecture/BI
Solutions

Responsibilities include - Design, Development of aggregates, Identifying & implementing Architecture related improvements/automation, mentoring/helping Junior developers(w.r.t. Technical solutions, Scala/Spark coding standards etc.)

Technology Stack includes – Apache Spark, Scala, Cloudera 5.11.1 (Hadoop, Hive), Jupyter notebooks,
Apple Private Cloud (S3, EC2, Jenkins, Splunk)

Accomplishments :
Designed/developed Analytics solutions - reading data in parquet format, using Scala + Apache Spark to implement complex Business Logic
worked on POC to evaluate/configure using Jupiter notebooks with Scala - to enable complex logic & visualization charts. (By default, advanced visualization is available only if we use Python on Jupiter notebook)

### Senior Technology Architect @ Infosys Technologies Ltd
Jan 2010 – Jan 2021

### Apple Datacenter – Builing highly available, scalable Real-time data pipeline @ Apple
Jan 2016 – Jan 2018 | Newark,CA
Description: The project involves setting up highly available, scalable, centralized Real-time Data pipeline for DataCenter Data. The data is used for Analytics/Reporting.

Technology Stack  – Hortonworks 2.x platform, Apache/Confluent Kafka, Apache Spark, AWS(S3, EC2), Hadoop, Prometheus, Grafana, Hive, Scala, Jenkins

 Accomplishments/Responsibilities:
- As part of Phase 1 : Developed data pipelines using Apache Spark + Scala moving data from Kafka to HDFS/Hive (on Hortonworks 2.x platform).  
- As part of Phase 2 : Developed data pipelines using Apache Spark + Scala moving data from Kafka to AWS/S3.  
- Confluent Kafka - Evaluated platform capabilities, published best practices, implemented/optimized  key Confluent modules - Control Center, Auto DataBalancer, Schema Registry, Replicator, Kafka Connect, Data Stream monitoring etc
- Evaluated & Defined Hadoop security capabilities for HDP 2.x components - HDFS, Kafka, Hive, Hbase, Spark, OpenTSDB, Grafana
-Implemented Security for HDP 2.x, some of the components implemented include 
              - Kerberos for Authentication
              - SSL for Confluent Kafka 
              - Apache Ranger (role based access)
              - Apache Knox (perimeter security)
              - Encryption using Hadoop KMS
- Defined best practices for Hadoop governance - encompasses Security, Lifecycle Management, Data Quality, Metadata management, Operations & reporting

### Apple Datacenter – Asset Management @ Apple
Jan 2016 – Jan 2016 | Newark, CA
Description: The project involves migration of existing Asset management solution on SQL Server to a NoSQL database (MongoDB). 

Accomplishments/Responsibilities:
•	Worked with system owner to identify existing system capabilities, limitations & future system requirements, Existing system DB is - SQL Server.
•	Defined MongoDB Architecture, key requirements included Multi-Data Center support, Active-Active, Performance, Audit
•	Evaluated/compared Big Data Architectural stack options for storing/querying hierarchical data – MongoDB, Cassandra, CouchDB, Oracle
•	Defined H/W requirements for setting up Dev/Test/Prod MongoDB clusters
•	Defined Collections in MongoDB – for enhanced CRUD performance
•	Done code review for ETL into MongoDB

### Apple DataCenter BI Strategy @ Apple
Jan 2016 – Jan 2016 | Newark, CA
The project involves setting up Big Data BI/Analytics system for Apple Data Center group, from ground up.

Accomplishments/Responsibilities include -
- Worked with system owners to understand existing systems & BI capabilities
- Evaluated Big Data stack options
- Defined optimal BI/Analytics technical architecture

### Big Data Architect/Lead - Apple Music (Business Intelligence) @ Apple
Jan 2015 – Jan 2015 | Cupertino
Key member of the GBI group at Apple, involved in launch of Apple Music.
Apple Music was launched as a subscription based service in 100+ countries in June 2015.
The GBI group was responsible for ingesting iTunes data into Hadoop/Hive, and developing semantic aggregates used for Analytics.

### BigData Architect/Lead - Core Induction @ Apple
Jan 2012 – Jan 2014 | Sunnyvale, CA
Leading the Apple iTunes Hadoop Core Induction team - responsible for data management/induction into Hadoop/Hive. Key responsibilities include - Solution Architecture, requirements/design, review, Project planning, delivery, Management status reporting.

### Solution Architect @ i2 Technologies India Pvt. Ltd.
Jan 2001 – Jan 2010


## Education
### Post Graduate Diploma in Management in Marketing/Marketing Management, General
Institute of Management Development and Research, Pune

### Bachelor's Degree in Mechanical Engineering
Andhra University

### High School
Timpany School


## Contact & Social
- LinkedIn: https://linkedin.com/in/karan-alang-4173437

---
Source: https://flows.cv/karanalang
JSON Resume: https://flows.cv/karanalang/resume.json
Last updated: 2026-04-12