•Architected a Log Event Completeness Ranking and Replay System(US patent published)
which support:
•Measure log event completeness at application host level and rollup to data center level
•Provide percentile completeness rate.
•Rank log files in order of completeness impact, and determine the replay candidate to
achieve the target completeness value.
Using Kafka, MapReduce, Python, shell script, Argus.
•Integrated Distributed Tracing System(Zipkin) into Salesforce. Supporting more than one
billion transactions per day. Using Zipkin, Elasticsearch, Java.
•Leading in design and implementation of Kafka Pipeline Fidelity Check System, providing
realtime completeness at each topic/consumer group level. Using Kafka, Postgres, Java.
•Applied Machine Learning to anomaly detection during software release. Using
Elasticsearch and Kibana
•Worked on physical delete framework, increased deletion throughput.
•Created data replication latency monitoring system.
•Writing detailed test plans and test cases to cover business use cases, error handling and
boundary conditions as defined in technical specifications