Spearheaded construction of data.com (acquired by Salesforce), online database containing comprehensive information of millions of companies, hundreds millions of contacts around the world
▪ Built all contribution flows, including single and bulk flows with fraud checking and matching as well as normalization and rules validation
▪ Implemented contact and company search index and refreshing using apache Solr
▪ Created contact recommendation engine based on purchase activity and contact similarity, leading to 50% more contact purchase
▪ Crafted entity recognition software using Naive Bayes classification algorithms
▪ Established contact rank and department classification based on title
▪ Formulated batch job framework, email transmitter, e-commerce platform, gamification system, etc
▪ Massive contacts deduplication, matching, enrichment using apache Camel, MongoDB
▪ Company news digestion and integration using apache Camel, Zookeeper, Kafka, Hbase and Solr
▪ Review data.com projects for security holes, XSS, SQL injection, CSRF, DOS/DDOS, etc