I helped build and scale Salesify's Lead Generation platform. My work involved building and maintaining the backend systems and pipelines that power the platform. Also worked on various Data Science initiatives involving NLP, Named Entity Recognition etc.
• Developed distributed pipelines over MongoDB to help process terabytes of data
• Built various micro-services using REST, Python & RabbitMQ
• Refactored existing ETL jobs to improve efficiency by 5x
• Developed a custom multithreaded Mongo to Redshift data transfer pipeline using Python
• Built tools to automate data ingestion from AWS S3 to MongoDB
• Built a Keyword extractor using Python & Rake Algorithm
• Worked on clustering algorithms, near dup detection which power recommendations
• Deployed Elastic Search cluster to improve search latency
• Developed the initial version of Salesify's API