# Shouvik Sharma > Enterprise Data & AI | Data Infrastructure and Analytics | Snowflake Pro Core Certified | Databricks Certified Data Engineer Location: San Francisco Bay Area, United States Profile: https://flows.cv/shouvik 🔍 Data Engineer | ETL Specialist | SQL Ninja 🚀 Passionate about transforming raw data into actionable insights, I am a results-driven Data Engineer with 3 years of hands-on experience in designing, developing, and maintaining robust data infrastructure. My expertise lies in crafting efficient ETL pipelines, optimizing data workflows, and ensuring data accuracy for informed decision-making. Key Skills: 🛠️ ETL Development: Proficient in designing and implementing Extract, Transform, Load (ETL) processes, ensuring seamless data integration across diverse sources. 💻 SQL Mastery: Adept at writing complex SQL queries to extract, manipulate, and analyze large datasets, ensuring data accuracy and integrity. 📊 Data Modeling: Skilled in creating and maintaining data models to facilitate a clear understanding of data structures and relationships. 🛢️ Database Management: Experienced in managing and optimizing relational databases, including performance tuning and troubleshooting. Professional Experience: In my previous roles, I have successfully collaborated with cross-functional teams to deliver data solutions that meet business requirements. I have a proven track record of implementing scalable and efficient data pipelines, reducing processing times and improving overall data quality. Continuous Learning: I thrive on staying at the forefront of emerging technologies in the data engineering landscape. From cloud-based solutions like AWS and Azure to mastering the latest tools and frameworks, I am committed to expanding my skill set to drive innovation and efficiency. Let's Connect: I am enthusiastic about connecting with like-minded professionals, sharing insights, and exploring opportunities for collaboration. If you are seeking a dedicated Data Engineer who can turn your data into a strategic asset, let's connect! ## Work Experience ### Mentor @ Mentoring Club Jan 2026 – Present ### Software Engineer, Data @ Chime Jan 2025 – Present | San Francisco Bay Area • Spearheaded the development of Chime’s Member-360 Full Funnel Data Foundation on Snowflake, enhancing data normalization. • Implemented CDC-driven pipelines that reduced compute costs by approximately 35% and improved insight latency to near-real-time. • Created a dynamic funnel dashboard utilized by Growth, Marketing, and Product teams to optimize spend and identify drop-offs. • FinPlat Transaction-Alert Golden Cut-over – migrated Snowflake EDW alert datasets from Galileo to native FinPlat events, validated via shadow checks, and cleared the path for FCM/BIN migrations. ### Data- Senior Associate @ Avant Jan 2024 – Jan 2025 - Developed and maintained highly scalable ETL pipelines using Databricks and dbt, ensuring data integrity and seamless integration across multiple data sources. - Designed and optimized data workflows, reducing processing times by 30% through the use of Apache Spark and Databricks, resulting in improved data pipeline efficiency. - Orchestrated complex workflows using Apache Airflow to automate daily pipeline operations, increasing system reliability by 25% and minimizing system downtime. Key aspects of the pipelines: - Utilized dbt macros to standardize transformations and ensure code reusability. - Implemented Liquid Clustering to optimize data query performance, particularly in large datasets. - Applied Partitioning strategies to enable faster data retrieval and processing, improving overall pipeline efficiency. Leveraged Delta Lake for handling data versioning and ensuring data consistency in a scalable manner. ### Data- Associate @ Avant Jan 2022 – Jan 2023 - Implemented robust data quality checks using SODA for real-time validation, resulting in a 30% reduction in data discrepancies. - Led end-to-end data transformation workflows with dbt, achieving a 20% improvement in analytics process efficiency. - Extracted insights from marketing data to optimize customer application experience, contributing to a 4% increase in application rate. ### Data - Senior Analyst @ Avant Jan 2021 – Jan 2022 | Racine, Wisconsin, United States - Led the development and maintenance of data warehouses on Dremio and Databricks, improving data accessibility and efficiency. - Implemented ELT processes and conducted data analysis using SQL and Python. - Worked on cloud computing services (AWS) to develop scalable and efficient data solutions. - Collaborated with cross-functional teams to implement BI tools for data visualization and reporting. - Implemented and maintained CI/CD practices using GitHub. ### Analytics Engineer Intern @ CNH Industrial Jan 2021 – Jan 2021 | Racine, Wisconsin, United States • Development of predictive analytics and statistical forecasting models utilizing forecasting related systems to improve forecast accuracy and bias reduction • Formalizing assumptions about how demand forecasts are expected to behave, creating definitions of outliers, developing methods to systematically identify these outliers, and explaining why they are reasonable or identifying fixes for them • Development of tools to allow process automation, analysis & corrective action implementation by the business. - Forecast Accuracy analysis and corrective action implementation ### Data Scientist @ Daten Solutions Jan 2020 – Jan 2021 | Schaumburg, Illinois, United States - Developed and automated data migration pipeline from SQL Server to Snowflake using SnowSQL and SnowPipe, and performed dimensional modeling on the migrated data, further created data dictionary for the technical audience. - Automated ETL processes using Prefect (Python), making it easier to wrangle data sets and reducing time by as much as 40% by performing large-scale data conversions, and transferring BAAN data into standardized formats for integration into Snowflake. - Created Tableau dashboards to explain variation in success Metrics and Time Series Analysis to higher management. - Automated reporting process using Excel VBA (Macros) and MySQL maintaining accuracy and saving ~ 75% of time, maintained version control Git, Mercurial, SVN. - Formulated problem statements after gaining an understanding of the client's business problems - Created key result areas for driving the revenue of the client - Developed B2C customer segmentation algorithms using demographic variables, RFM model and then applying k-means clustering for deciding the ideal set of customers for marketing campaigns - Created customer segmentation end-to-end machine learning pipeline for a grocery chain by using AWS Sagemaker Studio and S3 buckets ### Data Engineer and Analysis Intern @ Daten Solutions Jan 2020 – Jan 2020 | Schaumburg, Illinois, United States • Developed statistical models like ARIMA using statsmodels package in Jupyter Notebook, the model achieved an overall accuracy of MAPE 5.96% • Created interactive dashboard using R Shiny • Automated the end-to-end model workflow using Azure DevOps and Azure Machine Learning which allows CI/CD architecture ### Content Writer @ Medium Jan 2020 – Jan 2021 | Chicago, Illinois, United States ### Data Scientist - Practicum Student @ Labelmaster Jan 2020 – Jan 2020 | Chicago, Illinois, United States •Analyzed sales of multiple departments using a dataset of over 300 features in Python which included KPIs such as transport market indicators, freight movement, and US economic indicators •Removed multicollinearity from domain knowledge and identified important features with respect to the dependent variable using the Granger Causality test •Forecasted sales for 2021 using a combination of Statistical (SARIMAX) and Deep Learning Time Series (LSTM and RNN layers)techniques which resulted in a Mean Absolute Percentage Error rate of 8% •Deployed models by creating an interactive Flask application and created visualization dashboards using Tableau; presented analysis to senior leadership ### Business Analyst @ Cartesian Consulting Jan 2018 – Jan 2019 | Mumbai Area, India 1. Developed customer insights for one of India’s largest grocery chains, to assist their marketing team for improving Customer Retentions, Reducing Churn Rate, Campaign Responses, Lift and Incremental revenue using statistical techniques like RFM methodology, Linear Regression and Logistic Regression. 2. Built CLTV & BTYD propensity models using BTYDplus library in R, these models helped to choose best customers for loyalty programs 3. Incorporated market basket analysis to improve campaign ROI through cross sell, it improved the topline revenue year-on-year by 3% 4. Created & automated various business trend reports & trackers to analyse patterns & movements in business KPIs 5. Created interactive dashboards using Qlikview for showcasing key metrics to the senior leadership 6. Created customer one view and customer profiling based on the customer’s geography for a global translation services platform. 7. Deployed predictive modelling techniques like Random Forest to identify the ‘Most Valuable Customer’, this customer identification led to better customer targeting and improve the yearly top line revenue by 3 %. 8. Deployed Feature Selection using the Boruta library in R for determining the most impactful features for predictive modeling 9. Performed marketing mix modeling and ROI analysis to quantitatively estimate the effectiveness of various marketing elements for one of the top fantasy sports platform based in India. 10. Performed hypothesis testing to validate whether “Fantasy sports is a game of skill or gamble” using the Chi-Square Test, Linear Regression and paired T-test, the findings successfully published in the Harvard Business Review. ### Business Strategy Intern @ Greeksoft Technologies Pvt. Ltd. Jan 2017 – Jan 2017 | Mumbai Area, India • Led a price forecasting project for Greeksoft Technologies Pvt. Ltd., Mumbai • Extracted stock price data using NSEpy library which is used to extract historical and real-time data from NSE’s website in python. • Created external variables using technical stock indicators for determining the impact on the closing price • Performed data cleaning and data manipulation for excluding holidays in the stock market • Built an RNN Neural Network model for Live positional trading using Keras package with Tensorflow in AWS Sagemaker, where outputs supplemented Bull Spread Strategy in Options Trading, the developed model architecture was backtested for the period from 2012-2017 where it achieved correct market prediction in 71 % of the days ; this forecasting architecture is utilized for live trading • Deployed automate end-to-end predictive modeling pipeline using AWS DevOps, where it supported automated daily price forecasting using the LSTM neural network architecture ### Data Engineer Intern @ Nielsen Jan 2017 – Jan 2017 | Baroda • Worked as Data Science Intern to automate sample design processes using R software • Assisted in designing and development of technical architecture for sample design process • Reduced time required to complete these processes by 25%, thereby helping management to make important decisions faster • Propose potential research-on-research tests to improve current Nielsen methodologies and improve response and compliance • Dive in, and work with our data science team to develop new data-centric products involving new and innovative algorithms • Classification of store types based on store attributes using Random Forest algorithm in PySpark which resulted in better surveying and data collection ### Relationship Manager @ Tata Capital Jan 2015 – Jan 2016 | Lower parel 1. Drive acquisition channel of used cars and two-wheelers dealership, by building customer scorecard analysing different parameters affecting repaying capacity 2. Identified key metrics by extracting and analysing client’s customer data; gained an understanding of the client’s financial and inventory process for deriving second-order variables, to determine the credit limit of the customer. 3. This led to a multi-fold increase in corporate lending for two-wheeler and used cars segment, with 0% NPA cases reported over the course of 10 months. ## Education ### Master's degree in Data Science Illinois Institute of Technology ### Masters in Statistics SVKM's Narsee Monjee Institute of Management Studies (NMIMS) ### kj somaiya science & commerce college ### VPM br.tol high school ## Contact & Social - LinkedIn: https://linkedin.com/in/shouvik-sharma - GitHub: https://github.com/shouvik19 --- Source: https://flows.cv/shouvik JSON Resume: https://flows.cv/shouvik/resume.json Last updated: 2026-04-11