# Umer Shaikh > Data Engineer | Semantic and context layers | Building data foundations that make AI work in production | Ex-Meta Location: San Ramon, California, United States Profile: https://flows.cv/umershaikh 12+ years of experience working in data analytics as a Data Engineer using modern stack, cloud services, big data tools and traditional ETL Tools. • Expertise in marketing (digital & conventional) business process, also have good functioning knowledge of manufacturing, e-commerce, retail domains. • Competent in all aspects of Data Warehousing including project planning, requirement gathering, data modeling, development, deployment, maintenance, enhancements. testing and production support and bug-fix. • Experience designing, reviewing, implementing and optimizing data transformation processes in the Hadoop and ETL ecosystems. Able to consolidate, validate and cleanse data from a vast range of sources – from application, databases, files to data warehouse. • Extensive work experience in software testing life cycle (STLC) including System Testing, System Integration testing, Regression Testing, UAT support and business user interaction, Exposure to creating Testing Strategies, Detailed Test Plan, Test case execution, Defect Analysis and preparing bug/pass reports. • Experience in Agile Scrum methodology; adapting to new development environments and changing business requirements ## Work Experience ### Staff Software Engineer - Data @ Chime Jan 2021 – Present Built data foundations & frameworks for AI, ML & Product Analytics. Built in-house - experimentation data platform - audiencing data platform - member 360 platform - datawarehouse AI RAG bots ### Lead Data Engineer @ Facebook Jan 2019 – Jan 2021 | Menlo Park, California, United States Built foundational datasets to unlock product insights in creation team for newsfeed, stories & reels product in facebook app. ### Senior Data Engineer @ Lending Club Jan 2016 – Jan 2019 | San Francisco Bay Area Working in marketing scrum to deliver data pipelines catering business requirements in every sprint cycle. Collaborating with business users and data science team for their data needs, giving them the data insights valuable for decision making. • Automated direct mail marketing process which reduced the overall operation time of a campaign by 70% • Created ingestion pipelines in hive which consumes huge data from credit bureaus and transformation of this huge data to be able to load in to data marts. • Creating credit models and underwriting in hive sql to process this data and send automated files to printshop for printing mails. • Created analytical tables for capturing all the user’s interaction across multiple marketing channels like direct marketing, email, partner websites, digital ads, etc. so as to make them personalized offer and market them efficiently. • Created automation to ingest data from partners, building analytics on it to change our pricing based on competitor’s pricing. • Help email marketing team with all the data attributes needed for their campaign like eligible products, user’s loan, pay plan, whether prospective borrower, etc • Developed UNIX shell scripts for business validations and operational use cases. • Creating oozie workflows to schedule hive etl jobs • Creating alert mechanism using splunk, wavefront and Opsgenie to track SLA for critical processes and failure alerts notifications. ### Data Engineer @ Macys.com Jan 2015 – Jan 2016 | San Francisco, California • Development of ETL pipelines using Datastage ETL tool or complex SQL to integrate the website & store data from various sources like Hadoop (HDFS), Oracle, DB2, Tibco, MQ & Flat files into data warehouse. • Develop complex scripts to consume huge files from big-data/HDFS to data warehouse using Hive & Sqoop. • Mitigated risks & challenges involved in the migration of ETL jobs from Datastage v8.5 to v11.3. • Optimized performance of existing production jobs which were bottlenecks. • Developed UNIX shell scripts to create validation frameworks and scripts to trigger the ETL jobs which are scheduled in Control-M scheduler. • Used XML stages to consume and publish XML files to MQ. • Interaction with System analysts, business users to understand their requirements during Plan and define phase of the project. • Managing multiple projects and driving the projects in parallel from development phase to deployment. • Preparation of deployment documentation to handover to support team for production support. • Followed the release engineering & defect management process using JIRA tool and used SVN for version management. • Used Version One tool for Agile software development cycle. ### Senior ETL Engineer @ Silicon Valley Bank (Via Tata Consultancy Services) Jan 2014 – Jan 2015 | Santa Clara, California • Providing solution with respect to SAP BODS and Redwood scheduler • SAP Repository Management/ Data store setup • Worked in different modules of the projects related to Financial Data processing, Business Performance Management, Sales Marketing Analysis and Reporting, e-Statements. • Interaction with Clients, managers, business team to understand their requirements during Plan and define phase of the Change Request. • Used transformations like Connected and Unconnected Lookups, Expression, Joiner, Update, Router, Aggregator and Sequence generator. • Development of jobs in SAP Redwood Scheduler following SDLC cycle. • Maintenance of Database, SAP BODS server, Recycling services by coordinating with the hosting team. • Connecting with Oracle product support, SAP product support for issues with Tool. • Configuration changes, recycling services of Essbase server. ### Data Warehouse Developer @ General Motors (Via Tata Consultancy Services) Jan 2010 – Jan 2014 • Used Datastage Designer to develop jobs for extracting, cleansing, transforming, integrating, and loading data into data warehouse database. • New DataStage Development & Enhancement to meet the requirement. • Wrote and executed complex SQL queries to test the data being loaded into Interim tables. • DataStage 7.5, 8.1 & 8.5 installation, configuration, administration and maintenance. Patches installation and upgrades. • Providing solution with respect to IIS, Oracle or Linux/Unix. • Interaction with Clients, managers, business team to understand their requirements during Plan and define phase of the Change Request. • Extensively worked on building Datastage jobs using various stages like Oracle Connector, Funnel, Transformer stage, Sequential file stage, LookUp, Join and Peek Stages, Sort, Merge, Aggregator, Dataset and Remove Duplicates stages. • Performance tuning to optimize the performance of the ETL. • Provide quick solution to client queries and take care of client change request. • Created scripts and ETL jobs to automate manual steps. • Customization for Application as per change request and client requirement. • Conducted Performance Testing and Load testing. • Design of Architecture to meet the data delivery in minimum possible time with availability of current tools and resources. • Implementation of MFT2 pattern architecture for GSC27 SOX applications. • DataStage EBCDIC file based CDC and Intelligent batch jobs triggering implementation for multiple application Contribution in ELIT Setup for GSC27. • Interaction and co-ordination with client and offshore team for smooth project execution. • Conducted multiple DataStage & UNIX Training sessions for teams. ## Education ### Bachelor of Engineering (BE) in Information Technology University of Mumbai ## Contact & Social - LinkedIn: https://linkedin.com/in/umer-shaikh --- Source: https://flows.cv/umershaikh JSON Resume: https://flows.cv/umershaikh/resume.json Last updated: 2026-04-12