20+ years of experience as Sr. Data Engineer working on different ETL tools building data warehouse solutions for large scale enterprises across Fintech, Manufacturing, Banking & Agro domains.
Experience
2022 — Now
2022 — Now
San Francisco Bay Area
2020 — 2022
2020 — 2022
San Francisco Bay Area
2016 — 2020
2016 — 2020
San Francisco Bay Area
Responsibilities:
Building ETL pipeline accelerating the extraction, transformation, and loading of massive structured and unstructured data.
Managed all activities necessary to take an ETL project from concept to production including system architecture definition, data modeling (star and snowflake) design, prototyping, development & testing (SCD Type I,II,III) and scheduling (dependencies, backward chaining, forward firing)
Excellent analytical, organizational, management, technical and creative skills with a strong understanding of data warehousing technologies, data analysis and OLTP vs. OLAP flows
Configuring streaming pipeline to perform a real time analytics to help grower and dealer to maintain a great farming practises.
Loaded the aggregate data into a relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming.
Experienced in Parsing high-level design specs to simple ETL coding and mapping standards. Designed Mapping Document, Detail Design Documents, Install and Release documents which is a guideline to ETL Coding.
Designed a large data warehouse using data/dimensional modeling.
Responsible for scheduling ETL pipeline in Airflow, error checking, production support, maintenance and testing of ETL pipeline using Airflow logs.
Designed and developed Big Data analytics platform for processing growers(farmers) information from fieldview product and other application logs using Python, Spark, Hive etc .
Environment: Spark, Hive, Flink, AWS(EMR, S3, Lambda, Kinesis, ECS, EC2), YARN, Zeppelin Notebook, Python, Scala, Docker, Airflow, Kettle, Vertica,Git Repo.
2015 — 2016
2015 — 2016
San Francisco Bay Area
Description:
Salesforce.com (stylized as salesƒorce) is a cloud computing company headquartered in San Francisco, California. Though its revenue comes from a customer relationship management (CRM) product, Salesforce also tries capitalizing on commercial applications of social networking through acquisition. As of 2015, it is one of the most highly valued American cloud computing companies with a market capitalization of $50 billion,[6][not in citation given] although the company has never turned a GAAP profit since its inception in 1999.
Responsibilities:
• Data integration plan from the various source system to Hadoop system.
• Experienced in Parsing high-level design specs to simple ETL coding and mapping standards.
• Designing, developing, integrating, testing, troubleshooting and debugging of the applications.
• Designed Mapping Document, Detail Design Documents, Install and Release documents which is a guideline to ETL Coding.
• Managing smooth implementation within deadlines and deployment of the application at client location.
• Designed and developed Informatica Mappings and Sessions based on user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
Environment: Spark,Hive,Scoop,Scala,Python,Informatica Power Center 9.5, Oracle 11g,TIDAL
Apache Flink, AWS (EMR,Kinesis, S3, Zippelin Notebook)
2011 — 2015
2011 — 2015
Charlotte, North Carolina Area
Description:
The Client is a $17 billion global diversified industrial company founded in 1871. Ingersoll Rand is a global provider of products, services, and integrated solutions to industries as diverse as transportation, manufacturing, construction, and agriculture.
The IR BTP BI program will allow the business to focus on analyzing information to make better business decisions and provide improved management reporting and analytics with data from disparate order entry and order management systems.
Responsibilities:
• Parsing high-level design specs to simple ETL coding and mapping standards. Designed Mapping Document, Detail Design Documents, Install and Release documents which is a guideline to ETL Coding.
• Performance tuning of various ETL components, Database SQL queries, Indexes management, Advanced Oracle Concept Tuning.
• Worked on complete life cycle from Extraction, Transformation and Loading of data using Informatica.
• Used Informatica's features to implement Type I, II changes in slowly changing dimension tables and also developed complex mappings to facilitate daily, weekly and monthly loading of data.
• Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression, Lookup, Update strategy and Sequence generator, Procedure.
• Prepared high-level design document for extracting data from complex relational database tables, data conversions, transformation and loading into specific formats.
• Designed and developed the Mappings using various transformations to suit the business user requirements and business rules to load data from Oracle, SQL Server, DB2, flat file.
• Developed standard and re-usable mappings and mapplets using various transformations like Expression, Lookups, Joiner, Filter, Source Qualifier, Sorter, Update strategy and Sequence generator.
Education
ACTS CDAC
DAC
All India Shri Shivaji Memorial Society College Of Engineering