I am currently a Senior Software Engineer specializing in Data Infrastructure at Toast in Boston where I have built out and scaled self service tools for our Data Platform which allow other teams and engineers to build out data products and reporting tools.
2024 — Now
Boston, Massachusetts, United States
Led squad's effort to migrate the catalog format of Toast's data lake from AWS LakeFormation to Apache Iceberg
2022 — 2024
Boston, Massachusetts, United States
Implemented a deletion queuing system for the safe deletion of older generations of streaming data
Created a custom data backup system that allowed us to backup tables from our data catalog, as well as store metadata which would allow us to re-register a backup as a snapshot table in our catalog
Helped migrate our stack from using Spark 3.2 to Spark 3.4
Worked on implementation of cost tagging initiative to evaluate and reduce team's spend on data storage and computation
Implemented Auth0 across our team's resources
Boston, Massachusetts, United States
Continuing to implement new ETL jobs from various data sources such as SQL Server, S3, client side SFTP etc using clean, testable Python code.
Mentoring junior engineers
Working with R&D to implement highly performant Spark jobs in Scala
Converting legacy SQL code into Python for better performance and Source Code control
Improving unit test coverage on existing pipelines
Implemented time intensive Athena calculations and data load in Scala to save engineer hours and reduce human error
Designed and built pyspark job runner to automate submission of pyspark jobs to EMR including handling of dependencies and job configurations
Continue to maintain existing Python code base, expanding test coverage, improving performance and scalability
Rewriting legacy Python code in Spark to improve performance and reduce need for constantly running EC2 instances.
2017 — 2021
Boston, Massachusetts, United States
Implemented dozens of new ETL jobs in SQL and Python using both T-SQL and Postgres
Ensured data cleanliness in all 4 of our SQL Server databases as well as our S3 data storage
Ran, maintained and improved a monthly and quarterly Spark job written in Scala
Implemented On-Call/Pipeline Triage system for engineers
Assisted with migration from On Premise/Bare Metal SQL Servers to AWS RDS Servers in the cloud
Assisted with migration of job scheduling from a combination of Cron and Cloudwatch Events to Airflow
London, United Kingdom
Lloyds Banking Group, remediation project, analytics. My involvement includes:
Interrogation and analysis of very large volumes of data from different sources in order to understand business logic used from each source
Produce powerpoints detailing business logic to be applied for key business stakeholders
Receive, analyse and prepare data to be loaded into the system database
Produce detailed reports for key business stakeholders which 'tell the story' of the data in order to understand appropriate treatment
QBE Insurance, implementing the Solvency II Pillar 3 Reporting Solution. My involvement included end to end delivery of certain aspects including:
Analysis and interrogation of large volumes of data in order to gain insights into the functionality of the solution
Business Analysis, Design and prototyping of Analytics solutions in order to analyse and identify issues within data
Design and build of Exceptions Reporting solution in SQL
Key contributor to analysis, design and test of complex components of Financial Reporting solution built in SQL
Writing test approach, test model and executing several batches of tests for the Technical Provisions Model.
Analysing, executing and writing Microsoft SQL scripts
Performing defect triage for team of 5 testers
Running client meetings in order to identify and gather requirements
Communicating solutions to key stakeholders
Education
University of Strathclyde