Worked on the Big Data Platform responsible for implementing a variety of large scale data storage and processing pipelines in the cloud.
Primarily worked with the ingestion and storage pipeline powering the ancestry and DNA products. Involved Java, Hadoop, MapReduce, and S3 for ingestion and HBase + Apache Phoenix for the storage and querying. Ran on AWS taking advantage of various products including EMR, S3, Lambda, CloudFormation, RDS.