Experience

VercelSoftware Engineer

2026 — Now

Greater New York City Area

DatadogStaff Engineer

2024 — 2025

Greater New York City Area

Staff Engineer in Datadog’s Infrastructure organization, focused on global networking, large-scale distributed systems, and reliability engineering. My work spans across Infrastructure, ensuring Datadog remains highly available, scalable, and ready for future growth.

Led the design and delivery of core platform capabilities, from deployment tooling and certificate automation to private connectivity and resilient edge infrastructure, supporting Datadog’s rapid global expansion.

Drove multi-region failover approaches and a repeatable, fast path for bringing new datacenters online, including enablement for regulated environments such as FedRAMP High and IL5.

Collaborated with product engineering teams, including but not limited to On Call, MCP, and Status Pages, to ensure new services launched reliably on top of core infrastructure.

Built cross-organizational practices that strengthened operational readiness and accelerated incident response.

Managed technical execution with cloud and network service providers to deliver secure, cost-efficient connectivity and accelerate Datadog’s infrastructure roadmap.

DatadogSenior Software Engineer

2021 — 2024

Greater New York City Area

Senior Software Engineer building Datadog’s global edge and networking platform.

DropboxSite Reliability Engineer

2019 — 2020

Greater New York City Area

Site Reliability Engineer at Dropbox focused on large-scale fleet lifecycle management, automation, and deployment orchestration across Dropbox’s hybrid cloud infrastructure.

Core contributor and member of the Fleet Management team responsible for the lifecycle,

allocation, and OS installation of Dropbox’s on-premise fleet consisting of over 75,000 bare metal hosts.

Designed and implemented an image-based OS installer compromised of several microservices which integrated with Dropbox's deployment infrastructure stack. The new installer dropped compute host p95 provisioning time to just 10 minutes which allowed for more rapid and reliable deployment of servers into production.

Designed and implemented an OS image building framework that abstracts the customization of a target image using a simple YAML based configuration syntax. As a result, new bare metal images were able to be created quickly without introducing redundant configuration and technical debt into the codebase.

Constructed a catalog of user-friendly Grafana dashboards for foundational services which increased the operability of the team’s stack by allowing on-call to quickly identify and troubleshoot issues.

Lead weekly operational meetings to systematically review incidents, user reported issues, and alerts to identify regressions and improve on-call health for the team.

Palantir TechnologiesSite Reliability Engineer

2016 — 2019

Greater New York City Area

Site Reliability Engineer supporting Palantir’s most impactful customer deployments in sensitive, self-hosted environments. Built and operated in-house monitoring and automation systems to improve observability and reliability at scale. Played hands-on roles in environments where uptime and trust were critical, working directly with customer IT organizations to ensure secure and resilient operations. Contributed to both open source and internal platforms while driving adoption of modern reliability practices across distributed infrastructure.

Provided infrastructure and systems support for large on-prem deployments, including server racking/cabling, OS installation and tuning, system administration, troubleshooting, and incident management in collaboration with customer IT teams.

Designed and implemented an in-house monitoring platform to replace Nagios with Prometheus across multiple sites, enabling deeper visibility into host and service health.

Built a Go-based health agent to surface critical hardware metrics from HP and Dell servers.

Developed a RESTful API for programmatic recording/alerting rule management in Prometheus.

Conducted code reviews and mentored engineers as they ramped up on Go.

Contributed upstream improvements to Prometheus and Grafana.

Core contributor to Palantir’s migration to Kubernetes and AWS for cloud-based platform deployments.

Implemented Datadog monitoring and real-time dashboards for cluster operations.

Reduced toil and improved resiliency of host bootstrapping via systemd-driven self-healing automation.

Education

University at Buffalo

Experience+1

Education

Bachelor of Science (B.S.)

Experience