# Bob Shannon

> Software Engineer at Vercel

Location: New York City Metropolitan Area, United States
Profile: https://flows.cv/bobshannon

## Work Experience
### Software Engineer @ Vercel
Jan 2026 – Present | Greater New York City Area

### Staff Engineer @ Datadog
Jan 2024 – Jan 2025 | Greater New York City Area
Staff Engineer in Datadog’s Infrastructure organization, focused on global networking, large-scale distributed systems, and reliability engineering. My work spans across Infrastructure, ensuring Datadog remains highly available, scalable, and ready for future growth.

• Led the design and delivery of core platform capabilities, from deployment tooling and certificate automation to private connectivity and resilient edge infrastructure, supporting Datadog’s rapid global expansion.
• Drove multi-region failover approaches and a repeatable, fast path for bringing new datacenters online, including enablement for regulated environments such as FedRAMP High and IL5.
• Collaborated with product engineering teams, including but not limited to On Call, MCP, and Status Pages, to ensure new services launched reliably on top of core infrastructure.
• Built cross-organizational practices that strengthened operational readiness and accelerated incident response.
• Managed technical execution with cloud and network service providers to deliver secure, cost-efficient connectivity and accelerate Datadog’s infrastructure roadmap.

### Senior Software Engineer @ Datadog
Jan 2021 – Jan 2024 | Greater New York City Area
Senior Software Engineer building Datadog’s global edge and networking platform.

### Site Reliability Engineer @ Dropbox
Jan 2019 – Jan 2020 | Greater New York City Area
Site Reliability Engineer at Dropbox focused on large-scale fleet lifecycle management, automation, and deployment orchestration across Dropbox’s hybrid cloud infrastructure.

• Core contributor and member of the Fleet Management team responsible for the lifecycle,
allocation, and OS installation of Dropbox’s on-premise fleet consisting of over 75,000 bare metal hosts.
• Designed and implemented an image-based OS installer compromised of several microservices which integrated with Dropbox's deployment infrastructure stack. The new installer dropped compute host p95 provisioning time to just 10 minutes which allowed for more rapid and reliable deployment of servers into production.
• Designed and implemented an OS image building framework that abstracts the customization of a target image using a simple YAML based configuration syntax. As a result, new bare metal images were able to be created quickly without introducing redundant configuration and technical debt into the codebase.
• Constructed a catalog of user-friendly Grafana dashboards for foundational services which increased the operability of the team’s stack by allowing on-call to quickly identify and troubleshoot issues.
• Lead weekly operational meetings to systematically review incidents, user reported issues, and alerts to identify regressions and improve on-call health for the team.

### Site Reliability Engineer @ Palantir Technologies
Jan 2016 – Jan 2019 | Greater New York City Area
Site Reliability Engineer supporting Palantir’s most impactful customer deployments in sensitive, self-hosted environments. Built and operated in-house monitoring and automation systems to improve observability and reliability at scale. Played hands-on roles in environments where uptime and trust were critical, working directly with customer IT organizations to ensure secure and resilient operations. Contributed to both open source and internal platforms while driving adoption of modern reliability practices across distributed infrastructure.

• Provided infrastructure and systems support for large on-prem deployments, including server racking/cabling, OS installation and tuning, system administration, troubleshooting, and incident management in collaboration with customer IT teams.
• Designed and implemented an in-house monitoring platform to replace Nagios with Prometheus across multiple sites, enabling deeper visibility into host and service health.
      • Built a Go-based health agent to surface critical hardware metrics from HP and Dell servers.
      • Developed a RESTful API for programmatic recording/alerting rule management in Prometheus.
      • Conducted code reviews and mentored engineers as they ramped up on Go.
      • Contributed upstream improvements to Prometheus and Grafana.
• Core contributor to Palantir’s migration to Kubernetes and AWS for cloud-based platform deployments.
• Implemented Datadog monitoring and real-time dashboards for cluster operations.
• Reduced toil and improved resiliency of host bootstrapping via systemd-driven self-healing automation.

### Owner @ DDoS Hosting Solutions
Jan 2008 – Jan 2014
Founder and operator of a small web hosting business specializing in DDoS-resilient infrastructure and dependable customer service. Designed, deployed, and maintained virtualization platforms, while building mitigation strategies to ensure uptime during network attacks. Managed all operational aspects of the company, including engineering, support, and business administration.


## Education
### Bachelor of Science (B.S.) in Computer Engineering
University at Buffalo


## Contact & Social
- LinkedIn: https://linkedin.com/in/bobmshannon
- Portfolio: https://robert.sh

---
Source: https://flows.cv/bobshannon
JSON Resume: https://flows.cv/bobshannon/resume.json
Last updated: 2026-04-13