# Sandeep Bansal

> Principal Performance Engineer  | Salesforce Agentforce Platform  | Founder Jetmanlabs.com |  Patent Instrumentation

Location: San Francisco Bay Area, United States
Profile: https://flows.cv/sandeepbansal

15+ performance engineering and tools development experience in improving the resiliency and performance of small to large scale cloud and distributed platforms with performance benchmarking, scalability experiments, identifying bottlenecks with profiling, monitoring, Instrumentation tools and data science.

Experience in optimizing cloud platforms for large to small companies like Salesforce, GE Digital, Clickup, Lithium Technologies, Intuit etc.

Experience in building functional and performance benchmarking and scalability frameworks to test AI models with large data sets with real time analytics and uncover scale and latency issues.

Experience in optimizing scalability and end to end time for Natural language based AI systems for Einstein Search.

Architect and build multi user microservices and User navigation performance testing and workload generation framework using open source like Jmeter, Locust, Puppeteer to measure latency in production environment to identify latency and resilience issues.

Troubleshoot and profile production slowness or uptime reliability issue across end to end multi -tier infra, DB, services runtime and end user.
Platform tuning and identify opportunities to optimize using tracing and custom tracing and observability collectors across the stack.

Observability Expertise in real user monitoring and performance metrics instrumentation (RUM) to measure user page load time in production and pre production env.

Capacity planning to optimize cloud cost by building the model to collect data from sources like user latency, services health, platform usage patterns and infrastructure health and resource usage to ensure there is no customer impact.
Build and maintain a performance and scalability lab to run benchmarks, synthetic monitoring, profiling and capacity planning exercises.

Build api driven test data for workload generation to reproduce a customer scenario and profiling cause of slowness.
Left Shift performance to run perf in CI/CD to identify perf issues early in dev lifecycle.

Patent and Innovation
Patent on action-able instrumentation and observability of real user monitoring observability platform. 

Develop multi user UI performance/ synthetic user experience monitoring and observability platform to instrument devtool perf and profiling metrics to identify slow user page load or loading issues.

Python machine agent to collect host or services runtime metrics data at edge  publish  to JSON store for decision and anomaly.

## Work Experience
### Principal Engineer - Agentforce Platform @ Salesforce
Jan 2023 – Present | San Francisco Bay Area
Architecture and design to scale Agentforce platform to build and deploy agent to help customers add AI capabilities in their products.
LLM and Model benchmarking with various LLM's and GPU's.
Build tools to benchmark LLM models find optimal cost to serve.
Build Mock platform to mock LLM responses and save millions of $ for internal development.

### Founder @ JetManLabs
Jan 2019 – Present | San Francisco Bay Area
Founder of jetmanlabs.com

Jetman is developer productivity platform. It enable development teams to design, build, mock and test API’s  and ship products at faster pace with highest quality.

Everything from ground zero, product vision, architecture, design, full stack development in nodejs , javascript, HTML, CSS etc.

HIre and manage a team of 5 engineers to build supporting products.

Developed platform independent MacOS and Windows client for API development and testing.
Developed cloud platform and portal using microservices running in Google Cloud (GCP)

### Principal Engineer - Performance | Instrumentation | Observability @ ClickUp
Jan 2022 – Jan 2023 | San Francisco Bay Area
Leading the effort to scale and improve end to end user experience latency across multi tier AWS based cloud platform of ClickUp CRM product.

Architect and implement distributed multi user API and User experience benchmarking frameworks to run in perf and production environment to ensure services and end user latency and resiliency are with SLA with release or cloud technologies changes.
Design and Implement Test Data Generation Frameworks for BE and AI use cases for OpenAI and CRm data applications.

Monitor and actionable trend analysis of production services and infrastructure health as new features or customer usage patterns changes.

Instrument and profile end user latency issues across front end, services and AWS cloud infrastructure.
Shorten the feedback cycle for services latency and impact with Left Shift Performance with CI/CD.
Infrastructure capacity planning for DB (Postgres, Elasticsearch, ECS, Redis,) and other app run time services running on ecs by understanding user traffic pattern, time of use and mapping to infrastructure resource usage.

RCA and MTTR of production performance/scalability issues, and work closely with development teams to make improvements.

Architect to instrumentation and collect perf metrics/signals for user page load time measurements and opportunities to improve loading time.

Perf Tools:
Jmeter, Loadrunner, Locust, Playwright/Puppeteer and added multi user support.

APM/Observability:
DataDog, New Relic, Appdynamics, Graffana, YourKit, JVM Tunning, ELK

Programming languages:
NodeJS, Javascript, Java, Python

Technologies and Cloud
EC2, SSD, ECS, Docker, NodeJS, JVM, Cassandra, Postgres, AWS, Redis, Kafka, RabbitMQ, ElasticSearch, Kibana

### Principal Engineer - Performance | Observability | Instrumentation @ Salesforce
Jan 2021 – Jan 2022 | San Francisco Bay Area
Improved end to end user experience of AI and Natural language powered Einstein Search Platform by identifying the scale and areas of optimization by measuring the breakdown of the latency spent from UI interaction to last hop.
Design and Implemented new capability to measure real user page load time and intercept the traffic and time spent on various components and sent to metrics store for waterfall breakdown comparison.
Act as Product Owner to prioritize/align performance short term and long term goals.
Mentor and manage small team.
Patent on UI load generation and waterfall based analytics user experience platform.
Custom Instrumentation across client and API tier for latency breakdown.
 
BE and UI performance load and metric collection generation framework for real time analysis and regression detection.
Built system crawler to watch and collect system health and custom metrics of 10's of thousands of DB and Runtime production nodes for system health and later used the data for capacity planning and estimating private Datacenter cap add. 

SRE for monitoring, debugging and RCA of production services in private and public cloud.
Root cause performance/scalability issues and work closely with development teams to make improvements.

Technology Stack:- Oracle, Solr, Nodejs, Postgres, AWS, Private Cloud, Java, Spring, Angular JS
APM: Appdynamics, New Relic, VisualVM, YourKit, Splunk
Programming: Python,Java, NodeJs, Javascript, JQuery/HTML
Cloud: Loadbalancer, AWS, Elasticsearch, Kibana, Splunk, Redis, Auto scaling, ecs, Docker

### LMTS Performance and Observability @ Salesforce
Jan 2017 – Jan 2021 | San Francisco Bay Area
Leading Salesforce Einstein Search performance initiative: Improving user experience, scaling to millions queries/day.
Architect and design backend and UI perf instrumentation framework to capture performance metrics. 
Managing and leading Observability platform to accurately measure, waterfall breakdown of request/webpage roundtrip. Exciting part is you can run actionable analytics for trending, drill-down the cause of slowness and get insight to find opportunity to optimize.

### Sr Staff Performance Architect/Manager - Cloud and Infrastructure platform @ GE Digital
Jan 2016 – Jan 2017 | San Francisco Bay Area
Architect Performance as a Service platform to improve developer productivity
Telemetry
Design Anomaly Tracing 
Cloud Platform and infrastructure Benchmarking/Architecture
Tools, RCA to improve MTTR
Infrastructure Optimization and Capacity Planning
Uptime and SLA to improve customer experience

### Staff Performance Scale and Capacity Engineer @ GE Digital
Jan 2014 – Jan 2016 | San Francisco Bay Area
GE Graph Predix IoT platform is all about scale and performance.
Provide Architectural and design inputs to scale multi-tenant cloud platform based on microservices
Improve throughput, stability and latency of cloud based IoT microservices.
Provide ways to safeguard app during network, database, load issues to maintain uptime and SLA's
Benchmark and identify scale and operational limitation of data platform.
Address Multi tenancy issues in cloud platform and approach to scale within SLA.
Production infrastructure capacity planning for distributed nosql cassandra, Solr  and app runtime to estimate cost to serve and achieve operation excellence.
Review platform architecture and provide recommendations.
Build benchmark and result analysis tools and framework in Nodejs, Java, Javascript to simplify performance benchmarking, analysis and monitoring as replacement for Jmeter and Loadrunner
Identify platform scale, performance, stability issues by running various benchmarks workloads.
Integrate performance testing into CI pipeline.
Root cause analysis of performance issues in production and performance env and work with developers to fix the right problem.
Build Jmeter based Cloud infrastructure to automate benchmarking.
Technology Stack:- Cassandra, Solr, Nodejs, Postgres, Cloud Foundry, AWS, Private Cloud, Java, HIbernate, OpenJpa, Spring, Angular JS
APM: Appdynamics, New Relic, VisualVM, YourKit
Programming: Java, NOdeJs, Javascript, JQuery/HTML

### Lead Performance and SRE Engineer @ Lithium Technologies
Jan 2012 – Jan 2014 | San Francisco Bay Area
Leading Performance Engineering/Certification of J2EE based SaaS Social Application
Responsible for RCA of Production Performance issues and tune SaaS platform for better performance.

Improve stability and downtime of cloud platform by redesigning cloud deployment and architecture for microservices 
Establish Release Performance Certification Process in Agile Software development.
Improve toolset needed or reducing MTTR of production issues.
Work with Product owners and Identify business stories for Performance testing and come up performance test data, scenarios and methodologies
Participate in product architectural/implementation discussion and share recommendation 
Identify Performance bottleneck and narrow down root cause and drive performance fix with developers
Identify Performance monitoring metrics and add to performance acceptance criteria
Create product performance report for every release and share with stakeholders
Finding RCA of production Performance outages in SaaS env to help developers to fix the right problem and validate the fixes in Pre/Post Production env
Designed and Implemented monitoring and trending framework to log product and Infrastructure performance metrics using Graphite
Tech stack:- Java, spring, microservices, Mysql db, Cassandra,  Jmeter, appdynamics, graphite, big data analytics, SaaS, cloud Ops, public and private cloud

### Sr Performance Engineer Consultant @ HP Software
Jan 2010 – Jan 2012 | San Francisco Bay Area
Working on web based app for IT heldesk management solution.
Create Performance Testing scripts for Flex based Application
Execute and Analyze Performance Test on J2EE environemnt
Bottleneck analysis and tuning recomendations
Monitor and Analyze End to End System and pin point 
Performance issue
Baseline and Bench-marking of performance.
Database Monitor and analyze SQL plan, analyze awr/addm reports
Customize application code to solve Loadrunner Flex issues.

### Sr Performance Engineer Consusltant @ Intuit
Jan 2010 – Jan 2010 | San Francisco Bay Area
Setup Performance Test Lab
Create Performance Testing scripts
Execute and Analyze Performance Test on J2EE environemnt
Bottleneck analysis and tuning recomendations
Monitor and Analyze End to End System and pin point Performance issue
Database Monitor and analyze SQL plan, analyze awr/addm reports

### Senior Performance Engineer @ Symphony Services
Jan 2007 – Jan 2009
Evaluate load testing tools to fit application requirement
Understand Product architecture and come up with performance test cases
Create Loadrunner scripts, scenarios, configure monitors and run benchmark test
Monitor end to end application stack and identify app/database/third party services bottlenecks
File performance issue and work with developers to fix
Create customer facing reports
Analyze awwr/adm reports and identify database issues
Analyze SQL explain plan and find missing or bad indexes

### Software Test Engineer @ Sling Media
Jan 2006 – Jan 2007

### Network Engineer @ Hughes Communications India Limited
Jan 2003 – Jan 2005


## Education
### Bachelors in Electronics and Comm
Punjab Technical University


## Contact & Social
- LinkedIn: https://linkedin.com/in/sandeep-bansal-656811

---
Source: https://flows.cv/sandeepbansal
JSON Resume: https://flows.cv/sandeepbansal/resume.json
Last updated: 2026-04-12