# Sandeep Bansal > Principal Performance Engineer | Salesforce Agentforce Platform | Founder Jetmanlabs.com | Patent Instrumentation Location: San Francisco Bay Area, United States Profile: https://flows.cv/sandeepbansal 15+ performance engineering and tools development experience in improving the resiliency and performance of small to large scale cloud and distributed platforms with performance benchmarking, scalability experiments, identifying bottlenecks with profiling, monitoring, Instrumentation tools and data science. Experience in optimizing cloud platforms for large to small companies like Salesforce, GE Digital, Clickup, Lithium Technologies, Intuit etc. Experience in building functional and performance benchmarking and scalability frameworks to test AI models with large data sets with real time analytics and uncover scale and latency issues. Experience in optimizing scalability and end to end time for Natural language based AI systems for Einstein Search. Architect and build multi user microservices and User navigation performance testing and workload generation framework using open source like Jmeter, Locust, Puppeteer to measure latency in production environment to identify latency and resilience issues. Troubleshoot and profile production slowness or uptime reliability issue across end to end multi -tier infra, DB, services runtime and end user. Platform tuning and identify opportunities to optimize using tracing and custom tracing and observability collectors across the stack. Observability Expertise in real user monitoring and performance metrics instrumentation (RUM) to measure user page load time in production and pre production env. Capacity planning to optimize cloud cost by building the model to collect data from sources like user latency, services health, platform usage patterns and infrastructure health and resource usage to ensure there is no customer impact. Build and maintain a performance and scalability lab to run benchmarks, synthetic monitoring, profiling and capacity planning exercises. Build api driven test data for workload generation to reproduce a customer scenario and profiling cause of slowness. Left Shift performance to run perf in CI/CD to identify perf issues early in dev lifecycle. Patent and Innovation Patent on action-able instrumentation and observability of real user monitoring observability platform. Develop multi user UI performance/ synthetic user experience monitoring and observability platform to instrument devtool perf and profiling metrics to identify slow user page load or loading issues. Python machine agent to collect host or services runtime metrics data at edge publish to JSON store for decision and anomaly. ## Work Experience ### Principal Engineer - Agentforce Platform @ Salesforce Jan 2023 – Present | San Francisco Bay Area Architecture and design to scale Agentforce platform to build and deploy agent to help customers add AI capabilities in their products. LLM and Model benchmarking with various LLM's and GPU's. Build tools to benchmark LLM models find optimal cost to serve. Build Mock platform to mock LLM responses and save millions of $ for internal development. ### Founder @ JetManLabs Jan 2019 – Present | San Francisco Bay Area Founder of jetmanlabs.com Jetman is developer productivity platform. It enable development teams to design, build, mock and test API’s and ship products at faster pace with highest quality. Everything from ground zero, product vision, architecture, design, full stack development in nodejs , javascript, HTML, CSS etc. HIre and manage a team of 5 engineers to build supporting products. Developed platform independent MacOS and Windows client for API development and testing. Developed cloud platform and portal using microservices running in Google Cloud (GCP) ### Principal Engineer - Performance | Instrumentation | Observability @ ClickUp Jan 2022 – Jan 2023 | San Francisco Bay Area Leading the effort to scale and improve end to end user experience latency across multi tier AWS based cloud platform of ClickUp CRM product. Architect and implement distributed multi user API and User experience benchmarking frameworks to run in perf and production environment to ensure services and end user latency and resiliency are with SLA with release or cloud technologies changes. Design and Implement Test Data Generation Frameworks for BE and AI use cases for OpenAI and CRm data applications. Monitor and actionable trend analysis of production services and infrastructure health as new features or customer usage patterns changes. Instrument and profile end user latency issues across front end, services and AWS cloud infrastructure. Shorten the feedback cycle for services latency and impact with Left Shift Performance with CI/CD. Infrastructure capacity planning for DB (Postgres, Elasticsearch, ECS, Redis,) and other app run time services running on ecs by understanding user traffic pattern, time of use and mapping to infrastructure resource usage. RCA and MTTR of production performance/scalability issues, and work closely with development teams to make improvements. Architect to instrumentation and collect perf metrics/signals for user page load time measurements and opportunities to improve loading time. Perf Tools: Jmeter, Loadrunner, Locust, Playwright/Puppeteer and added multi user support. APM/Observability: DataDog, New Relic, Appdynamics, Graffana, YourKit, JVM Tunning, ELK Programming languages: NodeJS, Javascript, Java, Python Technologies and Cloud EC2, SSD, ECS, Docker, NodeJS, JVM, Cassandra, Postgres, AWS, Redis, Kafka, RabbitMQ, ElasticSearch, Kibana ### Principal Engineer - Performance | Observability | Instrumentation @ Salesforce Jan 2021 – Jan 2022 | San Francisco Bay Area Improved end to end user experience of AI and Natural language powered Einstein Search Platform by identifying the scale and areas of optimization by measuring the breakdown of the latency spent from UI interaction to last hop. Design and Implemented new capability to measure real user page load time and intercept the traffic and time spent on various components and sent to metrics store for waterfall breakdown comparison. Act as Product Owner to prioritize/align performance short term and long term goals. Mentor and manage small team. Patent on UI load generation and waterfall based analytics user experience platform. Custom Instrumentation across client and API tier for latency breakdown. BE and UI performance load and metric collection generation framework for real time analysis and regression detection. Built system crawler to watch and collect system health and custom metrics of 10's of thousands of DB and Runtime production nodes for system health and later used the data for capacity planning and estimating private Datacenter cap add. SRE for monitoring, debugging and RCA of production services in private and public cloud. Root cause performance/scalability issues and work closely with development teams to make improvements. Technology Stack:- Oracle, Solr, Nodejs, Postgres, AWS, Private Cloud, Java, Spring, Angular JS APM: Appdynamics, New Relic, VisualVM, YourKit, Splunk Programming: Python,Java, NodeJs, Javascript, JQuery/HTML Cloud: Loadbalancer, AWS, Elasticsearch, Kibana, Splunk, Redis, Auto scaling, ecs, Docker ### LMTS Performance and Observability @ Salesforce Jan 2017 – Jan 2021 | San Francisco Bay Area Leading Salesforce Einstein Search performance initiative: Improving user experience, scaling to millions queries/day. Architect and design backend and UI perf instrumentation framework to capture performance metrics. Managing and leading Observability platform to accurately measure, waterfall breakdown of request/webpage roundtrip. Exciting part is you can run actionable analytics for trending, drill-down the cause of slowness and get insight to find opportunity to optimize. ### Sr Staff Performance Architect/Manager - Cloud and Infrastructure platform @ GE Digital Jan 2016 – Jan 2017 | San Francisco Bay Area Architect Performance as a Service platform to improve developer productivity Telemetry Design Anomaly Tracing Cloud Platform and infrastructure Benchmarking/Architecture Tools, RCA to improve MTTR Infrastructure Optimization and Capacity Planning Uptime and SLA to improve customer experience ### Staff Performance Scale and Capacity Engineer @ GE Digital Jan 2014 – Jan 2016 | San Francisco Bay Area GE Graph Predix IoT platform is all about scale and performance. Provide Architectural and design inputs to scale multi-tenant cloud platform based on microservices Improve throughput, stability and latency of cloud based IoT microservices. Provide ways to safeguard app during network, database, load issues to maintain uptime and SLA's Benchmark and identify scale and operational limitation of data platform. Address Multi tenancy issues in cloud platform and approach to scale within SLA. Production infrastructure capacity planning for distributed nosql cassandra, Solr and app runtime to estimate cost to serve and achieve operation excellence. Review platform architecture and provide recommendations. Build benchmark and result analysis tools and framework in Nodejs, Java, Javascript to simplify performance benchmarking, analysis and monitoring as replacement for Jmeter and Loadrunner Identify platform scale, performance, stability issues by running various benchmarks workloads. Integrate performance testing into CI pipeline. Root cause analysis of performance issues in production and performance env and work with developers to fix the right problem. Build Jmeter based Cloud infrastructure to automate benchmarking. Technology Stack:- Cassandra, Solr, Nodejs, Postgres, Cloud Foundry, AWS, Private Cloud, Java, HIbernate, OpenJpa, Spring, Angular JS APM: Appdynamics, New Relic, VisualVM, YourKit Programming: Java, NOdeJs, Javascript, JQuery/HTML ### Lead Performance and SRE Engineer @ Lithium Technologies Jan 2012 – Jan 2014 | San Francisco Bay Area Leading Performance Engineering/Certification of J2EE based SaaS Social Application Responsible for RCA of Production Performance issues and tune SaaS platform for better performance. Improve stability and downtime of cloud platform by redesigning cloud deployment and architecture for microservices Establish Release Performance Certification Process in Agile Software development. Improve toolset needed or reducing MTTR of production issues. Work with Product owners and Identify business stories for Performance testing and come up performance test data, scenarios and methodologies Participate in product architectural/implementation discussion and share recommendation Identify Performance bottleneck and narrow down root cause and drive performance fix with developers Identify Performance monitoring metrics and add to performance acceptance criteria Create product performance report for every release and share with stakeholders Finding RCA of production Performance outages in SaaS env to help developers to fix the right problem and validate the fixes in Pre/Post Production env Designed and Implemented monitoring and trending framework to log product and Infrastructure performance metrics using Graphite Tech stack:- Java, spring, microservices, Mysql db, Cassandra, Jmeter, appdynamics, graphite, big data analytics, SaaS, cloud Ops, public and private cloud ### Sr Performance Engineer Consultant @ HP Software Jan 2010 – Jan 2012 | San Francisco Bay Area Working on web based app for IT heldesk management solution. Create Performance Testing scripts for Flex based Application Execute and Analyze Performance Test on J2EE environemnt Bottleneck analysis and tuning recomendations Monitor and Analyze End to End System and pin point Performance issue Baseline and Bench-marking of performance. Database Monitor and analyze SQL plan, analyze awr/addm reports Customize application code to solve Loadrunner Flex issues. ### Sr Performance Engineer Consusltant @ Intuit Jan 2010 – Jan 2010 | San Francisco Bay Area Setup Performance Test Lab Create Performance Testing scripts Execute and Analyze Performance Test on J2EE environemnt Bottleneck analysis and tuning recomendations Monitor and Analyze End to End System and pin point Performance issue Database Monitor and analyze SQL plan, analyze awr/addm reports ### Senior Performance Engineer @ Symphony Services Jan 2007 – Jan 2009 Evaluate load testing tools to fit application requirement Understand Product architecture and come up with performance test cases Create Loadrunner scripts, scenarios, configure monitors and run benchmark test Monitor end to end application stack and identify app/database/third party services bottlenecks File performance issue and work with developers to fix Create customer facing reports Analyze awwr/adm reports and identify database issues Analyze SQL explain plan and find missing or bad indexes ### Software Test Engineer @ Sling Media Jan 2006 – Jan 2007 ### Network Engineer @ Hughes Communications India Limited Jan 2003 – Jan 2005 ## Education ### Bachelors in Electronics and Comm Punjab Technical University ## Contact & Social - LinkedIn: https://linkedin.com/in/sandeep-bansal-656811 --- Source: https://flows.cv/sandeepbansal JSON Resume: https://flows.cv/sandeepbansal/resume.json Last updated: 2026-04-12