# joel maria > Senior Staff Engineer | AI-Native & Event-Driven Systems at 50M+ Scale | LLM Infrastructure Location: United States, United States Profile: https://flows.cv/joelmaria Senior Staff Engineer with 16+ years architecting high-availability, distributed systems operating at massive concurrency and enterprise scale (50M+ users). I specialize in designing AI-native production infrastructure by integrating Retrieval-Augmented Generation (RAG), embedding pipelines, vector search, and LLM orchestration into streaming-first, event-driven systems. My focus is not AI wrappers. I build cost-governed, fault-tolerant AI platforms that sustain real-world load. Throughout my career, I’ve operated at the intersection of: ● High-scale distributed systems ● Real-time streaming architectures ● Generative AI infrastructure ● FinTech-grade reliability and security At U.S. Bank, I co-defined the architecture for an AI-powered voice assistant, integrating secure mobile clients with ML inference services under strict banking constraints. At Upwork, I led architectural design for creator and gaming platforms serving 50M+ users, re-architecting data access layers for 200% performance gains and building streaming-first telemetry contracts enabling AI-ready workloads. Previously at Grax, I scaled a Kubernetes-native SaaS platform to 100M+ monthly API requests, halving end-to-end latency through migration to WebSocket streaming and event-driven communication. Core Architecture Domains AI-Native Infrastructure ● Retrieval-Augmented Generation (RAG) ● Embedding pipelines & sub-120ms vector similarity search ● LLM orchestration (guardrails, routing, deterministic fallbacks) ● Token governance & inference cost optimization (38% reduction) ● Production-grade AI observability & performance controls Distributed Systems at Scale ● Event-Driven Architecture (Kafka / Kafka Streams) ● High-concurrency microservices (Node.js / Python) ● GraphQL federation & data-layer optimization ● Sub-second telemetry & streaming pipelines ● Multi-tenant Kubernetes environments Cost & Performance Governance ● Reduced cloud operational spend by $67K/month via container rightsizing & caching ● Engineered burst-resilient systems under financial transaction workloads ● Designed scalable identity layers (OAuth2/JWT) across distributed domains ## Work Experience ### Senior Software Engineer, AI-Scale Platform Architecture @ Upwork Jan 2024 – Present ● Led architecture for high-scale creator and gaming platforms serving 50M+ global consumers, designing distributed, event-driven systems across Node.js (TypeScript) and Python (FastAPI, asyncio). ● Architected and scaled horizontally distributed microservices and Kafka-based streaming pipelines powering real-time telemetry and mission-critical financial transactions with strong consistency guarantees and resilience under burst traffic conditions. ● Re-architected GraphQL APIs and data access layers, achieving 200% performance gains through resolver optimization, request batching, caching strategies, and query cost governance. ● Engineered non-blocking I/O, clustering, and async concurrency controls to maximize throughput while sustaining low-latency SLAs across high-traffic endpoints. ● Designed streaming-first telemetry contracts and enrichment pipelines enabling AI/ML-ready workloads for behavioral analytics, fraud detection, and adaptive monetization. ● Improved platform reliability and cost efficiency through structured observability (metrics, tracing, logs) and increased mobile performance by 35% via deep React Native Hermes optimization. ### Logistic and Last-Mile AI Platform @ Upwork Jan 2024 – Present | Boston, MA ● Architected a distributed AI-native logistics platform (React Native + Kubernetes) supporting real-time dispatch and routing across thousands of concurrent delivery flows. ● Designed Kafka streaming pipelines processing 50K+ GPS events/min, enabling sub-second anomaly detection and improving route deviation accuracy by 28%. ● Built a RAG-powered operational intelligence layer that reduced manual escalations by 35% through contextual retrieval and exception automation. ● Implemented embedding pipelines and low-latency vector search (<120ms), powering similarity-based decision augmentation for routing and driver behavior analysis. ● Engineered a production-grade LLM orchestration layer (prompt routing, guardrails, deterministic fallbacks), increasing automated resolution rates from 42% to 71%. ● Optimized LLM inference costs via selective invocation and token control strategies, reducing token usage by 38% and lowering monthly AI infrastructure costs by 31%. ### Lead Engineer – AI Voice & Distributed Banking Systems @ U.S. Bank Jan 2021 – Jan 2024 | Boston, MA ● Co-defined architecture for AI voice assistant powered by Node.js, Python, and TensorFlow inference services. ● Designed secure integration layer between mobile clients, ML inference services, and core banking systems. ● Implemented streaming API interactions for low-latency inference workflows. ● Designed scalable microservice communication model across distributed domains. ● Reduced application footprint by 20% through modular build strategies and dependency rationalization. ● Led 22 engineers building enterprise UI platform with 80%+ test coverage. ### Fullstack Engineer – High-Scale Distributed SaaS Platform @ GRAX Jan 2019 – Jan 2020 | Boston, MA ● Architected a Kubernetes-native Node.js platform handling 100M+ monthly API requests, enabling automated horizontal scaling and high-availability multi-tenant SaaS workloads. ● Led migration from REST polling to WebSocket streaming, reducing end-to-end latency by 50% and doubling real-time concurrency capacity. ● Optimized cloud infrastructure through container rightsizing, caching strategies, and batch processing, reducing operational costs by $67K/month. ● Designed event-driven data pipelines with BigQuery and implemented distributed OAuth2/JWT identity across services, establishing scalable retrieval and indexing patterns later leveraged for semantic and vector-based search architectures. ### Senior Software Engineer – Distributed Identity & Microservices @ CoStar Jan 2016 – Jan 2019 | Boston, MA ● Designed OAuth2 + JWT authentication systems supporting 5M+ daily logins. ● Built Backend-for-Frontend layer consolidating multiple domain services. ● Developed 60+ REST services with resilience patterns and observability hooks. ● Improved fault isolation and error handling across distributed services. ### Fraud Detection Engineer – Streaming & Real-Time Systems @ Fidelity Investments Jan 2015 – Jan 2016 | Boston, MA ● Built real-time fraud detection services using Kafka, Redis, Django. ● Designed event-driven anomaly detection pipelines. ● Integrated external intelligence APIs for automated risk verification. ● Delivered real-time visualization dashboards for transaction monitoring. ### Full-stack React.js Developer @ Verizon Jan 2014 – Jan 2015 | Lawrence, MA ● Built RESTful microservices in Node.js + MongoDB to manage nationwide cable infrastructure and equipment mapping. ● Developed cloud management modules for compute, storage, and network automation on Verizon IaaS platform. ● Enhanced platform scalability and data synchronization across distributed systems. ### Senior Mobile Engineer @ Bank of America Jan 2013 – Jan 2014 | Boston, MA ● Developed native mobile modules in Java and Objective-C, integrating secure biometrics and OCR scanning into a hybrid Cordova/PhoneGap app used by 46M+ customers. ● Modernized legacy banking workflows by introducing Backbone.js and AngularJS architectures, significantly improving UI performance and processing speed for millions of users. ● Refactored large-scale Java monoliths into RESTful web services, increasing scalability and reliability for systems handling 1.6B+ annual transactions. ### CSS3 Lead Developer @ TD Ameritrade Jan 2012 – Jan 2013 | New Jersey, United States ● Built a SASS/Compass framework automating sprite generation and vendor prefixing, cutting load times by 40%. ● Applied OOCSS + Bootstrap/LESS for modular, maintainable, and responsive UI architecture. ● Delivered a hybrid Cordova/HTML5 app deployed on Android and iOS. ### Freelance Back-End and Front-End Developer @ Elance Jan 2008 – Jan 2012 | Lawrence, MA ● Built full-stack PHP/CodeIgniter systems including RESTful APIs (JSON), data-feed processors (CSV/XML/RSS), web crawlers, and automation pipelines using cron jobs, cURL, GD, and MySQL/PostgreSQL, powering high-volume data ingestion and reporting. ● Delivered custom enterprise modules such as multi-currency invoicing with PDF generation, Facebook business-page integration apps, image-processing pipelines, and dynamic UI components using HTML/CSS, jQuery, AJAX, and Fancybox. ● Converted complex PSDs, Flash components, and legacy ColdFusion/JavaScript assets into high-performance, standards-compliant HTML/CSS interfaces, improving SEO, accessibility, and cross-browser reliability for multiple client platforms ## Contact & Social - LinkedIn: https://linkedin.com/in/joel-maria-960a7820 - Portfolio: https://joelmaria-developer.web.app/ - Portfolio: https://jmstechnologiesinc.com --- Source: https://flows.cv/joelmaria JSON Resume: https://flows.cv/joelmaria/resume.json Last updated: 2026-04-20