# Nasr Mohiuddin Syed

> Senior AI/ML Engineer | LLMs (LLaMA, Sonar) | RAG & Agentic AI | LangChain, LangGraph, CrewAI | LLMOps, PyTorch, AWS | Ex-Meta, Perplexity

Location: San Francisco Bay Area, United States
Profile: https://flows.cv/nasr

AI/ML Engineer with 5+ years of experience building and deploying production-grade large language model (LLM) systems across search, finance, and enterprise applications.
 
Currently working on scaling LLM infrastructure and agentic AI systems, with hands-on experience fine-tuning and deploying models like LLaMA 3.x and Sonar LLM for real-time, high-throughput environments.
 
My work focuses on:
 
 ● Designing RAG pipelines using LangChain, LangGraph, FAISS, and Pinecone 
 ● Building multi-agent AI systems with CrewAI 
 ● Optimizing LLM inference and training using PyTorch, DeepSpeed, and vLLM 
 ● Implementing LLMOps pipelines with MLflow, Airflow, and AWS 
 
At Perplexity and Meta, I have:
 
 ● Improved model accuracy and reasoning performance across benchmarks 
 ● Built scalable AI systems handling 10K+ daily interactions 
 ● Delivered measurable impact on user engagement and system efficiency 
 
I also bring experience in:
 
 ● Full-stack AI development (React, Next.js, FastAPI) 
 ● AI governance, safety, and compliance (HIPAA, PCI-DSS) 
 
Interested in building scalable, reliable, and high-impact AI systems leveraging LLMs and agentic architectures.

## Work Experience
### AI/ML Engineer @ Perplexity
Jan 2025 – Present | San Francisco, California
Working on large-scale LLM systems powering search, finance, and enterprise AI applications. Focused on agentic AI, RAG pipelines, and high-performance inference.

● Fine-tuned and deployed Sonar LLM (LLaMA 3.3–70B) using PyTorch and DeepSpeed on AWS, achieving 1,200+ tokens/sec throughput on large-scale inference infrastructure
● Built end-to-end RAG pipelines using LangGraph, FAISS, and Pinecone, improving factual accuracy to 92%+ for financial and enterprise use cases
● Integrated LLM into multi-agent workflows using CrewAI, enabling SEC-compliant insights and increasing user engagement by 27%
● Engineered high-performance inference systems using vLLM, ONNX, and Triton, significantly improving latency and reducing cloud costs
● Applied advanced prompt engineering techniques (CoT, few-shot, RLAIF) to improve reasoning and QA performance by 18% across benchmarks
● Developed full-stack AI features using React (Next.js), TypeScript, and Node.js, enabling real-time agent interactions and dynamic AI-driven UI
● Designed scalable training and evaluation pipelines using Airflow and Databricks, supporting rapid experimentation and model benchmarking
● Implemented trust and safety frameworks, including moderation APIs and citation validation, ensuring compliant and reliable AI outputs

### Software Engineer @ Meta
Jan 2024 – Jan 2025 | San Francisco, California
Focused on LLM systems, backend infrastructure, and scalable AI services for enterprise and internal platforms.

● Built scalable backend systems and microservices across hybrid cloud environments, improving system uptime by 40% and reducing issue resolution time by 30%
● Fine-tuned LLaMA models (8B & 70B) on domain-specific datasets, improving summarization and Q&A accuracy by 27% for enterprise users
● Developed RAG-based AI systems using LangChain, LlamaIndex, and Weaviate, reducing support ticket resolution time by 40%
● Built and deployed LLM-powered APIs and chat systems using FastAPI and Next.js, supporting 10K+ daily user interactions
● Optimized model inference using ONNX, quantization, and distributed GPU systems, achieving sub-200ms latency in production
● Implemented MLOps pipelines using MLflow, Prometheus, and AWS, enabling scalable model tracking, monitoring, and deployment
● Designed internal dashboards and developer tools using PostgreSQL, GraphQL, and REST APIs to monitor model performance and usage
● Ensured AI safety and compliance by implementing guardrails, PII filtering, and red-teaming workflows aligned with Responsible AI standards

### Software Engineer @ Accenture
Jan 2020 – Jan 2023 | India
Worked on ML models and AI systems in fintech and insurance domains, focusing on analytics, APIs, and deployment.

● Developed machine learning models for credit risk scoring using Python and scikit-learn, improving loan default prediction accuracy by ~18%
● Built AI-powered insurance solutions using TensorFlow and OpenCV, enabling automated health risk profiling and premium calculation
● Applied clustering techniques (KMeans, DBSCAN) to analyze customer behavior, increasing user engagement by 30% through personalization
● Developed RESTful APIs using FastAPI and Flask, deploying scalable services on AWS and Azure cloud platforms
● Contributed to full-stack development using React and Python backends, building dashboards and workflows for fintech and insurance clients
● Supported MLOps pipelines using MLflow, Airflow, and DVC, automating model versioning, tracking, and retraining processes
● Built data preprocessing and feature engineering pipelines to improve model performance and reliability across multiple use cases
● Delivered data visualization dashboards using Power BI and Streamlit, enabling business stakeholders to derive actionable insights


## Education
### Master's Degree in Computer Science
San José State University

### Bachelor's Degree in Computer Science
Osmania University

### All Saint's High School


## Contact & Social
- LinkedIn: https://linkedin.com/in/nasr-mohiuddin-syed-982689207
- Portfolio: https://syed-nasr08.github.io/syed-nasr08/

---
Source: https://flows.cv/nasr
JSON Resume: https://flows.cv/nasr/resume.json
Last updated: 2026-04-17