# Nasr Mohiuddin Syed > Senior AI/ML Engineer | LLMs (LLaMA, Sonar) | RAG & Agentic AI | LangChain, LangGraph, CrewAI | LLMOps, PyTorch, AWS | Ex-Meta, Perplexity Location: San Francisco Bay Area, United States Profile: https://flows.cv/nasr AI/ML Engineer with 5+ years of experience building and deploying production-grade large language model (LLM) systems across search, finance, and enterprise applications. Currently working on scaling LLM infrastructure and agentic AI systems, with hands-on experience fine-tuning and deploying models like LLaMA 3.x and Sonar LLM for real-time, high-throughput environments. My work focuses on: ● Designing RAG pipelines using LangChain, LangGraph, FAISS, and Pinecone ● Building multi-agent AI systems with CrewAI ● Optimizing LLM inference and training using PyTorch, DeepSpeed, and vLLM ● Implementing LLMOps pipelines with MLflow, Airflow, and AWS At Perplexity and Meta, I have: ● Improved model accuracy and reasoning performance across benchmarks ● Built scalable AI systems handling 10K+ daily interactions ● Delivered measurable impact on user engagement and system efficiency I also bring experience in: ● Full-stack AI development (React, Next.js, FastAPI) ● AI governance, safety, and compliance (HIPAA, PCI-DSS) Interested in building scalable, reliable, and high-impact AI systems leveraging LLMs and agentic architectures. ## Work Experience ### AI/ML Engineer @ Perplexity Jan 2025 – Present | San Francisco, California Working on large-scale LLM systems powering search, finance, and enterprise AI applications. Focused on agentic AI, RAG pipelines, and high-performance inference. ● Fine-tuned and deployed Sonar LLM (LLaMA 3.3–70B) using PyTorch and DeepSpeed on AWS, achieving 1,200+ tokens/sec throughput on large-scale inference infrastructure ● Built end-to-end RAG pipelines using LangGraph, FAISS, and Pinecone, improving factual accuracy to 92%+ for financial and enterprise use cases ● Integrated LLM into multi-agent workflows using CrewAI, enabling SEC-compliant insights and increasing user engagement by 27% ● Engineered high-performance inference systems using vLLM, ONNX, and Triton, significantly improving latency and reducing cloud costs ● Applied advanced prompt engineering techniques (CoT, few-shot, RLAIF) to improve reasoning and QA performance by 18% across benchmarks ● Developed full-stack AI features using React (Next.js), TypeScript, and Node.js, enabling real-time agent interactions and dynamic AI-driven UI ● Designed scalable training and evaluation pipelines using Airflow and Databricks, supporting rapid experimentation and model benchmarking ● Implemented trust and safety frameworks, including moderation APIs and citation validation, ensuring compliant and reliable AI outputs ### Software Engineer @ Meta Jan 2024 – Jan 2025 | San Francisco, California Focused on LLM systems, backend infrastructure, and scalable AI services for enterprise and internal platforms. ● Built scalable backend systems and microservices across hybrid cloud environments, improving system uptime by 40% and reducing issue resolution time by 30% ● Fine-tuned LLaMA models (8B & 70B) on domain-specific datasets, improving summarization and Q&A accuracy by 27% for enterprise users ● Developed RAG-based AI systems using LangChain, LlamaIndex, and Weaviate, reducing support ticket resolution time by 40% ● Built and deployed LLM-powered APIs and chat systems using FastAPI and Next.js, supporting 10K+ daily user interactions ● Optimized model inference using ONNX, quantization, and distributed GPU systems, achieving sub-200ms latency in production ● Implemented MLOps pipelines using MLflow, Prometheus, and AWS, enabling scalable model tracking, monitoring, and deployment ● Designed internal dashboards and developer tools using PostgreSQL, GraphQL, and REST APIs to monitor model performance and usage ● Ensured AI safety and compliance by implementing guardrails, PII filtering, and red-teaming workflows aligned with Responsible AI standards ### Software Engineer @ Accenture Jan 2020 – Jan 2023 | India Worked on ML models and AI systems in fintech and insurance domains, focusing on analytics, APIs, and deployment. ● Developed machine learning models for credit risk scoring using Python and scikit-learn, improving loan default prediction accuracy by ~18% ● Built AI-powered insurance solutions using TensorFlow and OpenCV, enabling automated health risk profiling and premium calculation ● Applied clustering techniques (KMeans, DBSCAN) to analyze customer behavior, increasing user engagement by 30% through personalization ● Developed RESTful APIs using FastAPI and Flask, deploying scalable services on AWS and Azure cloud platforms ● Contributed to full-stack development using React and Python backends, building dashboards and workflows for fintech and insurance clients ● Supported MLOps pipelines using MLflow, Airflow, and DVC, automating model versioning, tracking, and retraining processes ● Built data preprocessing and feature engineering pipelines to improve model performance and reliability across multiple use cases ● Delivered data visualization dashboards using Power BI and Streamlit, enabling business stakeholders to derive actionable insights ## Education ### Master's Degree in Computer Science San José State University ### Bachelor's Degree in Computer Science Osmania University ### All Saint's High School ## Contact & Social - LinkedIn: https://linkedin.com/in/nasr-mohiuddin-syed-982689207 - Portfolio: https://syed-nasr08.github.io/syed-nasr08/ --- Source: https://flows.cv/nasr JSON Resume: https://flows.cv/nasr/resume.json Last updated: 2026-04-17