AI/ML Engineer | Data Science Specialist | Research-Driven Problem Solver I’m Sasi Jyothirmai Bonu, an AI/ML Engineer with 5+ years of experience building machine learning and data science solutions across research, telecom, and agriculture.

Experience

Network Theory Applied Research InstituteAI/ML Engineer

2025 — Now

Worked with stakeholders to define real agricultural problems and translate them into clear use cases.

Collected and cleaned data from FAO, USDA, and research sources to create reliable training datasets.

Designed the data pipeline for ingestion, validation, and versioning so models are always trained on trusted data.

Trained and fine-tuned domain LLMs, optimizing accuracy, cost, and latency for production-ready deployment.

Mentored a graduate student in data preprocessing, pipeline design, and model evaluation, speeding up delivery and knowledge transfer.

SAMstreamAI /ML Engineer

2024 — 2025

Built an XGBoost model to predict federal contract bids, improving forecast accuracy by 20% and guiding pricing decisions.

Engineered and validated interpretable features to help pricing teams understand and justify model predictions.

Developed PySpark ETL pipelines for automated ingestion, reducing data latency by 60%.

Tuned hyperparameters and compared XGBoost with baseline models to justify the final model choice.

DEL LabData Scientist

2024 — 2025

Boulder, Colorado, United States

Project 1: Hours were spent by researchers on reviewing speech data and segmenting trials. The goal was to make a Streamlit dashboard powered by IBM Watson to segment trials and reduce the review time.

Developed a Streamlit dashboard with IBM Watson Speech-to-Text to transcribe and segment 100+ learning trials, enabling faster identification of spoken number responses and reducing manual review time by an estimated 60%.

Project 2: The goal was to train an attention-based LSTM to convert between numbers, words, and visual blocks, helping compare how children learn numbers with how well a model can mimic that learning.

Provided data-driven insights to advance the study of human cognition and the development, education, and learning of children, leading to publications.

Trained an attention-based LSTM model to translate between Arabic numerals, number words, and visual blocks under three conditions: (1) numerals + words, (2) blocks + words, and (3) all three combined.

Project 3: Make a data analysis pipeline enabling analysis of how colors versus stickers influence how quickly new learners pick up typing on a keyboard.

Engineered data workflows for trial segmentation and gaze metric extraction (fixation area, duration, switching), resulting in high-quality derived datasets for downstream statistical modeling.

Boosted frame-level classification accuracy of egocentric video data by 30% using a Vision Transformer, enabling precise AOI tracking.

Wipro LimitedData Engineer (Project Engineer): Data Analytics & AI

2020 — 2023

Project 1: The goal was to develop backfill pipelines to process late-arriving/missing subscriber data and figure out the source of latency, ensuring accurate and timely campaign execution.

Extracted late-arriving records from upstream systems, processed, and analyzed payloads to categorize errors by type, enabling targeted fixes.

Conducted multi-day analysis to identify patterns and error sources, informing preventive measures and improving pipeline reliability.

Project 2: Developed a data pipeline to generate clean, reliable datasets for weekly leadership reporting and business analysis.

Built an ETL pipeline to extract data from PostgreSQL tables, aggregate, and transform weekly data, ensuring accurate numbers for senior leadership reporting to stakeholders

Optimized pipeline performance to efficiently process growing historical data in HDFS, improving processing speed and resource utilization while maintaining data accuracy.

Project 3: Built a PySpark ETL job for automated monitoring to ensure reliable data processing every day and alerting otherwise.

Developed PySpark transformations to calculate record counts and other metrics, ensuring daily data consistency and correctness across inputs and outputs.

Reduced manual monitoring efforts and improved data reliability by 45%, enabling faster detection of issues and minimizing potential disruptions.

Amrita Institute of Medical Sciences and Research CentreResearch Assistant

2020 — 2020

Kochi, Kerala, India

Improved epilepsy surgery planning by identifying seizure-onset zones from SEEG data and visualizing neural patterns to inform clinical decisions.

Developed statistical and ML models to localize seizure onset zones from SEEG recordings.

Built a clinical visualization tool mapping seizure origin nodes, supporting surgical decisions, and authored a peer-reviewed journal publication in the UK.

Education

University of Colorado Boulder

Master of Science - MS

Amrita Vishwa Vidyapeetham

Experience

Education

Master of Science - MS

Bachelor of Technology - BTech