t LexisNexis, I worked on building scalable machine learning solutions to extract insights from complex legal documents. I developed an end-to-end document intelligence platform that automated classification and information extraction, significantly reducing manual processing time for legal teams.
I built and optimized machine learning models using Scikit-learn and XGBoost, and later enhanced performance by integrating transformer-based NLP models from Hugging Faceβimproving classification accuracy by 18%. My work involved designing robust preprocessing and feature engineering pipelines using Pandas and NumPy to handle highly unstructured legal text.
To scale the system, I engineered distributed data pipelines using PySpark, enabling batch and near real-time processing of large legal datasets. I also deployed models as REST APIs using FastAPI and Docker, reducing response latency by 25% and improving integration across internal platforms.
Additionally, I implemented MLOps workflows using MLflow and CI/CD pipelines, which streamlined experimentation, versioning, and deployment. Continuous monitoring and retraining improved model reliability by 20% over time.