AI engineering, data pipelines, predictive modeling, and peer-reviewed research
Production RAG pipeline powering a university-facing chatbot. Engineered with Sentence Transformers, LLAMA3-70B (Groq API), and FAISS vector indexing over 5,000+ JSOM webpages via custom web scraping. Improved response accuracy by ~85% and cut latency by ~95% to under 3 seconds per query. Deployed and validated with UT Dallas JSOM leadership.
End-to-end ML pipeline that reduced loan default rates from 25.8% to 3%, an 88.4% improvement. Built with XGBoost and Neural Networks with SQL-validated data pipelines for feature engineering and model deployment. Comprehensive evaluation across precision, recall, and AUC metrics.
U.S. renewable energy investment analysis platform with real-time EIA & FRED data feeds, IRR/NPV/LCOE financial modelling, and an integrated Claude AI research assistant for natural language queries. Interactive charts, state-level breakdowns, and investment scenario comparisons.
AI-powered travel planning assistant that generates personalized itineraries using LLM inference. Clean web frontend built with React, backed by a FastAPI service handling itinerary generation, routing logic, and user preference management. Modular backend designed for scalability.
Ensemble model combining SVC, Naive Bayes, and Decision Trees to predict multiple diseases, raising accuracy from 95% to 99%. Reduced data processing time 42% via SQL optimization. Research published in IJRTE demonstrating a 25% increase in predictive accuracy for multi-disease scenarios.
Automatic Speech Recognition system for low-resource Tamil language using fine-tuned Wav2Vec2 with a custom linear layer and tokenizer. Achieved 61.3% WER on Mozilla Common Voice dataset, outperforming the prior state-of-the-art 69.76% WER. Published in Springer's Advances in Data Science and Computing Technologies.
Explored NVIDIA's NeMo framework for Automatic Speech Recognition on Indian regional languages. Investigated pre-trained model effectiveness and fine-tuning strategies for low-resource language processing. Research published in AIP Conference Proceedings.
Comprehensive study on ML applications in healthcare, predictive modeling, patient outcome prediction, and medical data analysis using ensemble methods and modern AI/ML techniques. Published in IEEE International Conference proceedings.
Customer churn analysis on the Olist Brazilian e-commerce dataset. Applied RFM segmentation and cohort analysis to identify at-risk customer segments, then built predictive models to classify churn likelihood. Delivered actionable retention insights from real transaction-level data.
Chrome extension that lets you track job applications directly from any job board, save listings, update application status, and add notes without leaving the page. Streamlines the job hunt workflow with a clean, minimal UI and persistent local storage.
Developed an ASR system for low-resource Tamil using fine-tuned Wav2Vec2 with a custom tokenizer. Achieved 61.3% WER on Mozilla Common Voice, outperforming prior 69.76% WER benchmark.
Ensemble of SVC, Naive Bayes, and Decision Trees raised disease prediction accuracy from 95% to 99%. SQL optimization cut data processing time 42%. Demonstrated 25% improvement in multi-disease prediction accuracy.
Investigated NVIDIA NeMo's pre-trained models and fine-tuning strategies for ASR on Indian regional languages, advancing low-resource speech processing research.
Comprehensive study of ML applications in healthcare analytics, predictive modeling, patient outcome prediction, and medical data analysis using modern AI/ML and ensemble methods.
I'm open to AI engineering, data engineering, and data science opportunities. Let's build something impactful together.