Portfolio

Caduceo

Developed a healthcare cost analysis chatbot that leverages OCR, NLP, and machine learning to detect overcharges in medical bills, providing accurate charge classification and explanations through a conversational interface. Engineered a multimodal AI pipeline integrating Azure AI Vision for OCR, a 4-bit quantized LLaMA 3.2B model, DBSCAN clustering, and MongoDB/Snowflake, achieving a Silhouette coefficient above 0.90 for charge anomaly detection. This project earned 2nd Place in the Assurant Challenge: Revolutionize AI Solutioning with Multimodal Agentic AI, where I collaborated with a team to deliver an end-to-end solution combining large language models, clustering algorithms, cloud services, and real-time data infrastructure.

Detecting Flash Crash Precursors in Bitcoin Market Using Supervised and Unsupervised Learning

Investigated flash crashes in the Bitcoin market using features extracted from high-frequency order book data. Applied supervised models such as Random Forests and unsupervised autoencoders to detect pre-crash anomalies with strong predictive performance. Demonstrated that microstructure signals like liquidity imbalances and bid-ask spreads can provide advance warning of extreme price movements.

Multi‐Model Monte Carlo Portfolio Optimization

Developed a multi-model Monte Carlo framework to optimize equity portfolios under realistic price and interest-rate dynamics. Simulated asset paths using Geometric Brownian Motion, Merton jump-diffusion, CEV, and Heston models alongside Vasicek, CIR, and Ho-Lee rate scenarios to identify allocations that maximized Sharpe ratio. Enhanced risk-adjusted performance by incorporating vanilla options that reduced downside exposure while preserving upside potential.

Predicting New Bike Shares

Conducted a detailed exploratory analysis on a bike-sharing dataset and identified “hour of the day” as a key predictor of usage patterns. Built and evaluated multiple machine learning models—including Random Forest, Feedforward Neural Network, and Gradient Boosting Machine—using K-fold cross validation. Model performance was assessed using MSE and RMSE, with the Random Forest achieving the best results (RMSE: 258.57).