My Portfolio
I build production-grade data science systems — from Bayesian models and LLM pipelines to enterprise CLV engines — that move real dollars and drive real decisions.
| 📍 Vancouver, BC | 🎓 B.Sc. Data Science, Simon Fraser University | 📧 Email: gandhivibhuti1802@gmail.com |
Jan 2025 – Present
Led end-to-end development and enterprise-wide deployment of advanced analytics solutions at BCLC’s Advanced Analytics division.
Jan 2023 – Dec 2023
Jan 2023 – Apr 2023
PyMC Bayesian Inference SciPy Optimisation Hierarchical Modelling · ▶ Read the Article
Built a full-stack MMM pipeline for a fictional outdoor apparel brand across five markets. Generated synthetic data with known ground truth, applied geometric adstock and Hill saturation transformations, and fit a hierarchical Bayesian model to decompose revenue by channel. A SciPy budget optimizer identified a consistent reallocation from Social → Paid Search yielding an 8–15% weekly revenue uplift across all regions.
Gaussian HMM Ledoit-Wolf Shrinkage Walk-Forward Validation Canadian ETFs · ▶ Read the Article
Built a three-regime market detection system using a Gaussian HMM on 10 engineered features (realized volatility, momentum, credit spread proxy, yield curve slope), combined with mean-variance optimization. Achieved a Sharpe ratio of 1.27 vs. 0.88 for a 60/40 benchmark, outperforming in 3 of 4 out-of-sample test periods.
LLMs Gemini-2.5-flash Multi-Agent Architecture NLP
Scraped and analyzed 2,500 videos from 25 creators. Developed a custom Authenticity Index and deployed a multi-agent LLM architecture (Gemini-2.5-flash) to quantify how sponsorship alters linguistic patterns and how audiences respond to those changes.
ARIMA Time-Series Python · ▶ Watch the video
Implemented an ARIMA model to forecast Canadian inflation with 92% accuracy during the pandemic — earning 1st place at Vancouver Datajam. Delivered actionable insights into consumer price dynamics during crisis periods.
| Project | Methods | Highlights |
|---|---|---|
| Explainable GNNs for Protein Classification | PointNet, GCN, GGS-NN | Integrated explainability analysis for trustworthy geometric deep learning |
| Diabetes Prediction: Binary Classifiers & SVMs | LR, KNN, LDA, SVM (RBF) | Comprehensive model comparison on Pima Indians dataset |
| Dow Jones Time-Series Analysis | ARIMA, Box-Jenkins | Addressed volatility clustering, heteroskedasticity, mean reversion |
| Bank Churn Analysis | Azure Synapse, MySQL, Power BI | Interactive 4-year churn dashboard |
| Vancouver Weather Forecast | Neural Prophet, PyTorch | 78% accuracy using 80 years of climate data |
| Wikipedia Summarizer GPT App | RAG, Flan-T5 XXL, Streamlit | End-to-end NLP app hosted on Streamlit |
Languages: Python · R · SQL · C · C++ · MATLAB
ML / AI: PyTorch · TensorFlow · Keras · Scikit-learn · XGBoost · LightGBM · LangChain · Hugging Face · PyMC
Specializations: Bayesian Modelling · Time-Series Forecasting · NLP & Generative AI · Deep Learning (CNNs, RNNs, GNNs) · Customer Segmentation · CLV · A/B Testing · Geospatial Analysis · Reinforcement Learning
Data & Cloud: Azure · AWS · Databricks · PySpark · Snowflake · Oracle · PostgreSQL · MySQL
MLOps: CI/CD · Docker · Azure ML Studio · DataIku · Git/GitHub · Model Monitoring
Visualization: Power BI · Tableau · Plotly · Streamlit · Seaborn
BSc Data Science — Simon Fraser University (Sep 2020 – Dec 2025)
Relevant Coursework: Object Oriented Programming, Applied Multivariate Analysis, Sampling and Experimental Design, Statistical Learning and Prediction, Bayesian Statistics, Artificial Intelligence, Time Series Analysis, Object Oriented Programming, Database Systems, Linear Optimization