Home
Now Playing on Spotify
Loading...

Welcome!


Photo of Yash Soni

Hi I'm Yash Soni. I build data and software projects, from fraud detection and analytics dashboards to automation tools and games.

Explore my projects, or check out some photos and fun stuff.

About

I am a Data Science student at Wilfrid Laurier University. I enjoy working with data and math, especially when they help explain patterns and real world behavior. My projects focus on extracting meaningful insights from messy data through machine learning and analytics, where evaluation metrics and decision thresholds matter more than raw accuracy. I enjoy building clean, readable code, automation that saves time, and simple dashboards and workflows in Python and R that make results easy to understand.

Experience

Manulife logo

Manulife Financial — Data Scientist Intern

Built governed analytics and AI solutions for Risk, Treasury, and Finance stakeholders, including LLM-driven workflows and transformed fraud datasets for downstream modeling.

PySpark, LLMs, Machine Learning, Data Preparation

Wilfrid Laurier logo

Wilfrid Laurier University — Machine Learning Research Assistant

Cleaned and standardized UVABench scenario data, prepared features and labels for core safety tasks, and analyzed benchmark outputs to support model evaluation workflows.

Machine Learning, Benchmarking, Python, Research

John Hancock logo

John Hancock — Data Engineer Intern

Built and optimized Databricks pipelines with PySpark and SQL, automated ML and LLM ingestion workflows, and improved monitoring for data quality and pipeline execution.

PySpark, SQL, Databricks, Data Pipelines

RBC logo

RBC — Data Analyst Intern

Validated daily mortgage disbursement data exceeding $2.5M with high accuracy and built Power BI dashboards and Excel trackers that reduced escalation time across operations.

Power BI, Excel, Data Validation, Reporting

Wilfrid Laurier logo

Wilfrid Laurier University — Teaching Assistant

Guided 200+ students in Data Analytics and Calculus by explaining concepts clearly and helping debug Python and R code in labs and coursework.

Python, R, Statistics, Teaching

Projects


Catanthropic icon

Catanthropic

Multi-agent Catan system modeling resource economies, trade incentives, and long-horizon board planning with TensorFlow decision pipelines and a Flask REST API for game state queries.

Python, Flask, TensorFlow, NumPy, Multi-Agent Systems

Cortex icon

Cortex

AI-powered platform analyzing GitHub repositories, identifying technical strengths and knowledge gaps, and visualizing learning through a brain-inspired interface.

Python, Gemini API, GitHub Analysis, RAG, Data Visualization

Trade Risk icon

Trade Risk

Scenario based tariff risk engine for Canadian export sectors, combining explainable AI with cached Gemini generated risk explanations.

Python, Gemini API, Explainable AI, Caching, Risk Analysis

GenAI Finance Assistant icon

GenAI Finance Assistant

RAG-powered chatbot for querying finance data and terminology using LangChain, OpenAI, and vector similarity search over structured datasets.

Python, OpenAI API, LangChain, Streamlit, FAISS

Optimal Workout Case Study icon

Optimal Workout Case Study

Training program optimization using workout logs to analyze volume, frequency, and progression patterns under recovery and fatigue constraints.

Python, Pandas, NumPy, Jupyter Notebook

Credit Card Fraud Detection icon

Credit Card Fraud Detection (PyTorch)

End to end fraud detection under extreme class imbalance, evaluated with PR AUC, ROC AUC, threshold tuning, and top K review queue simulation.

Python, PyTorch, NumPy, Pandas, Matplotlib

LeetCode Problems icon

LeetCode Problems

Collection of solutions across difficulty levels, focused on clean implementations and fundamentals.

Python, Data Structures, Algorithms

LeetCode to Git icon

LeetCode to Git

CLI that collects problem metadata and code, then automatically commits and pushes solutions to GitHub using a consistent structure.

Python, Git, GitHub

Census Insights Dashboard icon

Census Insights Dashboard

Interactive dashboard using Dash callbacks and dropdown filters to explore census style demographics in Jupyter.

Python, Pandas, Plotly, Dash, JupyterDash

Uber Analysis icon

Uber Analysis

Trip volume analysis by day and month in R, including interactive DT tables and date feature work to surface demand patterns.

R, DT, Data Analysis

Colosseum Fighters icon

Colosseum Fighters

Ancient Rome themed fighting game built with Pygame, structured into modules for entities, UI rendering, and asset loading.

Python, Pygame

Extras


Photo roll

A few snapshots, scroll through with the arrows.

Contact


Contact Me

Reach out about opportunities, collaborations, or questions.