Academic and Professional Projects


Petroleum Data Analytics for Predictive Modeling (2024)

Spearheaded a comprehensive data analytics initiative aimed at predicting Initial Production (IP) rates from shale gas wells, leveraging a rich dataset to enhance accuracy in resource estimation and operational planning within the petroleum sector.

  • Data Engineering: Pioneered advanced data preprocessing techniques, including spline interpolation for missing value imputation, robust scaling for normalization, and isolation forest for outlier detection, ensuring a pristine dataset for model training.

  • Feature Engineering: Implemented one-hot encoding for categorical data transformation, followed by meticulous feature selection using Pearson's coefficient, F-score, and mutual information metrics to distill the most predictive features.

  • Exhaustive hyperparameter tuning was conducted across four machine learning algorithms (Random Forest, KNN, Gradient Boosting, and XGBoost with regularization) over a threefold cross-validation, achieving a test accuracy of 0.896 and train accuracy of 0.991 with Gradient Boosting.

  • Gathered feature importance analysis, utilizing optimized Gradient Boosting to elucidate and rank the features most critical to predicting IP, thereby providing actionable insights for exploration and production strategies.

  • Deployed the optimized preprocessors and Gradient Boosting model through serialization with Pickle.

Created wireframes and mockups for a fictional online robot retailer company.

Build-My-Bot (2022)

Simulated an automated stress testing environment that could be run on Android Applications to mimic potential crashes and improve the overall stability and health of the app in Production, using Python and Bash Scripting at JPMorgan Chase & Co. This idea received special mention in the CIO’s town hall in 2020.

Monkey Runner (2019-2020)

Automated Punching System (2017)

Created an Automated Punching System for the Ladies’ Hostel in our college, intended to ease the signing in and signing out process for girl students, which was otherwise a completely manual process, at the Ladies Hostel, as part of Database and Management Systems Course Project. Some of the critical features of this project include fully automated login/logout for students, Login for security personals to view the database and Login for the Hostel Warden, who would have admin access to the database and can grant special permissions or alter curfew timings as required. This system was created using HTML, CSS, JavaScript (Bootstrap) with PHP and MySQL.

Point-Of-Sale (POS) System (2022)

As part of the DBMS coursework, I designed, implemented, and maintained a Point-Of-Sales (POS) system in a Relational Database on an AWS EC2 instance. As part of the course project, I implemented ETL processing, views, materialized views, and indexing to make reads faster and stored procedures and triggers to keep the POS system up-to-date. I scaled the POS system across multiple servers using standard replication and P2P clustering on Galera by configuring the AWS instances accurately. Finally, I performed schema migration of data from the Relational Database into a NoSQL Document Database called MongoDB using JSON aggregates and executed basic analytical queries to answer business questions. 

Worked on a Proof of Concept (POC) to enable data transfer across mobile devices with ultrasonic sound waves using frequency modulation and specific encoding and decoding techniques in an attempt to execute payments between 2 mobile devices in close proximity. This idea and the developed POC also received special mention in the Global Hackathon conducted at JPMorgan Chase & Co in 2018.

Proximity Payments (2018)

Product Canvas (2022)

Designed a product canvas for Destination Bryan after stakeholder analysis to enhance user traffic. Drafted requirements, designed wireframes, aka mockups, and outlined KPIs. Also designed the overall workflow and storyboard showcasing the user experience. Crafted a product roadmap with the help of user stories and sprint planning.

Placement Prediction and Training (2018-2019)

Developed a Placement Prediction Module in Python, which was essentially an ML model that analyzed previous year’s student’s historical data and made a probabilistic conclusion of the chances of a student’s placement in collaboration with the Training and Placement Cell of our college. A Training Module based on Adaptive Learning was also developed in Python and Django Framework along with the former enabling students to improve themselves in those fields they had to focus upon, which was achieved by analyzing the company-wise requirements and interview/test questions.

Design and Implementation of a Data Warehouse for a Retail Store (2023)

In a comprehensive data warehouse project for Dominick's Finer Foods, I led the design and development efforts, focusing on retail analytics and store-level data aggregation. This involved analyzing a substantial dataset from the Chicago Booth School of Business, which included over 3500 Unique Product Codes (UPCs) across approximately 100 stores. My approach included applying Kimball’s methodology to design independent, conformable data marts and utilizing Dimensional Modeling techniques for Star Schema creation to facilitate efficient data organization and retrieval. In managing the project's technical aspects, I designed and implemented robust data pipelines using SQL Server Integration Services (SSIS) and managed database interactions through SQL Server Management Studio (SSMS), ensuring coherent data mapping, effective ETL processes, and seamless data integration. To further enhance the project's analytical capabilities, I leveraged Power BI, SSRS, and SSAS for sophisticated data visualization and reporting. Additionally, I employed Microsoft Visio for the development and documentation of the data warehouse architecture, ensuring a clear representation of data flow and system integration.