About

About Me

As a data scientist with a background in computer science, I excel in data analysis and machine learning. With a Master’s in Data Science from Indiana University Bloomington and a B. Tech from DIT University, I am skilled in Python, SQL, Tableau, R, TensorFlow, and Keras. My internships at NASA's USRA and Optum involved enhancing semantic segmentation models and automating healthcare data workflows. Proficient in tools like QGIS, MySQL, and Snowflake, I deliver data-driven solutions. My projects, including real-time Twitter sentiment analysis, reflect my commitment to practical tech applications. Let's connect to discuss how I can contribute to your data science initiatives.

  • Profile: Data Science & Analytics
  • Domain: Healthcare & Geospatial
  • Education: MS in Data Science & B.Tech in Computer Science
  • Language: English, Hindi

Experience


May 2023 - Aug 2023

Data Science Intern

USRA-NASA

Technologies:- Python, QGIS, Keras with Tensorflow, Deep learning.

  • Achieved 96% IoU accuracy rate by adopting wide array of ML techniques and optimizing the U-Net architecture with a ResNet-34 backbone resulting in effective classification of the burned and unburned regions in satellite land images using semantic segmentation.
  • Utilized QGIS software to produce labeled images for GeoTIFF files and integrated various data augmentation techniques to enhance data quality and precision, demonstrating a focus on problem solving to optimize model performance.

Jun 2021- May 2022

Data Analyst

Optum - UnitedHealth Group

Technologies:- SQL, Python (Numpy, Pandas), Tableau, MySQL.

  • Saved 4-5 hours of daily manual effort and reduced claim sync errors by implementing the automation of Python batch jobs to streamline healthcare data extraction, transformation and loading from the DB2 server to the application server.
  • Improved decision accuracy by 25% by developing automated statistical reports with Tableau for senior leadership and cross-functional teams. These reports highlighted KPIs to understand business dynamics and forecast demand, identifying high-rejection claim sites and optimizing data source monitoring.
  • Formulated complex SQL queries in MySQL to enhance data quality within the data center supply chain, resulting in a 20% reduction in data defects and improved problem resolution efficiency by 30%.

May 2019 - Jul 2019

Machine Learning Intern

Ducat

Technologies:- Python (Pandas, Matplotlib, scikit-learn, Keras with Tensorflow), PowerBI, Excel.

  • Increased crop yield potential by developing end-to-end models in Python, conducting Exploratory Data Analysis (EDA) and data manipulation on Indian soil data to identify nutrient deficiencies and optimize agricultural decisions.
  • Achieved a significant 15% reduction in pesticide costs by designing dashboards with PowerBI to present insights, optimizing farming techniques and pesticide use.

Projects


Graphical Live Twitter Sentiment Analysis with NLTK

Executed a Twitter API-based project for real-time data gathering, applying NLP techniques using NLTK library to attain a 90.2% sentiment analysis accuracy, and crafted interactive dashboards with statistical graphs and visualizations to deliver valuable insights.

Calorie Tracker

Built and deployed the Health Analyzer using LangChain and Gemini Pro API, integrating LLM for calorie count and nutrient breakdown from food images. Connected and built APIs to turn these models into a full production system, enhancing healthy food habits.

Gaze-Controlled Keyboard

Devised a Python and OpenCV Eye Detection project, achieving a 95% accuracy rate in tracking eye movements, while also pioneering AI-powered experiences with a hands-free keyboard control system as an innovative alternative for individuals with quadriplegia.


AppFusion: Where Apps Align with Your Desires

Designed an app recommendation system utilizing MySQL and Streamlit (Python), implementing CRUD operations, resulting in an innovative, user-centric solution, and fostering user-friendly Apple app store exploration.

Spotify Worldwide Daily Song Analysis

This dataset includes daily rankings of the top 200 songs in 53 countries from 2017 to 2018, covering over 2 million rows from 6629 artists and 18598 songs, totaling 105 billion streams. The project aims to analyze Spotify's regional chart data to uncover trends in music preferences over time across different countries.

More projects on Github

I love to solve business problems & uncover hidden data stories


GitHub

Education


2022-2024

MS in Data Science

Indiana University Bloomington

Courses: Statistics, Data Cleaning and Feature Engineering, Fundamentals of Data Mining.

Assistantship: Graduate Assistant for Introduction to Programming.

2017-2021

B. Tech in Computer Science and Engineering

DIT University - Dehradun

Courses: Artificial Intelligence (AI), Data warehousing, Data Analytics.

Skills and Expertise


Skills

python
Machine Learning
Deep Learning
Computer Vision
NLP
Generative AI
R
Tableau
Power BI
QGIS
SQL
MySQL
PostgreSQL
Version Control (Git/GitHub)
Unix
MS Excel

Expertise


Data Scientist

As a data scientist with a strong foundation in computer science and data analysis, I specialize in using advanced machine learning techniques and statistical methods to uncover actionable insights from complex datasets. With a Master’s in Data Science from Indiana University Bloomington and hands-on experience at NASA's USRA, I excelled in semantic segmentation, achieving a 96% IoU accuracy rate. Proficient in Python, TensorFlow, and Keras, I have also developed real-time data analysis projects like a Twitter sentiment analysis tool, demonstrating my ability to apply cutting-edge technologies to solve real-world problems.

Machine learning Engineer

As a Machine Learning Engineer, I bring a robust understanding of machine learning and deep learning, both in theory and practice. During my internship at Ducat, I implemented predictive models and conducted exploratory data analysis on agricultural datasets, resulting in increased crop yields and reduced pesticide costs. Proficient in Python, scikit-learn, Keras, and TensorFlow, I have developed and deployed models for diverse challenges, including health data analysis and accessibility solutions. My projects demonstrate my capability to innovate and apply machine learning techniques to enhance technology.

Data Analyst

My experience as a Data Analyst at Optum-UnitedHealth Group highlights my expertise in data analytics and business intelligence. I automated data extraction and transformation processes, significantly reducing manual effort and errors. By developing comprehensive Tableau reports, I improved decision-making accuracy by 25%. My complex SQL queries enhanced data quality, leading to a 20% reduction in defects. This role showcased my ability to optimize data workflows and provide valuable insights for better business outcomes.


Contact

Contact Me

Below are the details to reach out to me!

Address

San Diego, California

Contact Number

+ 1 812-803-9202

Download Resume

resumelink