
Timeline
About me
Executive - Data Scientist @ BARC India | Stanford ML Course Graduate | Machine Learning Enthusiast | Data Science Enthusiast | Automation | Power Platform |
Education
Almabetter
-Dav public school,ara
2012 - 201412th science a+Dav public school ara
2011 - 201210th 9.6Gautam buddha university
2014 - 2019Intergated dual degree b.tech + mba finance & hrm a
Experience
Bihar entrepreneurs association (bea) बिहार उद्यमी संघ
Jun 2018 - Jul 2018Intern1.Planning and Development of Bihar IT Summit Proposal & Government Schemes for EntrepreneurshipDevelopment.2. Study of different Government Schemes for ignition of Entrepreneurship in Girls, Women.3. Study of latest intervention in the field of IT which will help farmers and other people to increase theirdevelopment.
Almabetter
Feb 2022 - Oct 20221.Learnt skills like EDA, Data Visualization, Data Preprocessing, and other skills like Team work, Time management and Problem Solving, Leadership . Got hands on experience with tools like R , Power BI , Tableau and Scikit-learn.2. Good understanding and implementation knowledge of various Machine Learningalgorithms and deep learning algorithms such as LSTM Model,Logistic regression, Linear Regression, Support Vector Machines , Ensembles techniques like XGBoost.3. Has an expertise working on clustering , recommendation , Time series , NLP.4.. Worked on projects like Health Insurance Cross Sell Prediction, Seoul Bike Sharing Demand Prediction and Online Retail Customer Segmentation. gained proficiency in python programming. 5.Worked as a Subject Matter Expert and data science related questions and queries over doubt resolution forum.6.Performed SQL case studies like Pizza Runner, Store, and Danny's Dinner. Gained Proficiency in Python, SQl Show less 1. Implemented text preprocessing techniques such as text cleaning, and stemming.Performed vectorization of textual data using TF-IDF followed by dimensionality reduction using PCA.2.Implemented K-Means clustering on description, genre, and cast text to categorize 7.7K TV shows and movies into 5 different clusters which helped in garnering insights on Netflix and how content is being consumed.3.Business Achievement:- Has helped to create better customer retention acquisition.4. Developed K-Means Clustering and evaluated the optimal clusters using the Silhouette score ie 0.35 and Elbow method where elbow. is at k =3. 5. Performed Hierarchical clustering and evaluated the optimal clusters using the Silhouette score ie 0.32 and Dendrogram where the threshold cluster is at 3. Show less 1. The objective was to Investigate customers defaulting on credit cards.2. Basic data inspection by Exploratory Data Analysis using Matplotlib and Seaborn giving an in-depth intuition to the important features of our dataset. 3.Developed a binary classification model using algorithms such as Logistic Regression, Random Forest, and XG Boost. The Random Boost model achieved a ROC_AUC score of 0.91. 4. Analyzed missing value imputation using statistical measure, implemented SMOTE boosting to oversample the minority class observations, and carried out hyperparameter tunning usingGridSearchCV . 5. Implemented SMOTE boosting to oversample the minority class observations, andcarried out hyperparameter tunning using GridSearchCV.6. Deployed model on Vercel using Flask API. Link:- Credit card default prediction Show less 1. Predicting the stock closing price of the month based on given features. Built a regression model using Linear regression, Regularization techniques, XGboost, and Random forest and created a model using LSTM with an accuracy of 77 %. 2.Designed model using Auto ARIMA which can be used in the business for short-termforecasting 3. Basic data inspection by Exploratory Data Analysis using Matplotlib and Seaborn giving an in-depth intuition to the important features of our dataset. 4. Split the original dataset into train and test. Fitted the model to Linear regression, Random Forest, XGBoost, SVM, and KNN. Out of which Random Forest performed well with an R2Score of 0.91. 5. Cleave dataset with new lag columns using time series split. Used FBProphet, Linear Regression, Regularised linear regression, and Random Forest. Out of which Linear Regression performed well with an R2Score of 0.98. Show less 1.We have found hosts that take good advantage of the airbnb platform provide the mostlistings; we found that our top host has 327 listings.2. After that, we proceeded with analyzing boroughs and neighborhood listing densities andwhat areas were more popular than another.3. Next, we put good use of our latitude and longitude columns and used to create ageographical heatmap color-coded by the price of listings.4. Further, we came back to the first column with name strings and had to do a bit morecoding to parse each title and analyze existing trends on how listings are named as wellas what was the count for the most used words by hosts.5. Low cost rooms or in range 0-50 $ have more reviews. This shows us that people whopay more for the rooms generally don’t write reviews. It is observed people write reviewsmore if they are not happy with their experience. In case of costly rooms there is highpossibility that the customers are happy. So they dont write much reviews.6. Lastly, we found the most reviewed listings and analyzed some additional attributes. Show less
Data Science Trainee
Aug 2021 - Oct 2022Netflix Movies and TV Shows Clustering
May 2022 - Jun 2022Credit Card Default Prediction
Apr 2022 - May 2022Yes Bank Stock Closing Price Prediction
Mar 2022 - Apr 2022AirBnB Bookings Analysis
Feb 2022 - Mar 2022
Kpmg
Feb 2022 - Sept 2022Data analytics virtual consulting internshipIt involved analyzing business data sets and improving their data quality. Then target specific customers through the analysis of customer behaviour trends and patterns. Then represent all of these insights by using a dashboard through Tableau, Power BI and Excel.Module 1: Data Quality Assessment To analyze data quality issues with the given datasets. It comprised of pointing out the existing discrepancies and suggesting ways to overcome those to prevent future hindrances in the analysis. It needed a good knowledge of Excel, particularly of filter tool and other functions. It was required to convey all this information over the mail to the client.Module 2: Data InsightsTo gather insights about potential customers by studying their past behaviour and choosing data analysis strategies for the same. The following factors were considered in this: age distribution, number of purchases in the last three years, job industry category, wealth segment, number of cars owned etc. In this part, a presentation had to be made to display the data analysis strategies and its results to the client to help them choose their target customers.Module 3: Data PresentationTo display the findings of the previous task such as target customers, top 10 goods, customer demographics, their spending habits in the form of a dashboard using Power BI, Tableau to help in data visualization. Show less
Barc india
Oct 2022 - nowExecutive
Jan 2023 - nowManagement Trainee
Oct 2022 - Jan 2023
Licenses & Certifications
- View certificate
Machine learning specialization
Deeplearning.ai, stanford universityJan 2024 - View certificate
Foundation ai - vilt
Ssc nasscomJul 2022 - View certificate
British airways - data science job simulation
ForageSept 2023 Supervised learning algortihms - 1
AlmabetterMay 2022- View certificate
Python (basic)
HackerrankJul 2022 - View certificate
Pwc switzerland - power bi job simulation
ForageSept 2022 Inferential statistics
AlmabetterMay 2022- View certificate
Sql (basic)
HackerrankJul 2022 Python basics -2
AlmabetterFeb 2022Supervised learning algorithms -ii
AlmabetterMay 2022Data analyst
AlmabetterFeb 2022- View certificate
Sql (intermediate)
HackerrankJul 2022 Probability , statistics , calculus & linear algebra
AlmabetterApr 2022Unsupervised learning algorithms
AlmabetterMay 2022Python basics
Whitehat jrJan 2022Data analyst
Whitehat jrFeb 2022Supervised learning algorithms - iii
AlmabetterMay 2022Python basics
AlmabetterJan 2022Advanced machine learning
AlmabetterMay 2022- View certificate
Data science premium program
AlmabetterSept 2022 Tableau
AlmabetterMay 2022- View certificate
Kpmg au - data analytics job simulation
ForageSept 2022 Sql fundamentals
AlmabetterFeb 2022Descriptive statistics
AlmabetterMay 2022Advanced excel
AlmabetterFeb 2022- View certificate
Sql (advanced)
HackerrankJul 2022
Honors & Awards
- Awarded to Abhishek AnandStar Student AlmaBetter Mar 2022 Selected as Star Student in just 2.5 months after the starting of the course. Among the top 3 students of my cohort which includes more than 100+ students.
Volunteer Experience
Student Volunteer
Issued by National Service Scheme on Aug 2014Associated with Abhishek Anand
Recommendations
Stephen badu
Senior System Administrator with 9+ years’ experience in Windows Servers, Active Directory, SQL, Sha...Abuja, Federal Capital Territory, NigeriaGeorgi stoyanov
Project Manager at Prototyp Ltd.Sofia, Sofia City, BulgariaGabriel angheli
Certified Software Tester | Certified Scrum MasterBucharest Metropolitan AreaKathleen wong
CX Performance Analyst at Southwest Airlinesدالاس فورت ورث متروبليكسAnjana das
Seeking Opportunities in Software Development | Proficient in Java, DSA, SQL, DBMS |Purba Bardhaman, West Bengal, IndiaKooi hoe lim
Manager at Infineon Technologies Asia Pacific Pte. Ltd.SingaporeKelly ángeles
Key Account Manager at CTSMexico City Metropolitan AreaAshlee anderson
Advocacy • Research & Evaluation • Community engagement • Learning & DevelopmentParkwood, Western Australia, AustraliaAmanda cosco
Communications Expert Focused on Emerging Technologies | Beauty and Fashion Futurist & Founder of El...Toronto, Ontario, CanadaMatthew glenn, csp, chst
HSE Manager at Bernhard MCC, LLCDallas-Fort Worth Metroplex
...