Data • ML • Storytelling

Turning Data into Decisions

I am Nayan, a data scientist dedicated to transforming complex datasets into actionable insights. I specialize in machine learning, data analytics, and storytelling, building end-to-end solutions—from robust data pipelines and predictive models to interactive dashboards—that help organizations make smarter, data-driven decisions.

Currently

Pune Maharashtra, India

Focus

Production ML Systems

30+

Self Initiated Projects

GitHub

Projects

Scroll

About Me

Connecting the dots between data, engineering, and real-world impact.

Standby

Nayan Darokar

Aspiring Data Scientist

Passionate about building intelligent systems that are scalable, interpretable, and impactful.

💡

I build end-to-end ML systems that solve real problems.

From NLP to deep learning, I don't just train models—I design pipelines, ensure data quality, and deploy solutions with clean engineering. I bridge the gap between theoretical algorithms and deployed applications.

🎯

Focus Area

Full-cycle ML: From raw data collection to Real deployment.

Data Prep 01
Predictive Modeling 02
Evaluation 03

⚡

The Edge

I blend analytical rigor with engineering best practices. I don't just run experiments; I write clean, modular code that is ready for production.

🔍

Research

💻

Engineering

Status

Open to Work

👨‍💻

Profile Verified

ID: #CAND-2025-DS

Name Nayan

Role Data Scientist

Experience Fresher

Match Score 98%

Skills

PYTHON SCIKIT-LEARN TENSORFLOW SQL POSTGRESQL NLTK DOCKER

🧰

Core Languages & Databases

Production-ready code and structured data pipelines.

Python SQL PostgreSQL SQLite

📊

Data Analytics & Visualization

Exploratory analysis, statistics, and storytelling.

Numpy Pandas Matplotlib Seaborn Plotly Power BI Statistics Exploratory Data Analysis ( EDA)

🛠️

Machine Learning Libraries & Tools

Frameworks to build fast, reliable models.

XGBoost Scikit-Learn NLTK TensorFlow LightGBM Gensim Sentence Transformers OpenCV( basic ) Pytorch ( Learning ) Gensim ( basic ) Joblib

🤖

Machine Learning Algorithms

Classical algorithms for tabular problems.

Regression Decision Tree Random Forest Support Vector Machine ( SVM ) K-Nearest Neighbors Naive Bayes Gradient Boosting AdaBoost Bagging Boosting Stacking Feature Engineering

🧠

Unsupervised, Deep Learning & NLP

Neural networks, NLP, embeddings, and clustering

Artificial Neural Network ( ANN ) Convolutional Neural Network ( CNN ) NLP Clustering K-Means TF-IDF Word Embeddings Tokenization & Preprocessing Image Preprocessing & Augmentation

🚀

Deployment Tools & Evaluation

Metrics, serialization, and production patterns.

Streamlit Flask Docker Git CI-CD ( Learning Soon ) Pickle / Joblib Model Deployment F1-Score Precision Recall ROC

Workflow Ecosystem

// 01 Source

01 Source

Connects to data sources and validates input integrity.

df = read_csv()

Connected

// 02 Clean

02 Clean

Removes noise and prepares structured data for modeling.

def clean(x):

Scanning Cleaned

// 03 Pipeline

03 Pipeline

Trains the model using optimized feature pipelines.

model.fit(X,y)

Processing Trained

// 04 Evaluation

04 Evaluation

Measures performance using real validation metrics.

loss: 0.021

Evaluating Verified

// 05 Monitor

05 Monitor

Tracks system health and model stability in production.

Status: 200 OK

Online

Projects

PROJECTS ARE DEPLOYED ON FREE HOSTING INITIAL LOAD MAY TAKE 40–60 SECONDS DUE TO COLD-START INFRASTRUCTURE

HOSTED ON RENDER (COLD-START ENABLED) SERVICE MAY GO IDLE WHEN INACTIVE FIRST REQUEST MAY TAKE ~40–60 SECONDS

PROJECTS ARE HOSTED ON FREE-TIER INFRASTRUCTURE FIRST REQUEST CAN TAKE UP TO 60 SECONDS COLD-START DEPLOYMENT IS ENABLED

Similarity Engine

Movie Recommendation System

VectorCine AI is a High-Fidelity movie recommendation system for vector-based similarity using interpretable modeling.

Docs

Sentiment Intelligence

McDonald's Review Sentiment Classifier

VERITTA-AI is a High-Fidelity review classification system for noisy real-world text using interpretable NLP.

Docs Visit Live App

Customer Risk

Mode: Strict

Hover to Scan

Risk Intelligence

Bank Customer Churn Prediction

RiskFlow v2.0 is a High-Fidelity churn risk intelligence system for structured banking-style customer data.

Docs Visit Live App

Tumor Detected Conf: 98.4%

Diagnostic Intelligence

Brain Tumor Detection (CNN)

Neurovia is a High-Fidelity brain tumor detection system for MRI scans using interpretable, deep learning system.

Docs

WORLD

SPORT

BIZ

TECH

ENSEMBLE ACTIVE

Content Intelligence

News Classification App

INFERSIS-AI is a High-Fidelity news classification system for real-world content using interpretable NLP.

Docs Visit Live App

SMS BLOCKED

MAIL

JUNKED

Message Intelligence

SMS / Email Spam Classifier

INBOXIS AI is a High-Fidelity Email and SMS classification system for real-world messages using interpretable NLP.

Docs Visit Live App

// Case Studies

Airbnb Price Leakage

Exploring pricing patterns through business-driven EDA. Uncovering critical factors like target leakage and neighbourhood data quality for robust ML pipelines.

#Numpy #Pandas #DataViz

View Case Study

Blood Donor Modeling

Analyzing donation behaviors to highlight city-level activity and consistency. Emphasizes data quality and documents modeling failures for responsible data science.

#Seaborn #Scikit-Learn #Statistics

View Case Study

System Architecture

Why i Built this way.

Bcz I Built this Like a Product.

Most portfolios are templates. This is a custom-engineered system focusing on performance, scalability, and context-aware interactions.This isn't overengineering; it's craftsmanship.

// System Modules (Active)

Dynamic Island UI

State-driven attention control.

Technical Implementation

Centralized state management store.
Context-aware switching (Loading vs Audio).
Reduces viewport clutter by 40%.

Portfolio AI (RAG)

Context-aware recruiter assistant.

Backend Logic

Vector Embeddings for project matching.
Custom RAG Pipeline (Retrieval Augmented Gen).
Designed for zero-hallucination responses.

GSAP + Lenis

Scroll inertia & storytelling logic.

Performance Metrics

Native JS animation loop (60fps).
Lenis implementation for scroll smoothing.
Zero layout shift on load.

// Interaction Philosophy

Why so much
interaction?

I didn't take the easy way out with a template. Every interaction is built to prove that function implies form.

4 Months

bash -- contribution-graph

$ git log --graph --oneline --contributions

135 Commits

repo/data-science-portfolio

Interaction Feel

Fluid

Prioritizing human perception over metrics.

Architecture

Zero Dependencies

Pure Vanilla JS. No framework bloat.

Treated as a System

✓

Component Reusability

Modular sections driven by config, not hard-coded spaghetti.
✓

No Heavy Frameworks

Vanilla JS architecture proves understanding of the DOM and core performance principles.
✓

Scalability

Built to expand effortlessly as my project library grows.

The Hybrid Advantage

Background Transparency

Former MERN Stack Developer

Gave me the discipline to build robust, interactive interfaces.

Data Scientist (Current)

Allows me to build logic that actually parses complex data, not just displays it.

// Engineering Retrospective

Trade-offs & Iterations

Decision: Architecture

Why No React?

React is powerful, but Virtual DOM reconciliation creates overhead for frame-perfect animations.

The Trade-off:

Chose Vanilla JS for direct render cycle control, prioritizing raw performance over dev speed. This helped me to stay close to how the browser actually renders UI.

Incident: Refactor

11,000 Line Problem

Managing 11k+ lines of Vanilla JS without a framework led to severe layout shifts and lag.

The Fix:

Refactored game loop to an isolated layer, achieving 60fps stability alongside AI processing. It pushed me to rethink structure and performance in a real-world way.

Shipped > Perfected

When Ambition Met Reality

I pushed this portfolio beyond a static site, building interactive systems, client-side intelligence, and layered motion.

Learned:

Performance is about timing, prioritization, and architecture. I shipped to learn from constraints and iterate in real conditions fast, without sacrificing clarity.

Context-Aware AI Summaries

Feature Spotlight

The Inspiration

I noticed modern browsers providing smart page summaries. Instead of waiting for API access, I decided to build my own version to elevate the user experience—making it adapting naturally to where you navigate.

Design & Logic

I didn't want a generic chatbot. I built a system that feels "aware" of the portfolio's context.

Intentional Brand-Consistent Voice Control

The Philosophy

When something inspires me,
I don’t wait for access.
I build my own version and make it better.

This is how I build.

Philosophy

Product-First • Performance-First • User-First

Certificates

Selected Udemy certificates

Data Science, ML & NLP Bootcamp – Krish Naik

Issued: July 15 2025

Short: Learned data preprocessing, model building, evaluation, and deployment using Python, Scikit-learn, TensorFlow, and advanced NLP techniques.

View Certificate

SQL Intermediate Skills Certification — HackerRank

Issued: 31 Jan 2025

Short: Gained strong proficiency in writing complex queries, performing joins, subqueries, and advanced data filtering techniques.

View Certificate

Python Machine Learning: Beginner to Pro — Udemy

Issued: 09 Feb 2025

Short: Built robust ML models using supervised and unsupervised learning techniques with Python and applied them in practical projects.

View Certificate

Experience &
Education

You might wonder why this portfolio feels so modern and polished — it reflects my experience as a former full-stack developer and my transition into Data Science, combining engineering discipline with machine learning expertise.

🚀 Read My Job-Seeking Journey

🎓

ENGINEER

B.Tech — CSE

Graduated 2024

JARVIS OS Project

Built a desktop virtual assistant (JARVIS OS)
Implemented voice-based control & automation
Designed a modular and scalable architecture
Added voice authentication for security
Independently developed over 720+ hours

MONGO

NODE

REACT

MERN Stack Developer

Early 2024

Full-Stack Architecture

Developed full-stack web applications using MERN
Built a UBER clone with AI assistance
Created an employee management system
Implemented APIs & authentication
Designed responsive UI using React

MONGO DB

EXPRESS

REACT

NODE.JS

FULL STACK

Remote Internships

2024

MERN Stack Developer

Worked with distributed remote teams
Built authentication systems
Developed pizza ordering websites
Debugged APIs and backend services
Used Git & GitHub

REACT

MONGO

NAYAN

Data

Scientist

Transition to Data Science

Late 2024

ML & NLP Focus

Shifted from full-stack to data science
Mastered Python Data Stack (Pandas, NumPy, Scikit-Learn)
Specialized in NLP architectures & Predictive Modeling
Engineered robust features for high-accuracy models
Applied software discipline to experimental ML workflows

TRAINING READY

Data Science Practitioner

Jan 2025 – Present

Actively seeking opportunities in Data Science & AI
Architecting end-to-end Deep Learning solutions
Exploring Large Language Models (LLMs) & Transformers
Refining skills in advanced algorithms.
Translating complex data into actionable business insights

Portfolio AI · Online

Questions You Might Have

Responses are tailored to the context of this portfolio.

verifying access token.. [ ACCESS GRANTED ] fetching repo metadata... Displaying Contributon Graphs

Source Found: GitHub

bash -- contribution-graph

git log --graph --oneline

135 Commits 840+ contributions

repo/data-science-portfolio

Get in Touch

Contact

OPEN TO WORK. HIRE ME FOR ML/AI PROJECTS AVAILABLE FOR FULL-TIME ROLES

Nayan Darokar

Data Scientist • ML Engineer

Connect

Let's Connect

↗

git commit -m "Hiring Nayan"

git push origin main

GitHub

Codebase

840+ contributions in the last year

Compose

Ready

Waiting for you to send…

Nayan’s Inbox

Send Email

Click to Launch

Theme

Switch to Light

WARM EDITION

Prefer Light Mode?

Warm Environment

[ REQUEST INITIATED ] Connecting to light environment...

Routing

Establishing secure connection...

Target: V2 Light Mode

Live Simulation

Total Visits

0

Increasing Lifetime

Typical Activity

Analyzing time bucket...

VISITOR DETECTED

ARCHITECTING INTELLIGENCE

Turning Data into Decisions

Nayan Darokar

Technical Skills

Projects

RiskFlow v2.0 — Customer Churn Intelligence Platform

VectorCine AI — Vector Similarity-Based Recommendation Engine

Certificates

Education

Select Summary Style

About Me

Nayan Darokar

I build end-to-end ML systems that solve real problems.

Focus Area

The Edge

Status

Select Summary Style

Skills

Core Languages & Databases

Data Analytics & Visualization

Machine Learning Libraries & Tools

Machine Learning Algorithms

Unsupervised, Deep Learning & NLP

Deployment Tools & Evaluation

01 Source

02 Clean

03 Pipeline

04 Evaluation

05 Monitor

Select Summary Style

Projects

Deployment Information

Movie Recommendation System

McDonald's Review Sentiment Classifier

Customer Risk

Bank Customer Churn Prediction

Brain Tumor Detection (CNN)

News Classification App

SMS / Email Spam Classifier

Project Summary Style

// Case Studies

Airbnb Price Leakage

Blood Donor Modeling

Case Study Summary Style

Why i Built this way.

Bcz I Built this Like a Product.

Dynamic Island UI

Portfolio AI (RAG)

GSAP + Lenis

Why so much interaction?

Treated as a System

The Hybrid Advantage

Trade-offs & Iterations

Why No React?

11,000 Line Problem

When Ambition Met Reality

Context-Aware AI Summaries

The Inspiration

Design & Logic

The Philosophy

This is how I build.

Select Philosophy Tone

Certificates

Data Science, ML & NLP Bootcamp – Krish Naik

SQL Intermediate Skills Certification — HackerRank

Python Machine Learning: Beginner to Pro — Udemy

Certificate Summary Style

Experience &Education

B.Tech — CSE

MERN Stack Developer

Remote Internships

Transition to Data Science

Data Science Practitioner

Experience Summary Style

Questions You Might Have

Select Summary Style

Contact

LinkedIn

GitHub

Why so much
interaction?

Experience &
Education