👋

Welcome to My Portfolio!

I’m a final-year Computer Science student passionate about solving data problems using Python, SQL, and MLOPS. 📈

I’ve built several end-to-end projects and aspire to work in data analytics and AI.🔍

Rajeev kumar

About Me

I’m a Computer Science (Data Science) undergraduate from NIET, with a strong passion for uncovering insights from data to solve real-world problems.

My journey began with a curiosity for numbers and patterns, which evolved into hands-on experience through academic projects and self-driven learning.

I’ve successfully completed several data analysis projects that reflect my analytical thinking, technical skills, and commitment to continuous growth.

I enjoy transforming raw data into actionable insights. Explore my portfolio to see how I solve complex data challenges with practical, results-oriented solutions.

Skills & Technologies

💻

Programming Languages

Python
SQL
Java
📊

Data Visualization

Power BI
Tableau
Matplotlib / Seaborn
🗄️

Databases & Tools

MySQL
MongoDB
🧠

Analytics & ML

Machine Learning
Statistical Analysis
Data Mining
📈

Business Tools

Advanced Excel
Google Analytics
☁️

Cloud & Other

AWS S3
Git / GitHub
Kubernetes

Featured Projects

🚗
MLOps Project - Vehicle Insurance Data Pipeline
Domain: Machine Learning | Functions: Data Engineering & MLOps

A robust end-to-end MLOps pipeline designed for managing vehicle insurance data — covering data ingestion, transformation, model training, and deployment with CI/CD automation. This project demonstrates real-world data management and deployment practices integrating AWS, Docker, and GitHub Actions.

Key Achievements:

  • Developed automated data ingestion from MongoDB to ML pipeline
  • Implemented data validation, transformation, and model training modules
  • Deployed ML model using AWS S3, EC2, ECR, and GitHub Actions
  • Enabled CI/CD workflow with Docker integration for seamless deployment
Python MongoDB AWS (S3, EC2, ECR) Docker GitHub Actions Scikit-learn
🤖
RockyBot – Research Tool
AI-powered News Insight Engine

Built an interactive news research assistant using Google Gemini and Streamlit. RockyBot allows users to extract, analyze, and query multiple news articles intelligently using LLM-driven question-answering.

Key Features:

  • Multi-article processing and content extraction via scraping and LangChain loaders
  • Semantic search using vector embeddings (FAISS)
  • Natural language Q&A with contextual memory and random prompts
  • Visual metrics dashboard and quick insights with chat history
Streamlit LangChain Google Gemini API FAISS BeautifulSoup Plotly
📈
Sales Data Analysis
Domain: FMCG | Functions: Sales & Finance

Comprehensive analysis of the Atliq Hardware Sales , focusing on market penetration, growth opportunities, and competitive landscape for strategic decision-making.

Key Achievements:

  • Identify high and low-performing cities based on profit margin
  • Track revenue and profit changes from 2017 to 2020
  • View revenue and profit contribution by market
  • Analyze revenue and profitability by customer.
Power BI Excel Power Query MySQL
📚
Smart Assistant
Domain: GenAI | Functions: Research Summarization & QA

An AI-powered research assistant built with Streamlit, LangChain, and Groq (LLaMA3) — designed to summarize uploaded PDFs/TXT, answer document-based questions with source references, generate logic-based MCQs, and run conversational Q&A backed by FAISS vector search.

Key Features:

  • Upload and process .pdf / .txt research documents
  • AI-generated concise summaries with source citations
  • Document Q&A with conversational memory buffer
  • "Challenge Me" mode: logic-based MCQs + feedback
  • Fast inference using Groq LLaMA3; semantic search via FAISS
  • Secure API key handling (.env or Streamlit secrets)
Streamlit LangChain Groq (LLaMA3) FAISS PyPDF2 Python

Let's Connect!

I'm always excited to discuss data projects and opportunities