Hi, I'm Sai Sindhura Kollepara

Data Analyst

Professional headshot

Dynamic and versatile data professional with expertise spanning data analysis, engineering, and science. Proficient in transforming complex, high-volume datasets into actionable insights through advanced analytics, scalable pipelines, and predictive modeling. Skilled in Python, SQL, and R, with experience in statistical analysis, machine learning, and data visualization. Adept at building robust ETL workflows, designing cloud-native architectures, and deploying solutions on AWS, GCP, and Snowflake. Experienced in BI tools such as Power BI, Tableau, and Looker Studio. Holds a Master’s degree in Data Science from Indiana University Bloomington, with a strong foundation in automation, experimentation, and cross-functional collaboration.

Find Me In

Biography

Summary

Technically grounded and outcome-focused data practitioner with experience in building end-to-end solutions—from ingesting raw data to producing scalable analytics and decision-ready insights. Skilled in data modeling, cloud-native development, and dashboard-driven storytelling. Proficient in Python and SQL, with experience deploying solutions on AWS, GCP, and Snowflake. Known for adaptability, critical thinking, and applying data to solve complex real-world problems across technical and business contexts.

Work Experience

Data Scientist – Data Science & AI Lab, Kelley School of Business, Indiana University Bloomington

Jan 2025 - Present

  • Analyzed behavioral and survey data from 800+ students using Python and SQL to detect early signs of mental health concerns and enable AI-driven predictive analytics.
  • Created time-aligned sensor-survey datasets, improving feature accuracy by 15% for mental health models.
  • Built 5+ dashboards using Plotly Dash and Tableau to visualize mental health patterns, user retention, and demographic trends.
  • Conducted user retention analysis to uncover drop-off timelines, revealing a 25% dropout risk within the first 30 days.
  • Performed social network analysis to evaluate how closeness scores, contact frequency, and relationship types influence depression and anxiety scores.
  • Compared sensor data quality between Apple Watch and non-Apple Watch users, identifying an 18% decline in completeness among non-Apple users.
  • Integrated behavioral signals across health, device usage, and social context to support symptom detection through within-user trend analysis.
  • Developed an interactive Plotly Dash dashboard with demographic filters and behavior-symptom mapping tools to support clinician and researcher decision-making.
  • Executed all analyses and visualizations in collaboration with the Kelley's Data Science and AI Lab (DSAIL), supporting real-world mental health research and intervention design.

Data Analytics Engineer - Indiana University Bloomington

Aug 2024 - Dec 2024

  • Developed Python-based ETL workflows to clean, validate, and model CRM-based engagement data in Snowflake.
  • Refactored SQL for schema design and transformations, improving data quality and reducing manual processing time by 30%.
  • Created Power BI dashboards to visualize trends in student participation and outreach, supporting analysis of 1K+ student records.

Data Analyst – Wipro, India

Dec 2021 - Aug 2023

  • Analyzed 2M+ digital payment transactions in Mastercard’s Bill Pay platform using SQL and AWS Athena to uncover authorization issues, process delays, and system anomalies, improving reporting reliability.
  • Built and maintained 10+ Power BI dashboards tracking payment KPIs such as transaction success rates, onboarding timelines, and usage trends, reducing ad hoc reporting by 35%.
  • Developed rule-based anomaly detection scripts in Python/SQL to flag spikes in failed transactions and latency, cutting unresolved issues by 30% and improving resolution time by 50%.
  • Collaborated with product, risk, and operations teams to standardize definitions for adoption, engagement, and performance metrics, ensuring consistency across reporting.
  • Automated recurring reporting workflows in Python, reducing manual work by 40% and supporting compliance and audit readiness.
  • Contributed to daily operations reviews by presenting dashboards and insights, helping prioritize service stability and product improvements.

Data Analyst Intern - Societe Generale Global Solution Centre

Aug 2021 - Dec 2021

  • Audited 10K+ transactional records using MS SQL Server to track SLA adherence and operational KPIs.
  • Built Tableau dashboards that cut manual reporting time by 40% and improved visibility across 3+ departments.

Education

Master's in Data Science – Indiana University Bloomington

Aug 2023 - May 2025

Key Coursework: Data Visualization, Advanced Database Concepts, Applied Machine Learning, Data Mining, Cloud Computing-AWS, Statistics

Bachelor of Technology in Electronics and Communication Engineering – Gayatri Vidya Parishad College of Engineering

Aug 2017 - July 2021

Volunteer/Extracurricular Activities

Indiana University Bloomington

Systems Inventory Assistant – IU Dining & Hospitality

Jan 2025 - May 2025

Maintained accurate inventory levels in campus dining locations including campus stores and Starbucks and ensured smooth operations by tracking and recording inventory, conducting monthly audits, and managing bins and storage areas. Monitored stock levels, and ensured accurate data entry utilizing inventory management software.

Part-Time Shift Supervisor – McNutt/Briscoe C-Stores, McNutt Eatery

Aug 2024 - May 2025

Supervised staff, managed inventory, and handled opening/closing procedures. Trained part-time employees, ensured service quality, addressed customer concerns, and oversaw staffing, security, and inventory management to ensure smooth operations.

Part-Time Guest Services Supervisor – IU Event Services

Aug 2024 - May 2025

Supervised event staff, managed crowd control, and resolved guest complaints, ensuring staff roles were assigned effectively and event operations ran smoothly.

Gayatri Vidya Parishad College of Engineering

Volunteer – Rotaract Club

Dec 2017 - Aug 2021

Organized community service projects including awareness events, food distribution, and educational initiatives, while also volunteering at orphanages and old age homes to support marginalized communities and promote social welfare.

Portfolio

Projects

Reddit Comments Analysis project thumbnail
Reddit Comments to Post Relevance Analysis
Focus: NLP, embeddings, clustering relevance of Reddit comments.
Tech: Python, Topic Modeling, Transfer Learning, API
Hybrid model combining semantic similarity, engagement, and sentiment to analyze and rank the relevance of Reddit comments to posts.
Stack Overflow Users Survey Analysis project thumbnail
Stack Overflow Users Survey Analysis
Focus: Demographic and tech preference insights through Power BI.
Tech: Power BI, SQL, DAX
Stack Overflow user survey data dashboard to derive insights on demographic trends and technology preferences using Power BI with drill-down features.
Global Emissions Explorer project thumbnail
Global Emissions Explorer – Time-Series Dashboard
Focus: Emissions, policy impact, and historical trends
Tech: Plotly Dash, Python (pandas), Choropleth Maps
Developed an interactive webpage using Plotly Dash and Python to visualize global greenhouse gas emissions, historical temperature changes, and carbon policy effects using data from 5+ public sources. Enabled dynamic filtering, animated maps, and KPI tracking across countries and sectors.
Per-Capita Emissions Visualization project thumbnail
Per-Capita Emissions Visualization & Reporting
Focus: Regional emission trends and dashboard-based reporting
Tech: SQL (BigQuery), Plotly, Looker Studio, GCP
Processed 42,000+ per-capita carbon records from 200 countries using SQL in Google BigQuery. Built animated choropleth maps with Plotly and deployed a real-time reporting dashboard via Looker Studio and GCP hosting to support emissions policy and research insights.
Personal Portfolio Website project thumbnail
Personal Portfolio Website
Focus: Creating an interactive web presence.
Tech: Web Development, Visual Studio Code
Multi-page portfolio website featuring sections like Home, Bio, Portfolio, Resume, and Contact. Focused on clean UI and smooth navigation to showcase professional projects and personal brand.
Scalable Student Records Web App project thumbnail
Scalable Student Records Web App on AWS
Focus: CRUD app with AWS best practices.
Tech: AWS, Virtual Servers, Relational Database, Load Balancing, Virtual Networking, Access & Credential Management
Scalable web application for managing student records using AWS services. Implemented auto-scaling and load balancing for high availability and peak performance.
Gamesphere Reviews project thumbnail
Sentiment Analysis on Gamesphere Reviews
Focus: Classifying sentiment in reviews.
Tech: NLTK, Scikit-learn, Pandas
Sentiment analysis on gaming reviews to classify sentiments into positive, neutral, or negative categories. Achieved 84% accuracy, optimized using GridSearchCV for hyperparameter tuning.
Emotion Classification project thumbnail
Emotion Classification Using NLP
Focus: Classifying text into emotion categories.
Tech: NLP, Topic Modeling, traditional ML models
Classification of text into emotion categories (e.g., joy, anger, sadness) using NLP techniques. Achieved 85% accuracy through strong feature engineering and model selection.

Skills

Programming Languages

  • Python - pandas, NumPy
  • SQL
  • R

Databases & Data Engineering

  • MySQL
  • PostgreSQL
  • SQL Server
  • MongoDB
  • Snowflake
  • Redshift
  • Glue
  • BigQuery
  • ETL/ELT
  • Data Pipeline Automation
  • Data Modeling
  • Apache Spark
  • Hadoop
  • Databricks
  • Cloudera Impala

Data Visualization & BI

  • Power BI
  • Tableau
  • Excel
  • Plotly Dash
  • Looker Studio
  • Seaborn
  • Matplotlib

Machine Learning & AI

  • Predictive Analytics
  • Time Series Forecasting
  • Regression Analysis
  • Hypothesis Testing
  • Scikit-learn
  • MLflow
  • Model Monitoring
  • Scikit-learn

NLP & Deep Learning

  • Text Classification
  • Topic Modeling (LDA)
  • Sentiment Analysis
  • PyTorch
  • Tensorflow
  • LLMs
  • BERT
  • GPT
  • Hugging Face
  • Generative AI

Cloud & Big Data Platforms

  • AWS - SageMaker, Glue, Athena, Lambda, KMS, CloudTrail, QuickSight, Redshift
  • Google Cloud - BigQuery, Analytics
  • Azure Databricks

Analytics & Strategy

  • EDA
  • Behavioral Segmentation
  • Product Analytics
  • A/B Testing
  • Cohort Analysis
  • Retention Modeling
  • Anomaly Detection

DevOps & Tools

  • Docker
  • CI/CD
  • Git
  • GitHub
  • APIs

Other Technologies

  • Robotic Process Automation (UiPath)

Soft Skills

  • Customer Service
  • Operations Management
  • Leadership
  • Inventory Management
  • Problem-Solving
  • Time Management

Contact