Naoko Ishibashi

Logo

Junior Data Analyst

View the Project on GitHub naokoi0408/Portfolio

Junior Data Analyst

Resume

Technical Skill: R, SQL, Java, Python, Tableau, Power BI

Education

Experience

Data Analyst and Programmer Intern @ Senior Grooming (August 2023)

Certifications

Project

Regression Analysis in Base R     

Publication

Utilized regression analysis, multivariate regression, statistical, and probability analysis on 2020 US National Election Survey data to assess voter sentiment towards the Democratic Party. Applied sampling theory, cleaned data, and created concise visualizations to summarize findings.

Regression Analysis

Diabetes Research Analysis

Publication

B-Cell Analysis Project Presentation

Analyzed 20,000 rows of data to examine the correlation between clone size and mutation in healthy and diabetic samples. Performed hypothesis testing to find significant results, indicating a notable increase in mean clone size in diabetic individuals. Created bar graphs to assess V-gene usage consistency across six donors, highlighting higher values in diabetes. This research suggests potential factors for risk prediction and prevention in Type 1 diabetes.

Diabetes Research Analysis

R-Shiny

Publication

Developed a Shiny application with Leaflet widgets for Philadelphia schools and LEA datasets, focusing on attendance-based rankings. Utilized text analysis and integrated SQL within R for advanced geospatial analysis.

R Shiny

SQL Data Project

Publication

The analysis compares median annual salaries across companies of different sizes, finding that companies with around 1000 employees pay between $250,000 and $260,000, while larger mega-companies offer slightly higher median salaries of $268,000 to $275,000. This highlights a trend of increasing salaries with company size.

SQL

Hypothesis Testing and Estimators

Publication

Hypothesis testing in these regressions evaluates whether the independent variables (like education level or region) have a statistically significant impact on the dependent variables (like median income or commute behavior). It tests whether the observed relationships are due to chance, with the null hypothesis assuming no effect and the alternative suggesting a significant effect.

Hypothesis

Data Cleaning and Transformation

Publication

This project focuses on data cleaning and analysis using dplyr, including tasks like filtering non-cancelled flights, analyzing delay patterns, and identifying delay-prone carriers and destinations. A bonus task involves examining baseball batting averages while accounting for potential data biases.

Data_Cleaning_and_Transformation

Random Variables

Publication

In this project, I conducted a series of statistical analyses and simulations using R. I calculated probabilities and confidence intervals for normal distributions, simulated dice rolls to explore probabilities related to Yahtzee, and analyzed coin flip data to estimate streak lengths. This work involved applying statistical techniques, such as normal CDFs and Monte Carlo simulations, to derive meaningful insights and visualize results through plots. The project demonstrates my proficiency in data analysis, statistical modeling, and simulation.

Data_Cleaning_and_Transformation

Automotive Complaint Analysis

Subaru Quality Insight Project report Power BI dashboard

I built a 3-page Power BI dashboard to analyze over 3,500 vehicle complaints from 2019 to 2025. The goal was to spot trends, compare brands, and identify quality issues. Using Subaru as a case study, I found a complaint spike in 2024, especially in popular models like the Crosstrek and Forester. This project shows how data dashboards can help monitor product performance and support decision-making.

Power BI - Subaru Quality Insights

GIS Analysis - Analyzing Age 65+ Population and Median Income

Publication

I analyzed senior population and income data across Philadelphia to identify ideal service areas for a grooming business. The report highlights key neighborhoods with high senior density and disposable income, using census data and geospatial visuals to support targeting strategy.

Power BI - Subaru Quality Insights