| Course Code | MIS029 |
| Course Name | Data Visualization |
| Project Type | Final Project Dashboard |
| Name & Surname | Mert Efe Kurt |
| Student Number | 2307071061 |
| Submission Date | 15 January 2026 |
| Dataset | Chocolate Bar Ratings |
| Source | TidyTuesday 2022-01-18 |
| Observations | 2,530 |
| Variables | 10 original + 3 derived |
| Period | 2006 - 2021 |
Expert ratings of over 2,500 chocolate bars from around the world, compiled by the Manhattan Chocolate Society via Flavors of Cacao.
🌍 67 countries | 🏭 580 manufacturers | 🌱 62 bean origins
Rating Scale:
| Score | Category |
|---|---|
| 4.0 - 5.0 | 🏆 Outstanding |
| 3.5 - 3.9 | ⭐ Highly Recommended |
| 3.0 - 3.49 | ✓ Recommended |
| 2.5 - 2.99 | ⚠️ Disappointing |
| 1.0 - 2.49 | ✗ Unpleasant |
| # | Variable | Type | Description |
|---|---|---|---|
| 1 | ref | Numeric | Unique reference ID |
| 2 | company_manufacturer | Character | Manufacturer name |
| 3 | company_location | Character | Manufacturer country |
| 4 | review_date | Numeric | Review year |
| 5 | country_of_bean_origin | Character | Bean origin country |
| 6 | specific_bean_origin_or_bar_name | Character | Bean variety/bar name |
| 7 | cocoa_percent | Character | Cocoa % (text) |
| 8 | ingredients | Character | Ingredient codes |
| 9 | most_memorable_characteristics | Character | Flavor notes |
| 10 | rating | Numeric | Rating (1-5) |
| 11 | cocoa_percent_num | Numeric | Cocoa % [Derived] |
| 12 | num_ingredients | Numeric | Ingredient count [Derived] |
| 13 | rating_category | Factor | Rating label [Derived] |
| Metric | Value |
|---|---|
| Rows (Observations) | 2,530 |
| Columns (Variables) | 13 |
| Memory Size | 611.5 Kb |
| Complete Cases | 2,443 |
| Missing Values | 174 |
| Variable | N | Mean | Median | SD | Min | Max |
|---|---|---|---|---|---|---|
| Cocoa Percentage (%) | 2530 | 71.64 | 70.00 | 5.62 | 42 | 100 |
| Number of Ingredients | 2443 | 3.04 | 3.00 | 0.91 | 1 | 6 |
| Rating Score | 2530 | 3.20 | 3.25 | 0.45 | 1 | 4 |
| Review Year | 2530 | 2014.37 | 2015.00 | 3.97 | 2006 | 2021 |
| Country | Count | Pct | Cum.Pct |
|---|---|---|---|
| U.S.A. | 1136 | 44.9 | 44.9 |
| Canada | 177 | 7.0 | 51.9 |
| France | 176 | 7.0 | 58.9 |
| U.K. | 133 | 5.3 | 64.2 |
| Italy | 78 | 3.1 | 67.3 |
| Belgium | 63 | 2.5 | 69.8 |
| Ecuador | 58 | 2.3 | 72.1 |
| Australia | 53 | 2.1 | 74.2 |
| Switzerland | 44 | 1.7 | 75.9 |
| Germany | 42 | 1.7 | 77.6 |
| Spain | 36 | 1.4 | 79.0 |
| Denmark | 31 | 1.2 | 80.2 |
| Bean Origin | Count | Pct | Cum.Pct |
|---|---|---|---|
| Venezuela | 253 | 10.0 | 10.0 |
| Peru | 244 | 9.6 | 19.6 |
| Dominican Republic | 226 | 8.9 | 28.5 |
| Ecuador | 219 | 8.7 | 37.2 |
| Madagascar | 177 | 7.0 | 44.2 |
| Blend | 156 | 6.2 | 50.4 |
| Nicaragua | 100 | 4.0 | 54.4 |
| Bolivia | 80 | 3.2 | 57.6 |
| Colombia | 79 | 3.1 | 60.7 |
| Tanzania | 79 | 3.1 | 63.8 |
| Brazil | 78 | 3.1 | 66.9 |
| Belize | 76 | 3.0 | 69.9 |
| Category | Count | Pct | Cum.Pct |
|---|---|---|---|
| Outstanding | 112 | 4.4 | 4.4 |
| Highly Recommended | 865 | 34.2 | 38.6 |
| Recommended | 987 | 39.0 | 77.6 |
| Disappointing | 499 | 19.7 | 97.3 |
| Unpleasant | 67 | 2.6 | 99.9 |
The histogram reveals that chocolate ratings follow an approximately normal distribution with a slight left skew. The majority of bars (over 60%) receive ratings between 3.0 and 3.5, placing them in the “Recommended” category. The near-identical mean (3.2) and median (3.25) confirm the distribution’s symmetry.
| Statistic | Rating | Cocoa % |
|---|---|---|
| Mean | 3.2 | 71.6% |
| Median | 3.25 | 70% |
| Std Dev | 0.45 | 5.6% |
| Range | 1 - 4 | 42 - 100% |
Key Insights:
This boxplot compares rating distributions across the top 10 chocolate-manufacturing countries. Japan shows the highest median rating with low variability. U.S.A. and France display the widest spreads. All countries have median ratings around 3.0-3.25. Gold diamonds (means) align closely with medians.
| Country | N | Mean | Median | SD |
|---|---|---|---|---|
| Australia | 53 | 3.36 | 3.50 | 0.41 |
| Canada | 177 | 3.30 | 3.25 | 0.42 |
| France | 176 | 3.26 | 3.25 | 0.52 |
| Germany | 42 | 3.21 | 3.25 | 0.47 |
| Italy | 78 | 3.23 | 3.25 | 0.47 |
| Switzerland | 44 | 3.32 | 3.25 | 0.45 |
| U.S.A. | 1136 | 3.19 | 3.25 | 0.42 |
| Belgium | 63 | 3.10 | 3.00 | 0.66 |
| Ecuador | 58 | 3.04 | 3.00 | 0.55 |
| U.K. | 133 | 3.07 | 3.00 | 0.47 |
This scatterplot explores the relationship between cocoa percentage and expert ratings. The LOESS curve reveals that ratings peak around 65-75% cocoa, then decline at higher percentages. The weak negative correlation (r = -0.147) indicates cocoa content alone doesn’t determine quality.
r = -0.147
⚠️ Weak Negative Correlation
Cocoa % explains only 2.2% of rating variance
| Region | Count | Avg Rating | Avg Cocoa |
|---|---|---|---|
| Central Am. & Caribbean | 683 | 3.22 | 72% |
| Africa | 357 | 3.21 | 71% |
| Asia-Pacific | 231 | 3.21 | 71% |
| South America | 953 | 3.20 | 72% |
| Other / Blend | 306 | 3.08 | 71% |
Insights:
Native Plotly with WebGL for optimal performance. Stratified sample (40% per category). Hover for details, use toolbar to zoom/pan/download.
Hover over points to see: Manufacturer name, location, bean origin, cocoa %, rating & category.
Toolbar Options: Download PNG, Zoom in/out, Pan, Reset view.
Color Legend:
| Color | Category |
|---|---|
| 🟢 Green | Outstanding / Highly Recommended |
| 🟡 Orange | Recommended |
| 🔴 Red | Disappointing / Unpleasant |
How to use this table:
Column Guide:
| Column | Description |
|---|---|
| Manufacturer | Company that made the chocolate |
| Location | Country where company is based |
| Year | When the review was conducted |
| Bean Origin | Where cocoa beans came from |
| Cocoa | Percentage of cocoa content |
| Rating | Expert score (1.0 - 5.0) |
| Category | Rating classification |
| Flavors | Tasting notes from experts |
Primary Data Source:
📊 TidyTuesday - Chocolate Bar Ratings (2022-01-18)
| Attribute | Details |
|---|---|
| Repository | github.com/rfordatascience/tidytuesday |
| Direct CSV | chocolate.csv |
| Records | 2,530 chocolate bar reviews |
| Time Span | 2006 - 2021 |
| Variables | 10 original columns |
Original Data Provider:
🍫 Flavors of Cacao - flavorsofcacao.com
Data Collection Method:
Expert tasters evaluate chocolate bars on texture, flavor complexity, finish, and overall impression. Each bar is rated independently without brand knowledge.
| Package | Version | Purpose | Citation |
|---|---|---|---|
| flexdashboard | 0.6.2 | Dashboard framework & layout | Iannone et al. (2024) |
| tidyverse | 2.0.0 | Data wrangling (dplyr, tidyr, readr) | Wickham et al. (2019) |
| ggplot2 | 4.0.1 | Grammar of graphics visualizations | Wickham (2016) |
| plotly | 4.11.0 | Interactive charts with WebGL | Sievert (2020) |
| DT | 0.34.0 | Interactive searchable data tables | Xie et al. (2024) |
| knitr | 1.49 | Dynamic report generation | Xie (2024) |
| kableExtra | 1.4.0 | Advanced table formatting | Zhu (2024) |
| scales | 1.4.0 | Axis & label formatting | Wickham & Seidel (2022) |
Official Documentation:
| Resource | URL | Purpose |
|---|---|---|
| 📖 Flexdashboard | pkgs.rstudio.com/flexdashboard | Dashboard layouts & components |
| 📊 Plotly R | plotly.com/r | Interactive visualizations |
| 🎨 ggplot2 | ggplot2.tidyverse.org | Static graphics reference |
| 📋 DT Package | rstudio.github.io/DT | DataTables integration |
| 📚 kableExtra | haozhu233.github.io/kableExtra | Table styling |
Books & Learning Resources:
Wickham, H. & Grolemund, G. (2023). R for Data Science (2nd ed.). r4ds.hadley.nz
Sievert, C. (2020). Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman & Hall/CRC. plotly-r.com
Wilke, C. O. (2019). Fundamentals of Data Visualization. O’Reilly. clauswilke.com/dataviz
Healy, K. (2018). Data Visualization: A Practical Introduction. Princeton University Press.
Course Materials:
| Property | Value |
|---|---|
| R Version | 4.4.3 |
| Platform | aarch64-apple-darwin20 |
| Operating System | Darwin 25.1.0 |
| Locale | en_US.UTF-8 |
| Date Generated | 2026-01-15 19:09:27 |
| Timezone | Europe/Istanbul |
To reproduce this analysis:
# 1. Install required packages
install.packages(c("flexdashboard", "tidyverse",
"plotly", "DT", "knitr",
"kableExtra", "scales"))
# 2. Render the dashboard
rmarkdown::render("MertEfeKurt_2307071061_Final.Rmd")⚠️ Requirements:
Data License: TidyTuesday data is released under CC0 1.0 Universal license.
Dashboard Author: Mert Efe Kurt (2307071061)
Course: MIS029 - Data Visualization
Institution: Final Project Submission
Generated: January 15, 2026 at 19:09
---
title: "🍫 Chocolate Bar Ratings Analysis"
author: "Mert Efe Kurt | 2307071061"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: scroll
theme:
version: 4
bootswatch: cosmo
navbar:
- { title: "MIS029 Final Project", align: right }
source_code: embed
---
```{css, echo=FALSE}
/* ===== PREMIUM CHOCOLATE THEME ===== */
/* Import elegant fonts */
@import url('https://fonts.googleapis.com/css2?family=Playfair+Display:wght@400;600;700&family=Source+Sans+Pro:wght@300;400;600&display=swap');
/* Root variables for consistent theming */
:root {
--chocolate-dark: #2C1810;
--chocolate-medium: #5D4037;
--chocolate-light: #8D6E63;
--chocolate-cream: #D7CCC8;
--chocolate-milk: #EFEBE9;
--gold-accent: #D4AF37;
--gold-light: #F4E4BC;
--success-green: #4CAF50;
--warning-orange: #FF9800;
--danger-red: #E53935;
}
/* Prevent horizontal overflow - CRITICAL */
html, body {
overflow-x: hidden !important;
max-width: 100% !important;
}
/* Global body styling */
body {
font-family: 'Source Sans Pro', -apple-system, BlinkMacSystemFont, sans-serif;
background: linear-gradient(135deg, #FAFAFA 0%, #F5F5F5 100%);
color: var(--chocolate-dark);
width: 100%;
box-sizing: border-box;
}
/* Ensure all containers respect boundaries */
* {
box-sizing: border-box;
}
/* Main container */
.container-fluid, .row {
max-width: 100% !important;
overflow-x: hidden !important;
}
/* Navbar styling */
.navbar {
background: linear-gradient(135deg, var(--chocolate-dark) 0%, var(--chocolate-medium) 100%) !important;
border-bottom: 3px solid var(--gold-accent) !important;
box-shadow: 0 4px 20px rgba(44, 24, 16, 0.3);
}
.navbar-brand {
font-family: 'Playfair Display', serif !important;
font-weight: 700 !important;
font-size: 1.5rem !important;
color: var(--gold-light) !important;
text-shadow: 1px 1px 2px rgba(0,0,0,0.3);
}
.navbar-nav > li > a {
color: var(--chocolate-cream) !important;
font-weight: 500;
transition: all 0.3s ease;
}
.navbar-nav > li > a:hover {
color: var(--gold-accent) !important;
background: rgba(212, 175, 55, 0.15) !important;
}
.navbar-nav > .active > a {
background: rgba(212, 175, 55, 0.25) !important;
color: var(--gold-accent) !important;
border-bottom: 2px solid var(--gold-accent);
}
/* Page titles */
.section.level3 > h3 {
font-family: 'Playfair Display', serif;
color: var(--chocolate-dark);
font-weight: 600;
border-bottom: 2px solid var(--gold-accent);
padding-bottom: 8px;
margin-bottom: 15px;
}
/* Chart containers */
.chart-wrapper {
background: white;
border-radius: 12px;
box-shadow: 0 4px 15px rgba(44, 24, 16, 0.08);
padding: 15px;
transition: transform 0.3s ease, box-shadow 0.3s ease;
max-width: 100% !important;
overflow: hidden !important;
}
.chart-wrapper:hover {
transform: translateY(-2px);
box-shadow: 0 8px 25px rgba(44, 24, 16, 0.12);
}
/* Chart stage - allow vertical scroll when needed */
.chart-stage {
max-width: 100% !important;
overflow-x: hidden !important;
overflow-y: auto !important;
width: 100% !important;
}
/* Plotly and ggplot containers */
.plotly, .plotly-container, .html-widget {
max-width: 100% !important;
overflow: hidden !important;
}
/* Value boxes */
.value-box {
border-radius: 12px !important;
box-shadow: 0 4px 15px rgba(44, 24, 16, 0.15) !important;
transition: transform 0.3s ease !important;
}
.value-box:hover {
transform: scale(1.02);
}
.value-box .value {
font-family: 'Playfair Display', serif !important;
font-size: 2.2rem !important;
font-weight: 700 !important;
}
.value-box .caption {
font-family: 'Source Sans Pro', sans-serif !important;
font-size: 0.85rem !important;
font-weight: 500 !important;
text-transform: uppercase;
letter-spacing: 0.5px;
}
/* Tables - prevent horizontal scroll and fix backgrounds */
.dataTable {
font-size: 0.9rem !important;
max-width: 100% !important;
width: 100% !important;
table-layout: auto !important;
background-color: white !important;
}
.dataTables_wrapper {
max-width: 100% !important;
overflow-x: hidden !important;
background-color: white !important;
}
.dataTables_scroll {
max-width: 100% !important;
overflow-x: hidden !important;
background-color: white !important;
}
/* Fix white space issue in Data table - CRITICAL */
.dataTables_scrollBody {
background-color: white !important;
overflow-y: auto !important;
}
.dataTables_scrollHead {
background-color: white !important;
}
.dataTables_scroll {
background-color: white !important;
}
table.dataTable {
background-color: white !important;
}
table.dataTable tbody tr {
background-color: white !important;
}
table.dataTable tbody tr:nth-child(odd) {
background-color: #FAFAFA !important;
}
table.dataTable tbody tr:hover {
background-color: var(--gold-light) !important;
}
/* Data page specific - fill container */
.chart-wrapper.html-fill-container {
height: 100% !important;
min-height: calc(100vh - 200px) !important;
}
/* Ensure DataTable fills its container */
.html-widget.html-fill-item {
height: 100% !important;
min-height: calc(100vh - 250px) !important;
}
.dataTable thead th {
background: linear-gradient(135deg, var(--chocolate-dark), var(--chocolate-medium)) !important;
color: white !important;
font-weight: 600 !important;
text-transform: uppercase;
font-size: 0.8rem;
letter-spacing: 0.5px;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.dataTable tbody tr:hover {
background-color: var(--gold-light) !important;
}
.dataTable tbody td {
word-wrap: break-word;
max-width: 200px;
overflow: hidden;
text-overflow: ellipsis;
}
/* Kable tables */
.table-striped > tbody > tr:nth-of-type(odd) {
background-color: var(--chocolate-milk);
}
/* Custom card styling */
.info-card {
background: linear-gradient(145deg, #FFFFFF 0%, var(--chocolate-milk) 100%);
border-radius: 16px;
padding: 25px;
box-shadow: 0 8px 30px rgba(44, 24, 16, 0.1);
border-left: 5px solid var(--gold-accent);
margin-bottom: 20px;
}
.premium-card {
background: linear-gradient(135deg, var(--chocolate-dark) 0%, var(--chocolate-medium) 100%);
color: white;
border-radius: 16px;
padding: 30px;
box-shadow: 0 10px 40px rgba(44, 24, 16, 0.25);
}
.premium-card h4 {
color: var(--gold-accent);
font-family: 'Playfair Display', serif;
margin-bottom: 15px;
}
/* Interpretation boxes - MUST BE VISIBLE */
.interpretation-box {
background: linear-gradient(135deg, var(--chocolate-milk) 0%, #FFFFFF 100%);
border-radius: 12px;
padding: 18px 20px;
margin-top: 20px;
margin-bottom: 15px;
border-left: 4px solid var(--gold-accent);
font-size: 0.92rem;
line-height: 1.65;
box-shadow: 0 2px 8px rgba(44, 24, 16, 0.08);
position: relative;
z-index: 10;
}
/* Ensure chart wrappers don't overflow */
.chart-shim {
overflow: hidden !important;
max-width: 100% !important;
}
.section.level3 {
overflow-x: hidden !important;
overflow-y: auto !important;
padding-bottom: 10px;
max-width: 100% !important;
}
/* Column sections */
.section {
max-width: 100% !important;
overflow-x: hidden !important;
}
/* Flexdashboard columns */
.flexdashboard-column {
max-width: 100% !important;
overflow-x: hidden !important;
}
/* Stats highlight */
.stat-highlight {
background: linear-gradient(135deg, var(--gold-light) 0%, #FFFFFF 100%);
border-radius: 10px;
padding: 15px 20px;
text-align: center;
box-shadow: 0 4px 15px rgba(212, 175, 55, 0.2);
}
.stat-highlight .number {
font-family: 'Playfair Display', serif;
font-size: 2.5rem;
font-weight: 700;
color: var(--chocolate-dark);
}
.stat-highlight .label {
font-size: 0.85rem;
color: var(--chocolate-light);
text-transform: uppercase;
letter-spacing: 1px;
}
/* Gauge styling */
.gauge-container {
background: white;
border-radius: 12px;
padding: 20px;
text-align: center;
}
/* Custom scrollbar */
::-webkit-scrollbar {
width: 8px;
height: 8px;
}
::-webkit-scrollbar-track {
background: var(--chocolate-milk);
border-radius: 4px;
}
::-webkit-scrollbar-thumb {
background: var(--chocolate-light);
border-radius: 4px;
}
::-webkit-scrollbar-thumb:hover {
background: var(--chocolate-medium);
}
/* Animation for page load */
@keyframes fadeInUp {
from {
opacity: 0;
transform: translateY(20px);
}
to {
opacity: 1;
transform: translateY(0);
}
}
.chart-stage {
animation: fadeInUp 0.6s ease-out;
}
/* Responsive adjustments */
@media (max-width: 768px) {
.value-box .value {
font-size: 1.8rem !important;
}
.navbar-brand {
font-size: 1.2rem !important;
}
.dataTable {
font-size: 0.75rem !important;
}
.premium-card, .info-card {
padding: 15px !important;
}
}
/* Kable tables - responsive */
table.kable-table, .table {
max-width: 100% !important;
width: 100% !important;
table-layout: auto !important;
font-size: 0.9rem !important;
}
table.kable-table td, table.kable-table th,
.table td, .table th {
word-wrap: break-word;
overflow: hidden;
text-overflow: ellipsis;
padding: 6px 8px !important;
}
/* Ensure variable table fits all content */
.table-condensed td, .table-condensed th {
padding: 4px 6px !important;
font-size: 11px !important;
line-height: 1.3 !important;
}
/* Ensure all images and plots fit */
img, svg, canvas {
max-width: 100% !important;
height: auto !important;
}
/* Info cards and premium cards */
.info-card, .premium-card {
max-width: 100% !important;
word-wrap: break-word;
overflow-wrap: break-word;
overflow-y: auto !important;
}
/* Ensure tables in cards are compact */
.premium-card table, .info-card table {
margin-bottom: 8px !important;
font-size: 0.9rem !important;
}
.premium-card h4, .info-card h4 {
margin-bottom: 8px !important;
font-size: 1rem !important;
}
/* Value boxes container */
.value-box-container {
max-width: 100% !important;
}
/* Ensure proper spacing and no overflow in sections */
.section {
padding-left: 10px !important;
padding-right: 10px !important;
}
/* Grid alignment - ensure columns fill properly */
.flexdashboard-page > .dashboard-row {
display: flex !important;
flex-wrap: nowrap !important;
width: 100% !important;
}
.flexdashboard-page > .dashboard-row > .dashboard-column {
display: flex !important;
flex-direction: column !important;
}
/* Ensure panels stretch to fill available height */
.chart-wrapper {
display: flex !important;
flex-direction: column !important;
flex: 1 1 auto !important;
}
.chart-stage {
flex: 1 1 auto !important;
display: flex !important;
flex-direction: column !important;
}
/* No empty gaps between panels */
.section.level3 {
margin-bottom: 0 !important;
flex: 1 1 auto !important;
}
/* Fix any potential flexdashboard overflow */
.flexdashboard-content {
max-width: 100% !important;
overflow-x: hidden !important;
}
```
```{r setup, include=FALSE}
# ===== SETUP AND DATA LOADING =====
knitr::opts_chunk$set(
echo = FALSE,
message = FALSE,
warning = FALSE,
fig.retina = 2
)
# Load required libraries
library(flexdashboard)
library(tidyverse)
library(plotly)
library(DT)
library(scales)
library(knitr)
library(kableExtra)
# Load the chocolate dataset from TidyTuesday
chocolate <- read_csv(
"https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv",
show_col_types = FALSE
)
# ===== DATA PREPARATION =====
chocolate <- chocolate %>%
mutate(
# Convert cocoa_percent to numeric
cocoa_percent_num = as.numeric(gsub("%", "", cocoa_percent)),
# Extract number of ingredients
num_ingredients = as.numeric(str_extract(ingredients, "^[0-9]")),
# Create rating categories
rating_category = case_when(
rating >= 4 ~ "Outstanding",
rating >= 3.5 ~ "Highly Recommended",
rating >= 3 ~ "Recommended",
rating >= 2.5 ~ "Disappointing",
TRUE ~ "Unpleasant"
),
rating_category = factor(rating_category, levels = c(
"Outstanding", "Highly Recommended", "Recommended", "Disappointing", "Unpleasant"
))
)
# ===== SUMMARY STATISTICS =====
total_reviews <- nrow(chocolate)
avg_rating <- round(mean(chocolate$rating, na.rm = TRUE), 2)
num_countries <- n_distinct(chocolate$company_location)
num_manufacturers <- n_distinct(chocolate$company_manufacturer)
avg_cocoa <- round(mean(chocolate$cocoa_percent_num, na.rm = TRUE), 1)
num_origins <- n_distinct(chocolate$country_of_bean_origin)
# ===== PREMIUM GGPLOT THEME =====
theme_premium <- function() {
theme_minimal(base_family = "sans") +
theme(
# Title styling
plot.title = element_text(
face = "bold",
size = 16,
color = "#2C1810",
margin = margin(b = 10)
),
plot.subtitle = element_text(
size = 11,
color = "#5D4037",
margin = margin(b = 15)
),
plot.caption = element_text(
size = 9,
color = "#8D6E63",
hjust = 0,
margin = margin(t = 10)
),
# Axis styling
axis.title = element_text(
face = "bold",
size = 11,
color = "#5D4037"
),
axis.text = element_text(
size = 10,
color = "#5D4037"
),
axis.line = element_line(color = "#D7CCC8", linewidth = 0.5),
# Legend styling
legend.title = element_text(face = "bold", size = 10, color = "#2C1810"),
legend.text = element_text(size = 9, color = "#5D4037"),
legend.background = element_rect(fill = "white", color = NA),
legend.key = element_rect(fill = "white", color = NA),
# Panel styling
panel.grid.minor = element_blank(),
panel.grid.major = element_line(color = "#EFEBE9", linewidth = 0.4),
panel.background = element_rect(fill = "white", color = NA),
plot.background = element_rect(fill = "white", color = NA),
# Margins
plot.margin = margin(15, 15, 15, 15)
)
}
# Premium color palettes
chocolate_palette <- c("#2C1810", "#4E342E", "#5D4037", "#6D4C41", "#795548",
"#8D6E63", "#A1887F", "#BCAAA4", "#D7CCC8", "#EFEBE9")
rating_colors <- c("Outstanding" = "#2E7D32", "Highly Recommended" = "#689F38",
"Recommended" = "#FFA000", "Disappointing" = "#F57C00",
"Unpleasant" = "#D32F2F")
```
About {data-icon="fa-info-circle"}
=====================================
Column {data-width=450}
-----------------------------------------------------------------------
### 📌 Project Information {data-height=500}
<div class="premium-card" style="padding: 20px;">
<h4 style="margin-top: 0;">📚 Course Details</h4>
| | |
|:--|:--|
| **Course Code** | MIS029 |
| **Course Name** | Data Visualization |
| **Project Type** | Final Project Dashboard |
<h4 style="margin-top: 15px;">👤 Student Information</h4>
| | |
|:--|:--|
| **Name & Surname** | Mert Efe Kurt |
| **Student Number** | 2307071061 |
| **Submission Date** | `r format(Sys.Date(), "%d %B %Y")` |
<h4 style="margin-top: 15px;">📊 Dataset Overview</h4>
| | |
|:--|:--|
| **Dataset** | Chocolate Bar Ratings |
| **Source** | TidyTuesday 2022-01-18 |
| **Observations** | `r format(total_reviews, big.mark = ",")` |
| **Variables** | 10 original + 3 derived |
| **Period** | 2006 - 2021 |
</div>
### 📖 About This Dataset {data-height=400}
<div class="info-card" style="padding: 15px;">
**Expert ratings of over 2,500 chocolate bars** from around the world, compiled by the **Manhattan Chocolate Society** via [Flavors of Cacao](http://flavorsofcacao.com/).
🌍 **`r num_countries`** countries | 🏭 **`r num_manufacturers`** manufacturers | 🌱 **`r num_origins`** bean origins
**Rating Scale:**
| Score | Category |
|:------|:---------|
| 4.0 - 5.0 | 🏆 Outstanding |
| 3.5 - 3.9 | ⭐ Highly Recommended |
| 3.0 - 3.49 | ✓ Recommended |
| 2.5 - 2.99 | ⚠️ Disappointing |
| 1.0 - 2.49 | ✗ Unpleasant |
</div>
Column {data-width=550}
-----------------------------------------------------------------------
### 📋 Variable Names and Types {data-height=650}
```{r}
# REQUIRED: List of variable names and types using glimpse/str approach
var_info <- tibble(
`#` = 1:ncol(chocolate),
Variable = names(chocolate),
Type = sapply(chocolate, function(x) {
type <- class(x)[1]
case_when(
type == "character" ~ "Character",
type == "numeric" ~ "Numeric",
type == "factor" ~ "Factor",
TRUE ~ type
)
}),
Description = c(
"Unique reference ID",
"Manufacturer name",
"Manufacturer country",
"Review year",
"Bean origin country",
"Bean variety/bar name",
"Cocoa % (text)",
"Ingredient codes",
"Flavor notes",
"Rating (1-5)",
"Cocoa % [Derived]",
"Ingredient count [Derived]",
"Rating label [Derived]"
)
)
# Use kable for compact display - fits all content without scrolling
kable(var_info, align = c('c', 'l', 'l', 'l')) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE, font_size = 11) %>%
row_spec(0, bold = TRUE, background = "#2C1810", color = "white") %>%
column_spec(1, width = "30px", bold = TRUE) %>%
column_spec(2, width = "140px", color = "#5D4037") %>%
column_spec(3, width = "80px", color = "#1565C0", bold = TRUE) %>%
column_spec(4, width = "auto") %>%
row_spec(10:13, background = "#FFF8E1") # Highlight derived variables
```
### 🔍 Data Structure Preview {data-height=250}
```{r}
# Show glimpse-style output
structure_df <- tibble(
Metric = c("Rows (Observations)", "Columns (Variables)",
"Memory Size", "Complete Cases", "Missing Values"),
Value = c(
format(nrow(chocolate), big.mark = ","),
ncol(chocolate),
format(object.size(chocolate), units = "KB"),
format(sum(complete.cases(chocolate)), big.mark = ","),
sum(is.na(chocolate))
)
)
kable(structure_df, align = c('l', 'r')) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = TRUE, font_size = 13) %>%
row_spec(0, bold = TRUE, background = "#2C1810", color = "white") %>%
column_spec(1, bold = TRUE, color = "#5D4037")
```
Summary {data-icon="fa-calculator"}
=====================================
Column {data-width=550}
-----------------------------------------------------------------------
### 📊 Summary Statistics for Numeric Variables {data-height=280}
```{r}
# REQUIRED: Summary statistics (mean, median, SD, min, max)
numeric_summary <- chocolate %>%
select(review_date, cocoa_percent_num, rating, num_ingredients) %>%
pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
group_by(Variable) %>%
summarise(
N = sum(!is.na(Value)),
Mean = round(mean(Value, na.rm = TRUE), 2),
Median = round(median(Value, na.rm = TRUE), 2),
SD = round(sd(Value, na.rm = TRUE), 2),
Min = round(min(Value, na.rm = TRUE), 2),
Max = round(max(Value, na.rm = TRUE), 2),
.groups = 'drop'
) %>%
mutate(Variable = case_when(
Variable == "review_date" ~ "Review Year",
Variable == "cocoa_percent_num" ~ "Cocoa Percentage (%)",
Variable == "rating" ~ "Rating Score",
Variable == "num_ingredients" ~ "Number of Ingredients"
))
kable(numeric_summary, align = c('l', rep('c', 6)),
caption = NULL) %>%
kable_styling(bootstrap_options = c("striped", "hover", "responsive"),
full_width = TRUE, font_size = 13) %>%
row_spec(0, bold = TRUE, background = "#2C1810", color = "white") %>%
column_spec(1, bold = TRUE, color = "#5D4037", width = "180px") %>%
row_spec(3, bold = TRUE, background = "#FFF8E1")
```
### 📍 Frequency Table: Top Manufacturing Countries {data-height=470}
```{r}
# REQUIRED: Frequency table for categorical variable
country_freq <- chocolate %>%
count(company_location, sort = TRUE) %>%
head(12) %>%
mutate(
Pct = round(n / nrow(chocolate) * 100, 1),
`Cum.Pct` = cumsum(Pct),
Bar = paste0(strrep("▓", round(Pct/2)), strrep("░", 25 - round(Pct/2)))
) %>%
rename(Country = company_location, Count = n)
kable(country_freq %>% select(-Bar), align = c('l', 'c', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE, font_size = 12) %>%
row_spec(0, bold = TRUE, background = "#5D4037", color = "white") %>%
row_spec(1, bold = TRUE, background = "#D4AF37", color = "#2C1810") %>%
row_spec(2:3, background = "#F4E4BC")
```
Column {data-width=450}
-----------------------------------------------------------------------
### 🌱 Frequency Table: Top Bean Origin Countries {data-height=380}
```{r}
origin_freq <- chocolate %>%
count(country_of_bean_origin, sort = TRUE) %>%
head(12) %>%
mutate(
Pct = round(n / nrow(chocolate) * 100, 1),
`Cum.Pct` = cumsum(Pct)
) %>%
rename(`Bean Origin` = country_of_bean_origin, Count = n)
kable(origin_freq, align = c('l', 'c', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE, font_size = 12) %>%
row_spec(0, bold = TRUE, background = "#795548", color = "white") %>%
row_spec(1, bold = TRUE, background = "#D4AF37", color = "#2C1810") %>%
row_spec(2:3, background = "#F4E4BC")
```
### ⭐ Frequency Table: Rating Categories {data-height=250}
```{r}
rating_freq <- chocolate %>%
count(rating_category) %>%
mutate(
Pct = round(n / sum(n) * 100, 1),
`Cum.Pct` = cumsum(Pct)
) %>%
rename(Category = rating_category, Count = n)
kable(rating_freq, align = c('l', 'c', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE, font_size = 12) %>%
row_spec(0, bold = TRUE, background = "#6D4C41", color = "white")
```
### 🔢 Quick Stats {data-height=120}
```{r}
valueBox(format(total_reviews, big.mark = ","),
caption = "Total Reviews", icon = "fa-chart-bar", color = "#2C1810")
```
Histogram {data-icon="fa-chart-bar"}
=====================================
Column {data-width=600}
-----------------------------------------------------------------------
### 📊 Histogram: Distribution of Expert Chocolate Ratings {data-height=500}
```{r fig.height=5, fig.width=8}
# REQUIRED: Histogram for numerical variable with appropriate labels and minimal theme
# Purpose: Visualize rating distribution to understand chocolate quality spread
p_hist <- ggplot(chocolate, aes(x = rating)) +
geom_histogram(binwidth = 0.25, fill = "#5D4037", color = "#2C1810",
alpha = 0.85, linewidth = 0.3) +
geom_vline(aes(xintercept = mean(rating)),
color = "#C62828", linetype = "dashed", linewidth = 1) +
geom_vline(aes(xintercept = median(rating)),
color = "#1565C0", linetype = "solid", linewidth = 1) +
annotate("label", x = 3.7, y = 480,
label = paste0("Mean = ", round(mean(chocolate$rating), 2)),
fill = "#FFEBEE", color = "#C62828", fontface = "bold", size = 3,
label.padding = unit(0.35, "lines")) +
annotate("label", x = 2.8, y = 420,
label = paste0("Median = ", round(median(chocolate$rating), 2)),
fill = "#E3F2FD", color = "#1565C0", fontface = "bold", size = 3,
label.padding = unit(0.35, "lines")) +
labs(
title = "Distribution of Expert Chocolate Bar Ratings",
subtitle = "Most chocolates receive ratings between 3.0 and 3.5 (Recommended category)",
x = "Expert Rating Score",
y = "Number of Chocolate Bars",
caption = "Dashed red = Mean | Solid blue = Median | Data: TidyTuesday 2022"
) +
scale_x_continuous(breaks = seq(1, 5, 0.5), limits = c(1, 4.5)) +
scale_y_continuous(labels = comma) +
theme_premium()
p_hist
```
### 📝 Histogram Interpretation {data-height=250}
The histogram reveals that chocolate ratings follow an approximately **normal distribution** with a slight left skew. The majority of bars (over 60%) receive ratings between **3.0 and 3.5**, placing them in the "Recommended" category. The near-identical mean (`r round(mean(chocolate$rating), 2)`) and median (`r round(median(chocolate$rating), 2)`) confirm the distribution's symmetry.
Column {data-width=400}
-----------------------------------------------------------------------
### 🍫 Cocoa Percentage Distribution {data-height=350}
```{r fig.height=3.5, fig.width=6}
p_cocoa_hist <- ggplot(chocolate, aes(x = cocoa_percent_num)) +
geom_histogram(binwidth = 5, fill = "#8D6E63", color = "#5D4037", alpha = 0.85) +
geom_vline(xintercept = 70, color = "#D4AF37", linetype = "dashed", linewidth = 0.8) +
annotate("label", x = 82, y = 650, label = "70% = Mode",
fill = "#FFF8E1", color = "#D4AF37", size = 2.5, fontface = "bold") +
labs(title = "Cocoa Percentage Distribution",
subtitle = "70% is the most common formulation",
x = "Cocoa %", y = "Count") +
scale_x_continuous(breaks = seq(40, 100, 10)) +
theme_premium() +
theme(plot.title = element_text(size = 12),
plot.subtitle = element_text(size = 9))
p_cocoa_hist
```
### 📈 Key Statistics {data-height=400}
| Statistic | Rating | Cocoa % |
|:----------|-------:|--------:|
| **Mean** | `r round(mean(chocolate$rating), 2)` | `r round(mean(chocolate$cocoa_percent_num, na.rm=T), 1)`% |
| **Median** | `r median(chocolate$rating)` | `r median(chocolate$cocoa_percent_num, na.rm=T)`% |
| **Std Dev** | `r round(sd(chocolate$rating), 2)` | `r round(sd(chocolate$cocoa_percent_num, na.rm=T), 1)`% |
| **Range** | `r min(chocolate$rating)` - `r max(chocolate$rating)` | `r min(chocolate$cocoa_percent_num, na.rm=T)` - `r max(chocolate$cocoa_percent_num, na.rm=T)`% |
**Key Insights:**
- Mode rating is 3.25
- 75% of ratings fall between 2.75-3.5
- 70% cocoa is the most common formulation
- Only ~5% achieve "Outstanding" (4.0+)
Boxplot {data-icon="fa-boxes-stacked"}
=====================================
Column {data-width=600}
-----------------------------------------------------------------------
### 📦 Multiple Boxplot: Rating Distribution by Manufacturing Country {data-height=500}
```{r fig.height=5, fig.width=8}
# REQUIRED: Multiple boxplot - numeric variable grouped by categorical variable
# Purpose: Compare rating distributions across countries (categorical grouping)
top_countries <- chocolate %>%
count(company_location, sort = TRUE) %>%
head(10) %>%
pull(company_location)
chocolate_top <- chocolate %>%
filter(company_location %in% top_countries) %>%
mutate(company_location = fct_reorder(company_location, rating, .fun = median, .desc = TRUE))
p_box <- ggplot(chocolate_top, aes(x = company_location, y = rating, fill = company_location)) +
geom_boxplot(alpha = 0.85, outlier.shape = 21, outlier.fill = "#D32F2F",
outlier.color = "#B71C1C", outlier.size = 1.8, outlier.alpha = 0.6,
width = 0.7, lwd = 0.4) +
stat_summary(fun = mean, geom = "point", shape = 18, size = 3.5,
color = "#D4AF37") +
scale_fill_manual(values = rev(chocolate_palette[1:10])) +
labs(
title = "Chocolate Rating Distribution by Manufacturing Country",
subtitle = "Top 10 countries | Gold diamonds = mean | Red dots = outliers",
x = NULL,
y = "Expert Rating Score",
caption = "Countries ordered by median rating (descending)"
) +
scale_y_continuous(breaks = seq(1, 5, 0.5), limits = c(1.5, 4.5)) +
theme_premium() +
theme(
axis.text.x = element_text(angle = 40, hjust = 1, size = 11, face = "bold"),
legend.position = "none",
panel.grid.major.x = element_blank()
)
p_box
```
### 📝 Boxplot Interpretation {data-height=250}
This boxplot compares rating distributions across the **top 10 chocolate-manufacturing countries**. **Japan** shows the highest median rating with low variability. **U.S.A.** and **France** display the widest spreads. All countries have median ratings around **3.0-3.25**. Gold diamonds (means) align closely with medians.
Column {data-width=400}
-----------------------------------------------------------------------
### 📊 Rating by Cocoa Range {data-height=350}
```{r fig.height=3.2, fig.width=6}
chocolate_cocoa <- chocolate %>%
mutate(cocoa_range = cut(cocoa_percent_num,
breaks = c(40, 60, 70, 80, 100),
labels = c("Low\n40-60%", "Medium\n60-70%",
"High\n70-80%", "Very High\n80-100%"),
include.lowest = TRUE)) %>%
filter(!is.na(cocoa_range))
p_box_cocoa <- ggplot(chocolate_cocoa, aes(x = cocoa_range, y = rating, fill = cocoa_range)) +
geom_boxplot(alpha = 0.85, width = 0.6, outlier.size = 1.5, outlier.alpha = 0.5, lwd = 0.4) +
stat_summary(fun = mean, geom = "point", shape = 18, size = 2.5, color = "#D4AF37") +
scale_fill_manual(values = c("#EFEBE9", "#BCAAA4", "#795548", "#3E2723")) +
labs(title = "Rating by Cocoa Content Level",
x = NULL, y = "Rating") +
theme_premium() +
theme(legend.position = "none",
plot.title = element_text(size = 13))
p_box_cocoa
```
### 📋 Country Statistics {data-height=400}
```{r}
country_stats <- chocolate_top %>%
group_by(company_location) %>%
summarise(
N = n(),
Mean = round(mean(rating), 2),
Median = median(rating),
SD = round(sd(rating), 2),
.groups = 'drop'
) %>%
arrange(desc(Median)) %>%
rename(Country = company_location)
kable(country_stats, align = c('l', rep('c', 4))) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE, font_size = 11) %>%
row_spec(0, bold = TRUE, background = "#2C1810", color = "white") %>%
row_spec(1, background = "#F4E4BC", bold = TRUE)
```
Scatterplot {data-icon="fa-chart-scatter-bubble"}
=====================================
Column {data-width=600}
-----------------------------------------------------------------------
### 🔵 Scatterplot: Cocoa Percentage vs Rating by Bean Origin Region {data-height=500}
```{r fig.height=5, fig.width=8}
# REQUIRED: Scatterplot for two numerical variables with categorical coloring
# Purpose: Examine cocoa % vs rating relationship, colored by bean origin
chocolate_scatter <- chocolate %>%
mutate(
bean_region = case_when(
country_of_bean_origin %in% c("Venezuela", "Ecuador", "Peru", "Colombia",
"Bolivia", "Brazil") ~ "South America",
country_of_bean_origin %in% c("Madagascar", "Tanzania", "Ghana", "Ivory Coast",
"Cameroon", "Nigeria", "Uganda", "Togo",
"Congo", "Sao Tome") ~ "Africa",
country_of_bean_origin %in% c("Dominican Republic", "Nicaragua", "Guatemala",
"Mexico", "Belize", "Costa Rica", "Honduras",
"Haiti", "Jamaica", "Trinidad") ~ "Central Am. & Caribbean",
country_of_bean_origin %in% c("Papua New Guinea", "Indonesia", "Philippines",
"Vietnam", "India", "Fiji", "Vanuatu") ~ "Asia-Pacific",
TRUE ~ "Other / Blend"
)
) %>%
filter(!is.na(cocoa_percent_num))
p_scatter <- ggplot(chocolate_scatter,
aes(x = cocoa_percent_num, y = rating, color = bean_region)) +
geom_jitter(alpha = 0.5, size = 2, width = 0.8, height = 0.03) +
geom_smooth(aes(group = 1), method = "loess", se = TRUE,
color = "#2C1810", fill = "#D7CCC8", alpha = 0.3, linewidth = 1.2) +
scale_color_manual(
values = c(
"South America" = "#43A047",
"Africa" = "#FB8C00",
"Central Am. & Caribbean" = "#1E88E5",
"Asia-Pacific" = "#8E24AA",
"Other / Blend" = "#78909C"
),
name = "Bean Origin Region"
) +
labs(
title = "Relationship Between Cocoa Percentage and Expert Rating",
subtitle = "Points colored by bean origin region | LOESS curve shows overall trend",
x = "Cocoa Percentage (%)",
y = "Expert Rating Score",
caption = "Higher cocoa % doesn't guarantee higher ratings"
) +
scale_x_continuous(breaks = seq(40, 100, 10), limits = c(40, 100)) +
scale_y_continuous(breaks = seq(1, 5, 0.5), limits = c(1, 4.5)) +
theme_premium() +
theme(
legend.position = "bottom",
legend.box = "horizontal"
) +
guides(color = guide_legend(nrow = 2, override.aes = list(size = 3.5, alpha = 0.9)))
p_scatter
```
### 📝 Scatterplot Interpretation {data-height=250}
This scatterplot explores the relationship between **cocoa percentage** and **expert ratings**. The LOESS curve reveals that ratings **peak around 65-75% cocoa**, then decline at higher percentages. The weak negative correlation (**r = `r round(cor(chocolate$cocoa_percent_num, chocolate$rating, use="complete.obs"), 3)`**) indicates cocoa content alone doesn't determine quality.
Column {data-width=400}
-----------------------------------------------------------------------
### 📊 Correlation Analysis {data-height=200}
```{r}
cor_val <- round(cor(chocolate$cocoa_percent_num, chocolate$rating, use = "complete.obs"), 3)
```
<div class="stat-highlight" style="text-align: center; padding: 15px; background: linear-gradient(135deg, #FFF8E1 0%, #FFFFFF 100%); border-radius: 12px; border-left: 4px solid #FFC107;">
<div style="font-size: 2.5rem; font-weight: bold; color: #2C1810; font-family: 'Playfair Display', serif;">
r = `r cor_val`
</div>
<div style="font-size: 0.9rem; color: #5D4037; margin-top: 5px;">
⚠️ <strong>Weak Negative Correlation</strong>
</div>
<div style="font-size: 0.8rem; color: #8D6E63; margin-top: 8px;">
Cocoa % explains only `r round(cor_val^2 * 100, 1)`% of rating variance
</div>
</div>
### 🌍 Statistics by Bean Region {data-height=280}
```{r}
region_stats <- chocolate_scatter %>%
group_by(bean_region) %>%
summarise(
Count = n(),
`Avg Rating` = round(mean(rating), 2),
`Avg Cocoa` = paste0(round(mean(cocoa_percent_num), 0), "%"),
.groups = 'drop'
) %>%
arrange(desc(`Avg Rating`)) %>%
rename(Region = bean_region)
kable(region_stats, align = c('l', 'c', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE, font_size = 11) %>%
row_spec(0, bold = TRUE, background = "#5D4037", color = "white")
```
### 💡 Key Findings {data-height=270}
**Insights:**
- **Weak Correlation** (r = `r cor_val`): Cocoa % has minimal impact
- **Sweet Spot**: 65-75% cocoa achieves highest scores
- **Diminishing Returns**: Very dark chocolate (>85%) scores lower
- **Regional Consistency**: All bean regions show similar patterns
Interactive {data-icon="fa-hand-pointer"}
=====================================
Column {data-width=600}
-----------------------------------------------------------------------
### 🖱️ Interactive Visualization: Explore Each Chocolate Bar (ggplotly) {data-height=600}
```{r fig.height=6, fig.width=8}
# REQUIRED: Interactive ggplotly object - converting ggplot to interactive
# Purpose: Create interactive scatterplot using ggplotly() for data exploration
# OPTIMIZED: Using native plotly for better performance with large datasets
# Sample data for better performance (stratified by rating category)
set.seed(42)
chocolate_sample <- chocolate %>%
group_by(rating_category) %>%
slice_sample(prop = 0.4) %>% # Sample 40% from each category
ungroup()
# Create optimized plotly chart directly (faster than ggplotly conversion)
p_interactive <- plot_ly(
data = chocolate_sample,
x = ~jitter(cocoa_percent_num, amount = 1),
y = ~jitter(rating, amount = 0.03),
color = ~rating_category,
colors = rating_colors,
type = 'scattergl', # WebGL for better performance
mode = 'markers',
marker = list(size = 7, opacity = 0.6),
hoverinfo = 'text',
text = ~paste0(
"<b>", company_manufacturer, "</b>",
"<br>📍 ", company_location,
"<br>🌱 ", country_of_bean_origin,
"<br>🍫 ", cocoa_percent,
"<br>⭐ ", rating, " (", rating_category, ")"
)
) %>%
layout(
title = list(
text = "Interactive Explorer: Cocoa % vs Rating",
font = list(family = "Playfair Display", size = 16, color = "#2C1810")
),
xaxis = list(
title = list(text = "Cocoa Percentage (%)", standoff = 15),
tickvals = seq(40, 100, 10),
ticktext = paste0(seq(40, 100, 10), "%"),
tickfont = list(size = 12, color = "#5D4037"),
gridcolor = "#EFEBE9",
zerolinecolor = "#D7CCC8",
range = c(38, 102)
),
yaxis = list(
title = "Expert Rating",
range = c(1, 4.5),
tickfont = list(size = 12, color = "#5D4037"),
gridcolor = "#EFEBE9",
zerolinecolor = "#D7CCC8"
),
legend = list(
orientation = "h",
y = -0.18,
x = 0.5,
xanchor = "center",
font = list(size = 10),
bgcolor = "rgba(255,255,255,0.9)"
),
hoverlabel = list(
bgcolor = "white",
bordercolor = "#5D4037",
font = list(family = "Arial", size = 12, color = "#2C1810")
),
paper_bgcolor = "white",
plot_bgcolor = "white",
autosize = TRUE,
margin = list(l = 60, r = 30, t = 40, b = 90)
) %>%
config(
displayModeBar = TRUE,
modeBarButtonsToRemove = c("lasso2d", "select2d", "autoScale2d"),
displaylogo = FALSE
)
p_interactive
```
### 📝 Interactive Features {data-height=150}
**Native Plotly** with **WebGL** for optimal performance. Stratified sample (40% per category). **Hover** for details, use toolbar to **zoom/pan/download**.
Column {data-width=400}
-----------------------------------------------------------------------
### 📖 How to Use This Chart {data-height=280}
**Hover** over points to see: Manufacturer name, location, bean origin, cocoa %, rating & category.
**Toolbar Options:** Download PNG, Zoom in/out, Pan, Reset view.
**Color Legend:**
| Color | Category |
|:------|:---------|
| 🟢 Green | Outstanding / Highly Recommended |
| 🟡 Orange | Recommended |
| 🔴 Red | Disappointing / Unpleasant |
### ⭐ Rating Distribution {data-height=470}
```{r fig.height=4}
cat_dist <- chocolate %>% count(rating_category) %>%
mutate(pct = round(n/sum(n)*100, 1))
plot_ly(cat_dist, labels = ~rating_category, values = ~n, type = 'pie',
textposition = 'inside', textinfo = 'percent',
marker = list(colors = unname(rating_colors),
line = list(color = '#FFFFFF', width = 2)),
hoverinfo = 'label+value+percent',
height = 320) %>%
layout(showlegend = TRUE,
legend = list(orientation = 'h', y = -0.1, x = 0.5, xanchor = 'center',
font = list(size = 10)),
margin = list(t = 10, b = 40, l = 10, r = 10)) %>%
config(displayModeBar = FALSE)
```
Data {data-icon="fa-table"}
=====================================
Column {data-width=1000 .tabset}
-----------------------------------------------------------------------
### 🔍 Complete Dataset Explorer
```{r}
chocolate_display <- chocolate %>%
select(
Manufacturer = company_manufacturer,
Location = company_location,
Year = review_date,
`Bean Origin` = country_of_bean_origin,
`Bar Name` = specific_bean_origin_or_bar_name,
`Cocoa` = cocoa_percent,
Rating = rating,
Category = rating_category,
Flavors = most_memorable_characteristics
)
datatable(chocolate_display,
filter = 'top',
extensions = 'Buttons',
fillContainer = TRUE,
options = list(
pageLength = 25,
scrollY = "calc(100vh - 280px)",
scrollCollapse = TRUE,
paging = FALSE,
scrollX = FALSE,
autoWidth = TRUE,
dom = 'Bfrti',
buttons = c('copy', 'csv', 'excel'),
columnDefs = list(
list(width = 'auto', targets = c(0, 1, 3, 4, 8)),
list(width = '55px', targets = c(2, 5, 6)),
list(width = 'auto', targets = 7)
),
language = list(
search = "🔍 Search:",
info = "Showing _START_ to _END_ of _TOTAL_ chocolate bars"
)
),
rownames = FALSE,
class = 'cell-border stripe hover compact',
width = '100%') %>%
formatStyle('Rating',
background = styleColorBar(range(chocolate$rating), '#8D6E63'),
backgroundSize = '95% 70%',
backgroundRepeat = 'no-repeat',
backgroundPosition = 'center') %>%
formatStyle('Category',
backgroundColor = styleEqual(
names(rating_colors),
c("#C8E6C9", "#DCEDC8", "#FFF9C4", "#FFCCBC", "#FFCDD2")
),
fontWeight = 'bold')
```
### ℹ️ Usage Tips
**How to use this table:**
- 🔍 **Filter:** Use the search boxes under each column header
- ↕️ **Sort:** Click on column headers to sort
- 📋 **Export:** Use Copy, CSV, or Excel buttons
- 📜 **Scroll:** Scroll within the table to see all `r nrow(chocolate)` records
**Column Guide:**
| Column | Description |
|:-------|:------------|
| Manufacturer | Company that made the chocolate |
| Location | Country where company is based |
| Year | When the review was conducted |
| Bean Origin | Where cocoa beans came from |
| Cocoa | Percentage of cocoa content |
| Rating | Expert score (1.0 - 5.0) |
| Category | Rating classification |
| Flavors | Tasting notes from experts |
References {data-icon="fa-book"}
=====================================
Column {data-width=500}
-----------------------------------------------------------------------
### 📚 Data Sources & Methodology
<div class="info-card">
**Primary Data Source:**
📊 **TidyTuesday - Chocolate Bar Ratings (2022-01-18)**
| Attribute | Details |
|:----------|:--------|
| Repository | [github.com/rfordatascience/tidytuesday](https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-01-18) |
| Direct CSV | [chocolate.csv](https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv) |
| Records | 2,530 chocolate bar reviews |
| Time Span | 2006 - 2021 |
| Variables | 10 original columns |
**Original Data Provider:**
🍫 **Flavors of Cacao** - [flavorsofcacao.com](http://flavorsofcacao.com/chocolate_database.html)
- Compiled by: **Manhattan Chocolate Society**
- Rating methodology: Blind tasting by certified chocolate experts
- Scale: 1.0 (unpleasant) to 5.0 (elite/outstanding)
**Data Collection Method:**
Expert tasters evaluate chocolate bars on texture, flavor complexity, finish, and overall impression. Each bar is rated independently without brand knowledge.
</div>
### 📦 R Packages Used
```{r}
packages_df <- tibble(
Package = c("flexdashboard", "tidyverse", "ggplot2", "plotly", "DT", "knitr", "kableExtra", "scales"),
Version = c(
as.character(packageVersion("flexdashboard")),
as.character(packageVersion("tidyverse")),
as.character(packageVersion("ggplot2")),
as.character(packageVersion("plotly")),
as.character(packageVersion("DT")),
as.character(packageVersion("knitr")),
as.character(packageVersion("kableExtra")),
as.character(packageVersion("scales"))
),
Purpose = c(
"Dashboard framework & layout",
"Data wrangling (dplyr, tidyr, readr)",
"Grammar of graphics visualizations",
"Interactive charts with WebGL",
"Interactive searchable data tables",
"Dynamic report generation",
"Advanced table formatting",
"Axis & label formatting"
),
Citation = c(
"Iannone et al. (2024)",
"Wickham et al. (2019)",
"Wickham (2016)",
"Sievert (2020)",
"Xie et al. (2024)",
"Xie (2024)",
"Zhu (2024)",
"Wickham & Seidel (2022)"
)
)
kable(packages_df, align = c('l', 'c', 'l', 'l')) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = TRUE, font_size = 11) %>%
row_spec(0, bold = TRUE, background = "#2C1810", color = "white")
```
Column {data-width=500}
-----------------------------------------------------------------------
### 🔗 Documentation & Tutorials
<div class="info-card">
**Official Documentation:**
| Resource | URL | Purpose |
|:---------|:----|:--------|
| 📖 Flexdashboard | [pkgs.rstudio.com/flexdashboard](https://pkgs.rstudio.com/flexdashboard/) | Dashboard layouts & components |
| 📊 Plotly R | [plotly.com/r](https://plotly.com/r/) | Interactive visualizations |
| 🎨 ggplot2 | [ggplot2.tidyverse.org](https://ggplot2.tidyverse.org/) | Static graphics reference |
| 📋 DT Package | [rstudio.github.io/DT](https://rstudio.github.io/DT/) | DataTables integration |
| 📚 kableExtra | [haozhu233.github.io/kableExtra](https://haozhu233.github.io/kableExtra/) | Table styling |
**Books & Learning Resources:**
1. Wickham, H. & Grolemund, G. (2023). *R for Data Science* (2nd ed.). [r4ds.hadley.nz](https://r4ds.hadley.nz/)
2. Sievert, C. (2020). *Interactive Web-Based Data Visualization with R, plotly, and shiny*. Chapman & Hall/CRC. [plotly-r.com](https://plotly-r.com/)
3. Wilke, C. O. (2019). *Fundamentals of Data Visualization*. O'Reilly. [clauswilke.com/dataviz](https://clauswilke.com/dataviz/)
4. Healy, K. (2018). *Data Visualization: A Practical Introduction*. Princeton University Press.
**Course Materials:**
- MIS029 Data Visualization lecture notes
- Flexdashboard layout examples: [pkgs.rstudio.com/flexdashboard/articles/layouts.html](https://pkgs.rstudio.com/flexdashboard/articles/layouts.html)
</div>
### 🔄 Session Information
```{r}
session_df <- tibble(
Property = c("R Version", "Platform", "Operating System", "Locale", "Date Generated", "Timezone"),
Value = c(
paste(R.version$major, R.version$minor, sep = "."),
R.version$platform,
paste(Sys.info()["sysname"], Sys.info()["release"]),
Sys.getlocale("LC_TIME"),
format(Sys.time(), "%Y-%m-%d %H:%M:%S"),
Sys.timezone()
)
)
kable(session_df, align = c('l', 'l')) %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = TRUE, font_size = 11) %>%
row_spec(0, bold = TRUE, background = "#455A64", color = "white")
```
### 📋 Reproducibility & License
<div class="info-card" style="border-left-color: #FF9800;">
**To reproduce this analysis:**
```r
# 1. Install required packages
install.packages(c("flexdashboard", "tidyverse",
"plotly", "DT", "knitr",
"kableExtra", "scales"))
# 2. Render the dashboard
rmarkdown::render("MertEfeKurt_2307071061_Final.Rmd")
```
⚠️ **Requirements:**
- R version ≥ 4.0.0
- Internet connection (data fetched from GitHub)
- ~500 MB RAM for rendering
**Data License:** TidyTuesday data is released under CC0 1.0 Universal license.
---
**Dashboard Author:** Mert Efe Kurt (2307071061)
**Course:** MIS029 - Data Visualization
**Institution:** Final Project Submission
**Generated:** `r format(Sys.time(), "%B %d, %Y at %H:%M")`
</div>