About

Column

📌 Project Information

📚 Course Details

Course Code	MIS029
Course Name	Data Visualization
Project Type	Final Project Dashboard

👤 Student Information

Name & Surname	Mert Efe Kurt
Student Number	2307071061
Submission Date	15 January 2026

📊 Dataset Overview

Dataset	Chocolate Bar Ratings
Source	TidyTuesday 2022-01-18
Observations	2,530
Variables	10 original + 3 derived
Period	2006 - 2021

📖 About This Dataset

Expert ratings of over 2,500 chocolate bars from around the world, compiled by the Manhattan Chocolate Society via Flavors of Cacao.

🌍 67 countries | 🏭 580 manufacturers | 🌱 62 bean origins

Rating Scale:

Score	Category
4.0 - 5.0	🏆 Outstanding
3.5 - 3.9	⭐ Highly Recommended
3.0 - 3.49	✓ Recommended
2.5 - 2.99	⚠️ Disappointing
1.0 - 2.49	✗ Unpleasant

Column

📋 Variable Names and Types

#	Variable	Type	Description
1	ref	Numeric	Unique reference ID
2	company_manufacturer	Character	Manufacturer name
3	company_location	Character	Manufacturer country
4	review_date	Numeric	Review year
5	country_of_bean_origin	Character	Bean origin country
6	specific_bean_origin_or_bar_name	Character	Bean variety/bar name
7	cocoa_percent	Character	Cocoa % (text)
8	ingredients	Character	Ingredient codes
9	most_memorable_characteristics	Character	Flavor notes
10	rating	Numeric	Rating (1-5)
11	cocoa_percent_num	Numeric	Cocoa % [Derived]
12	num_ingredients	Numeric	Ingredient count [Derived]
13	rating_category	Factor	Rating label [Derived]

🔍 Data Structure Preview

Metric	Value
Rows (Observations)	2,530
Columns (Variables)	13
Memory Size	611.5 Kb
Complete Cases	2,443
Missing Values	174

Summary

Column

📊 Summary Statistics for Numeric Variables

Variable	N	Mean	Median	SD	Min	Max
Cocoa Percentage (%)	2530	71.64	70.00	5.62	42	100
Number of Ingredients	2443	3.04	3.00	0.91	1	6
Rating Score	2530	3.20	3.25	0.45	1	4
Review Year	2530	2014.37	2015.00	3.97	2006	2021

📍 Frequency Table: Top Manufacturing Countries

Country	Count	Pct	Cum.Pct
U.S.A.	1136	44.9	44.9
Canada	177	7.0	51.9
France	176	7.0	58.9
U.K.	133	5.3	64.2
Italy	78	3.1	67.3
Belgium	63	2.5	69.8
Ecuador	58	2.3	72.1
Australia	53	2.1	74.2
Switzerland	44	1.7	75.9
Germany	42	1.7	77.6
Spain	36	1.4	79.0
Denmark	31	1.2	80.2

Column

🌱 Frequency Table: Top Bean Origin Countries

Bean Origin	Count	Pct	Cum.Pct
Venezuela	253	10.0	10.0
Peru	244	9.6	19.6
Dominican Republic	226	8.9	28.5
Ecuador	219	8.7	37.2
Madagascar	177	7.0	44.2
Blend	156	6.2	50.4
Nicaragua	100	4.0	54.4
Bolivia	80	3.2	57.6
Colombia	79	3.1	60.7
Tanzania	79	3.1	63.8
Brazil	78	3.1	66.9
Belize	76	3.0	69.9

⭐ Frequency Table: Rating Categories

Category	Count	Pct	Cum.Pct
Outstanding	112	4.4	4.4
Highly Recommended	865	34.2	38.6
Recommended	987	39.0	77.6
Disappointing	499	19.7	97.3
Unpleasant	67	2.6	99.9

🔢 Quick Stats

2,530

Histogram

Column

📊 Histogram: Distribution of Expert Chocolate Ratings

📝 Histogram Interpretation

The histogram reveals that chocolate ratings follow an approximately normal distribution with a slight left skew. The majority of bars (over 60%) receive ratings between 3.0 and 3.5, placing them in the “Recommended” category. The near-identical mean (3.2) and median (3.25) confirm the distribution’s symmetry.

Column

🍫 Cocoa Percentage Distribution

📈 Key Statistics

Statistic	Rating	Cocoa %
Mean	3.2	71.6%
Median	3.25	70%
Std Dev	0.45	5.6%
Range	1 - 4	42 - 100%

Key Insights:

Mode rating is 3.25
75% of ratings fall between 2.75-3.5
70% cocoa is the most common formulation
Only ~5% achieve “Outstanding” (4.0+)

Boxplot

Column

📦 Multiple Boxplot: Rating Distribution by Manufacturing Country

📝 Boxplot Interpretation

This boxplot compares rating distributions across the top 10 chocolate-manufacturing countries. Japan shows the highest median rating with low variability. U.S.A. and France display the widest spreads. All countries have median ratings around 3.0-3.25. Gold diamonds (means) align closely with medians.

Column

📊 Rating by Cocoa Range

📋 Country Statistics

Country	N	Mean	Median	SD
Australia	53	3.36	3.50	0.41
Canada	177	3.30	3.25	0.42
France	176	3.26	3.25	0.52
Germany	42	3.21	3.25	0.47
Italy	78	3.23	3.25	0.47
Switzerland	44	3.32	3.25	0.45
U.S.A.	1136	3.19	3.25	0.42
Belgium	63	3.10	3.00	0.66
Ecuador	58	3.04	3.00	0.55
U.K.	133	3.07	3.00	0.47

Scatterplot

Column

🔵 Scatterplot: Cocoa Percentage vs Rating by Bean Origin Region

📝 Scatterplot Interpretation

This scatterplot explores the relationship between cocoa percentage and expert ratings. The LOESS curve reveals that ratings peak around 65-75% cocoa, then decline at higher percentages. The weak negative correlation (r = -0.147) indicates cocoa content alone doesn’t determine quality.

Column

📊 Correlation Analysis

r = -0.147

⚠️ Weak Negative Correlation

Cocoa % explains only 2.2% of rating variance

🌍 Statistics by Bean Region

Region	Count	Avg Rating	Avg Cocoa
Central Am. & Caribbean	683	3.22	72%
Africa	357	3.21	71%
Asia-Pacific	231	3.21	71%
South America	953	3.20	72%
Other / Blend	306	3.08	71%

💡 Key Findings

Insights:

Weak Correlation (r = -0.147): Cocoa % has minimal impact
Sweet Spot: 65-75% cocoa achieves highest scores
Diminishing Returns: Very dark chocolate (>85%) scores lower
Regional Consistency: All bean regions show similar patterns

Interactive

Column

🖱️ Interactive Visualization: Explore Each Chocolate Bar (ggplotly)

📝 Interactive Features

Native Plotly with WebGL for optimal performance. Stratified sample (40% per category). Hover for details, use toolbar to zoom/pan/download.

Column

📖 How to Use This Chart

Hover over points to see: Manufacturer name, location, bean origin, cocoa %, rating & category.

Toolbar Options: Download PNG, Zoom in/out, Pan, Reset view.

Color Legend:

Color	Category
🟢 Green	Outstanding / Highly Recommended
🟡 Orange	Recommended
🔴 Red	Disappointing / Unpleasant

⭐ Rating Distribution

Data

Column

🔍 Complete Dataset Explorer

ℹ️ Usage Tips

How to use this table:

🔍 Filter: Use the search boxes under each column header
↕️ Sort: Click on column headers to sort
📋 Export: Use Copy, CSV, or Excel buttons
📜 Scroll: Scroll within the table to see all 2530 records

Column Guide:

Column	Description
Manufacturer	Company that made the chocolate
Location	Country where company is based
Year	When the review was conducted
Bean Origin	Where cocoa beans came from
Cocoa	Percentage of cocoa content
Rating	Expert score (1.0 - 5.0)
Category	Rating classification
Flavors	Tasting notes from experts

References

Column

📚 Data Sources & Methodology

Primary Data Source:

📊 TidyTuesday - Chocolate Bar Ratings (2022-01-18)

Attribute	Details
Repository	github.com/rfordatascience/tidytuesday
Direct CSV	chocolate.csv
Records	2,530 chocolate bar reviews
Time Span	2006 - 2021
Variables	10 original columns

Original Data Provider:

🍫 Flavors of Cacao - flavorsofcacao.com

Compiled by: Manhattan Chocolate Society
Rating methodology: Blind tasting by certified chocolate experts
Scale: 1.0 (unpleasant) to 5.0 (elite/outstanding)

Data Collection Method:

Expert tasters evaluate chocolate bars on texture, flavor complexity, finish, and overall impression. Each bar is rated independently without brand knowledge.

📦 R Packages Used

Package	Version	Purpose	Citation
flexdashboard	0.6.2	Dashboard framework & layout	Iannone et al. (2024)
tidyverse	2.0.0	Data wrangling (dplyr, tidyr, readr)	Wickham et al. (2019)
ggplot2	4.0.1	Grammar of graphics visualizations	Wickham (2016)
plotly	4.11.0	Interactive charts with WebGL	Sievert (2020)
DT	0.34.0	Interactive searchable data tables	Xie et al. (2024)
knitr	1.49	Dynamic report generation	Xie (2024)
kableExtra	1.4.0	Advanced table formatting	Zhu (2024)
scales	1.4.0	Axis & label formatting	Wickham & Seidel (2022)

Column

🔗 Documentation & Tutorials

Official Documentation:

Resource	URL	Purpose
📖 Flexdashboard	pkgs.rstudio.com/flexdashboard	Dashboard layouts & components
📊 Plotly R	plotly.com/r	Interactive visualizations
🎨 ggplot2	ggplot2.tidyverse.org	Static graphics reference
📋 DT Package	rstudio.github.io/DT	DataTables integration
📚 kableExtra	haozhu233.github.io/kableExtra	Table styling

Books & Learning Resources:

Wickham, H. & Grolemund, G. (2023). R for Data Science (2nd ed.). r4ds.hadley.nz
Sievert, C. (2020). Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman & Hall/CRC. plotly-r.com
Wilke, C. O. (2019). Fundamentals of Data Visualization. O’Reilly. clauswilke.com/dataviz
Healy, K. (2018). Data Visualization: A Practical Introduction. Princeton University Press.

Course Materials:

MIS029 Data Visualization lecture notes
Flexdashboard layout examples: pkgs.rstudio.com/flexdashboard/articles/layouts.html

🔄 Session Information

Property	Value
R Version	4.4.3
Platform	aarch64-apple-darwin20
Operating System	Darwin 25.1.0
Locale	en_US.UTF-8
Date Generated	2026-01-15 19:09:27
Timezone	Europe/Istanbul

📋 Reproducibility & License

To reproduce this analysis:

# 1. Install required packages
install.packages(c("flexdashboard", "tidyverse", 
                   "plotly", "DT", "knitr", 
                   "kableExtra", "scales"))

# 2. Render the dashboard
rmarkdown::render("MertEfeKurt_2307071061_Final.Rmd")

⚠️ Requirements:

R version ≥ 4.0.0
Internet connection (data fetched from GitHub)
~500 MB RAM for rendering

Data License: TidyTuesday data is released under CC0 1.0 Universal license.

Dashboard Author: Mert Efe Kurt (2307071061)

Course: MIS029 - Data Visualization

Institution: Final Project Submission

Generated: January 15, 2026 at 19:09

---
title: "🍫 Chocolate Bar Ratings Analysis"
author: "Mert Efe Kurt | 2307071061"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    vertical_layout: scroll
    theme: 
      version: 4
      bootswatch: cosmo
    navbar:
      - { title: "MIS029 Final Project", align: right }
    source_code: embed
---

```{css, echo=FALSE}
/* ===== PREMIUM CHOCOLATE THEME ===== */

/* Import elegant fonts */
@import url('https://fonts.googleapis.com/css2?family=Playfair+Display:wght@400;600;700&family=Source+Sans+Pro:wght@300;400;600&display=swap');

/* Root variables for consistent theming */
:root {
  --chocolate-dark: #2C1810;
  --chocolate-medium: #5D4037;
  --chocolate-light: #8D6E63;
  --chocolate-cream: #D7CCC8;
  --chocolate-milk: #EFEBE9;
  --gold-accent: #D4AF37;
  --gold-light: #F4E4BC;
  --success-green: #4CAF50;
  --warning-orange: #FF9800;
  --danger-red: #E53935;
}

/* Prevent horizontal overflow - CRITICAL */
html, body {
  overflow-x: hidden !important;
  max-width: 100% !important;
}

/* Global body styling */
body {
  font-family: 'Source Sans Pro', -apple-system, BlinkMacSystemFont, sans-serif;
  background: linear-gradient(135deg, #FAFAFA 0%, #F5F5F5 100%);
  color: var(--chocolate-dark);
  width: 100%;
  box-sizing: border-box;
}

/* Ensure all containers respect boundaries */
* {
  box-sizing: border-box;
}

/* Main container */
.container-fluid, .row {
  max-width: 100% !important;
  overflow-x: hidden !important;
}

/* Navbar styling */
.navbar {
  background: linear-gradient(135deg, var(--chocolate-dark) 0%, var(--chocolate-medium) 100%) !important;
  border-bottom: 3px solid var(--gold-accent) !important;
  box-shadow: 0 4px 20px rgba(44, 24, 16, 0.3);
}

.navbar-brand {
  font-family: 'Playfair Display', serif !important;
  font-weight: 700 !important;
  font-size: 1.5rem !important;
  color: var(--gold-light) !important;
  text-shadow: 1px 1px 2px rgba(0,0,0,0.3);
}

.navbar-nav > li > a {
  color: var(--chocolate-cream) !important;
  font-weight: 500;
  transition: all 0.3s ease;
}

.navbar-nav > li > a:hover {
  color: var(--gold-accent) !important;
  background: rgba(212, 175, 55, 0.15) !important;
}

.navbar-nav > .active > a {
  background: rgba(212, 175, 55, 0.25) !important;
  color: var(--gold-accent) !important;
  border-bottom: 2px solid var(--gold-accent);
}

/* Page titles */
.section.level3 > h3 {
  font-family: 'Playfair Display', serif;
  color: var(--chocolate-dark);
  font-weight: 600;
  border-bottom: 2px solid var(--gold-accent);
  padding-bottom: 8px;
  margin-bottom: 15px;
}

/* Chart containers */
.chart-wrapper {
  background: white;
  border-radius: 12px;
  box-shadow: 0 4px 15px rgba(44, 24, 16, 0.08);
  padding: 15px;
  transition: transform 0.3s ease, box-shadow 0.3s ease;
  max-width: 100% !important;
  overflow: hidden !important;
}

.chart-wrapper:hover {
  transform: translateY(-2px);
  box-shadow: 0 8px 25px rgba(44, 24, 16, 0.12);
}

/* Chart stage - allow vertical scroll when needed */
.chart-stage {
  max-width: 100% !important;
  overflow-x: hidden !important;
  overflow-y: auto !important;
  width: 100% !important;
}

/* Plotly and ggplot containers */
.plotly, .plotly-container, .html-widget {
  max-width: 100% !important;
  overflow: hidden !important;
}

/* Value boxes */
.value-box {
  border-radius: 12px !important;
  box-shadow: 0 4px 15px rgba(44, 24, 16, 0.15) !important;
  transition: transform 0.3s ease !important;
}

.value-box:hover {
  transform: scale(1.02);
}

.value-box .value {
  font-family: 'Playfair Display', serif !important;
  font-size: 2.2rem !important;
  font-weight: 700 !important;
}

.value-box .caption {
  font-family: 'Source Sans Pro', sans-serif !important;
  font-size: 0.85rem !important;
  font-weight: 500 !important;
  text-transform: uppercase;
  letter-spacing: 0.5px;
}

/* Tables - prevent horizontal scroll and fix backgrounds */
.dataTable {
  font-size: 0.9rem !important;
  max-width: 100% !important;
  width: 100% !important;
  table-layout: auto !important;
  background-color: white !important;
}

.dataTables_wrapper {
  max-width: 100% !important;
  overflow-x: hidden !important;
  background-color: white !important;
}

.dataTables_scroll {
  max-width: 100% !important;
  overflow-x: hidden !important;
  background-color: white !important;
}

/* Fix white space issue in Data table - CRITICAL */
.dataTables_scrollBody {
  background-color: white !important;
  overflow-y: auto !important;
}

.dataTables_scrollHead {
  background-color: white !important;
}

.dataTables_scroll {
  background-color: white !important;
}

table.dataTable {
  background-color: white !important;
}

table.dataTable tbody tr {
  background-color: white !important;
}

table.dataTable tbody tr:nth-child(odd) {
  background-color: #FAFAFA !important;
}

table.dataTable tbody tr:hover {
  background-color: var(--gold-light) !important;
}

/* Data page specific - fill container */
.chart-wrapper.html-fill-container {
  height: 100% !important;
  min-height: calc(100vh - 200px) !important;
}

/* Ensure DataTable fills its container */
.html-widget.html-fill-item {
  height: 100% !important;
  min-height: calc(100vh - 250px) !important;
}

.dataTable thead th {
  background: linear-gradient(135deg, var(--chocolate-dark), var(--chocolate-medium)) !important;
  color: white !important;
  font-weight: 600 !important;
  text-transform: uppercase;
  font-size: 0.8rem;
  letter-spacing: 0.5px;
  white-space: nowrap;
  overflow: hidden;
  text-overflow: ellipsis;
}

.dataTable tbody tr:hover {
  background-color: var(--gold-light) !important;
}

.dataTable tbody td {
  word-wrap: break-word;
  max-width: 200px;
  overflow: hidden;
  text-overflow: ellipsis;
}

/* Kable tables */
.table-striped > tbody > tr:nth-of-type(odd) {
  background-color: var(--chocolate-milk);
}

/* Custom card styling */
.info-card {
  background: linear-gradient(145deg, #FFFFFF 0%, var(--chocolate-milk) 100%);
  border-radius: 16px;
  padding: 25px;
  box-shadow: 0 8px 30px rgba(44, 24, 16, 0.1);
  border-left: 5px solid var(--gold-accent);
  margin-bottom: 20px;
}

.premium-card {
  background: linear-gradient(135deg, var(--chocolate-dark) 0%, var(--chocolate-medium) 100%);
  color: white;
  border-radius: 16px;
  padding: 30px;
  box-shadow: 0 10px 40px rgba(44, 24, 16, 0.25);
}

.premium-card h4 {
  color: var(--gold-accent);
  font-family: 'Playfair Display', serif;
  margin-bottom: 15px;
}

/* Interpretation boxes - MUST BE VISIBLE */
.interpretation-box {
  background: linear-gradient(135deg, var(--chocolate-milk) 0%, #FFFFFF 100%);
  border-radius: 12px;
  padding: 18px 20px;
  margin-top: 20px;
  margin-bottom: 15px;
  border-left: 4px solid var(--gold-accent);
  font-size: 0.92rem;
  line-height: 1.65;
  box-shadow: 0 2px 8px rgba(44, 24, 16, 0.08);
  position: relative;
  z-index: 10;
}

/* Ensure chart wrappers don't overflow */
.chart-shim {
  overflow: hidden !important;
  max-width: 100% !important;
}

.section.level3 {
  overflow-x: hidden !important;
  overflow-y: auto !important;
  padding-bottom: 10px;
  max-width: 100% !important;
}

/* Column sections */
.section {
  max-width: 100% !important;
  overflow-x: hidden !important;
}

/* Flexdashboard columns */
.flexdashboard-column {
  max-width: 100% !important;
  overflow-x: hidden !important;
}

/* Stats highlight */
.stat-highlight {
  background: linear-gradient(135deg, var(--gold-light) 0%, #FFFFFF 100%);
  border-radius: 10px;
  padding: 15px 20px;
  text-align: center;
  box-shadow: 0 4px 15px rgba(212, 175, 55, 0.2);
}

.stat-highlight .number {
  font-family: 'Playfair Display', serif;
  font-size: 2.5rem;
  font-weight: 700;
  color: var(--chocolate-dark);
}

.stat-highlight .label {
  font-size: 0.85rem;
  color: var(--chocolate-light);
  text-transform: uppercase;
  letter-spacing: 1px;
}

/* Gauge styling */
.gauge-container {
  background: white;
  border-radius: 12px;
  padding: 20px;
  text-align: center;
}

/* Custom scrollbar */
::-webkit-scrollbar {
  width: 8px;
  height: 8px;
}

::-webkit-scrollbar-track {
  background: var(--chocolate-milk);
  border-radius: 4px;
}

::-webkit-scrollbar-thumb {
  background: var(--chocolate-light);
  border-radius: 4px;
}

::-webkit-scrollbar-thumb:hover {
  background: var(--chocolate-medium);
}

/* Animation for page load */
@keyframes fadeInUp {
  from {
    opacity: 0;
    transform: translateY(20px);
  }
  to {
    opacity: 1;
    transform: translateY(0);
  }
}

.chart-stage {
  animation: fadeInUp 0.6s ease-out;
}

/* Responsive adjustments */
@media (max-width: 768px) {
  .value-box .value {
    font-size: 1.8rem !important;
  }
  .navbar-brand {
    font-size: 1.2rem !important;
  }
  .dataTable {
    font-size: 0.75rem !important;
  }
  .premium-card, .info-card {
    padding: 15px !important;
  }
}

/* Kable tables - responsive */
table.kable-table, .table {
  max-width: 100% !important;
  width: 100% !important;
  table-layout: auto !important;
  font-size: 0.9rem !important;
}

table.kable-table td, table.kable-table th,
.table td, .table th {
  word-wrap: break-word;
  overflow: hidden;
  text-overflow: ellipsis;
  padding: 6px 8px !important;
}

/* Ensure variable table fits all content */
.table-condensed td, .table-condensed th {
  padding: 4px 6px !important;
  font-size: 11px !important;
  line-height: 1.3 !important;
}

/* Ensure all images and plots fit */
img, svg, canvas {
  max-width: 100% !important;
  height: auto !important;
}

/* Info cards and premium cards */
.info-card, .premium-card {
  max-width: 100% !important;
  word-wrap: break-word;
  overflow-wrap: break-word;
  overflow-y: auto !important;
}

/* Ensure tables in cards are compact */
.premium-card table, .info-card table {
  margin-bottom: 8px !important;
  font-size: 0.9rem !important;
}

.premium-card h4, .info-card h4 {
  margin-bottom: 8px !important;
  font-size: 1rem !important;
}

/* Value boxes container */
.value-box-container {
  max-width: 100% !important;
}

/* Ensure proper spacing and no overflow in sections */
.section {
  padding-left: 10px !important;
  padding-right: 10px !important;
}

/* Grid alignment - ensure columns fill properly */
.flexdashboard-page > .dashboard-row {
  display: flex !important;
  flex-wrap: nowrap !important;
  width: 100% !important;
}

.flexdashboard-page > .dashboard-row > .dashboard-column {
  display: flex !important;
  flex-direction: column !important;
}

/* Ensure panels stretch to fill available height */
.chart-wrapper {
  display: flex !important;
  flex-direction: column !important;
  flex: 1 1 auto !important;
}

.chart-stage {
  flex: 1 1 auto !important;
  display: flex !important;
  flex-direction: column !important;
}

/* No empty gaps between panels */
.section.level3 {
  margin-bottom: 0 !important;
  flex: 1 1 auto !important;
}

/* Fix any potential flexdashboard overflow */
.flexdashboard-content {
  max-width: 100% !important;
  overflow-x: hidden !important;
}
```

```{r setup, include=FALSE}
# ===== SETUP AND DATA LOADING =====
knitr::opts_chunk$set(
  echo = FALSE, 
  message = FALSE, 
  warning = FALSE,
  fig.retina = 2
)

# Load required libraries
library(flexdashboard)
library(tidyverse)
library(plotly)
library(DT)
library(scales)
library(knitr)
library(kableExtra)

# Load the chocolate dataset from TidyTuesday
chocolate <- read_csv(
  "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv",
  show_col_types = FALSE
)

# ===== DATA PREPARATION =====
chocolate <- chocolate %>%
  mutate(
    # Convert cocoa_percent to numeric
    cocoa_percent_num = as.numeric(gsub("%", "", cocoa_percent)),
    # Extract number of ingredients
    num_ingredients = as.numeric(str_extract(ingredients, "^[0-9]")),
    # Create rating categories
    rating_category = case_when(
      rating >= 4 ~ "Outstanding",
      rating >= 3.5 ~ "Highly Recommended",
      rating >= 3 ~ "Recommended",
      rating >= 2.5 ~ "Disappointing",
      TRUE ~ "Unpleasant"
    ),
    rating_category = factor(rating_category, levels = c(
      "Outstanding", "Highly Recommended", "Recommended", "Disappointing", "Unpleasant"
    ))
  )

# ===== SUMMARY STATISTICS =====
total_reviews <- nrow(chocolate)
avg_rating <- round(mean(chocolate$rating, na.rm = TRUE), 2)
num_countries <- n_distinct(chocolate$company_location)
num_manufacturers <- n_distinct(chocolate$company_manufacturer)
avg_cocoa <- round(mean(chocolate$cocoa_percent_num, na.rm = TRUE), 1)
num_origins <- n_distinct(chocolate$country_of_bean_origin)

# ===== PREMIUM GGPLOT THEME =====
theme_premium <- function() {
  theme_minimal(base_family = "sans") +
    theme(
      # Title styling
      plot.title = element_text(
        face = "bold", 
        size = 16, 
        color = "#2C1810",
        margin = margin(b = 10)
      ),
      plot.subtitle = element_text(
        size = 11, 
        color = "#5D4037",
        margin = margin(b = 15)
      ),
      plot.caption = element_text(
        size = 9, 
        color = "#8D6E63",
        hjust = 0,
        margin = margin(t = 10)
      ),
      # Axis styling
      axis.title = element_text(
        face = "bold", 
        size = 11, 
        color = "#5D4037"
      ),
      axis.text = element_text(
        size = 10, 
        color = "#5D4037"
      ),
      axis.line = element_line(color = "#D7CCC8", linewidth = 0.5),
      # Legend styling
      legend.title = element_text(face = "bold", size = 10, color = "#2C1810"),
      legend.text = element_text(size = 9, color = "#5D4037"),
      legend.background = element_rect(fill = "white", color = NA),
      legend.key = element_rect(fill = "white", color = NA),
      # Panel styling
      panel.grid.minor = element_blank(),
      panel.grid.major = element_line(color = "#EFEBE9", linewidth = 0.4),
      panel.background = element_rect(fill = "white", color = NA),
      plot.background = element_rect(fill = "white", color = NA),
      # Margins
      plot.margin = margin(15, 15, 15, 15)
    )
}

# Premium color palettes
chocolate_palette <- c("#2C1810", "#4E342E", "#5D4037", "#6D4C41", "#795548", 
                       "#8D6E63", "#A1887F", "#BCAAA4", "#D7CCC8", "#EFEBE9")
rating_colors <- c("Outstanding" = "#2E7D32", "Highly Recommended" = "#689F38",
                   "Recommended" = "#FFA000", "Disappointing" = "#F57C00", 
                   "Unpleasant" = "#D32F2F")
```

About {data-icon="fa-info-circle"}
=====================================

Column {data-width=450}
-----------------------------------------------------------------------

### 📌 Project Information {data-height=500}

<div class="premium-card" style="padding: 20px;">

<h4 style="margin-top: 0;">📚 Course Details</h4>

| | |
|:--|:--|
| **Course Code** | MIS029 |
| **Course Name** | Data Visualization |
| **Project Type** | Final Project Dashboard |

<h4 style="margin-top: 15px;">👤 Student Information</h4>

| | |
|:--|:--|
| **Name & Surname** | Mert Efe Kurt |
| **Student Number** | 2307071061 |
| **Submission Date** | `r format(Sys.Date(), "%d %B %Y")` |

<h4 style="margin-top: 15px;">📊 Dataset Overview</h4>

| | |
|:--|:--|
| **Dataset** | Chocolate Bar Ratings |
| **Source** | TidyTuesday 2022-01-18 |
| **Observations** | `r format(total_reviews, big.mark = ",")` |
| **Variables** | 10 original + 3 derived |
| **Period** | 2006 - 2021 |

</div>

### 📖 About This Dataset {data-height=400}

<div class="info-card" style="padding: 15px;">

**Expert ratings of over 2,500 chocolate bars** from around the world, compiled by the **Manhattan Chocolate Society** via [Flavors of Cacao](http://flavorsofcacao.com/).

🌍 **`r num_countries`** countries | 🏭 **`r num_manufacturers`** manufacturers | 🌱 **`r num_origins`** bean origins

**Rating Scale:**

| Score | Category |
|:------|:---------|
| 4.0 - 5.0 | 🏆 Outstanding |
| 3.5 - 3.9 | ⭐ Highly Recommended |
| 3.0 - 3.49 | ✓ Recommended |
| 2.5 - 2.99 | ⚠️ Disappointing |
| 1.0 - 2.49 | ✗ Unpleasant |

</div>

Column {data-width=550}
-----------------------------------------------------------------------

### 📋 Variable Names and Types {data-height=650}

```{r}
# REQUIRED: List of variable names and types using glimpse/str approach
var_info <- tibble(
  `#` = 1:ncol(chocolate),
  Variable = names(chocolate),
  Type = sapply(chocolate, function(x) {
    type <- class(x)[1]
    case_when(
      type == "character" ~ "Character",
      type == "numeric" ~ "Numeric",
      type == "factor" ~ "Factor",
      TRUE ~ type
    )
  }),
  Description = c(
    "Unique reference ID",
    "Manufacturer name",
    "Manufacturer country",
    "Review year",
    "Bean origin country",
    "Bean variety/bar name",
    "Cocoa % (text)",
    "Ingredient codes",
    "Flavor notes",
    "Rating (1-5)",
    "Cocoa % [Derived]",
    "Ingredient count [Derived]",
    "Rating label [Derived]"
  )
)

# Use kable for compact display - fits all content without scrolling
kable(var_info, align = c('c', 'l', 'l', 'l')) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 11) %>%
  row_spec(0, bold = TRUE, background = "#2C1810", color = "white") %>%
  column_spec(1, width = "30px", bold = TRUE) %>%
  column_spec(2, width = "140px", color = "#5D4037") %>%
  column_spec(3, width = "80px", color = "#1565C0", bold = TRUE) %>%
  column_spec(4, width = "auto") %>%
  row_spec(10:13, background = "#FFF8E1")  # Highlight derived variables
```

### 🔍 Data Structure Preview {data-height=250}

```{r}
# Show glimpse-style output
structure_df <- tibble(
  Metric = c("Rows (Observations)", "Columns (Variables)", 
             "Memory Size", "Complete Cases", "Missing Values"),
  Value = c(
    format(nrow(chocolate), big.mark = ","),
    ncol(chocolate),
    format(object.size(chocolate), units = "KB"),
    format(sum(complete.cases(chocolate)), big.mark = ","),
    sum(is.na(chocolate))
  )
)

kable(structure_df, align = c('l', 'r')) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), 
                full_width = TRUE, font_size = 13) %>%
  row_spec(0, bold = TRUE, background = "#2C1810", color = "white") %>%
  column_spec(1, bold = TRUE, color = "#5D4037")
```


Summary {data-icon="fa-calculator"}
=====================================

Column {data-width=550}
-----------------------------------------------------------------------

### 📊 Summary Statistics for Numeric Variables {data-height=280}

```{r}
# REQUIRED: Summary statistics (mean, median, SD, min, max)
numeric_summary <- chocolate %>%
  select(review_date, cocoa_percent_num, rating, num_ingredients) %>%
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
  group_by(Variable) %>%
  summarise(
    N = sum(!is.na(Value)),
    Mean = round(mean(Value, na.rm = TRUE), 2),
    Median = round(median(Value, na.rm = TRUE), 2),
    SD = round(sd(Value, na.rm = TRUE), 2),
    Min = round(min(Value, na.rm = TRUE), 2),
    Max = round(max(Value, na.rm = TRUE), 2),
    .groups = 'drop'
  ) %>%
  mutate(Variable = case_when(
    Variable == "review_date" ~ "Review Year",
    Variable == "cocoa_percent_num" ~ "Cocoa Percentage (%)",
    Variable == "rating" ~ "Rating Score",
    Variable == "num_ingredients" ~ "Number of Ingredients"
  ))

kable(numeric_summary, align = c('l', rep('c', 6)),
      caption = NULL) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "responsive"),
                full_width = TRUE, font_size = 13) %>%
  row_spec(0, bold = TRUE, background = "#2C1810", color = "white") %>%
  column_spec(1, bold = TRUE, color = "#5D4037", width = "180px") %>%
  row_spec(3, bold = TRUE, background = "#FFF8E1")
```

### 📍 Frequency Table: Top Manufacturing Countries {data-height=470}

```{r}
# REQUIRED: Frequency table for categorical variable
country_freq <- chocolate %>%
  count(company_location, sort = TRUE) %>%
  head(12) %>%
  mutate(
    Pct = round(n / nrow(chocolate) * 100, 1),
    `Cum.Pct` = cumsum(Pct),
    Bar = paste0(strrep("▓", round(Pct/2)), strrep("░", 25 - round(Pct/2)))
  ) %>%
  rename(Country = company_location, Count = n)

kable(country_freq %>% select(-Bar), align = c('l', 'c', 'c', 'c')) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 12) %>%
  row_spec(0, bold = TRUE, background = "#5D4037", color = "white") %>%
  row_spec(1, bold = TRUE, background = "#D4AF37", color = "#2C1810") %>%
  row_spec(2:3, background = "#F4E4BC")
```

Column {data-width=450}
-----------------------------------------------------------------------

### 🌱 Frequency Table: Top Bean Origin Countries {data-height=380}

```{r}
origin_freq <- chocolate %>%
  count(country_of_bean_origin, sort = TRUE) %>%
  head(12) %>%
  mutate(
    Pct = round(n / nrow(chocolate) * 100, 1),
    `Cum.Pct` = cumsum(Pct)
  ) %>%
  rename(`Bean Origin` = country_of_bean_origin, Count = n)

kable(origin_freq, align = c('l', 'c', 'c', 'c')) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 12) %>%
  row_spec(0, bold = TRUE, background = "#795548", color = "white") %>%
  row_spec(1, bold = TRUE, background = "#D4AF37", color = "#2C1810") %>%
  row_spec(2:3, background = "#F4E4BC")
```

### ⭐ Frequency Table: Rating Categories {data-height=250}

```{r}
rating_freq <- chocolate %>%
  count(rating_category) %>%
  mutate(
    Pct = round(n / sum(n) * 100, 1),
    `Cum.Pct` = cumsum(Pct)
  ) %>%
  rename(Category = rating_category, Count = n)

kable(rating_freq, align = c('l', 'c', 'c', 'c')) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 12) %>%
  row_spec(0, bold = TRUE, background = "#6D4C41", color = "white")
```

### 🔢 Quick Stats {data-height=120}

```{r}
valueBox(format(total_reviews, big.mark = ","), 
         caption = "Total Reviews", icon = "fa-chart-bar", color = "#2C1810")
```


Histogram {data-icon="fa-chart-bar"}
=====================================

Column {data-width=600}
-----------------------------------------------------------------------

### 📊 Histogram: Distribution of Expert Chocolate Ratings {data-height=500}

```{r fig.height=5, fig.width=8}
# REQUIRED: Histogram for numerical variable with appropriate labels and minimal theme
# Purpose: Visualize rating distribution to understand chocolate quality spread

p_hist <- ggplot(chocolate, aes(x = rating)) +
  geom_histogram(binwidth = 0.25, fill = "#5D4037", color = "#2C1810", 
                 alpha = 0.85, linewidth = 0.3) +
  geom_vline(aes(xintercept = mean(rating)), 
             color = "#C62828", linetype = "dashed", linewidth = 1) +
  geom_vline(aes(xintercept = median(rating)), 
             color = "#1565C0", linetype = "solid", linewidth = 1) +
  annotate("label", x = 3.7, y = 480, 
           label = paste0("Mean = ", round(mean(chocolate$rating), 2)),
           fill = "#FFEBEE", color = "#C62828", fontface = "bold", size = 3,
           label.padding = unit(0.35, "lines")) +
  annotate("label", x = 2.8, y = 420,
           label = paste0("Median = ", round(median(chocolate$rating), 2)),
           fill = "#E3F2FD", color = "#1565C0", fontface = "bold", size = 3,
           label.padding = unit(0.35, "lines")) +
  labs(
    title = "Distribution of Expert Chocolate Bar Ratings",
    subtitle = "Most chocolates receive ratings between 3.0 and 3.5 (Recommended category)",
    x = "Expert Rating Score",
    y = "Number of Chocolate Bars",
    caption = "Dashed red = Mean | Solid blue = Median | Data: TidyTuesday 2022"
  ) +
  scale_x_continuous(breaks = seq(1, 5, 0.5), limits = c(1, 4.5)) +
  scale_y_continuous(labels = comma) +
  theme_premium()

p_hist
```

### 📝 Histogram Interpretation {data-height=250}

The histogram reveals that chocolate ratings follow an approximately **normal distribution** with a slight left skew. The majority of bars (over 60%) receive ratings between **3.0 and 3.5**, placing them in the "Recommended" category. The near-identical mean (`r round(mean(chocolate$rating), 2)`) and median (`r round(median(chocolate$rating), 2)`) confirm the distribution's symmetry.

Column {data-width=400}
-----------------------------------------------------------------------

### 🍫 Cocoa Percentage Distribution {data-height=350}

```{r fig.height=3.5, fig.width=6}
p_cocoa_hist <- ggplot(chocolate, aes(x = cocoa_percent_num)) +
  geom_histogram(binwidth = 5, fill = "#8D6E63", color = "#5D4037", alpha = 0.85) +
  geom_vline(xintercept = 70, color = "#D4AF37", linetype = "dashed", linewidth = 0.8) +
  annotate("label", x = 82, y = 650, label = "70% = Mode", 
           fill = "#FFF8E1", color = "#D4AF37", size = 2.5, fontface = "bold") +
  labs(title = "Cocoa Percentage Distribution",
       subtitle = "70% is the most common formulation",
       x = "Cocoa %", y = "Count") +
  scale_x_continuous(breaks = seq(40, 100, 10)) +
  theme_premium() +
  theme(plot.title = element_text(size = 12),
        plot.subtitle = element_text(size = 9))

p_cocoa_hist
```

### 📈 Key Statistics {data-height=400}

| Statistic | Rating | Cocoa % |
|:----------|-------:|--------:|
| **Mean** | `r round(mean(chocolate$rating), 2)` | `r round(mean(chocolate$cocoa_percent_num, na.rm=T), 1)`% |
| **Median** | `r median(chocolate$rating)` | `r median(chocolate$cocoa_percent_num, na.rm=T)`% |
| **Std Dev** | `r round(sd(chocolate$rating), 2)` | `r round(sd(chocolate$cocoa_percent_num, na.rm=T), 1)`% |
| **Range** | `r min(chocolate$rating)` - `r max(chocolate$rating)` | `r min(chocolate$cocoa_percent_num, na.rm=T)` - `r max(chocolate$cocoa_percent_num, na.rm=T)`% |

**Key Insights:** 

- Mode rating is 3.25
- 75% of ratings fall between 2.75-3.5
- 70% cocoa is the most common formulation
- Only ~5% achieve "Outstanding" (4.0+)


Boxplot {data-icon="fa-boxes-stacked"}
=====================================

Column {data-width=600}
-----------------------------------------------------------------------
 
### 📦 Multiple Boxplot: Rating Distribution by Manufacturing Country {data-height=500}

```{r fig.height=5, fig.width=8}
# REQUIRED: Multiple boxplot - numeric variable grouped by categorical variable
# Purpose: Compare rating distributions across countries (categorical grouping)

top_countries <- chocolate %>%
  count(company_location, sort = TRUE) %>%
  head(10) %>%
  pull(company_location)

chocolate_top <- chocolate %>%
  filter(company_location %in% top_countries) %>%
  mutate(company_location = fct_reorder(company_location, rating, .fun = median, .desc = TRUE))

p_box <- ggplot(chocolate_top, aes(x = company_location, y = rating, fill = company_location)) +
  geom_boxplot(alpha = 0.85, outlier.shape = 21, outlier.fill = "#D32F2F", 
               outlier.color = "#B71C1C", outlier.size = 1.8, outlier.alpha = 0.6,
               width = 0.7, lwd = 0.4) +
  stat_summary(fun = mean, geom = "point", shape = 18, size = 3.5, 
               color = "#D4AF37") +
  scale_fill_manual(values = rev(chocolate_palette[1:10])) +
  labs(
    title = "Chocolate Rating Distribution by Manufacturing Country",
    subtitle = "Top 10 countries | Gold diamonds = mean | Red dots = outliers",
    x = NULL,
    y = "Expert Rating Score",
    caption = "Countries ordered by median rating (descending)"
  ) +
  scale_y_continuous(breaks = seq(1, 5, 0.5), limits = c(1.5, 4.5)) +
  theme_premium() +
  theme(
    axis.text.x = element_text(angle = 40, hjust = 1, size = 11, face = "bold"),
    legend.position = "none",
    panel.grid.major.x = element_blank()
  )

p_box
```

### 📝 Boxplot Interpretation {data-height=250}

This boxplot compares rating distributions across the **top 10 chocolate-manufacturing countries**. **Japan** shows the highest median rating with low variability. **U.S.A.** and **France** display the widest spreads. All countries have median ratings around **3.0-3.25**. Gold diamonds (means) align closely with medians.

Column {data-width=400}
-----------------------------------------------------------------------

### 📊 Rating by Cocoa Range {data-height=350}

```{r fig.height=3.2, fig.width=6}
chocolate_cocoa <- chocolate %>%
  mutate(cocoa_range = cut(cocoa_percent_num, 
                           breaks = c(40, 60, 70, 80, 100),
                           labels = c("Low\n40-60%", "Medium\n60-70%", 
                                    "High\n70-80%", "Very High\n80-100%"),
                           include.lowest = TRUE)) %>%
  filter(!is.na(cocoa_range))

p_box_cocoa <- ggplot(chocolate_cocoa, aes(x = cocoa_range, y = rating, fill = cocoa_range)) +
  geom_boxplot(alpha = 0.85, width = 0.6, outlier.size = 1.5, outlier.alpha = 0.5, lwd = 0.4) +
  stat_summary(fun = mean, geom = "point", shape = 18, size = 2.5, color = "#D4AF37") +
  scale_fill_manual(values = c("#EFEBE9", "#BCAAA4", "#795548", "#3E2723")) +
  labs(title = "Rating by Cocoa Content Level",
       x = NULL, y = "Rating") +
  theme_premium() +
  theme(legend.position = "none",
        plot.title = element_text(size = 13))

p_box_cocoa
```

### 📋 Country Statistics {data-height=400}

```{r}
country_stats <- chocolate_top %>%
  group_by(company_location) %>%
  summarise(
    N = n(),
    Mean = round(mean(rating), 2),
    Median = median(rating),
    SD = round(sd(rating), 2),
    .groups = 'drop'
  ) %>%
  arrange(desc(Median)) %>%
  rename(Country = company_location)

kable(country_stats, align = c('l', rep('c', 4))) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 11) %>%
  row_spec(0, bold = TRUE, background = "#2C1810", color = "white") %>%
  row_spec(1, background = "#F4E4BC", bold = TRUE)
```


Scatterplot {data-icon="fa-chart-scatter-bubble"}
=====================================

Column {data-width=600}
-----------------------------------------------------------------------

### 🔵 Scatterplot: Cocoa Percentage vs Rating by Bean Origin Region {data-height=500}

```{r fig.height=5, fig.width=8}
# REQUIRED: Scatterplot for two numerical variables with categorical coloring
# Purpose: Examine cocoa % vs rating relationship, colored by bean origin

chocolate_scatter <- chocolate %>%
  mutate(
    bean_region = case_when(
      country_of_bean_origin %in% c("Venezuela", "Ecuador", "Peru", "Colombia", 
                                     "Bolivia", "Brazil") ~ "South America",
      country_of_bean_origin %in% c("Madagascar", "Tanzania", "Ghana", "Ivory Coast",
                                     "Cameroon", "Nigeria", "Uganda", "Togo", 
                                     "Congo", "Sao Tome") ~ "Africa",
      country_of_bean_origin %in% c("Dominican Republic", "Nicaragua", "Guatemala",
                                     "Mexico", "Belize", "Costa Rica", "Honduras",
                                     "Haiti", "Jamaica", "Trinidad") ~ "Central Am. & Caribbean",
      country_of_bean_origin %in% c("Papua New Guinea", "Indonesia", "Philippines",
                                     "Vietnam", "India", "Fiji", "Vanuatu") ~ "Asia-Pacific",
      TRUE ~ "Other / Blend"
    )
  ) %>%
  filter(!is.na(cocoa_percent_num))

p_scatter <- ggplot(chocolate_scatter, 
                    aes(x = cocoa_percent_num, y = rating, color = bean_region)) +
  geom_jitter(alpha = 0.5, size = 2, width = 0.8, height = 0.03) +
  geom_smooth(aes(group = 1), method = "loess", se = TRUE, 
              color = "#2C1810", fill = "#D7CCC8", alpha = 0.3, linewidth = 1.2) +
  scale_color_manual(
    values = c(
      "South America" = "#43A047",
      "Africa" = "#FB8C00", 
      "Central Am. & Caribbean" = "#1E88E5",
      "Asia-Pacific" = "#8E24AA",
      "Other / Blend" = "#78909C"
    ),
    name = "Bean Origin Region"
  ) +
  labs(
    title = "Relationship Between Cocoa Percentage and Expert Rating",
    subtitle = "Points colored by bean origin region | LOESS curve shows overall trend",
    x = "Cocoa Percentage (%)",
    y = "Expert Rating Score",
    caption = "Higher cocoa % doesn't guarantee higher ratings"
  ) +
  scale_x_continuous(breaks = seq(40, 100, 10), limits = c(40, 100)) +
  scale_y_continuous(breaks = seq(1, 5, 0.5), limits = c(1, 4.5)) +
  theme_premium() +
  theme(
    legend.position = "bottom",
    legend.box = "horizontal"
  ) +
  guides(color = guide_legend(nrow = 2, override.aes = list(size = 3.5, alpha = 0.9)))

p_scatter
```

### 📝 Scatterplot Interpretation {data-height=250}

This scatterplot explores the relationship between **cocoa percentage** and **expert ratings**. The LOESS curve reveals that ratings **peak around 65-75% cocoa**, then decline at higher percentages. The weak negative correlation (**r = `r round(cor(chocolate$cocoa_percent_num, chocolate$rating, use="complete.obs"), 3)`**) indicates cocoa content alone doesn't determine quality.

Column {data-width=400}
-----------------------------------------------------------------------

### 📊 Correlation Analysis {data-height=200}

```{r}
cor_val <- round(cor(chocolate$cocoa_percent_num, chocolate$rating, use = "complete.obs"), 3)
```

<div class="stat-highlight" style="text-align: center; padding: 15px; background: linear-gradient(135deg, #FFF8E1 0%, #FFFFFF 100%); border-radius: 12px; border-left: 4px solid #FFC107;">
<div style="font-size: 2.5rem; font-weight: bold; color: #2C1810; font-family: 'Playfair Display', serif;">
r = `r cor_val`
</div>
<div style="font-size: 0.9rem; color: #5D4037; margin-top: 5px;">
⚠️ <strong>Weak Negative Correlation</strong>
</div>
<div style="font-size: 0.8rem; color: #8D6E63; margin-top: 8px;">
Cocoa % explains only `r round(cor_val^2 * 100, 1)`% of rating variance
</div>
</div>

### 🌍 Statistics by Bean Region {data-height=280}

```{r}
region_stats <- chocolate_scatter %>%
  group_by(bean_region) %>%
  summarise(
    Count = n(),
    `Avg Rating` = round(mean(rating), 2),
    `Avg Cocoa` = paste0(round(mean(cocoa_percent_num), 0), "%"),
    .groups = 'drop'
  ) %>%
  arrange(desc(`Avg Rating`)) %>%
  rename(Region = bean_region)

kable(region_stats, align = c('l', 'c', 'c', 'c')) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 11) %>%
  row_spec(0, bold = TRUE, background = "#5D4037", color = "white")
```

### 💡 Key Findings {data-height=270}

**Insights:**

- **Weak Correlation** (r = `r cor_val`): Cocoa % has minimal impact
- **Sweet Spot**: 65-75% cocoa achieves highest scores
- **Diminishing Returns**: Very dark chocolate (>85%) scores lower
- **Regional Consistency**: All bean regions show similar patterns


Interactive {data-icon="fa-hand-pointer"}
=====================================

Column {data-width=600}
-----------------------------------------------------------------------

### 🖱️ Interactive Visualization: Explore Each Chocolate Bar (ggplotly) {data-height=600}

```{r fig.height=6, fig.width=8}
# REQUIRED: Interactive ggplotly object - converting ggplot to interactive
# Purpose: Create interactive scatterplot using ggplotly() for data exploration
# OPTIMIZED: Using native plotly for better performance with large datasets

# Sample data for better performance (stratified by rating category)
set.seed(42)
chocolate_sample <- chocolate %>%
  group_by(rating_category) %>%
  slice_sample(prop = 0.4) %>%  # Sample 40% from each category

  ungroup()

# Create optimized plotly chart directly (faster than ggplotly conversion)
p_interactive <- plot_ly(
  data = chocolate_sample,
  x = ~jitter(cocoa_percent_num, amount = 1),
  y = ~jitter(rating, amount = 0.03),
  color = ~rating_category,
  colors = rating_colors,
  type = 'scattergl',  # WebGL for better performance
  mode = 'markers',
  marker = list(size = 7, opacity = 0.6),
  hoverinfo = 'text',
  text = ~paste0(
    "<b>", company_manufacturer, "</b>",
    "<br>📍 ", company_location,
    "<br>🌱 ", country_of_bean_origin,
    "<br>🍫 ", cocoa_percent,
    "<br>⭐ ", rating, " (", rating_category, ")"
  )
) %>%
  layout(
    title = list(
      text = "Interactive Explorer: Cocoa % vs Rating",
      font = list(family = "Playfair Display", size = 16, color = "#2C1810")
    ),
    xaxis = list(
      title = list(text = "Cocoa Percentage (%)", standoff = 15),
      tickvals = seq(40, 100, 10),
      ticktext = paste0(seq(40, 100, 10), "%"),
      tickfont = list(size = 12, color = "#5D4037"),
      gridcolor = "#EFEBE9",
      zerolinecolor = "#D7CCC8",
      range = c(38, 102)
    ),
    yaxis = list(
      title = "Expert Rating",
      range = c(1, 4.5),
      tickfont = list(size = 12, color = "#5D4037"),
      gridcolor = "#EFEBE9",
      zerolinecolor = "#D7CCC8"
    ),
    legend = list(
      orientation = "h", 
      y = -0.18, 
      x = 0.5, 
      xanchor = "center",
      font = list(size = 10),
      bgcolor = "rgba(255,255,255,0.9)"
    ),
    hoverlabel = list(
      bgcolor = "white",
      bordercolor = "#5D4037",
      font = list(family = "Arial", size = 12, color = "#2C1810")
    ),
    paper_bgcolor = "white",
    plot_bgcolor = "white",
    autosize = TRUE,
    margin = list(l = 60, r = 30, t = 40, b = 90)
  ) %>%
  config(
    displayModeBar = TRUE,
    modeBarButtonsToRemove = c("lasso2d", "select2d", "autoScale2d"),
    displaylogo = FALSE
  )

p_interactive
```

### 📝 Interactive Features {data-height=150}

**Native Plotly** with **WebGL** for optimal performance. Stratified sample (40% per category). **Hover** for details, use toolbar to **zoom/pan/download**.

Column {data-width=400}
-----------------------------------------------------------------------

### 📖 How to Use This Chart {data-height=280}

**Hover** over points to see: Manufacturer name, location, bean origin, cocoa %, rating & category.

**Toolbar Options:** Download PNG, Zoom in/out, Pan, Reset view.

**Color Legend:**

| Color | Category |
|:------|:---------|
| 🟢 Green | Outstanding / Highly Recommended |
| 🟡 Orange | Recommended |
| 🔴 Red | Disappointing / Unpleasant |

### ⭐ Rating Distribution {data-height=470}

```{r fig.height=4}
cat_dist <- chocolate %>% count(rating_category) %>% 
  mutate(pct = round(n/sum(n)*100, 1))

plot_ly(cat_dist, labels = ~rating_category, values = ~n, type = 'pie',
        textposition = 'inside', textinfo = 'percent',
        marker = list(colors = unname(rating_colors),
                      line = list(color = '#FFFFFF', width = 2)),
        hoverinfo = 'label+value+percent',
        height = 320) %>%
  layout(showlegend = TRUE,
         legend = list(orientation = 'h', y = -0.1, x = 0.5, xanchor = 'center', 
                       font = list(size = 10)),
         margin = list(t = 10, b = 40, l = 10, r = 10)) %>%
  config(displayModeBar = FALSE)
```


Data {data-icon="fa-table"}
=====================================

Column {data-width=1000 .tabset}
-----------------------------------------------------------------------

### 🔍 Complete Dataset Explorer

```{r}
chocolate_display <- chocolate %>%
  select(
    Manufacturer = company_manufacturer,
    Location = company_location,
    Year = review_date,
    `Bean Origin` = country_of_bean_origin,
    `Bar Name` = specific_bean_origin_or_bar_name,
    `Cocoa` = cocoa_percent,
    Rating = rating,
    Category = rating_category,
    Flavors = most_memorable_characteristics
  )

datatable(chocolate_display,
          filter = 'top',
          extensions = 'Buttons',
          fillContainer = TRUE,
          options = list(
            pageLength = 25,
            scrollY = "calc(100vh - 280px)",
            scrollCollapse = TRUE,
            paging = FALSE,
            scrollX = FALSE,
            autoWidth = TRUE,
            dom = 'Bfrti',
            buttons = c('copy', 'csv', 'excel'),
            columnDefs = list(
              list(width = 'auto', targets = c(0, 1, 3, 4, 8)),
              list(width = '55px', targets = c(2, 5, 6)),
              list(width = 'auto', targets = 7)
            ),
            language = list(
              search = "🔍 Search:",
              info = "Showing _START_ to _END_ of _TOTAL_ chocolate bars"
            )
          ),
          rownames = FALSE,
          class = 'cell-border stripe hover compact',
          width = '100%') %>%
  formatStyle('Rating',
              background = styleColorBar(range(chocolate$rating), '#8D6E63'),
              backgroundSize = '95% 70%',
              backgroundRepeat = 'no-repeat',
              backgroundPosition = 'center') %>%
  formatStyle('Category',
              backgroundColor = styleEqual(
                names(rating_colors),
                c("#C8E6C9", "#DCEDC8", "#FFF9C4", "#FFCCBC", "#FFCDD2")
              ),
              fontWeight = 'bold')
```

### ℹ️ Usage Tips

**How to use this table:**

- 🔍 **Filter:** Use the search boxes under each column header
- ↕️ **Sort:** Click on column headers to sort
- 📋 **Export:** Use Copy, CSV, or Excel buttons
- 📜 **Scroll:** Scroll within the table to see all `r nrow(chocolate)` records

**Column Guide:**

| Column | Description |
|:-------|:------------|
| Manufacturer | Company that made the chocolate |
| Location | Country where company is based |
| Year | When the review was conducted |
| Bean Origin | Where cocoa beans came from |
| Cocoa | Percentage of cocoa content |
| Rating | Expert score (1.0 - 5.0) |
| Category | Rating classification |
| Flavors | Tasting notes from experts |


References {data-icon="fa-book"}
=====================================

Column {data-width=500}
-----------------------------------------------------------------------

### 📚 Data Sources & Methodology

<div class="info-card">

**Primary Data Source:**

📊 **TidyTuesday - Chocolate Bar Ratings (2022-01-18)**

| Attribute | Details |
|:----------|:--------|
| Repository | [github.com/rfordatascience/tidytuesday](https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-01-18) |
| Direct CSV | [chocolate.csv](https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv) |
| Records | 2,530 chocolate bar reviews |
| Time Span | 2006 - 2021 |
| Variables | 10 original columns |

**Original Data Provider:**

🍫 **Flavors of Cacao** - [flavorsofcacao.com](http://flavorsofcacao.com/chocolate_database.html)

- Compiled by: **Manhattan Chocolate Society**
- Rating methodology: Blind tasting by certified chocolate experts
- Scale: 1.0 (unpleasant) to 5.0 (elite/outstanding)

**Data Collection Method:**

Expert tasters evaluate chocolate bars on texture, flavor complexity, finish, and overall impression. Each bar is rated independently without brand knowledge.

</div>

### 📦 R Packages Used

```{r}
packages_df <- tibble(
  Package = c("flexdashboard", "tidyverse", "ggplot2", "plotly", "DT", "knitr", "kableExtra", "scales"),
  Version = c(
    as.character(packageVersion("flexdashboard")),
    as.character(packageVersion("tidyverse")),
    as.character(packageVersion("ggplot2")),
    as.character(packageVersion("plotly")),
    as.character(packageVersion("DT")),
    as.character(packageVersion("knitr")),
    as.character(packageVersion("kableExtra")),
    as.character(packageVersion("scales"))
  ),
  Purpose = c(
    "Dashboard framework & layout",
    "Data wrangling (dplyr, tidyr, readr)",
    "Grammar of graphics visualizations",
    "Interactive charts with WebGL",
    "Interactive searchable data tables",
    "Dynamic report generation",
    "Advanced table formatting",
    "Axis & label formatting"
  ),
  Citation = c(
    "Iannone et al. (2024)",
    "Wickham et al. (2019)",
    "Wickham (2016)",
    "Sievert (2020)",
    "Xie et al. (2024)",
    "Xie (2024)",
    "Zhu (2024)",
    "Wickham & Seidel (2022)"
  )
)

kable(packages_df, align = c('l', 'c', 'l', 'l')) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), 
                full_width = TRUE, font_size = 11) %>%
  row_spec(0, bold = TRUE, background = "#2C1810", color = "white")
```

Column {data-width=500}
-----------------------------------------------------------------------

### 🔗 Documentation & Tutorials

<div class="info-card">

**Official Documentation:**

| Resource | URL | Purpose |
|:---------|:----|:--------|
| 📖 Flexdashboard | [pkgs.rstudio.com/flexdashboard](https://pkgs.rstudio.com/flexdashboard/) | Dashboard layouts & components |
| 📊 Plotly R | [plotly.com/r](https://plotly.com/r/) | Interactive visualizations |
| 🎨 ggplot2 | [ggplot2.tidyverse.org](https://ggplot2.tidyverse.org/) | Static graphics reference |
| 📋 DT Package | [rstudio.github.io/DT](https://rstudio.github.io/DT/) | DataTables integration |
| 📚 kableExtra | [haozhu233.github.io/kableExtra](https://haozhu233.github.io/kableExtra/) | Table styling |

**Books & Learning Resources:**

1. Wickham, H. & Grolemund, G. (2023). *R for Data Science* (2nd ed.). [r4ds.hadley.nz](https://r4ds.hadley.nz/)

2. Sievert, C. (2020). *Interactive Web-Based Data Visualization with R, plotly, and shiny*. Chapman & Hall/CRC. [plotly-r.com](https://plotly-r.com/)

3. Wilke, C. O. (2019). *Fundamentals of Data Visualization*. O'Reilly. [clauswilke.com/dataviz](https://clauswilke.com/dataviz/)

4. Healy, K. (2018). *Data Visualization: A Practical Introduction*. Princeton University Press.

**Course Materials:**

- MIS029 Data Visualization lecture notes
- Flexdashboard layout examples: [pkgs.rstudio.com/flexdashboard/articles/layouts.html](https://pkgs.rstudio.com/flexdashboard/articles/layouts.html)

</div>

### 🔄 Session Information

```{r}
session_df <- tibble(
  Property = c("R Version", "Platform", "Operating System", "Locale", "Date Generated", "Timezone"),
  Value = c(
    paste(R.version$major, R.version$minor, sep = "."),
    R.version$platform,
    paste(Sys.info()["sysname"], Sys.info()["release"]),
    Sys.getlocale("LC_TIME"),
    format(Sys.time(), "%Y-%m-%d %H:%M:%S"),
    Sys.timezone()
  )
)

kable(session_df, align = c('l', 'l')) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = TRUE, font_size = 11) %>%
  row_spec(0, bold = TRUE, background = "#455A64", color = "white")
```

### 📋 Reproducibility & License

<div class="info-card" style="border-left-color: #FF9800;">

**To reproduce this analysis:**

```r
# 1. Install required packages
install.packages(c("flexdashboard", "tidyverse", 
                   "plotly", "DT", "knitr", 
                   "kableExtra", "scales"))

# 2. Render the dashboard
rmarkdown::render("MertEfeKurt_2307071061_Final.Rmd")
```

⚠️ **Requirements:** 

- R version ≥ 4.0.0
- Internet connection (data fetched from GitHub)
- ~500 MB RAM for rendering

**Data License:** TidyTuesday data is released under CC0 1.0 Universal license.

---

**Dashboard Author:** Mert Efe Kurt (2307071061)

**Course:** MIS029 - Data Visualization

**Institution:** Final Project Submission

**Generated:** `r format(Sys.time(), "%B %d, %Y at %H:%M")`

</div>