Here’s a for working with the worldcup R package (the Fjelstul World Cup database, usually accessed via the worldcup package or directly from the CSV data).
install.packages("worldcup") library(worldcup) If you prefer the directly (from the Fjelstul World Cup Database GitHub repo ), download the data-csv/ folder. 2. Database Schema (Main Tables) The package contains several tibbles. Key ones: worldcup r package fjelstul data-csv
Load them:
data("matches") data("goals") data("cards") data("players") Use dplyr for easy manipulation. Example 1: Matches in 2018 World Cup library(dplyr) matches %>% filter(tournament_id == "WC-2018") %>% select(match_id, home_team, away_team, home_score, away_score) Example 2: Top goal scorers (all time) goals %>% group_by(player_id) %>% summarise(total_goals = n(), .groups = "drop") %>% arrange(desc(total_goals)) %>% left_join(players, by = "player_id") %>% select(player_name, total_goals) %>% slice_head(n = 10) Example 3: Most cards in a single match cards %>% group_by(match_id) %>% summarise(total_cards = n()) %>% arrange(desc(total_cards)) %>% left_join(matches, by = "match_id") %>% select(match_id, home_team, away_team, total_cards) Example 4: Goals by minute (including injury time) goals %>% filter(minute_regulation <= 90) %>% group_by(minute_regulation) %>% summarise(goals = n()) %>% arrange(minute_regulation) %>% plot(type = "h", xlab = "Minute", ylab = "Goals", main = "Goals by Minute") 4. Common Analysis Tasks Team performance over time matches %>% group_by(tournament_id, winner) %>% summarise(wins = n(), .groups = "drop") %>% arrange(tournament_id, desc(wins)) Player with most assists in a single tournament goals %>% filter(!is.na(assist_player_id)) %>% group_by(tournament_id, assist_player_id) %>% summarise(assists = n(), .groups = "drop") %>% slice_max(assists, by = tournament_id, n = 1) %>% left_join(players, by = c("assist_player_id" = "player_id")) 5. Working with CSV files directly If you downloaded the CSV files (e.g., from the GitHub repo): Here’s a for working with the worldcup R
| Table | Description | |--------|-------------| | matches | Match-level data (score, date, stadium, etc.) | | goals | Goal-scoring events (scorer, assist, minute, type) | | cards | Yellow/red cards | | substitutions | Substitutions | | players | Player metadata | | teams | Team metadata | | tournaments | World Cup editions (year, host, winner) | Database Schema (Main Tables) The package contains several
library(readr) library(dplyr) matches_csv <- read_csv("data-csv/matches.csv") goals_csv <- read_csv("data-csv/goals.csv") players_csv <- read_csv("data-csv/players.csv")