Visualization Exercise

The graph I settled on to replicate plots different colleges’ fight songs by duration in seconds and beats per minute. I found this graph on fivethiryeight.com at this link https://projects.fivethirtyeight.com/college-fight-song-lyrics/ and the raw data at this link https://github.com/fivethirtyeight/data/tree/master/fight-songs. Here is a quick screenshot of the original graph since it is slightly interactive, but if you would like to view it in its full glory then check out the link above. Screenshot of Original Graph

Load packages

library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ purrr   1.0.1 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.5.0 
✔ readr   2.1.3      ✔ forcats 0.5.2 
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
library(readr)
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout
library(shiny)

Load in data

fight <- read_csv("data/fight-songs.csv")
Rows: 65 Columns: 23
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (19): school, conference, song_name, writers, year, student_writer, offi...
dbl  (4): bpm, sec_duration, number_fights, trope_count

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Plotting

##create the plot with parameters and average lines
Plot1 <- ggplot(fight, aes(x=sec_duration, y=bpm, col=school, text= paste(school, "<br>")))+
  geom_point(size=3, alpha=0.8)+
  xlab("Duration")+ylab("Beats Per Minute")+
  geom_hline(yintercept = mean(fight$bpm, na.rm=TRUE), lty='dotted')+
  geom_vline(xintercept= mean(fight$sec_duration, na.rm=TRUE), lty='dotted')+
  scale_colour_hue(c=20, l=80)+theme(legend.position="none")+
  scale_x_continuous(limits= c(0,180), breaks = seq(0,180,20))+
  scale_y_continuous(limits= c(50,200), breaks= c(0,60,80,100,120,140,160,180,200))+
  theme(axis.title.y = element_text(face="bold", size=11), plot.title = element_text(face="bold", size=14.5, hjust=0.5), axis.title.x = element_text(face="bold", size=11))+
  ggtitle("How Colleges' Fight Songs Stack Up")+
  annotate("text", x = 140, y = 190, label = "Fast but long")+
  annotate("text", x = 30, y = 190, label = "Fast and short")+
  annotate("text", x = 30, y = 50, label = "Slow but short")+
  annotate("text", x = 140, y = 50, label = "Slow and Long")+
   theme(panel.background = element_rect(fill = 'white'),
          panel.grid.major = element_line(color = 'grey'),
          panel.grid.minor = element_line(color = 'grey'))+
  annotate("text", x=140, y=130, label="AVERAGE", size=3)+
  annotate("text", x=71, y=100, label="AVERAGE", angle=90, size=3)
Plot1

I started with plotting the basic parameters and then started tweaking things to make them more similar to the original. I started with the point size and transparency, then added axis labels and the average lines for bpm and duration. I altered the scale of both axes, added the title, changed the font, and added quadrant labels as well. Lastly I changed the background grid to how the original looked. As I was going along, I would view the graph and see how it was looking, then add to the code. I can see how this may be harder to follow than if I broke it down into separate steps.

Adding the Hovering Text Box

##create the hover text box
ggplotly(Plot1, tooltip = c("text"))

Comments

Stackoverflow was my best friend during this process and I would’ve been lost without it. I came into this exercise with a basic understanding of making plots but I didn’t know all the ways to customize the different aspects of a graph or any sort of interactive elements, and it was my first time trying to use ggplotly. I definitely learned a ton of details for personalization and formatting which took the majority of the time I spent on it. I had a difficult time figuring out what color palette was used in the original so I settled for the default colors which were fairly similar. I also couldn’t figure out how to add the labels of ‘AVERAGE’ within the dotted average lines I made, so I just added them adjacent to the lines. This was defintely a challenge but I feel like I learned a lot and found a bunch of new resources I can turn to in the future.