Flu Analysis Exploration

Load Packages

#load in necessary packages
library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ purrr   1.0.1 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.5.0 
✔ readr   2.1.3      ✔ forcats 0.5.2 
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

Load Clean Data

#load in the cleaned data
flurevise<- readRDS(file = "../data/flurevised.rds")
glimpse(flurevise)
Rows: 730
Columns: 32
$ SwollenLymphNodes <fct> Yes, Yes, Yes, Yes, Yes, No, No, No, Yes, No, Yes, Y…
$ ChestCongestion   <fct> No, Yes, Yes, Yes, No, No, No, Yes, Yes, Yes, Yes, Y…
$ ChillsSweats      <fct> No, No, Yes, Yes, Yes, Yes, Yes, Yes, Yes, No, Yes, …
$ NasalCongestion   <fct> No, Yes, Yes, Yes, No, No, No, Yes, Yes, Yes, Yes, Y…
$ CoughYN           <fct> Yes, Yes, No, Yes, No, Yes, Yes, Yes, Yes, Yes, No, …
$ Sneeze            <fct> No, No, Yes, Yes, No, Yes, No, Yes, No, No, No, No, …
$ Fatigue           <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Ye…
$ SubjectiveFever   <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, No, Yes…
$ Headache          <fct> Yes, Yes, Yes, Yes, Yes, Yes, No, Yes, Yes, Yes, Yes…
$ Weakness          <fct> Mild, Severe, Severe, Severe, Moderate, Moderate, Mi…
$ WeaknessYN        <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Ye…
$ CoughIntensity    <fct> Severe, Severe, Mild, Moderate, None, Moderate, Seve…
$ CoughYN2          <fct> Yes, Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes…
$ Myalgia           <fct> Mild, Severe, Severe, Severe, Mild, Moderate, Mild, …
$ MyalgiaYN         <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Ye…
$ RunnyNose         <fct> No, No, Yes, Yes, No, No, Yes, Yes, Yes, Yes, No, No…
$ AbPain            <fct> No, No, Yes, No, No, No, No, No, No, No, Yes, Yes, N…
$ ChestPain         <fct> No, No, Yes, No, No, Yes, Yes, No, No, No, No, Yes, …
$ Diarrhea          <fct> No, No, No, No, No, Yes, No, No, No, No, No, No, No,…
$ EyePn             <fct> No, No, No, No, Yes, No, No, No, No, No, Yes, No, Ye…
$ Insomnia          <fct> No, No, Yes, Yes, Yes, No, No, Yes, Yes, Yes, Yes, Y…
$ ItchyEye          <fct> No, No, No, No, No, No, No, No, No, No, No, No, Yes,…
$ Nausea            <fct> No, No, Yes, Yes, Yes, Yes, No, No, Yes, Yes, Yes, Y…
$ EarPn             <fct> No, Yes, No, Yes, No, No, No, No, No, No, No, Yes, Y…
$ Hearing           <fct> No, Yes, No, No, No, No, No, No, No, No, No, No, No,…
$ Pharyngitis       <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, No, No, No, Yes, …
$ Breathless        <fct> No, No, Yes, No, No, Yes, No, No, No, Yes, No, Yes, …
$ ToothPn           <fct> No, No, Yes, No, No, No, No, No, Yes, No, No, Yes, N…
$ Vision            <fct> No, No, No, No, No, No, No, No, No, No, No, No, No, …
$ Vomit             <fct> No, No, No, No, No, No, Yes, No, No, No, Yes, Yes, N…
$ Wheeze            <fct> No, No, No, Yes, No, Yes, No, No, No, No, No, Yes, N…
$ BodyTemp          <dbl> 98.3, 100.4, 100.8, 98.8, 100.5, 98.4, 102.5, 98.4, …
#get a summary of the variables
flurevise %>% select(BodyTemp, Nausea) %>% summary()
    BodyTemp      Nausea   
 Min.   : 97.20   No :475  
 1st Qu.: 98.20   Yes:255  
 Median : 98.50            
 Mean   : 98.94            
 3rd Qu.: 99.30            
 Max.   :103.10            

Data Exploration

#create a histogram of the continuous variable BodyTemp 
ggplot(flurevise, aes(x=BodyTemp))+geom_histogram()+labs(title="Body Temperature Frequencies", x="Body Temperature", y="Count")+theme_bw()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The distribution looks fairly normal with most of the temps at normal human ranges from around 98-99 degrees, with nothing looking implausible. There are a few across the 100-103 range which could be considered indicative of a fever.

ggplot(flurevise, aes(x=Nausea, y=BodyTemp, fill=Nausea))+geom_boxplot()

Here I was checking how the two outcomes of interest compared to each other in a box plot, they have a pretty even body temperature distribution between those that reported nausea and those that didn’t.

ggplot(flurevise, aes(x=SwollenLymphNodes, y=BodyTemp, fill=SwollenLymphNodes))+geom_boxplot()+labs(title = "Body Temperature Distribution by Lymph Node Status", x="Swollen Lymph Nodes", y="Body Temperature")

It seems from a glance that body temperatures were slightly higher in people who reported no swollen lymph nodes.

ggplot(flurevise, aes(x=CoughIntensity, y=BodyTemp, fill=CoughIntensity))+geom_boxplot()+labs(x="Cough Intensity", y="Body Temperature", title="Body Temperature Distribution by Cough Intensity")

Body temperature tended to trend higher with more intense coughing.

ggplot(flurevise, aes(x=RunnyNose, y=BodyTemp, fill=RunnyNose))+geom_boxplot()+labs(title="Body Temperature Distribution by Presence of Runny Nose", x="Runny Nose Status", y="Body Temperature")