class: center, middle, inverse, title-slide # grammar of graphics ### 2021-08-27 --- class: middle, inverse # Announcements --- ## Announcements - If you cannot be in class any day due to COVID isolation, file an [incapacitation form](https://trinity.duke.edu/undergraduate/academic-policies/illness) regardless of due dates. - For those on the waitlist, we're not allowed to add seats to the class due to fire code. - Before Monday: Fill out "Getting to know you" survey. - Before Wednesday: Complete reading + first reading quiz (to be posted on course website). --- ## Dorianne Gray <img src="images/dorianne-gray.jpeg" width="70%" style="display: block; margin: auto;" /> --- class: middle, inverse # Data visualization --- ## Data visualization > *"The simple graph has brought more information to the data analyst's mind than any other device." --- John Tukey* - Data visualization is the creation and study of the visual representation of data - Many tools for visualizing data -- R is one of them - Many approaches/systems within R for making data visualizations -- **ggplot2** is one of them, and that's what we're going to use --- ## ggplot2 `\(\in\)` tidyverse .pull-left[ <img src="images/ggplot2-part-of-tidyverse.png" width="80%" /> ] .pull-right[ - **ggplot2** is tidyverse's data visualization package - `gg` in "ggplot2" stands for Grammar of Graphics - Inspired by the book **Grammar of Graphics** by Leland Wilkinson ] --- ## Grammar of Graphics .pull-left-narrow[ A grammar of graphics is a tool that enables us to concisely describe the components of a graphic ] .pull-right-wide[ <img src="images/grammar-of-graphics.png" width="90%" /> ] .footnote[ Source: [BloggoType](http://bloggotype.blogspot.com/2016/08/holiday-notes2-grammar-of-graphics.html) ] --- ## Hello ggplot2! .pull-left-wide[ - `ggplot()` is the main function in ggplot2 - Plots are constructed in layers - Structure of the code for plots can be summarized as ```r ggplot(data = [dataset], mapping = aes(x = [x-variable], y = [y-variable])) + geom_xxx() + other options ``` - The ggplot2 package comes with the tidyverse ```r library(tidyverse) ``` - For help with ggplot2, see [ggplot2.tidyverse.org](http://ggplot2.tidyverse.org/) ] --- class: middle, inverse # Why do we visualize? --- ## Why do we visualize? 1. Discover patterns that may not be obvious from numerical summaries --- ## Anscombe's quartet ```r library(Tmisc) quartet %>% slice_head(n = 15) ``` --- ## Summary statistics for Anscombe's quartet ```r quartet %>% group_by(set) %>% summarise( mean_x = mean(x), mean_y = mean(y), sd_x = sd(x), sd_y = sd(y), r = cor(x, y) ) ``` ``` ## # A tibble: 4 × 6 ## set mean_x mean_y sd_x sd_y r ## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 I 9 7.50 3.32 2.03 0.816 ## 2 II 9 7.50 3.32 2.03 0.816 ## 3 III 9 7.5 3.32 2.03 0.816 ## 4 IV 9 7.50 3.32 2.03 0.817 ``` --- ## Scatterplots for Anscombe's quartet <img src="02-grammar-of-graphics_files/figure-html/quartet-plot-1.png" width="100%" /> --- ## Why do we visualize? 1. Discover patterns that may not be obvious from numerical summaries 2. Convey information in a way that is otherwise difficult/impossible to convey --- ## Excess deaths due to COVID-19 <img src="images/ft-covid-excess-deaths.png" width="60%" style="display: block; margin: auto;" /> .footnote[ Source: [Financial Times](https://www.ft.com/content/a2901ce8-5eb7-4633-b89c-cbdf5b386938), 27 Aug 2021. ] --- ## COVID-19 vaccination in US Counties <img src="images/nytimes-us-covid-vaccine.png" width="60%" style="display: block; margin: auto;" /> .footnote[ Source: [New York Times](https://www.nytimes.com/interactive/2020/us/covid-19-vaccine-doses.html), 27 Aug 2021. ] --- class: middle, inverse # ggplot2 ❤️ 🐧 --- ## ggplot2 `\(\in\)` tidyverse .pull-left[ <img src="images/ggplot2-part-of-tidyverse.png" width="80%" /> ] .pull-right[ - **ggplot2** is tidyverse's data visualization package - Structure of the code for plots can be summarized as ```r ggplot(data = [dataset], mapping = aes(x = [x-variable], y = [y-variable])) + geom_xxx() + other options ``` ] --- ## Data: Palmer Penguins Measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex. .pull-left-narrow[ <img src="images/penguins.png" width="80%" /> ] .pull-right-wide[ ```r library(palmerpenguins) glimpse(penguins) ``` ``` ## Rows: 344 ## Columns: 8 ## $ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel… ## $ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse… ## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, … ## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, … ## $ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186… ## $ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, … ## $ sex <fct> male, female, female, NA, female, male, female, male… ## $ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007… ``` ] --- .panelset[ .panel[.panel-name[Plot] <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-12-1.png" width="70%" /> ] .panel[.panel-name[Code] ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", colour = "Species") ``` ] ] --- class: middle, inverse # Coding out loud --- .midi[ > **Start with the `penguins` data frame** ] .pull-left[ ```r *ggplot(data = penguins) ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-13-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > **map bill depth to the x-axis** ] .pull-left[ ```r ggplot(data = penguins, * mapping = aes(x = bill_depth_mm)) ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-14-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > **and map bill length to the y-axis.** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, * y = bill_length_mm)) ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-15-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > **Represent each observation with a point** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm)) + * geom_point() ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-16-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > **and map species to the colour of each point.** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, * colour = species)) + geom_point() ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-17-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the colour of each point. > **Title the plot "Bill depth and length"** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + * labs(title = "Bill depth and length") ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-18-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the colour of each point. > Title the plot "Bill depth and length", > **add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins"** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + labs(title = "Bill depth and length", * subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins") ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-19-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the colour of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > **label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", * x = "Bill depth (mm)", y = "Bill length (mm)") ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-20-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the colour of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > **label the legend "Species"** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", * colour = "Species") ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-21-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the colour of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > label the legend "Species", > **and add a caption for the data source.** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", colour = "Species", * caption = "Source: Palmer Station LTER / palmerpenguins package") ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-22-1.png" width="100%" /> ] --- .midi[ > Start with the `penguins` data frame, > map bill depth to the x-axis > and map bill length to the y-axis. > Represent each observation with a point > and map species to the colour of each point. > Title the plot "Bill depth and length", > add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", > label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, > label the legend "Species", > and add a caption for the data source. > **Finally, use a discrete colour scale that is designed to be perceived by viewers with common forms of colour blindness.** ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", colour = "Species", caption = "Source: Palmer Station LTER / palmerpenguins package") + * scale_colour_viridis_d() ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-23-1.png" width="100%" /> ] --- .panelset[ .panel[.panel-name[Plot] <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-24-1.png" width="70%" /> ] .panel[.panel-name[Code] ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + labs(title = "Bill depth and length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Bill depth (mm)", y = "Bill length (mm)", colour = "Species", caption = "Source: Palmer Station LTER / palmerpenguins package") + scale_colour_viridis_d() ``` ] .panel[.panel-name[Narrative] .pull-left-wide[ .midi[ Start with the `penguins` data frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point and map species to the colour of each point. Title the plot "Bill depth and length", add the subtitle "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", label the x and y axes as "Bill depth (mm)" and "Bill length (mm)", respectively, label the legend "Species", and add a caption for the data source. Finally, use a discrete colour scale that is designed to be perceived by viewers with common forms of colour blindness. ] ] ] ] --- ## Argument names .tip[ You can omit the names of first two arguments when building plots with `ggplot()`. ] .pull-left[ ```r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + scale_colour_viridis_d() ``` ] .pull-right[ ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, colour = species)) + geom_point() + scale_colour_viridis_d() ``` ] --- class: middle, inverse # Aesthetics --- ## Aesthetics options Commonly used characteristics of plotting characters that can be **mapped to a specific variable** in the data are - `colour` - `shape` - `size` - `alpha` (transparency) --- ## Colour .pull-left[ ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, * colour = species)) + geom_point() + scale_colour_viridis_d() ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-25-1.png" width="100%" /> ] --- ## Shape Mapped to a different variable than `colour` .pull-left[ ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, colour = species, * shape = island)) + geom_point() + scale_colour_viridis_d() ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-26-1.png" width="100%" /> ] --- ## Shape Mapped to same variable as `colour` .pull-left[ ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, colour = species, * shape = species)) + geom_point() + scale_colour_viridis_d() ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-27-1.png" width="100%" /> ] --- ## Size .pull-left[ ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, colour = species, shape = species, * size = body_mass_g)) + geom_point() + scale_colour_viridis_d() ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-28-1.png" width="100%" /> ] --- ## Alpha .pull-left[ ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, colour = species, shape = species, size = body_mass_g, * alpha = flipper_length_mm)) + geom_point() + scale_colour_viridis_d() ``` ] .pull-right[ <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-29-1.png" width="100%" /> ] --- .pull-left[ **Mapping** ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, * size = body_mass_g, * alpha = flipper_length_mm)) + geom_point() ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-30-1.png" width="100%" /> ] .pull-right[ **Setting** ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + * geom_point(size = 2, alpha = 0.5) ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-31-1.png" width="100%" /> ] --- ## Mapping vs. setting - **Mapping:** Determine the size, alpha, etc. of points based on the values of a variable in the data - goes into `aes()` - **Setting:** Determine the size, alpha, etc. of points **not** based on the values of a variable in the data - goes into `geom_*()` (this was `geom_point()` in the previous example, but we'll learn about other geoms soon!) --- class: middle, inverse # Faceting --- ## Faceting - Smaller plots that display different subsets of the data - Useful for exploring conditional relationships and large data --- .panelset[ .panel[.panel-name[Plot] <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-32-1.png" width="70%" /> ] .panel[.panel-name[Code] ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(species ~ island) ``` ] ] --- ## Various ways to facet .question[ In the next few slides describe what each plot displays. Think about how the code relates to the output. **Note:** The plots in the next few slides do not have proper titles, axis labels, etc. because we want you to figure out what's happening in the plots. But you should always label your plots! ] --- ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(species ~ sex) ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-33-1.png" width="60%" /> --- ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(sex ~ species) ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-34-1.png" width="60%" /> --- ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_wrap(~ species) ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-35-1.png" width="60%" /> --- ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_grid(. ~ species) ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-36-1.png" width="60%" /> --- ```r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point() + * facet_wrap(~ species, ncol = 2) ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-37-1.png" width="60%" /> --- ## Faceting summary - `facet_grid()`: - 2d grid - `rows ~ cols` - use `.` for no split - `facet_wrap()`: 1d ribbon wrapped according to number of rows and columns specified or available plotting area --- ## Facet and color .panelset[ .panel[.panel-name[Plot] <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-38-1.png" width="60%" /> ] .panel[.panel-name[Code] ```r ggplot( penguins, aes(x = bill_depth_mm, y = bill_length_mm, * color = species)) + geom_point() + facet_grid(species ~ sex) + * scale_color_viridis_d() ``` ] ] --- ## Facet and color, no legend .panelset[ .panel[.panel-name[Plot] <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-39-1.png" width="60%" /> ] .panel[.panel-name[Code] ```r ggplot( penguins, aes(x = bill_depth_mm, y = bill_length_mm, color = species)) + * geom_point(show.legend = FALSE) + facet_grid(species ~ sex) + scale_color_viridis_d() ``` ] ] --- class: middle, inverse # Take a sad plot, and make it better --- The American Association of University Professors (AAUP) is a nonprofit membership association of faculty and other academic professionals. [This report](https://www.aaup.org/sites/default/files/files/AAUP_Report_InstrStaff-75-11_apr2013.pdf) by the AAUP shows trends in instructional staff employees between 1975 and 2011, and contains an image very similar to the one given below. <img src="images/staff-employment.png" width="80%" style="display: block; margin: auto;" /> --- Each row in this dataset represents a faculty type, and the columns are the years for which we have data. The values are percentage of hires of that type of faculty for each year. ```r staff <- read_csv("data/instructional-staff.csv") staff ``` ``` ## # A tibble: 5 × 12 ## faculty_type `1975` `1989` `1993` `1995` `1999` `2001` `2003` `2005` `2007` ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 Full-Time Tenu… 29 27.6 25 24.8 21.8 20.3 19.3 17.8 17.2 ## 2 Full-Time Tenu… 16.1 11.4 10.2 9.6 8.9 9.2 8.8 8.2 8 ## 3 Full-Time Non-… 10.3 14.1 13.6 13.6 15.2 15.5 15 14.8 14.9 ## 4 Part-Time Facu… 24 30.4 33.1 33.2 35.5 36 37 39.3 40.5 ## 5 Graduate Stude… 20.5 16.5 18.1 18.8 18.7 19 20 19.9 19.5 ## # … with 2 more variables: 2009 <dbl>, 2011 <dbl> ``` --- ## Recreate the visualization In order to recreate this visualization we need to first reshape the data to have one variable for faculty type and one variable for year. In other words, we will convert the data from the long format to wide format. But before we do so... .task[ If the long data will have a row for each year/faculty type combination, and there are 5 faculty types and 11 years of data, how many rows will the data have? ] --- class: center, middle <img src="images/pivot.gif" width="80%" style="display: block; margin: auto;" /> --- ## `pivot_*()` functions <img src="images/tidyr-longer-wider.gif" width="60%" /> --- ## `pivot_longer()` ```r pivot_longer(data, cols, names_to = "name", values_to = "value") ``` - The first argument is `data` as usual. - The second argument, `cols`, is where you specify which columns to pivot into longer format -- in this case all columns except for the `faculty_type` - The third argument, `names_to`, is a string specifying the name of the column to create from the data stored in the column names of data -- in this case `year` - The fourth argument, `values_to`, is a string specifying the name of the column to create from the data stored in cell values, in this case `percentage` --- ## Pivot instructor data .midi[ ```r library(tidyverse) staff_long <- staff %>% pivot_longer(cols = -faculty_type, names_to = "year", values_to = "percentage") %>% mutate(percentage = as.numeric(percentage)) staff_long ``` ``` ## # A tibble: 55 × 3 ## faculty_type year percentage ## <chr> <chr> <dbl> ## 1 Full-Time Tenured Faculty 1975 29 ## 2 Full-Time Tenured Faculty 1989 27.6 ## 3 Full-Time Tenured Faculty 1993 25 ## 4 Full-Time Tenured Faculty 1995 24.8 ## 5 Full-Time Tenured Faculty 1999 21.8 ## 6 Full-Time Tenured Faculty 2001 20.3 ## 7 Full-Time Tenured Faculty 2003 19.3 ## 8 Full-Time Tenured Faculty 2005 17.8 ## 9 Full-Time Tenured Faculty 2007 17.2 ## 10 Full-Time Tenured Faculty 2009 16.8 ## # … with 45 more rows ``` ] --- .question[ This doesn't look quite right, how would you fix it? ] .small[ ```r staff_long %>% ggplot(aes(x = percentage, y = year, color = faculty_type)) + geom_col(position = "dodge") ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-45-1.png" width="60%" /> ] --- .midi[ ```r staff_long %>% ggplot(aes(x = percentage, y = year, fill = faculty_type)) + geom_col(position = "dodge") ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-46-1.png" width="60%" /> ] --- ## Some improvement... .midi[ ```r staff_long %>% ggplot(aes(x = percentage, y = year, fill = faculty_type)) + geom_col() ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-47-1.png" width="60%" /> ] --- ## More improvement .midi[ ```r staff_long %>% ggplot(aes(x = year, y = percentage, group = faculty_type, color = faculty_type)) + geom_line() + theme_minimal() ``` <img src="02-grammar-of-graphics_files/figure-html/unnamed-chunk-48-1.png" width="100%" /> ] --- ## Goal: even more improvement! .task[ I want to achieve the following look but I have no idea how! ] <img src="images/sketch.png" width="70%" /> --- ## Asking good questions - Describe what you want - Describe where you are - Create a minimal **repr**oducible **ex**ample: `reprex::reprex()` --- .panelset[ .panel[.panel-name[Plot] <img src="02-grammar-of-graphics_files/figure-html/instructor-lines-1.png" width="100%" /> ] .panel[.panel-name[Code] ```r library(scales) staff_long %>% * mutate( * part_time = if_else(faculty_type == "Part-Time Faculty", * "Part-Time Faculty", "Other Faculty"), * year = as.numeric(year) * ) %>% ggplot(aes(x = year, y = percentage/100, group = faculty_type, color = part_time)) + geom_line() + * scale_color_manual(values = c("gray", "red")) + * scale_y_continuous(labels = label_percent(accuracy = 1)) + theme_minimal() + labs( title = "Instructional staff employment trends", x = "Year", y = "Percentage", color = NULL ) + * theme(legend.position = "bottom") ``` ] ]