Tutorial Week 9

While thinking back on the term, and trying to figure out what I wanted to write a tutorial on, I remembered the lab assignment from week 3, and how it involved data visualization of the dataset of the most popular baby names in New Zealand from 2001-2010. I remember thinking that for me it would have been easier to use R to create the bar charts that I ended up creating for the week 3 lab assignment because I had previous experience using R, so I decided to write a brief tutorial on how to create bar charts using R.

While this process may seem much more complicated and tedious than using one of the tools we have explored in this class, that does everything for you, I feel like by creating the visualizations yourself, you get to experience more of what the computer does for you along with getting some coding experience which can be very beneficial for the DGAH area. With more experience, R can be very useful for DGAH as it has strong capabilities in data analysis and modeling, along with being able to conduct text analytics and be able to find meaningful patterns in text.

Step 1: Log in to your account

Everyone at Carleton, already has access to an online R server at this link. Simply sign in with your Carleton credentials, and you will be set to proceed to the next step. If you would prefer to download the desktop version of R, then follow the instruction here. The only downside to using the online server that Carleton provides, is that you have to be connected to Carleton’s internet or be using a VPN.

Step 2: Set up the R file

Once you log in, you should be taken to page that looks like this (the layout may vary slightly). The next thing you want to do is go to the “File” tab in the top left, then “New File” then click “R Markdown”. From there you can title it whatever you want, and the output you can select whatever as you can always change that later on (I prefer selecting PDF as the output).

Once you have created the R Markdown file, your window should look something like this, and you can go ahead and delete everything below the 3 dashed lines are under output (everything I have selected in blue and below can be deleted.


Step 3: Uploading files

In the bottom right panel of the screen, click the Files tab, then click the upload button (white paper with yellow up arrow), which will open the window titled upload files as seen in the image above. Then Click choose File, and select the file you want (The data Should be in a .CSV format (I selected the csv 10MostPopularBabyNames). then click Ok. The csv should then appear in the Files tab.

The next thing you do is click the file you want and then click Import Dataset which will open the following window:

This window will allow you to preview the data along with providing a chunk of code to copy that allows you to load the data. Copy the Code then click import (in the bottom right).

Step 4: Writing the code

You then want to create a code chunk by either doing (clicking command+option+I on mac or Ctrl+Alt+I on windows) or simply by typing the following:

You then want to paste the code that you copied at the end of Step 3 into the code chunk you just created, you can also rename your data frame to make it easier later on such as what I did.

Be sure to run the code by clicking the little green arrow at the top right of each code chunk.

Then load the tidy verse library (as I did above) which will allow you to create many different kinds of visualizations although for this tutorial I will focus on creating box plots.

You then will create a new code chunk and copy the following code into the code chunk. This code is written to make a series of bar charts that have the year on the x-axis, count (number of babies with that name), and the name as the title of the chart. Everywhere there is a # and then text, it is just a comment and the comments is just extra code I wrote to allow the charts to be more readable.

MostPopularBabyNames %>%
  ggplot(aes(x = Year, y = Count)) +
  geom_col(fill = "blue") +  # Change bars to blue
  facet_wrap(~ Name, scales = "free_y") +
  theme_minimal(base_size = 5) +  # Decrease base text size
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 8),  # Smaller x-axis text
    axis.text.y = element_text(size = 8),  # Smaller y-axis text
    strip.text = element_text(size = 10)  # Smaller facet labels
  ) +
  labs(
    title = "Popularity of Baby Names Over Time",
    x = "Year",
    y = "Count"
  )

This should give you the output that you want and I will provide it below.

Step 5: saving the file

The next step is to knit(save/download) the file (use the arrow to select what you want to knit it as, I normally knit as pdf, then a window pops up telling you to name the file and allowing you to choose where you want to save it to. Once you select a location to save the file, click save.

Here are two links to more tutorials and information on how to use R:

Carleton intro stats R manual is here.

Similarly to some tools used in this class, you can have animate visualization in R. Here is a Tutorial.

5 thoughts on “Tutorial Week 9

  1. I love how you connected it back to our Week 3 lab and used your prior experience with R to offer an alternative method for creating visualizations. I agree that while R might seem a bit more complex at first, it’s super rewarding to build something from scratch and really understand what’s going on behind the scenes. Plus, your step-by-step instructions are super clear and easy to follow—I can definitely see myself coming back to this if I ever want to try R for a DGAH project. Thanks for sharing such a helpful resource!

  2. I really like this tutorial! I have some familiarity with R but it was really nice getting this refresher and it really helps show how R could be used in a Digital Humanities. I know R seems complicated at first but this post really helps ease people into R. Thanks for the really informative tutorial!

  3. This was a really detailed and informative tutorial! Although I have a little bit of experience in R from a previous class, it has been a good amount of time since I last used it. I found this tutorial to be a great refresher for myself, and I can easily see it being a good introduction to R for beginners. R is pretty difficult to get the hang of, I think you do a great job of making it approachable! Good work!

  4. Great tutorial! R can be extremely challenging at points and your tutorial is well-written and makes the steps clear. Something I appreciate is linking the intro stats lab manual as it has a bunch of helpful R tips and hacks, and people often forget to use it as a resource when data cleaning or doing analysis.

  5. Hi Lucas. I wanted to say your tutorial is very thoughtful and easy to read through. I have used R studio before but your tutorial was a great way for me to review some important concepts. The code you wrote is well-written and easy to write. I never thought about using R for that assignment but maybe for my final project I may think about incorporating it. Great job!

Leave a Reply to Reed Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

css.php