Data Visualization Lab

I chose to visualize the number of babies with the most popular names by year for each sex. Originally I tried to create a stacked bar chart or a box and whisker plot, which I admit, may have done a better job of depicting the information I was trying to get from the data, however, I had some issues getting them to work and I feel that the slightly more interactive aspect of this graph does the data justice in a way. Although it isn’t perfect, I appreciate the ability to not only scroll over values, but to see individual data points with this chart type, you can toggle the categories using the legend in the top left, and the full spread of numbers is shown. Each data point having the name attached is a bonus to using a scatterplot instead of a bar chart, as it would be harder to hover over a single data point and see information for that particular name. This graph style allows for more questions to be generated about particular data points and trends as well. For example, are there more popular baby names for boys than girls? This could even be used as a metric to get a slight, albeit likely inaccurate, idea of how many males vs. females were born in a year for the setting of our data set. For this visualization, I did not have to change all too much other than the colors and update the legend to make it clear that the points are organized by sex to be able to better see the patterns across time, but if I were to do this again I would make sure that the data are normalized.

One of the most important facets of Digital Humanities is accessibility to information. Fittingly, data visualization is founded on that same principle, as the goal of a visualization is to make data or information easier for “someone” to understand. For this graph in particular, I decided to look at the trends in the most popular baby names, separated by gender and year, which I can see connecting to DH because the humanities are the study of people, and here we have a perfect example of using computational tools to do so. Zooming out on the graph a bit tells us that there is more information here than meets the eye. Many of the popular names carry on from year to year, and though they may be shuffled, it is apparent that there has been a steady decrease in the popularity of the most popular baby name since 2006. This can tell us a few things: the population we are researching here has gotten more diverse in their naming, the less popular names on the list are being used more (note how the data points are more clustered toward the middle), or even that fewer babies are being born, as our data does not seem normalized. Graphs like this allow us to generate more questions about the people represented in the data, and the historical context of them.

In the Who Gets Counted Counts reading, Klein and D’ignazio talk about rethinking social binaries and hierarchies and the influence and impact of how companies that collect data choose to do so on our thinking. One of the main issues brought up in the reading is about the under-representation of minorities in a wide array of categories. When looking at the information we are presented for this lab, we know nothing about the families naming these babies at all. Sure, we can make assumptions based on the names alone, but there is nothing to fully confirm our guesses about the people behind them. We know that these are popular baby names from 2001 to 2010, but we don’t know the demographics from which the data are derived. Having taken Lin’s Data Visualization as Activism class, I can confidently say that no matter what the subject is, as a data visualizer, you must put the audience first. Your message cannot and will not get across in the way that you want it to if, at any stage in the process, you fail to keep in mind who you’re trying to disseminate information to. I believe that as data visualizers, we need to have a level of transparency, not only with our audience but with ourselves about where our data comes from and the biases associated with those origins, rather than attempting to create generalizable statements to a population unrepresented in the data. We must recognize that wherever there are people involved, bias is present.

1 thought on “Data Visualization Lab

  1. Hi Simon, I really like the interactive aspect of your graph! Making the name card active upon hovering really cleans up the overall interface-something I definitely struggled with. I agree that data visualization helps make data or information easier to understand. While I never really noticed it, from your graph I can clearly see a total count comparison between female and male names and how the total counts became more consistent in recent years.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

css.php