/ Code

R Wide to long

Here's another note to self on how to do something I do frequently enough to need to document but not often enough to memorize - transforming data from "wide" to "long." Tidyverse has made this easy enough but I still like seeing examples.

The data I'm using was mentioned in Maria Lupetini's Open Data Science talk "Unearthing Climate Change Trends with Earth."

Fire up R and import the data:

$ R --no-save --quiet
R> library(tidyverse)
R> df <- read.table("http://climate.nasa.gov/system/internal_resources/details/original/647_Global_Temperature_Data_File.txt")
R> head(df, n=2)
    V1    V2    V3
1 1880 -0.20 -0.13
2 1881 -0.12 -0.16
R> # not really sure what this is, but first col is "year"
R> df <- rename(df, year=V1)
R> # reshape from "wide" to "long"
R> df <- df %>% gather(key, value, -year)
R> head(df, n=2)
  year key value
1 1880  V2 -0.20
2 1881  V2 -0.12
R> # see what our data looks like:
R> ggplot(df, aes(year, value, colour=key)) + geom_point() + geom_line()

Which produces this:
V2 is the raw data and V3 has been smoothed.

For giggles I'll try to make my graph look like Nasa's. Here's theirs:

... wait ... The download image is a static PNG file. The image on the NASA page is dynamic. It shows the data values as you move your cursor across the screen. This looks more like plotly, though looking at the <canvas> source there's no references to which plot library they are using. Let's see if plotly produces something similar. Plotly appears to like "wide" data so re-read our data.

R> library(plotly)
R> df <- df %>% spread(key, value) # revert from "long" to "wide"
R> head(df, n=2)
  year    V2    V3
1 1880 -0.20 -0.13
2 1881 -0.12 -0.16
R> p <- plot_ly(df, x=~year) %>%
  add_trace(y=~V2, name='Annual mean', type='scatter', mode='lines+markers', marker=list(size=10, color="#8888", line=list(color="#8888", width=2)), line=list(width=2, color="#8888"), hovertext=~paste("Year: ", year, "<br>Change: ", V2, "&deg;C")) %>%
  add_trace(y=~V3, name='Lowess smoothing', type='scatter', mode='lines', line=list(width=2, color="black"), hovertext=~paste("Year: ", year, "<br>Change: ", V3, "&deg;C")) %>%
  layout(title="GLOBAL LAND-OCEAN TEMPERATURE INDEX", xaxis=list(zeroline=F), yaxis=list(zeroline=F), legend=list(x=0.05, y=0.95))
R> p
R> api_create(p, filename="marial")

That last command magically makes the plot available on plotly's servers:

It's not precisely like the original but pretty close.