Time Series and Graphics in R
The examples use astsa, ggplot2, and
library(astsa) library(ggplot2) library(ggfortify)SOME OF THE EXAMPLES BELOW THAT USED ggfortify DON'T WORK ANYMORE. SOME DEMONSTRATIONS REMAIN, BUT I'M NOT GOING TO SPEND ANY MORE TIME ON THAT MESS - EMAIL THE MAINTAINERS IF YOU HAVE QUESTIONS.![]()
The graphs are saved to a png file in the working directory. Skip the lines
png(...)
and dev.off()
if you just want to test the result.
First, here's a plot of
gtemp_land
using the base graphics. If you add a grid after you plot, it goes on top.
You have some work to do if you want the grid underneath... but at least you can work around that - read on.
# for a basic plot, all you need is plot(gtemp_land) # it can't get simpler than that (not shown) plot(gtemp_land, type='o', col=4) # a slightly nicer version (not shown) # but here's a pretty version that includes a grid # this code uses R graphics only png(file='gtemp.png', width=600, height=320) par(mar=c(2,2,0,.5)+.5, mgp=c(1.6,.6,0)) # trim the margins plot(gtemp_land, ylab='Temperature Deviations', type='n') # set up the plot grid(lty=1, col=gray(.9)) # add a grid lines(gtemp_land, type='o', col=4) # and now plot the line dev.off()# But if you use astsa, it's a one liner png(file='ts_soi.png', width=600, height=320) tsplot(soi, main='Southern Oscillation Index', col=4, nxm=5) dev.off()
# I asked for minor ticks to have 5 sections on the time axis, the default is 2 (like the y-axis)
And a ggplot2 of the global temperature series. ggplot2 doesn't play with time series so you have to create a data frame that strips the series to its bare naked data.
png(file='gtemp1.png', width=600, height=320) gtemp.df = data.frame(Time=c(time(gtemp_land)), gtemp=c(gtemp_land)) # strip the ts attributes ptemp = ggplot(data = gtemp.df, aes(x=Time, y=gtemp) ) + # store it ylab('Temperature Deviations') + geom_line(col=4, lwd=1) + geom_point(col=4) ptemp # plot it dev.off()It's not necessary to store the figure... it's just an example of what you can do.# To remove the gray background, run ptemp + theme_bw() # not shown
And here's a one line astsa version that resembles gg-plot.
png(file='ggtemp.png', width=600, height=320) tsplot(gtemp_land, gg=TRUE, ylab='Temperature Deviations', col=4, type='o', pch=20) dev.off()![]()
♦ You can see a major difference between gg-plotting and base-plotting. In base graphics, you do things one at a time. If I input
plot(x)
and then grid()
, the graphic is
developed according to the order entered. Thus the graphic is produced with the grid on top of the line.
With gg-plotting, nothing is done until after all the input is finished. You can even save the
results in an object and add instructions later.
Here's an example of
ggplot
for two time series, one at a time (not the best way for many time series).
png(file='gtemps1.png', width=600, height=320) gtemp.df = data.frame(Time=c(time(gtemp_land)), gtemp=c(gtemp_ocean), gtempl=c(gtemp_land)) ggplot(data = gtemp.df, aes(x=Time, y=value, color=variable ) ) + ylab('Temperature Deviations') + geom_line(aes(y=gtemp , col='Ocean Only'), size=1, alpha=.5) + geom_line(aes(y=gtempl, col='Land Only'), size=1, alpha=.5) + theme(legend.position=c(.1,.85)) dev.off()Here's a thing you may not notice at first. I entered ocean only before land only, but land only comes before ocean only in the graph legend. Why?? Because the plot is in alphabetical order, not your order. If you don't care, then good for you. If you do care, you have to fight it a bit. You'll see how to do that in the missing data example below.![]()
And a similar plot using astsa...
png(file='gtempsbase.png', width=600, height=320) tsplot(gtemp_land, ylab="Gloabal Temp Devs", lwd=2, col=astsa.col(.6)[6]) lines(gtemp_ocean, lwd=2, col=astsa.col(.6)[4]) legend('topleft', col=c(6,4), lwd=2, legend=c("Land Only", "Ocean Only"), bg='white') dev.off()The astsa color palette is attached when astsa is loaded. The script# or using the spaghetti and gg options: png(file='gtempsbase2.png', width=600, height=320) tsplot(cbind(gtemp_land, gtemp_ocean), ylab="Gloabal Temp Devs", lwd=2, col=astsa.col(.7)[c(6,4)], gg=TRUE, spaghetti=TRUE) legend('topleft', col=c(6,4), lwd=2, legend=c("Land Only", "Ocean Only"), bty='n') dev.off()
![]()
astsa.col()
can be used to adjust the opacity level of the palette.
The numbering 1 to 8 are (shades of) black, red, green, blue, cyan, magenta, gold, gray.
To revert back to the new R4 palette, use palette('default')
.

png(file='gtemps3.png', width=600, height=320) autoplot(cbind(gtemp, gtemp2), ylab="It's Getting Hot in Here", xlab="emiT", size=1, colour='variable') dev.off()![]()
png(file='cmort1.png', width=600) autoplot(cbind(Mortality=cmort, Temperature=tempr, Particulates=part), xlab='Time', colour='variable') + guides(colour=FALSE) # shuts off legend dev.off()![]()
If you just want it without color or all the same, this works now (with no guarantee that it will work later)...
png(file='cmort1bw.png', width=600) autoplot(cbind(Mortality=cmort, Temperature=tempr, Particulates=part), xlab='Time', ts.colour=4) + guides(colour=FALSE) # shuts off legend dev.off()Just don't ask me how to get the y-labels on the three different series without just plotting three separate graphs. I'm sure it can be done .... Update: I found how to plot differently scaled multiple time series with ggplot2 on GitHub. I'm not sure it's worth the trouble.![]()
A similar quick plot using base graphics (not shown):
plot(cbind(cmort, tempr, part), main='Hot and Smoggy')Here's a pretty version in base graphics ONLY. The code is a little long, but repetitive, so you can copy-and-paste to save your wrists a little anguish. This example shows the BEST thing about base graphics... you can put anything anywhere without having to search the internet to hope someone figured out how to get a ggpackage to do what you want it to do.
png(file='mtp_base.png', width=600) par(mfrow=c(3,1), mar=c(0,3.5,0,3), oma=c(3.5,0,2,0), mgp=c(2,.6,0), cex.lab=1.1, tcl=-.3, las=1) plot(cmort, ylab=expression(M[~t]~~~~(Number~Buried)), xaxt="no", type='n') grid(lty=1, col=gray(.9)) lines(cmort, col=rgb(0,0,.9)) text(1974, 132, 'Bad Year', col=rgb(.5,0,.5), cex=1.25) # just for fun arrows(1973.5, 130, 1973, 127, length=0.05, angle=30, col=rgb(.5,0,.5)) plot(tempr, ylab=expression(T[~t]~~~~(degree~F)), xaxt="no", yaxt='no', type='n') grid(lty=1, col=gray(.9)) lines(tempr, col=rgb(.9,0,.9)) axis(4) plot(part, ylab=expression(P[~t]~~~~(PPM))) grid(lty=1, col=gray(.9)) lines(part, col=rgb(0,.7,0)) title(xlab="Time (week)", outer=TRUE) dev.off()Maybe this is the answer to: how to plot differently scaled multiple time series with ggplot2 ... do it in base graphics.![]()
You can also do something similar to the above using tsplot.
png(file='mtp_base_ts_new.png', height=600, width=600) tsplot(cbind(Mortality=cmort, Temperature=tempr, Particulates=part), type='o', pch=19, col=4:2) dev.off()## and a spaghetti version png(file='mtp_base_ts.png', height=400, width=600) tsplot(cbind(Mortality=cmort, Temperature=tempr, Particulates=part), col=astsa.col(.7)[4:2], spaghetti=TRUE) legend("topright", legend=c("Mortality", "Temperature", "Pollution"), lty=1, col=4:2) dev.off()
![]()
This is a ggfortify quick plot (that may still work??) of all the series on the same graph (makes sense if the scales are at least similar, which they are in this case - if temperature were in ℃, you would have to convert it to ℉ for the plot to be useful).
png(file='cmort2.png', width=600) autoplot(cbind(Mortality=cmort, Temperature=tempr, Particulates=part), xlab='Time', facets=FALSE) dev.off()# a similar quick plot in base graphics (not shown) ts.plot(cmort, tempr, part, col=2:4) # needs a legend legend('topright', legend=c('Mortality', 'Temperature', 'Pollution'), lty=1, col=2:4)
All 8 explosion series, in exploding color, using ggplot.
png(file='eqexp.png', width=600 ) library(reshape) # install 'reshape' if you don't have it df = melt(eqexp[,9:16]) # reshape the data frame Time = rep(1:2048, 8) ggplot(data=df, aes(x=Time, y=value, col=variable)) + geom_line( ) + guides(colour=FALSE) + facet_wrap(~variable, ncol=2, scales='free_y') + ylab('') dev.off()# using ggfortify - if you don't want color (not shown) autoplot(ts(eqexp[,9:16]), ncol=2, xlab='Time') # and in base graphics (not shown) plot(ts(eqexp[,9:16]), main='') # you can get a nice plot using astsa without melting anything ... you'll also see the astsa palette png(file='eqexp2.png', width=600) tsplot(eqexp[,9:16], col=1:8, ncolm=2) dev.off()
# and the rainbow palette looks nice png(file='eqexp3.png', width=600) tsplot(eqexp[,9:16], col=rainbow(8, v=.8), ncolm=2, gg=TRUE) dev.off()
################################################################### # NOTE: The code below using ggfortify used to work - but now it # doesn't... but it may tomorrow ... the pain of relying on # packages rears its ugly head ################################################################### autoplot(ts(eqexp[,9:16]), ncol=2, xlab='Time', colour='variable') + guides(colour=FALSE) # shuts off legend
Next is a more difficult problem where ggfortify fails, but ggplot works but hurts.
Ok, let's make it hard... M.I.S.S.I.N.G. D.A.T.A. We'll use blood, which is a multiple time series file with lots of NAs.
# In base graphics, it is sooooooo simple and the result is decent png(file='blood0.png', width=600 ) plot(blood, type='o', pch=19, main='') dev.off()That wasn't so bad. I gave up trying to do this in ggfortify (this is when everything else worked). So, here it is with ggplot2. It works ok, but you get warnings and other frustrations you'll see along the way...# and tsplot: png(file='blood1.png', width=600 ) tsplot(blood, type='o', col=rainbow(3, v=.8, start=.6, end=1), pch=19, cex=1, gg=TRUE) dev.off()
# and just using base graphics png(file='blood.png', width=600, height=500 ) filit = function(){ rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col=gray(.92), border=8)} culer = rainbow(3, s=.9, v=.8, start=.6, end=1) par(mfrow=c(3,1), mar=c(0,3.5,0,2), oma=c(3,0,2,0), mgp=c(2,.6,0), cex.lab=1.5, las=1, bg=gray(.95)) for (i in 1:3){ plot(blood[,i], ylab=colnames(blood)[i], xaxt="no", type='n') filit() grid(lty=1, col=gray(1)) lines(blood[,i], type='o', pch=19, col=culer[i]) } axis(1) mtext('Day', side=1, line=1) dev.off()
![]()
png(file='bloodgg.png', width=600 ) df = data.frame(day=c(time(blood)), blood=c(blood), Type=rep(c('WBC','PLT','HTC'), each=91) ) # df # uncomment to see the data frame levels(df$Type) # notice that the factor levels of Type are in alphabetical order... # [1] "HTC" "PLT" "WBC" # ... if I don't use the next line, the plot will be in alphabetical order ... # ... if I wanted the series in alphabetical order ... # ... I would have ordered it that way - so I need ... # ... the next line to reorder them back to the way I entered the dataWe got the graphic and survived the aliens, too.... df$Type = factor(df$Type, levels(df$Type)[3:1]) # any resemblance to the blood work of actual persons, living or dead, is purely coincidental ggplot(data=df, aes(x=day, y=blood, col=Type)) + ylab("Mary Jane's Blood Work") + geom_line() + geom_point() + guides(colour=FALSE) + facet_wrap(~Type, ncol=1, scales='free_y') # Danger, Will Robinson! Warning! Warning! NAs appearing! # Warning messages: # 1: Removed 9 rows containing missing values (geom_path). # 2: Removed 111 rows containing missing values (geom_point). # We're doomed! Crepes suzette! dev.off()
![]()
😝 Now imagine what you'd have to do if you want to identify the y-axis scales for each plot ... e.g., WBC is measured in 103/μL ... I guess that's not in the grammar of graphics, too.
But here's something that IS in the grammar of graphics, too ... a pretty ribbon plot of the Southern Oscillation Index:
png(file='soi.png', width=600) cblue = rgb(144,195,212, max=255) cred = rgb(195,144,212, max=255) df = data.frame(Time=c(time(soi)), SOI=c(soi), d=ifelse(c(soi)<0,0,1)) ggplot( data=df, aes(x=Time, y=SOI) ) + geom_ribbon(aes(ymax=d*SOI, ymin=0, fill = "cold")) + geom_ribbon(aes(ymax=0, ymin=(1-d)*SOI, fill = "hot")) + scale_fill_manual(name='SST', values=c("cold"=cblue,"hot"=cred)) + theme(legend.position=c(.05,.1)) dev.off()Well that might be pretty, but it obscures the trend, don't you think?![]()
Let's try a nicer plot ...
png(file='soi_nicer.png', width=600, height=320) df = data.frame(Time=c(time(soi)), SOI=c(soi)) ggplot( data=df, aes(x=Time, y=SOI) ) + geom_line(col=4) + stat_smooth(span=1/12, col=6, se=FALSE) + # El Niño stat_smooth(col=6) # La Tendencia (with CIs) dev.off()![]()
And a similar plot with astsa graphics - (lowess retains the time axis whereas loess does not)
png(file="soi_nicer_base.png", width=600, height=320) tsplot(soi, col=4) lines(lowess(soi, f=.05), lwd=2, col=6) # El Niño cycle lines(lowess(soi), lwd=2, col=6) # trend (with default span) dev.off()![]()
If you need the intervals (the gray swatch) in base, it's a bit of a pain. You have to use loess and then predict.loess to get the intervals. But those functions strip the time series attributes so you have to adjust for that....
# Here's how to get a CI, but I'll only do the trend png(file="soi_base_ci.png", width=600, height=320) tsplot(soi, col=4, gg=TRUE) lines(lowess(soi, f=.05), lwd=2, col=6) # El Niño cycle # code above is like the previous example... # ... and trend with CIs from here down x = c(time(soi)) # remove ts attributes... y = c(soi) # ... you do it or loess does it for you lo = predict(loess(y ~ x), se=TRUE) # trend in lo$fit lines(x, lo$fit, col=6, lwd=2) L = lo$fit - qt(0.975, lo$df)*lo$se U = lo$fit + qt(0.975, lo$df)*lo$se xx = c(x, rev(x)) yy = c(L, rev(U)) polygon(xx, yy, border=8, col=gray(.6, alpha=.3) ) dev.off()![]()
This is a plot of S&P 500 returns. The data are in astsa, but it's an xts file, so you have to load that first.
library(xts) png(file='sp500w_xts.png', width=600, height=320) plot(sp500w, main='S&P 500', col=rgb(0,.6,.6)) dev.off()The xts plot used to be UGLY, but it has been improved quite a bit.![]()
Here's a discrete-valued series plotted as a step function.
EQcount
in astsa is a count of certain types of earthquakes.
png(file='EQcount_basie.png', width=600, height=320) tsplot(EQcount, col=4, type='s') points(EQcount, pch=21, col=4, cex=1.1, bg = 6) # might look better without this dev.off()![]()
Here's the same data as bars (did I mention the duck that went into a bar, ordered a drink and said put it on my bill?)
png(file='EQcount2_basie.png', width=600, height=320) tsplot(EQcount, type='h', col=4, lwd=2, minor=FALSE) dev.off()![]()
If you did not know this already , with time series, the dimensions of the plot matters. Consider these two plots of the bi-annual sunspot numbers. In the first plot, you see that the series rises quickly ↑ and falls slowly ↘ . The second plot obscures this fact.
png(file='sun1.png', width=600, height=200) # wide and narrow tsplot(sunspotz, type='o', col=4, pch=19) dev.off()It is my understanding that R always defaults to a square, no matter which device is called. While squares are ok for toilet paper, they are typically not ok for plotting time series.png(file='sun2.png') # default square (480 px) tsplot(sunspotz, type='o', col=4, pch=19) dev.off()
![]()
And finally, a base graphics plot of the sunspot numbers:
png(file='sun_cc.png', width=600, height=400) x = sunspotz culer1 = rgb(242, 153, 216, max=255) culer2 = rgb(208, 73, 242, max=255) culer3 = rgb( 77, 161, 249, max=255) culer4 = rgb( 0, 200, 225, max=255) culer5 = rgb(124, 231, 251, max=255) par(mar=c(2,2,1,1)+2, mgp=c(3,.2,0), las=1, cex.main=2, tcl=0, col.axis=culer1, bg=rgb(.25,.1,.25)) plot(x, type='n', main='', ylab='', xlab='') rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col='black') grid(lty=1, col=rgb(1,0,1, alpha=.5)) lines(x, lwd=3, col=culer1) lines(window(x, start=1800), lwd=3, col=culer2) lines(window(x, start=1850), lwd=3, col=culer3) lines(window(x, start=1900), lwd=3, col=culer4) lines(window(x, start=1950), lwd=3, col=culer5) title(expression('Psychedelic' * phantom(' Sunspots')), col.main=culer1) title(expression(phantom('Psychedelic') * ' Sunspots'), col.main=culer5) mtext('Time', side=1, line=2, col=culer3, font=2, cex=1.25) mtext('Sunspot Numbers', side=2, line=2, col=culer2, font=2, las=0, cex=1.25) text(1800, 180, "don't stare at the sunspots", col=culer5, srt=20, font=4) text(1900, 170, "s.t.a.y c.o.o.l", col=culer1, srt=330, font=4) text(1850, 160, "dave? dave? \n dave's not here!", col=culer3, font=4) dev.off()# this is an elegant graphic that was not done in R
![]()
That's all for now... if I think of anything else, I'll post it. Stay tuned.