◊ lag.plot1 example [the script] - grid of plots of x(t-h) vs. x(t)
◊ lag.plot2 example [the script] - grid of plots of x1(t-h) vs. x2(t)
◊ acf2 example [the script] - simultaneous plot of sample ACF and PACF on the same scale
◊ sarima example [the script] - fits (S)ARIMA models including diagnostics in a short command
◊ sarima.for example [the script] - gives (S)ARIMA forecasts (values and graphic) in a short command
◊ spec.arma example [the script] - gives ARMA spectrum, tests causality, invertibility, common zeros
You can get it all in one file: itall.R. Just download the file to a convenient location and then "source" it.
SOURCING CODE: Sourcing code is easy (and don't let the word source scare you ... just substitute the word read for source and it will lose its mystery). The basic idea is you can have any number of R commands in an external file and you can execute them at any time by reading the file into R. This can be done by copying-and-pasting the commands into R, or by sourcing the file, in our case, itall.R, by issuing the command source("itall.R"). For Windows, Source is available on the File menu. Here's a little movie that shows how to source code: start the show
It is possible to source (read) this code without downloading the file.
For example, either
source(url("http://www.stat.pitt.edu/stoffer/tsa2/Rcode/itall.R")) or
source(url("http://lib.stat.cmu.edu/general/tsa2/Rcode/itall.R"))
will work. If you go this route, you should download itall.R anyway so you have it locally
should you need it (it's a small file, about 6kb).
Finally, save your workspace before closing R... that way, all these goodies will be available for the next session... and all the other stuff you did will be there, too. Just say, YES.
First an example using lag.plot2. The full call is lag.plot2(x1,x2,m,corr=TRUE,smooth=TRUE) and it will generate a grid of scatterplots of x1(t-h) versus x2(t) for h = 0,1,...,m, along with the cross-correlation values in blue and a lowess fit in red. Note that the first series, x1, is the one that gets lagged.
If you issue the command lag.plot2(x1,x2,m) you will see the cross-correlations but not the smoothed (red) fit. That is, corr=TRUE and smooth=FALSE is the default. I did it this way because the smooth can be informative but is a little annoying. If you just want the scatterplots and nothing else, then lag.plot2(x1,x2,m,corr=FALSE) is the command.
In the following example, x1=SOI and x2=Recruitment and I'm turning on the smooth, the correlations are already on.
soi = scan("/mydata/soi.dat") rec = scan("/mydata/recruit.dat") lag.plot2(soi,rec,8,smooth=TRUE)which results in the graph:
For a similar plot with one time series, use lag.plot1. The full call is lag.plot1(x,m,corr=TRUE,smooth=TRUE) and it will generate a grid of scatterplots of x(t-h) versus x(t) for h = 1,...,m, along with the autocorrelation values in blue and a lowess fit in red. The defaults are the same as in the previous script. If you don't want any correlations or lines, simply use R's lag.plot. Here's an example assuming soi is still available.
lag.plot1(soi,12,smooth=TRUE)which results in the graph:
Now, here's an example using acf2 on the Recruitment series (rec):
acf2(rec) # computes to lag sqrt(n)+10 ACF PACF [1,] 0.92 0.92 [2,] 0.78 -0.44 [3,] 0.63 -0.05 . . . . . . [31,] -0.13 0.06 [32,] -0.11 0.05you get the values as above ... and a picture:
You can specify the number of lags... 48 for example:
acf2(rec,48) ACF PACF [1,] 0.92 0.92 [2,] 0.78 -0.44 [3,] 0.63 -0.05 . . . . . . [46,] 0.12 0.05 [47,] 0.17 0.08 [48,] 0.20 -0.04and the picture:
The basics are as follows: If your time series is in x and you want to fit an ARIMA(p,d,q) model to your data, the call is sarima(x,p,d,q). The results are the parameter estimates, standard errors, AIC, AICc, BIC (as defined in Chapter 2) and diagnostics. To fit a seasonal ARIMA model, the call is sarima(x,p,d,q,P,D,Q,s). So, for example, sarima(x,2,1,0) will fit an ARIMA(2,1,0) model to the series in x, and sarima(x,2,1,0,0,1,1,12) will fit a seasonal ARIMA(2,1,0)×(0,1,1)12 model to the series in x.
Here are some examples using sarima and sarima.for. First, suppose I generate the random walk with drift, x, mentioned in R Issue 2:
set.seed(1) # so you can reproduce the results v = rnorm(100,1,1) # v contains 100 iid N(1,1) variates x = cumsum(v) # x is a random walk with drift = 1
and I try to fit an ARIMA(1,1,0) model:
sarima(x,1,1,0) # <-- this is the most basic form ... easy, huh? initial value -0.105745 <-- sarima shows the output from the optimization iter 2 value -0.105750 <-- this is the Conditional Sum Sq (CSS) part iter 3 value -0.105750 ** you can shut this off using sarima(x,1,1,0,details=FALSE) iter 4 value -0.105750 ... additional notes are at the end of these examples iter 5 value -0.105750 iter 6 value -0.105751 iter 7 value -0.105751 iter 7 value -0.105751 iter 7 value -0.105751 final value -0.105751 converged initial value -0.110799 <-- this is the Max Likelihood (ML) part iter 1 value -0.110799 final value -0.110799 converged $fit Call: arima(x = data, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S), xreg = constant, include.mean = F, optim.control = list(trace = 1, REPORT = 1, reltol=tol)) Coefficients: ar1 constant <-- notice this says constant (see Issue 1) -0.0031 1.1169 <-- and a constant is fit even when s.e. 0.1002 0.0897 differencing (see Issue 2) sigma^2 estimated as 0.8012: log likelihood = -129.51, aic = 265.01 $AIC [1] 0.8184023 <-- this is AIC as defined in Chapter 2 (type ?AIC to see how R computes aic - the two are comparable) $AICc [1] 0.8409023 <-- this is AICc as defined in Chapter 2 $BIC [1] -0.1294943 <-- this is SIC or BIC as defined in Chapter 2
The fit is like
∇x(t) = 1.1169 -.0031*∇x(t-1) + w(t)
with the estimate of
σ2 being .8012
and the AR term is not significant, as expected.
Note that the command will also generate diagnostics as a graphic, like this:
innov Time Series: Start = 1 End = 100 Frequency = 1 [1] -0.0007433653 0.0667311688 -0.9523344897 1.4754290746 0.2171587300 [6] -0.9367240697 0.3676241150 0.6225564230 0.4607874263 -0.4208839814 [11] 1.3935659584 0.2772364838 -0.7373100486 -2.3338899339 1.0008228665 [16] -0.1587343246 -0.1336016133 0.8265135730 0.7068615543 0.4791632451 [21] 0.8035377023 0.6676999363 -0.0402937232 -2.1063942292 0.4964132042 [26] -0.1714883965 -0.2732414093 -1.5885058950 -0.5999620095 0.2991931438 [31] 1.2426968053 -0.2158670082 0.2700817021 -0.1698812111 -1.4944982876 [36] -0.5365173470 -0.5128434489 -0.1778029899 0.9825696404 0.6492981652 [41] -0.2794408221 -0.3711421293 0.5789087390 0.4415416170 -0.8043102945 [46] -0.8268935803 0.2451257019 0.6523854908 -0.2272469038 0.7634883160 [51] 0.2835526362 -0.7280703597 0.2219580831 -1.2455829430 1.3122654009 [56] 1.8675500712 -0.4783818983 -1.1625406761 0.4492243732 -0.2505689069 [61] 2.2839282635 -0.1491003904 0.5723455848 -0.0871417189 -0.8604594524 [66] 0.0692256149 -1.9216486145 1.3427114231 0.0405039141 2.0558119984 [71] 0.3649423437 -0.8257515011 0.4912625277 -1.0494853727 -1.3737890396 [76] 0.1703043906 -0.5596650308 -0.1175354796 -0.0429279328 -0.7065641699 [81] -0.6877608810 -0.2542064057 1.0603971169 -1.6372034567 0.4719712284 [86] 0.2175108404 0.9468547768 -0.4181754728 0.2518073173 0.1509681372 [91] -0.6589683338 1.0889207200 1.0468578741 0.5865224127 1.4717219044 [96] 0.4461113121 -1.3921411813 -0.6944781185 -1.3436545925 -0.5944529125Another way to get the innovations is to give the analysis a name, e.g., dog=sarima(x,1,1,0), and then you can pull out the innovations as dog$fit$resid or resid(dog$fit) or even resid(sarima(x,1,1,0)$fit) if you want to leave out the dog.
sarima.for(x,5,1,1,0) $pred Time Series: Start = 101 End = 105 Frequency = 1 [1] 112.0075 113.1244 114.2413 115.3582 116.4751 $se Time Series: Start = 101 End = 105 Frequency = 1 [1] 0.8951188 1.2639371 1.5472077 1.7861037 1.9966174
... and the graphic:
sarima(rec,2,0,0) initial value 3.332380 iter 2 value 3.251366 iter 3 value 2.564654 iter 4 value 2.430141 iter 5 value 2.258212 iter 6 value 2.253343 iter 7 value 2.248346 iter 8 value 2.248345 iter 9 value 2.248345 iter 10 value 2.248341 iter 11 value 2.248332 iter 12 value 2.248331 iter 13 value 2.248330 iter 13 value 2.248330 iter 13 value 2.248330 CSS converged in 13 (or 14 or 15 ???) steps final value 2.248330 converged initial value 2.248862 iter 2 value 2.248857 iter 3 value 2.248855 iter 4 value 2.248855 iter 5 value 2.248854 iter 6 value 2.248854 iter 7 value 2.248854 iter 8 value 2.248854 iter 9 value 2.248853 iter 10 value 2.248853 iter 10 value 2.248853 iter 10 value 2.248853 and from there, ML converged in 10-12 steps final value 2.248853 converged $fit Call: arima(x = data, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S), xreg = xmean, include.mean = F, optim.control = list(trace = 1, REPORT = 1, reltol=tol)) Coefficients: ar1 ar2 xmean 1.3512 -0.4612 61.8585 s.e. 0.0416 0.0417 4.0039 sigma^2 estimated as 89.33: log likelihood = -1661.51, aic = 3331.02 $AIC [1] 5.505631 $AICc [1] 5.510243 $BIC [1] 4.532889
The fitted model is like
Now forecast 5 ahead:
Because there are more than 100 observations, for your viewing pleasure,
the graphic will show only the
final 100 observations and the forecasts (and error bounds):
If you don't want the logged spectrum, use spec.arma(ar=c(1.4,-.9), ma=.8, log="no").
The script will test the roots and you'll get warnings if the roots are not outside the unit circle
or if there is parameter redundancy.
For example,
x(t) = 61.8585*(1-1.3512-(-.4612)) + 1.3512*x(t-1) - .4612*x(t-2) + w(t)
Sorry, I didn't feel like trying to get sarima to report the constant here.
At least the wording is correct (see Issue 1).
sarima.for(rec,5,2,0,0)
$pred
Time Series:
Start = 454
End = 458
Frequency = 1
[1] 20.36547 26.08036 32.65148 38.89474 44.30006
$se
Time Series:
Start = 454
End = 458
Frequency = 1
[1] 9.451686 15.888378 20.464325 23.492457 25.393693
Here's a seasonal model fit on the recruitment data, minus the diagnostic graphics and with the details off (this is just an
example, it's not a very good model).
sarima(rec,2,0,0,1,0,0,12, details=FALSE) #<-- it's an ARIMA(2,0,0)×(1,0,0)12
Call:
arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S),
xreg = xmean, include.mean = F, optim.control = list(trace = trc, REPORT = 1,
reltol = tol))
Coefficients:
ar1 ar2 sar1 xmean
1.3466 -0.4514 0.1272 61.8056
s.e. 0.0419 0.0421 0.0468 4.7502
sigma^2 estimated as 87.86: log likelihood = -1657.85, aic = 3325.71
$AIC
[1] 5.493421
$AICc
[1] 5.498132
$BIC
[1] 4.529764
◊ In sarima, if there's no differencing (d=0 and D=0) you get the mean estimate.
If there's differencing of order one (either d=1 or D=1, but not both), you'll get the constant term.
In any other situation, no constant or mean term is included in the model. The idea is that if
you difference more than once (d + D > 1), any drift would have been removed.
◊ The relative tolerance (reltol) used to assess convergence in sarima.R and sarima.for.R
is set to sqrt(.Machine$double.eps), the R default.
For details, see the help file for optim under the control
arguments. You can change this without changing the code. For example, sarima(rec,2,0,0,tol=.0001) will
do the second example with reltol = .0001. If there are many parameters
to estimate (e.g., seasonal models), the analysis may take a long time
using the default. That's not a problem if you're sitting at your desk, but if you're doing a demonstration you
may need a good joke to tell while you're waiting for convergence. The one I thought of on the spot was: This is
why cigarettes were invented. They last about as long as it takes to get results. The joke went over pretty
well- you can use it free of charge.
spec.arma
And finally, here are some examples using spec.arma, which will produce the spectral density of an
ARMA model. It can also be used to determine if the model is causal and/or invertible (and there's a check for common zeros). The basic call is spec.arma(ar, ma) where
ar and ma are vectors containing the model parameters. For example, to see the logged spectral density
of the ARMA(2,1) model x(t) = 1.4 x(t-1) - .9 x(t-2) + w(t) + .8 w(t-1), you issue the command:
spec.arma(ar=c(1.4,-.9), ma=.8)
and you would see
spec.arma(ar=1)
WARNING: Model Not Causal
Error in spec.arma(ar = 1) : Try Again
spec.arma(ma=1)
WARNING: Model Not Invertible
Error in spec.arma(ma = 1) : Try Again
spec.arma(ar=39.95, ma=c(.9,2))
WARNING: Model Not Causal
WARNING: Model Not Invertible
Error in spec.arma(ar = 39.95, ma = c(0.9, 2)) : Try Again
# common zeros:
spec.arma(ar=.8, ma=-.8) # this will work but you'll see the spectral density of white noise
WARNING: Parameter Redundancy ... and you'll get a warning.
# approximate parameter redundancy (common zeros):
spec.arma(ar=.8, ma=-.7999)
WARNING: Parameter Redundancy note that the spectral ordinates are between
0.9999 and 1.0010, so you basically have white noise
# no arguments gives the spectral density of white noise with &sigma = 1;
spec.arma()
# and if you want to change the variance, do something like this:
spec.arma(ar = c(1.5,-.75), var.noise = 39.95, log="no")
Finally, the frequencies and the spectral density ordinates are returned invisibly, e.g.,
spec.arma(ar=.9)$freq and spec.arma(ar=.9)$spec, if you're interested in
the actual values.