Here are examples of how to use lag.plot1, lag.plot2, acf2, sarima and sarima.for. Get it all in one file: itall.R. Just download the file to a convenient location and then "source" it.
SOURCING CODE: Sourcing code is easy. The basic idea is you can have any number of R commands in an external file and you can execute them at any time by reading the file into R. This can be done by copying-and-pasting the commands into R, or by sourcing the file, say ‘itall.R’, by issuing the command source("itall.R"). For Windows, Source is available on the File menu. Here's a little movie that shows how to source code: start the show
It is possible to source this code without downloading the file.
For example, either
source(url("http://www.stat.pitt.edu/stoffer/tsa2/Rcode/itall.R")) or
source(url("http://lib.stat.cmu.edu/general/tsa2/Rcode/itall.R"))
will work. If you go this route, you should download itall.R anyway so you have it locally
should you need it (it's a small file, about 4kb).
Finally, save your workspace before closing R... that way, all these goodies will be available for the next session... and all the other stuff you did will be there, too. Just say, YES.
First an example using lag.plot2. The full call is lag.plot2(x1,x2,m,corr=T,smooth=T) and it will generate a grid of scatterplots of x1(t-h) versus x2(t) for h = 0,1,...,m, along with the cross-correlation values in blue and a lowess fit in red. Note that the first series, x1, is the one that gets lagged.
If you issue the command lag.plot2(x1,x2,m) you will see the cross-correlations but not the smoothed (red) fit. That is, corr=T and smooth=F is the default. I did it this way because the smooth is informative but is a little annoying. If you just want the scatterplots and nothing else, then lag.plot2(x1,x2,m,corr=F) is the command.
In the following example, x1=SOI and x2=Recruitment and I'm turning on the smooth, the correlations are already on.
soi=scan("/mydata/soi.dat")
rec=scan("/mydata/recruit.dat")
lag.plot2(soi,rec,8,smooth=T)
which results in the graph:
|
For a similar plot with one time series, use lag.plot1. The full call is lag.plot1(x,m,corr=T,smooth=T) and it will generate a grid of scatterplots of x(t-h) versus x(t) for h = 1,...,m, along with the autocorrelation values in blue and a lowess fit in red. The defaults are the same as in the previous script. If you don't want any correlations or lines, simply use R's lag.plot. Here's an example assuming soi is still available.
lag.plot1(soi,12,smooth=T)which results in the graph:
|
Now, here's an example using acf2 on the Recruitment series (rec):
acf2(rec) # computes to lag sqrt(n)+10
ACF PACF
[1,] 0.92 0.92
[2,] 0.78 -0.44
[3,] 0.63 -0.05
. . .
. . .
[31,] -0.13 0.06
[32,] -0.11 0.05
you get the values as above ... and a picture:
|
You can specify the number of lags... 48 for example:
acf2(rec,48)
ACF PACF
[1,] 0.92 0.92
[2,] 0.78 -0.44
[3,] 0.63 -0.05
. . .
. . .
[46,] 0.12 0.05
[47,] 0.17 0.08
[48,] 0.20 -0.04
and the picture:
|
set.seed(1) # so you can reproduce the results v = rnorm(100,1,1) # v contains 100 iid N(1,1) variates x = cumsum(v) # x is a random walk with drift = 1
and I try to fit an ARIMA(1,1,0) model:
sarima(x,1,1,0)
initial value -0.105745 <-- sarima shows the output from the optimization
iter 2 value -0.105750 <-- this is the Conditional Sum Sq (CSS) part
iter 3 value -0.105750 (*for details, see the note at the bottom of the page)
iter 4 value -0.105750
iter 5 value -0.105750
iter 6 value -0.105751
iter 7 value -0.105751
iter 7 value -0.105751
iter 7 value -0.105751
final value -0.105751
converged
initial value -0.110799 <-- this is the Max Likelihood (ML) part
iter 1 value -0.110799
final value -0.110799
converged
$fit
Call:
arima(x = data, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S),
xreg = constant, include.mean = F, optim.control = list(trace = 1, REPORT = 1, reltol=tol))
Coefficients:
ar1 constant <-- notice this says constant (see Issue 1)
-0.0031 1.1169 <-- and a constant is fit even when
s.e. 0.1002 0.0897 differencing (see Issue 2)
sigma^2 estimated as 0.8012: log likelihood = -129.51, aic = 265.01
$AIC
[1] 0.8184023 <-- this is AIC as defined in Chapter 2
(type ?AIC to see how R computes aic - the two are comparable)
$AICc
[1] 0.8409023 <-- this is AICc as defined in Chapter 2
$BIC
[1] -0.1294943 <-- this is SIC or BIC as defined in Chapter 2
Note that the command will also generate diagnostics as a graphic, like this:
|
The fit is like
∇x(t) = 1.1169 -.0031*∇x(t-1) + w(t)
with the estimate of
σ2 being .8012
and the AR term is not significant, as expected.
If you want to play with the innovations (i.e., the residuals) from the fit, they're
stored in innov:
innov Time Series: Start = 1 End = 100 Frequency = 1 [1] -0.0007433653 0.0667311688 -0.9523344897 1.4754290746 0.2171587300 [6] -0.9367240697 0.3676241150 0.6225564230 0.4607874263 -0.4208839814 [11] 1.3935659584 0.2772364838 -0.7373100486 -2.3338899339 1.0008228665 [16] -0.1587343246 -0.1336016133 0.8265135730 0.7068615543 0.4791632451 [21] 0.8035377023 0.6676999363 -0.0402937232 -2.1063942292 0.4964132042 [26] -0.1714883965 -0.2732414093 -1.5885058950 -0.5999620095 0.2991931438 [31] 1.2426968053 -0.2158670082 0.2700817021 -0.1698812111 -1.4944982876 [36] -0.5365173470 -0.5128434489 -0.1778029899 0.9825696404 0.6492981652 [41] -0.2794408221 -0.3711421293 0.5789087390 0.4415416170 -0.8043102945 [46] -0.8268935803 0.2451257019 0.6523854908 -0.2272469038 0.7634883160 [51] 0.2835526362 -0.7280703597 0.2219580831 -1.2455829430 1.3122654009 [56] 1.8675500712 -0.4783818983 -1.1625406761 0.4492243732 -0.2505689069 [61] 2.2839282635 -0.1491003904 0.5723455848 -0.0871417189 -0.8604594524 [66] 0.0692256149 -1.9216486145 1.3427114231 0.0405039141 2.0558119984 [71] 0.3649423437 -0.8257515011 0.4912625277 -1.0494853727 -1.3737890396 [76] 0.1703043906 -0.5596650308 -0.1175354796 -0.0429279328 -0.7065641699 [81] -0.6877608810 -0.2542064057 1.0603971169 -1.6372034567 0.4719712284 [86] 0.2175108404 0.9468547768 -0.4181754728 0.2518073173 0.1509681372 [91] -0.6589683338 1.0889207200 1.0468578741 0.5865224127 1.4717219044 [96] 0.4461113121 -1.3921411813 -0.6944781185 -1.3436545925 -0.5944529125Another way to get the innovations is to give the analysis a name, e.g., dog=sarima(x,1,1,0), and then you can pull out the innovations as dog$fit$resid or resid(dog$fit) or even resid(sarima(x,1,1,0)$fit) if you want to leave out the dog.
sarima.for(x,5,1,1,0) $pred Time Series: Start = 101 End = 105 Frequency = 1 [1] 112.0075 113.1244 114.2413 115.3582 116.4751 $se Time Series: Start = 101 End = 105 Frequency = 1 [1] 0.8951188 1.2639371 1.5472077 1.7861037 1.9966174
... and the graphic:
|
sarima(rec,2,0,0)
initial value 3.332380
iter 2 value 3.251366
iter 3 value 2.564654
iter 4 value 2.430141
iter 5 value 2.258212
iter 6 value 2.253343
iter 7 value 2.248346
iter 8 value 2.248345
iter 9 value 2.248345
iter 10 value 2.248341
iter 11 value 2.248332
iter 12 value 2.248331
iter 13 value 2.248330
iter 13 value 2.248330
iter 13 value 2.248330 CSS converged in 13 (or 14 or 15 ???) steps
final value 2.248330
converged
initial value 2.248862
iter 2 value 2.248857
iter 3 value 2.248855
iter 4 value 2.248855
iter 5 value 2.248854
iter 6 value 2.248854
iter 7 value 2.248854
iter 8 value 2.248854
iter 9 value 2.248853
iter 10 value 2.248853
iter 10 value 2.248853
iter 10 value 2.248853 and from there, ML converged in 10-12 steps
final value 2.248853
converged
$fit
Call:
arima(x = data, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S),
xreg = xmean, include.mean = F, optim.control = list(trace = 1, REPORT = 1, reltol=tol))
Coefficients:
ar1 ar2 xmean
1.3512 -0.4612 61.8585
s.e. 0.0416 0.0417 4.0039
sigma^2 estimated as 89.33: log likelihood = -1661.51, aic = 3331.02
$AIC
[1] 5.505631
$AICc
[1] 5.510243
$BIC
[1] 4.532889
The fitted model is like
x(t) = 61.8585*(1-1.3512-(-.4612)) + 1.3512*x(t-1) - .4612*x(t-2) + w(t)
Sorry, I didn't feel like trying to get sarima to report the constant here.
At least the wording is correct (see Issue 1).
Now forecast 5 ahead:
sarima.for(rec,5,2,0,0) $pred Time Series: Start = 454 End = 458 Frequency = 1 [1] 20.36547 26.08036 32.65148 38.89474 44.30006 $se Time Series: Start = 454 End = 458 Frequency = 1 [1] 9.451686 15.888378 20.464325 23.492457 25.393693Because there are more than 100 observations, for your viewing pleasure, the graphic will show only the final 100 observations and the forecasts (and error bounds):
|
sarima(rec,2,0,0,1,0,0,12) <-- it's an ARIMA(2,0,0)×(1,0,0)12
initial value 3.342720
iter 2 value 3.261763
iter 3 value 2.566046
iter 4 value 2.423603
iter 5 value 2.277371
iter 6 value 2.258071
iter 7 value 2.251079
iter 8 value 2.250614
iter 9 value 2.250393
iter 10 value 2.250393
iter 10 value 2.250393
final value 2.250393
converged
initial value 2.240806
iter 2 value 2.240796
iter 3 value 2.240793
iter 4 value 2.240792
iter 5 value 2.240791
iter 6 value 2.240789
iter 7 value 2.240787
iter 8 value 2.240783
iter 9 value 2.240782
iter 10 value 2.240782
iter 10 value 2.240782
final value 2.240782
converged
$fit
Call:
arima(x = data, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S),
xreg = xmean, include.mean = F, optim.control = list(trace = 1, REPORT = 1, reltol=tol))
Coefficients:
ar1 ar2 sar1 xmean
1.3466 -0.4514 0.1272 61.8056
s.e. 0.0419 0.0421 0.0468 4.7502
sigma^2 estimated as 87.86: log likelihood = -1657.85, aic = 3325.71
$AIC
[1] 5.493421
$AICc
[1] 5.498132
$BIC
[1] 4.529764