--- title: "Using expsmooth: worked examples" author: "Justin Carmody, Rob J Hyndman" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: fig_width: 7.15 fig_height: 4.42 vignette: > %\VignetteIndexEntry{Using expsmooth: worked examples} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(ggplot2) library(expsmooth) ``` # Introduction This package contains a collection of datasets that are designed to accompany the book "Forecasting with Exponential Smoothing: The State Space Approach" by Rob Hyndman, Anne B. Koehler, J. Keith Ord, and Ralph D. Snyder (Wiley, 3rd ed., 1998). The book can be purchased [here](http://amzn.com/3540719164/?tag=otexts-20). [![](http://pkg.robjhyndman.com/expsmooth/reference/figures/expsmooth.jpg)](http://exponentialsmoothing.net) When the `expsmooth` package is loaded, the `forecast` package is also loaded, providing the functions to fit and forecast with exponential smoothing state space models. This vignette will replicate **Section 2.8** from the book, and provide worked solutions to some of the exercises in **Section 2.9**. The forecast package has been updated since the book was published, and some of the resulting model estimates may be different from those in the book. Figure numbers are taken from the book. # Data sets A graph of a time series often exhibits patterns, such as an upward or downward movement (trend) or a pattern that repeats (seasonal variation), that might be used to forecast future values. Chapters 1 and 2 reference four data sets that are included in the `expsmooth` package. - **bonds** 125 monthly US government bond yields (percent per annum) from January 1994 to May 2004. - **usnetelec** 55 observations of annual US net electricity generation (billion kwh) for 1949 through 2003. - **ukcars** 113 quarterly observations of passenger motor vehicle production in the UK (thousands of cars) for the first quarter of 1977 through the first quarter of 2005. - **visitors** 240 monthly observations of the number of short term overseas visitors to Australia from May 1985 to April 2005. These time series are shown in **Figure 1.1** which is reproduced below: ```{r fig.width=7.15, fig.height=5} plot_bonds <- autoplot(bonds) + ggtitle('(a) US 10-year Bonds Yield') + xlab('Year') + ylab('Percentage per Annum') plot_usnetelec <- autoplot(usnetelec) + ggtitle('(b) US Net Electricity Generation') + xlab('Year') + ylab('Billion kWh') plot_ukcars <- autoplot(ukcars) + ggtitle('(c) UK Passenger Vehicle Production') + xlab('Year') + ylab('Thousands of Cars') plot_visitors <- autoplot(visitors) + ggtitle('(d) Overseas Visitors to Australia') + xlab('Year') + ylab('Thousands of People') gridExtra::grid.arrange(plot_bonds, plot_usnetelec, plot_ukcars, plot_visitors, nrow=2) ``` # Model selection exercise This part of the vignette will follow the methodology described in **Section 2.8** of the book, and reproduce the results that are reported there. This also provides answers to **Exercise 2.3** and **Exercise 2.4**. The estimation and model selection are performed by the `ets()` function and the forecasting is done by the `forecast()` function. These are both a part of the `forecast` package. A basic introduction to using these functions is given in **Section 7.6** and **Section 7.7** of ["Forecasting: Principles and Practice" by George Athanasopoulos and Rob J. Hyndman](https://otexts.org/fpp2/). The automatic forecasting process will by carried out for each of the four data sets. The process will be explained for the first data set, and the relevant results will be reported for the others. ### bonds The `ets()` function is used to apply all appropriate models (optimising parameters in each case), and then it also selects the best model according to the AICc. The AICc is the default penalised likelihood, but others can be specified. ```{r} fit_bonds <- ets(bonds) summary(fit_bonds) ``` The `autoplot()` function can then be used to show the states over time. ```{r fig.width=7.15, fig.height=5} autoplot(fit_bonds) ``` The `forecast()` function is used to produce point forecasts and prediction intervals. When the `summary()` function is called on the forecast object, it prints the model information along with the point forecasts and prediction intervals. ```{r} forecast_bonds <- forecast(fit_bonds, level=80) summary(forecast_bonds) ``` The `autoplot()` function can then be called on the forecast object to produce the graph shown in **Figure 2.1**. ```{r fig.width=7.15, fig.height=5} autoplot(forecast_bonds) + xlab('Year') + ylab('Percentage per annum') + ggtitle("US 10-year bond yields") + labs(caption="Fig. 2.1a") ``` ### usnetelec ```{r} fit_usnetelec <- ets(usnetelec) forecast_usnetelec <- forecast(fit_usnetelec, level=80) summary(forecast_usnetelec) autoplot(forecast_usnetelec) + xlab('Year') + ylab('Billion kwh') + ggtitle("US net electricity generation") + labs(caption="Fig. 2.1b") ``` ### ukcars ```{r} fit_ukcars <- ets(ukcars) forecast_ukcars <- forecast(fit_ukcars, level=80) summary(forecast_ukcars) autoplot(forecast_ukcars) + xlab('Year') + ylab('Thousands of cars') + ggtitle("UK passenger motor vehicle production") + labs(caption="Fig. 2.1c") ``` ### visitors ```{r} fit_visitors <- ets(visitors) forecast_visitors <- forecast(fit_visitors, level=80) summary(forecast_visitors) autoplot(forecast_visitors) + xlab('Year') + ylab('Thousands of people') + ggtitle("Overseas visitors to Australia") + labs(caption="Fig. 2.1d") ```