Introduction to the tsdl package

The Time Series Data Library (TSDL) was created by Rob Hyndman, Professor of Statistics at Monash University, Australia. It includes data from a lot of time series textbooks, as well as many other series that he has either collected for student projects or helpful people have sent to him.

The data library was once hosted on Professor Hyndman’s personal website since about 1992. In 2012 it was moved onto DataMarket which provides much better facilities for maintaining and using time series data. You can still access the data library from the website or using rdatamarket package to read it into R, but tsdl package provides a simpler means.

If you use any data from the TSDL in a publication, please use the following citation:

Rob Hyndman and Yangzhuoran Yang (2018). tsdl: Time Series Data Library. v0.1.0. https://pkg.yangzhuoranyang./tsdl/.

The data files will remain on Professor Hyndman’s personal website so that existing links will not be broken.

Installation

You can install the development version from Github

# install.packages("devtools")
devtools::install_github("FinYang/tsdl")

Usage

tsdl is a list of 648 series of class tsdl. Each series within tsdl is of class ts.

library(tsdl)
tsdl
#> Time Series Data Library: 648 time series  
#> 
#>                        Frequency
#> Subject                 0.1 0.25   1   4   5   6  12  13  52 365 Total
#>   Agriculture             0    0  37   0   0   0   3   0   0   0    40
#>   Chemistry               0    0   8   0   0   0   0   0   0   0     8
#>   Computing               0    0   6   0   0   0   0   0   0   0     6
#>   Crime                   0    0   1   0   0   0   2   1   0   0     4
#>   Demography              1    0   9   2   0   0   3   0   0   2    17
#>   Ecology                 0    0  23   0   0   0   0   0   0   0    23
#>   Finance                 0    0  23   5   0   0  20   0   2   1    51
#>   Health                  0    0   8   0   0   0   6   0   1   0    15
#>   Hydrology               0    0  42   0   0   0  78   1   0   6   127
#>   Industry                0    0   9   0   0   0   2   0   1   0    12
#>   Labour market           0    0   3   4   0   0  17   0   0   0    24
#>   Macroeconomic           0    0  18  33   0   0   5   0   0   0    56
#>   Meteorology             0    0  18   0   0   0  17   0   0  12    47
#>   Microeconomic           0    0  27   1   0   0   7   0   1   0    36
#>   Miscellaneous           0    0   4   0   1   1   3   0   1   0    10
#>   Physics                 0    0  12   0   0   0   4   0   0   0    16
#>   Production              0    0   4  14   0   0  28   1   1   0    48
#>   Sales                   0    0  10   3   0   0  24   0   9   0    46
#>   Sport                   0    1   1   0   0   0   0   0   0   0     2
#>   Transport and tourism   0    0   1   1   0   0  12   0   0   0    14
#>   Tree-rings              0    0  34   0   0   0   1   0   0   0    35
#>   Utilities               0    0   2   1   0   0   8   0   0   0    11
#>   Total                   1    1 300  64   1   1 240   3  16  21   648

To extract series with specific features, one can use function subset. The most common way to extract series is to specify frequency or subject (type) of the series. The position of these two set conditions are interchangeable.

# Subset by frequency
tsdl_quarterly <- subset(tsdl,4)
tsdl_quarterly
#> Time Series Data Library: 64 time series with frequency 4 
#> 
#>                        Frequency
#> Subject                  4
#>   Demography             2
#>   Finance                5
#>   Labour market          4
#>   Macroeconomic         33
#>   Microeconomic          1
#>   Production            14
#>   Sales                  3
#>   Transport and tourism  1
#>   Utilities              1
#>   Total                 64

# Subset by subject
tsdl_industry <- subset(tsdl,"Industry")
tsdl_industry
#> Time Series Data Library: 12 Industry time series  
#> 
#>           Frequency
#> Subject     1 12 52 Total
#>   Industry  9  2  1    12

# Subset by frequency and subject
tsdl_daily_industry <- subset(tsdl,12,"Industry")
tsdl_daily_industry
#> Time Series Data Library: 2 Industry time series with frequency 12 
#> 
#>           Frequency
#> Subject    12
#>   Industry  2

User can also subset the data set using specified start year, or keywords in its source attribute or description attribute.

# Subset by source
tsdl_abs <- subset(tsdl, source = "Australian Bureau of Statistics")
tsdl_abs
#> Time Series Data Library: 65 time series  
#> 
#>                        Frequency
#> Subject                  1  4 12 Total
#>   Agriculture            0  0  1     1
#>   Demography             0  2  1     3
#>   Finance                0  1  2     3
#>   Labour market          0  0  4     4
#>   Macroeconomic          1 19  1    21
#>   Production             0 13 16    29
#>   Sales                  0  0  1     1
#>   Transport and tourism  0  0  2     2
#>   Utilities              0  1  0     1
#>   Total                  1 36 28    65

# Subset by starting year
tsdl_1948 <- subset(tsdl, start = 1948)
tsdl_1948
#> Time Series Data Library: 10 time series  
#> 
#>                Frequency
#> Subject          4 12 Total
#>   Hydrology      0  1     1
#>   Labour market  1  5     6
#>   Macroeconomic  3  0     3
#>   Total          4  6    10

# Subset by description
tsdl_nettraffic <- subset(tsdl, description = "Internet traffic")
tsdl_nettraffic
#> Time Series Data Library: 6 Computing time series with frequency 1 
#> 
#>            Frequency
#> Subject     1
#>   Computing 6

To access attributes information of the time series, one can directly extract its attributes.

attributes(tsdl[[1]])
#> $tsp
#> [1] 1948.00 1979.75    4.00
#> 
#> $class
#> [1] "ts"
#> 
#> $source
#> [1] "Abraham & Ledolter (1983)"
#> 
#> $description
#> [1] "Quarterly Iowa nonfarm income (1948 – 1979)"
#> 
#> $subject
#> [1] "Macroeconomic"

The collective attributes information is stored in the data frame meta_tsdl. One can also access the possible choices of subject and other options when subset time series.

str(meta_tsdl)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    648 obs. of  5 variables:
#>  $ source     : chr  "Abraham & Ledolter (1983)" "Abraham & Ledolter (1983)" "Abraham & Ledolter (1983)" "Abraham & Ledolter (1983)" ...
#>  $ description:List of 648
#>   ..$ : chr "Quarterly Iowa nonfarm income (1948 – 1979)"
#>   ..$ : chr "Monthly demand repair parts large/heavy equip. Iowa 1972 – 1979"
#>   ..$ : chr "Montly av. residential gas usage Iowa (cubic feet)*100 –71 – –79"
#>   ..$ : chr "Monthly gasoline demand Ontario gallon millions 1960 – 1975"
#>   ..$ : chr "Monthly % changes in Canadian wages and salaries –67 – –75"
#>   ..$ : chr "Monthly sales of U.S. houses (thousands) 1965 – 1975"
#>   ..$ : chr "Quarterly growth rates of Iowa nonfarm income (1948 – 1979)"
#>   ..$ : chr "Monthly Av. residential electricity usage Iowa city 1971 – 1979"
#>   ..$ : chr "Monthly car sales in Quebec 1960-1968"
#>   ..$ : chr "Monthly traffic fatalities in Ontario 1960-1974"
#>   ..$ : chr "Monthly U.S. housing starts (privately owned 1-family) 1965 – 1975"
#>   ..$ : chr "Quarterly U.S. new plant/equip. expenditures –64 – –76 billions"
#>   ..$ : chr "Four-weekly totals of beer shipments"
#>   ..$ : chr "Monthly diffs yields: mortgages and govt. loans, Holland –61 – –74"
#>   ..$ : chr "More advertising and sales data: 36 consecutive monthly sales and advertising expenditures of a dietary weight control product"
#>   ..$ : chr "Monthly U.S. male (16-19 years) unemployment figures (thousands) 1948-1981"
#>   ..$ : chr "Monthly U.S. male (20 years and over) unemployment figures (10**3) 1948-1981"
#>   ..$ : chr "Monthly U.S. female (16-19 years) unemployment figures (thousands) 1948-1981"
#>   ..$ : chr "Monthly U.S. female (20 years and over) unemployment figures (10**3) 1948-1981"
#>   ..$ : chr "Zuerich monthly sunspot numbers 1749-1983"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot–2B"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 3"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 5"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 6"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 7"
#>   ..$ : chr "Annual number of lynx pelts (Hudson–s Bay company, Canada) 1857 – 1911"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 8"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 9"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 10"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 11"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 12"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 13"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 14"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 15"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 16"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 17"
#>   ..$ : chr "Annual unit price of lynx pelts (Hudson–s Bay company, Canada) 1857 – 1911"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 18"
#>   ..$ : chr "Annual yield of grain on Broadbalk field at Rothamsted 1852-1925: plot 19"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–2B"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–3"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–5"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–6"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–7"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–8"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–9"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–10"
#>   ..$ : chr "Monthly mean thickness (Dobson units) ozone column Arosa, Switzerland 1926-1971"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–11"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–12"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–13"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–14"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–15"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–16"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–17"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–18"
#>   ..$ : chr "Annual yield of straw on Broadbalk field at Rothamsted 1852-1925: plot–19"
#>   ..$ : chr "Annual changes in the earth–s rotation, day length (sec*10**-5) 1821-1970"
#>   ..$ : chr "Annual male melanoma incidence (age-adjusted per 10**5) Connecticut 1936-1972"
#>   ..$ : chr "Annual total melanoma incidence (age-adjusted per 10**5) Connecticut 1936-1972"
#>   ..$ : chr "Annual sunspot relative number 1936-1972"
#>   ..$ : chr "Monthly Canadian total unemployment figures (thousands) 1956-1975"
#>   ..$ : chr "Half-hourly precipitation and stream flow, River Hirnant, Wales, UK, November and December 1972."
#>   ..$ : chr "Sacramento River at Keswick, California, Oct. 1939 – Sep. 1960"
#>   ..$ : chr "Clearwater River at Kamiah, Idaho. 1911 – 1965"
#>   ..$ : chr "Judith River near Utica, MT. 1920 – 1960"
#>   ..$ : chr "Madison River near west Yellowstone, MT. 1923 – 1960"
#>   ..$ : chr "Whiterocks River near Whiterocks, Utah, 1930 – 1960"
#>   ..$ : chr "Middle Boulder Creek at Nederland, CO. 1912 – 1960"
#>   ..$ : chr "South Platte River below Cheesman Lake, CO. 1925 – 1960"
#>   ..$ : chr "Neches River near Rockland, Texas. 1940 – 1960"
#>   ..$ : chr "Big Ford River at big falls, MN. 1929 – 1960"
#>   ..$ : chr "Skunk River at Augusta, Iowa. 1915 – Sep. 1960"
#>   ..$ : chr "Current River at Van Buren, MO. 1922 – Sep. 1960"
#>   ..$ : chr "Trinity River at Lewiston, California, Oct. 1912 – Sep. 1960"
#>   ..$ : chr "Wolf River at New London, WI. 1914 – 1960"
#>   ..$ : chr "Mad River near Springfield, OH. 1915 – Sep. 1960"
#>   ..$ : chr "West Branch Delaware River at Hale Eddy, NY. 1916 – Sep. 1960"
#>   ..$ : chr "Pemigewasset River at Plymouth, NH. 1904 – 1960"
#>   ..$ : chr "Rappahannock River near Fredericksburg, VA. 1911 – 1960"
#>   ..$ : chr "James River at Buchanan, VA. 1911 – 1960"
#>   ..$ : chr "Oostanaula River at Resaca, GA. 1893 – 1960"
#>   ..$ : chr "Feather River at Oroville, California, Oct. 1902 – Sep. 1977"
#>   ..$ : chr "American River at Fair Oaks, California, Oct. 1906 – Sep. 1960"
#>   ..$ : chr "Eel River above Dos Rios, California, Oct. 1952 – Sep. 1960"
#>   ..$ : chr "Rock Creek at Little Round Valley, NR. Bishop, California, Sep. 1960"
#>   ..$ : chr "McKenzie River at McKenzie Bridge, Oregon, Oct. 1911 – Sep. 1960"
#>   ..$ : chr "S. F. Skykomish River near Index, Washington, Oct. 1923 – Sep. 1960"
#>   ..$ : chr "Boise River near Twin Springs, Idaho, Oct. 1912 – Sep. 1960"
#>   ..$ : chr "Mean maximum temperature in Melbourne: degrees C. Jan 71 – Dec 90."
#>   ..$ : chr "Daily maximum temperatures in Melbourne, Australia, 1981-1990"
#>   ..$ : chr "Daily minimum temperatures in Melbourne, Australia, 1981-1990"
#>   ..$ : chr "Daily rainfall in Melbourne, Australia, 1981-1990"
#>   ..$ : chr "Col 1: Quarterly average weekly male earnings in Australia, all industries. Col 2: CPI for same quarters. Col 3"| __truncated__
#>   ..$ : chr "Total building and construction activity in Australia: approvals each quarter in $m at 1989/90 prices. Sep 1973 – Mar 1995"
#>   ..$ : chr "Monthly basic iron production in Australia: thousand tonnes. Jan 1956 – Aug 1995"
#>   ..$ : chr "Basic quarterly iron production in Australia: thousand tonnes. Mar 1956 – Sep 1994"
#>   ..$ : chr "Monthly beer production in Australia: megalitres. Includes ale and stout. Does not include beverages with alcoh"| __truncated__
#>   ..$ : chr "Quarterly beer production in Australia: megalitres. March 1956 – June 1994"
#>   .. [list output truncated]
#>  $ frequency  : num  4 12 12 12 12 12 4 12 12 12 ...
#>  $ start      : num  1948 1972 1971 1960 1967 ...
#>  $ subject    : chr  "Macroeconomic" "Industry" "Utilities" "Sales" ...
unique(meta_tsdl$subject)
#>  [1] "Macroeconomic"         "Industry"              "Utilities"            
#>  [4] "Sales"                 "Transport and tourism" "Microeconomic"        
#>  [7] "Production"            "Labour market"         "Physics"              
#> [10] "Agriculture"           "Ecology"               "Health"               
#> [13] "Hydrology"             "Meteorology"           "Demography"           
#> [16] "Finance"               "Tree-rings"            "Chemistry"            
#> [19] "Sport"                 "Miscellaneous"         "Crime"                
#> [22] "Computing"

License

This package is free and open source software, licensed under GPL-3