**UFRJ**

# An Introduction to Time Series Analysis and Forecasting with Applications of SAS...

(Parte **1** de 6)

Introduction to Time Series Analysis and Forecasting with Applications of SAS and SPSS

This Page Intentionally Left Blank This Page Intentionally Left Blank

Introductionto Time Series

Analysisand

Forecasting with Applicationsof SAS and SPSS

Robert A. Yaffee

Statistics and Social Science Group Academic Computing Service of the Information Technology Services New York University New York, New York and Division of Geriatric Psychiatry State University of New York Health Science Center at Brooklyn Brooklyn, NY with

Monnie McGee

Hunter College City University of New York New York, New York

ACADEMIC PRESS, INC. San Diego London Boston New York Sydney Tokyo Toronto

Copyright Page goes here Copyright Page goes here

For Liz and Mike

This Page Intentionally Left Blank This Page Intentionally Left Blank

Contents

Preface xv

Chapter 1 Introduction and Overview

Chapter 2 Extrapolative and Decomposition Models vii viii Contents

2.2. Goodness-of-Fit Indicators 15 2.3. Averaging Techniques 18 2.3.1. The Simple Average 18 2.3.2. The Single Moving Average 18 2.3.3. Centered Moving Averages 20 2.3.4. Double Moving Averages 20 2.3.5. Weighted Moving Averages 2 2.4. Exponential Smoothing 23 2.4.1. Simple Exponential Smoothing 23 2.4.2. Holt’s Linear Exponential Smoothing 32 2.4.3. The Dampened Trend Linear Exponential

Smoothing Model 38 2.4.4. Exponential Smoothing for Series with Trend and

Seasonality: Winter’s Methods 39 2.4.5. Basic Evaluation of Exponential Smoothing 43 2.5. Decomposition Methods 45 2.5.1. Components of a Series 45 2.5.2. Trends 46 2.5.3. Seasonality 50 2.5.4. Cycles 50 2.5.5. Background 50 2.5.6. Overview of X-1 52 2.6. New Features of Census X-12 6 References 6

Chapter 3 Introduction to Box–Jenkins Time Series Analysis

3.1. Introduction 69 3.2. The Importance of Time Series Analysis Modeling 69 3.3. Limitations 70 3.4. Assumptions 70 3.5. Time Series 74 3.5.1. Moving Average Processes 74 3.5.2. Autoregressive Processes 76 3.5.3. ARMA Processes 7 3.5.4. Nonstationary Series and Transformations to Stationarity 7 3.6. Tests for Nonstationarity 81 3.6.1. The Dickey–Fuller Test 81

Contents ix

3.6.2. Augmented Dickey–Fuller Test 84 3.6.3. Assumptions of the Dickey–Fuller and

Augmented Dickey–Fuller Tests 85 3.6.4. Programming the Dickey–Fuller Test 86 3.7. Stabilizing the Variance 90 3.8. Structural or Regime Stability 92 3.9. Strict Stationarity 93 3.10. Implications of Stationarity 94 3.10.1. For Autoregression 94 3.10.2. Implications of Stationarity for Moving

Average Processes 97 References 9

Chapter 4 The Basic ARIMA Model

4.1. Introduction to ARIMA 101 4.2. Graphical Analysis of Time Series Data 102 4.2.1. Time Sequence Graphs 102 4.2.2. Correlograms and Stationarity 106 4.3. Basic Formulation of the Autoregressive

Integrated Moving Average Model 108 4.4. The Sample Autocorrelation Function 110 4.5. The Standard Error of the ACF 118 4.6. The Bounds of Stationarity and Invertibility 119 4.7. The Sample Partial Autocorrelation Function 122 4.7.1. Standard Error of the PACF 125 4.8. Bounds of Stationarity and Invertibility Reviewed 125 4.9. Other Sample Autocorrelation Functions 126 4.10. Tentative Identiﬁcation of Characteristic Patterns of

Integrated, Autoregressive, Moving Average, and ARMA Processes 128 4.10.1. Preliminary Programming Syntax for

Identiﬁcation of the Model 128 4.10.2. Stationarity Assessment 132 4.10.3. Identifying Autoregressive Models 134 4.10.4. Identifying Moving Average Models 137 4.10.5. Identifying Mixed Autoregressive–Moving

Average Models 142 References 149 x Contents

Chapter 5 Seasonal ARIMA Models

5.1. Cyclicity 151 5.2. Seasonal Nonstationarity 154 5.3. Seasonal Differencing 161 5.4. Multiplicative Seasonal Models 162 5.4.1. Seasonal Autoregressive Models 164 5.4.2. Seasonal Moving Average Models 166 5.4.3. Seasonal Autoregressive Moving

Average Models 168 5.5. The Autocorrelation Structure of Seasonal

ARIMA Models 169 5.6. Stationarity and Invertibility of Seasonal

ARIMA Models 170 5.7. A Modeling Strategy for the Seasonal

ARIMA Model 171 5.7.1. Identiﬁcation of Seasonal Nonstationarity 171 5.7.2. Purely Seasonal Models 171 5.7.3. A Modeling Strategy for General Multiplicative

Seasonal Models 173 5.8. Programming Seasonal Multiplicative

Box–Jenkins Models 183 5.8.1. SAS Programming Syntax 183 5.8.2. SPSS Programming Syntax 185 5.9. Alternative Methods of Modeling Seasonality 186 5.10. The Question of Deterministic or Stochastic

Seasonality 188 References 189

Chapter 6 Estimation and Diagnosis

6.1. Introduction 191 6.2. Estimation 191 6.2.1. Conditional Least Squares 192 6.2.2. Unconditional Least Squares 195 6.2.3. Maximum Likelihood Estimation 198 6.2.4. Computer Applications 204 6.3. Diagnosis of the Model 208 References 213

Contents xi

Chapter 7 Metadiagnosis and Forecasting

7.1. Introduction 215 7.2. Metadiagnosis 217 7.2.1. Statistical Program Output of

Metadiagnostic Criteria 221 7.3. Forecasting with Box–Jenkins Models 2 7.3.1. Forecasting Objectives 2 7.3.2. Basic Methodology of Forecasting 224 7.3.3. The Forecast Function 225 7.3.4. The Forecast Error 232 7.3.5. Forecast Error Variance 232 7.3.6. Forecast Conﬁdence Intervals 233 7.3.7. Forecast Proﬁles for Basic Processes 234 7.4. Characteristics of the Optimal Forecast 244 7.5. Basic Combination of Forecasts 245 7.6. Forecast Evaluation 248 7.7. Statistical Package Forecast Syntax 251 7.7.1. Introduction 251 7.7.2. SAS Syntax 252 7.7.3. SPSS Syntax 254 7.8. Regression Combination of Forecasts 256 References 263

Chapter 8 Intervention Analysis

8.1. Introduction: Event Interventions and

Their Impacts 265 8.2. Assumptions of the Event Intervention

(Impact) Model 267 8.3. Impact Analysis Theory 268 8.3.1. Intervention Indicators 268 8.3.2. The Intervention (Impulse Response) Function 270 8.3.3. The Simple Step Function: Abrupt Onset,

Permanent Duration 270 8.3.4. First-Order Step Function: Gradual Onset,

Permanent Duration 272 8.3.5. Abrupt Onset, Temporary Duration 276 xii Contents

8.3.6. Abrupt Onset and Oscillatory Decay 278 8.3.7. Graduated Onset and Gradual Decay 279 8.4. Signiﬁcance Tests for Impulse Response Functions 280 8.5. Modeling Strategies for Impact Analysis 282 8.5.1. The Box–Jenkins–Tiao Strategy 283 8.5.2. Full Series Modeling Strategy 285 8.6. Programming Impact Analysis 288 8.6.1. An Example of SPSS Impact Analysis Syntax 290 8.6.2. An Example of SAS Impact Analysis Syntax 297 8.6.3. Example: The Impact of Watergate on Nixon

Presidential Approval Ratings 314 8.7. Applications of Impact Analysis 342 8.8. Advantages of Intervention Analysis 345 8.9. Limitations of Intervention Analysis 346 References 350

Chapter 9 Transfer Function Models

9.1. Deﬁnition of a Transfer Function 353 9.2. Importance 354 9.3. Theory of the Transfer Function Model 355 9.3.1. The Assumption of the Single-Input Case 355 9.3.2. The Basic Nature of the Single-Input

Transfer Function 355 9.4. Modeling Strategies 368 9.4.1. The Conventional Box–Jenkins

Modeling Strategy 368 9.4.2. The Linear Transfer Function Modeling Strategy 399 9.5. Cointegration 420 9.6. Long-Run and Short-Run Effects in

Dynamic Regression 421 9.7. Basic Characteristics of a Good Time Series Model 422 References 423

Chapter 10 Autoregressive Error Models

10.1. The Nature of Serial Correlation of Error 425 10.1.1. Regression Analysis and the Consequences of Autocorrelated Error 426

Contents xiii

10.2. Sources of Autoregressive Error 435 10.3. Autoregressive Models with Serially

Correlated Errors 437 10.4. Tests for Serial Correlation of Error 437 10.5. Corrective Algorithms for Regression Models with

Autocorrelated Error 439 10.6. Forecasting with Autocorrelated Error Models 441 10.7. Programming Regression with

Autocorrelated Errors 443 10.7.1. SAS PROC AUTOREG 443 10.7.2. SPSS ARIMA Procedures for Autoregressive

Error Models 452 10.8. Autoregression in Combining Forecasts 458 10.9. Models with Stochastic Variance 462 10.9.1. ARCH and GARCH Models 463 10.9.2. ARCH Models for Combining Forecasts 464 References 465

Chapter 1 A Review of Model and Forecast Evaluation

1.1. Model and Forecast Evaluation 467 1.2. Model Evaluation 468 1.3. Comparative Forecast Evaluation 469 1.3.1. Capability of Forecast Methods 471 1.4. Comparison of Individual Forecast Methods 476 1.5. Comparison of Combined Forecast Models 477 References 478

Chapter 12

Power Analysis and Sample Size Determination for Well-Known Time Series Models

Monnie McGee

12.1. Census X-1 482 12.2. Box–Jenkins Models 483 12.3. Tests for Nonstationarity 486 12.4. Intervention Analysis and Transfer Functions 487 12.5. Regression with Autoregressive Errors 490 xiv Contents

Preface

This book is the product of an intellectual odyssey in search of an understanding of historical truth in culture, society, and politics, and the scenarios likely to unfold from them. The quest for this understanding of reality and its potential is not always easy. Those who fail to understand history will not fully understand the current situation. If they do not understand their current situation, they will be unable to take advantage of its latent opportunities or to sidestep the emergent snares hidden within it. George Santayana appreciated the dangers inherent in this ignorance when he said, ‘‘Those who fail to learn from history are doomed to repeat it.’’ Kierkegaard lemented that history is replete with examples of men condemned to live life forward while only understanding it backward. Even if, as Nobel laureate Neils Bohr once remarked, ‘‘Prediction is difﬁcult, especially of the future,’’ many great pundits and leaders emphasized the real need to understand the past and how to forecast from it. Winston Churchill, with an intuitive understanding of extrapolation, remarked that ‘‘the farther back you can look, the farther forward you can see.’’

Tragic tales abound where vital policies failed because decision makers did not fathom the historical background—with its ﬂow of cultural forces, demographic resources, social forces, economic processes, political processes—of a problem for which they had to make policy. Too often lives were lost or runied for lack of adequate diplomatic, military, political, or economic intelligence and understanding. Obversely, policies succeeded in accomplishing vital objectives because policy makers have understood the likely scenarios of events. After we learned from the past, we needed to study and understand the current situation to appreciate its future possibilities and probabilities. Indeed, the journalistic and scientiﬁc quest for ‘‘what is’’ may reveal the outlines of ‘‘what can be.’’ The qualitative investigation of ‘‘what has been’’ and ‘‘what is’’ may be the mere beginning of this quest. The principal objective of this textbook is to introduce the reader to the xvi Preface fundamental approaches to time series analysis and forecasting. Although the book explores the basic nature of a time series, it presumes that the reader has an understanding of the methodology of measurement and scale construction. In case there are missing data, the book brieﬂy addresses the imputation of missing data. For the most part, the book assumes that there are not signiﬁcant amounts of missing data in the series and that any missing data have been properly replaced or imputed. Designed for the advanced undergraduate or the beginning graduate student, this text examines the principal approaches to the analysis of time series processes and their forecasting. In simple and clear language, it explains moving average, exponential smoothing, decomposition (Census X-1 plus comments on Census X-12), ARIMA, intervention, transfer function, regression, error correction, and autoregressive error models. These models are generally used for analysis of historical, recent, current, or simulated data with a view toward forecasting. The book also examines evaluation of models, forecasts, and their combinations. Thus, the text attempts to discuss the basic approaches to time series analysis and forecasting.

Another objective of this text is to explain and demonstrate novel theoretical features and their applications. Some of the relatively new features include treatment of Y2K problem circumventions, Census X-12, different transfer function modeling strategies, a scenario analysis, an application of different forecast combination methods, and an analysis of sample size requirements for different models. Although Census X-12 is not yet part of either statistical package, its principal features are discussed because it is being used by governments as the standard method of deseasonalization. In fact, SAS is planning on implementing PROC X12 in a forthcoming version. When dealing with transfer function models, both the conventional Box–Jenkins–Tiao and the linear transfer function approaches are presented. The newer approach, which does not use prewhitening, is more amenable to more complex, multiple input models. In the chapter on event impact or intervention analysis, an approach is taken that compared the impact of an intervention with what would have happened if all things remained the same. A ‘‘what if’’ baseline is posited against which the impact is measured and modeled. The book also brieﬂy addresses cointegration and error correction models, which embed both long-run and short-run changes in the same model. In the penultimate chapter, the evaluation and comparison of models and forecasts are discussed. Attention is paid to the relative advantages and disadvantages of the application of one approach over another under different conditions. This section is especially important in view of the discovery in some of the forecast competitions that the more complex models do not always provide the best forecasts. The methods as well as the relative advantages and disadvantages of combining forecasts

Preface xvii to improve forecast accuracy are also analyzed. Finally, to dispel erroneous conventional wisdom concerning sample size, the ﬁnal chapter empirically examines the selection of the proper sample size for different types of analysis. In so doing, Monnie McGee makes a scholarly methodological contribution to the study of sample size required for time series tests to attain a power of 0.80, an approach to the subject of power of time series tests that has not received sufﬁcient discussion in the literature until now.

As theory and modeling are explained, the text shows how popular statisticalprograms,usingrecentandhistoricaldataarepreparedtoperform the time series analysis and forecasting. The statistical packages used in this book—namely, the Statistical AnalysisSystem (SAS) and the Statistical Package for the Social Sciences (SPSS)—are arguably the most popular general purpose statistical packages among university students in the social or natural sciences today. An understanding of theory is necessary for their proper application under varying circumstances. Therefore, after explaining the statistical theory, along with basic preprocessing commands, I present computer program examples that apply either or both of the SAS Econometric Time Series (SAS/ETS) module or the SPSS Trends module. The programming syntax, instead of the graphic interfaces, of the packages is presented because the use of this syntax tends to remain constant over time while the graphical interfaces of the statistical packages change frequently. In the presentation of data, the real data are ﬁrst graphed. Because graphical display can be critical to understanding the nature of the series, graphsofthedata(especiallytheSASGraphs)areelaboratelyprogrammed to produce high-resolution graphical output. The data are culled from areas of public opinion research, policy analysis, political science, economics, sociology, and even astronomy and occasionally come from areas of great historical, social, economic, or political importance during the period of time analyzed. The graphs include not only the historical data; after Chapter 7 explains forecasting, they also include forecasts and their proﬁles. SAS and SPSS computer programs, along with their data, are posted on the Academic Press Web site http://www.academicpress.com/sbe/authors to assist instructors in teaching and students in learning this subject matter. Students may run these programs and examine the output for themselves. Through their application of these time series programming techniques, they can enhance their capabilities in the quest for understanding the past, the present, and to a limited extent, que sera.

(Parte **1** de 6)