Time series analysis paper — analyze completely

Time series analysis paper

For Problem 2, you are to evaluate the given analysis and interpretation for clarity, completeness,
sufficiency, accuracy, and consistency. Indicate what you think is good, not good, and what you
would do differently. Note: points will be deducted for comments on format. The critique must
be about predictive analytics, not layout. Do not copy the report into your exam, rather use the

2

  • Return Surgeries, 15 points
  • For this problem, you are to evaluate the analysis and interpretation for clarity, completeness,
    sufficiency, accuracy, and consistency. Indicate what you think is good, not good, and what you
    would do differently. Keep your assessments by the numbering system to assure your criticisms
    coincide with the respective material.

    I do not want comments on format. Whether a figure is not in a convenient place is of not
    interest. Grammar and spelling are of no interest. I will mark down for non-essential criticism.
    Focus on the analysis and the interpretation of the results.

    2.1 Introduction

    A continuing question on daily return surgeries time series indices is whether any one series is
    interchangeable with another; i.e., does one surgery index time series have the same daily counts
    as some other particular surgery index time series within specified statistical error? Statistical
    differences in surgery index time series include, e.g., networks of multiple observers or counting
    methodology. The Debrecen index is compared to the Surgery Tracking And Recognition Algorithm
    (STARA) index, and with the Addendum of Authenticated and Verified Surgery Observations
    (AAVSO) index.

    A pairwise comparison of index counts is confounded by the possible autocorrelation of each
    series, and hence a traditional regression-type comparison is inappropriate as the autocorrelation
    violates the regression independence assumption. In addition, two aspects of a time series must
    be examined for a comparison; when the count occurred and the count magnitude. The analytical
    methodologies include autocorrelation and cross-correlation from statistical time series analysis to
    determine when a count occurred, and the nonparametric Wilcoxon signed rank test to compare
    magnitudes of the count. The time series analyses are used to determine pairwise day-by-day
    alignment. Once the paired series are time-aligned, the count magnitudes are made using the
    Wilcoxon signed rank test as the counts data do not follow a normal (Gaussian) distribution.

    Section 2.2 is a description of the three returning time series data sets; Section 2.3 discusses
    the time series statistical analyses of each of the three data sets; Section 2.4 is the data set com-
    parisons, or statistical time series cross-correlation analysis including a brief explanation of why
    regression is inappropriate; Section 2.5 are the count magnitude comparisons; and Section 2.6 are
    the conclusions.

    2.2 Data Sets

    This section describes the daily returning surgeries times series of the AAVSO, Debrecen, and
    STARA data sets. The descriptions indicate some of the characteristics that must be accounted for
    prior to a statistical comparison. The AAVSO series is described first, followed by the Debrecen
    series, and ending with the STARA series.

    2.2.1 AAVSO Data

    The AAVSO’s program of data-gathering and analysis of surgeries has been active since its inception
    in 1944. AAVSO raw data are submitted monthly as sets of date- and time-stamped values. The
    pre scrubbed AAVSO data contain 34,435 returning surgery counts that span from May 1, 2010
    through July 12, 2013. The left panel of Figure 1 shows that these data are truncated on the left
    at zero counts, skewed to the right. The histogram suggests these count data follow a Poisson
    distribution.

    2.2.2 Debrecen Data

    The pre scrubbed Debrecen data contain 41,866 daily returning surgery counts that span from
    December 4, 1981 through January 5, 2011. As with the AAVSO data, the middle panel of Figure
    1 shows that these data are truncated on the left at zero counts, skewed to the right. The histogram
    suggests these data follow a Poisson distribution.

    Figure 1: AAVSO, Debrecen, and STARA index counts histograms of the pre scrubbed data. The
    green dashed curves are best-fit exponential distributions, and the black solid curves are best-fit
    gamma distributions..

    2.2.3 STARA Data

    The STARA data contain 1,152 daily returning surgery return counts span from May 1, 2010
    through July 12, 2013. The right panel of Figure 1 shows that these data are truncated on the
    left at zero counts, and are skewed to the right. This suggests these counts data follow a Poisson
    distribution.

    The Poisson distributions of each of these data sets affect the accuracy of the paired count
    magnitude comparisons, as will be seen below.

    2.3 Autocorrelation Analysis

    A time series is a stochastic process where the index set is of countable time increments; i.e., a
    time series is a set of observations, xt, each recorded at a specified time t. To allow for the possibly
    unpredictable nature of future observations we may suppose that each observation is a realization
    of a random variable Xt. The time series {xt, t ∈ T0} is a realization of the family of random
    variables {Xt, t ∈ T}, where T ≥ T0.; i.e., the realization xt is a subset of all possible values of Xt.
    The following time series process analyses assess whether the count pairings are index set (time)
    aligned. This alignment is necessary before paired counts magnitude comparisons can be made.
    The times series autocorrelation analysis is preceded by a descriptive analysis of the data sets.

    2.3.1 Descriptive Analysis

    The AAVSO and Debrecen series have days with multiple observations which we summarize by the
    count median. Further, the time span of each data set must be matched. The common span is
    found to be from May 1, 2010 through January 5, 2011. The time series that result from using the
    daily median and matched spans are displayed in Figures 2 and 3. Figure 2 depicts the three series
    in a stacked, matched-span plot. The AAVSO data are in the upper panel, the Debrecen data are
    in the middle panel, and the lower panel has the STARA data. These plots show ambiguously
    matched count magnitudes.

    Figure 2: The AAVSO (top panel), Debrecen (middle panel), and STARA (bottom panel)
    matched-span time series plot. The data are daily..

    Figure 2 has the three matched-span series superimposed over each other. The AAVSO series
    is the solid black curve, the Debrecen series is the dashed red curve, and the STARA series is the
    dotted green curve. As with the stacked plot, this plot also shows no obvious coincidence in count
    magnitude.

    Figure 3: The three matched-span series superimposed. The AAVSO series is the solid black
    curve, the Debrecen series is the dashed red curve, and the STARA series is the dotted green curve.
    The data are daily..

    Fortunately, statistical time series analysis is able to remove much of the apparent ambiguity.
    Time series analysis will help determine if the counts are time-aligned. Once this outcome is
    available, a magnitude comparison is possible.

    2.3.2 Autocorrelation Models

    Before the counts time series magnitudes can be compared, the individual time series must be
    examined for autocorrelation, as autocorrelation inflates the series variability. A critical property of
    any time series is stationarity, which is required to assess the autocorrelations and cross-correlations.
    Stationarity is the property of a time series in which, over a specified time span, the mean and
    variance of the series is constant. This is the time series analysis equivalent of the mean zero,
    constant variance assumption requirement for such statistical methods a the t-test, analysis of
    variance, and regression. If a time series follows a Gaussian distribution, then it can be shown that
    the time series is stationary.

    We saw above that the three counts time series do not follow a normal distribution, and hence
    stationarity may not be assumed. A commonly used transformation to obtain a stationary time
    series is differencing. A first difference transformation is

    5Xt = Xt −Xt−1 = Xt −BXt = (1 −B)Xt, (1)

    where 5Xt is the tth first difference operation between the tth and the t−1st values of the random
    variable X, and B is the back shift operator such that BXt = Xt−1. The differencing operator
    may be extended to second (5(2)), third (5(3)), etc., differences, as can the back shift operator B,

    but higher order differencing is not needed for the return surgeries time series. The first difference
    transformation results in stationarity for each of the three series.

    With stationarity established, we can examine each series for autocorrelation. The sample
    autocorrelation function (ACF) and the sample partial autocorrelation function (PACF), and their
    associated plots, are used to identify if and what types of autocorrelation exist. The sample ACF
    measures time series white noise autocorrelation as a moving average order. The sample PACF
    measures time series autocorrelation as the order of autoregression

    In Figures 4 and 5, the panels on the diagonal depict the first-differenced (lag 1) series sample
    ACF and sample PACF respectively. The off-diagonal panels are unadjusted cross-correlations
    between paired series, and are here ignored pending further time series analysis. In each figure, the
    row one column one panel is the AAVSO series, the second row second column panel is the Debrecen
    series, and the third row third column is the STARA series, each after taking first differences. We
    are interested in the plot lag values of each panel that extend above or below the horizontal blue
    dashed 95% confidence interval (CI) lines. Each series has 211 days of return surgery counts, and
    at the 95% CI, this suggests that there are 0.05 × 211 ≈ 11 expected CI marginal overreaches. We
    therefore are interested in those lag patterns that strongly extend outside the CI band.

    Figure 4 is the sample ACF of the three series. The zeroth lag (t = t) is ignored in each sample
    ACF plot. The AAVSO plot suggests a lag 1 (preceding day) moving average model should be
    examined. The Debrecen plot indicates that both a lag 1 and a lag 3 moving average model may
    be appropriate. The STARA plot, like the AAVSO plot, suggests a lag 1 moving average model
    should be tested.

    Figure 4: The sample ACFs of the AAVSO, Debrecen, and STARA time series..

    Figure 5 is the sample PACF of the three series. In each panel on the diagonal of the plot, there
    are no systematic overreaches of the CIs, i.e., the overreaches appear random, which suggests no
    autoregressive behavior in these three series.

    Figure 5: The sample PACFs of the AAVSO, Debrecen, and STARA time series..

    The sample ACF and sample PACF suggest the types of time series models for each surgery
    count source. The models take the form of Autoregressive Integrated Moving Average (ARIMA)
    models. The models are denoted as ARIMA(p,d,q), where AR refers to the autoregressive compo-
    nent, I refers to the integrated component which determines the order of differencing to establish
    stationarity, MA refers to the moving average component, and p, d, and q are the non-negative
    integers indicating the orders of autoregression, integration, and moving averaging, respectively.
    The ARIMA analysis of the AAVSO series gives a ARIMA(0, 1, 1) model, the Debrecen model is
    ARIMA(0, 1, 3), and the STARA model ARIMA(1, 1, 3).

    Goodness-of-fit indicators for the ARIMA models are cumulative periodograms of the model
    standardized residuals, and time series plots of the standardized residuals. The behavior of the
    model residuals are particularly important for the cross-correlation analysis below. Figures ??
    and ?? are the diagnostics for the AAVSO ARIMA(0, 1, 1) model. Figure ?? is the cumulative
    periodogram. The blue dashed diagonal lines define a 95% CI band that, if the black cumulative
    periodogram curve lies within, suggests the model is adequate. Containment of the curve within the
    CI band suggests it follows a normal distribution, which is an indicator of model adequacy. Figure
    ?? has three diagnostic plots. The top panel is the standardized residuals time series plot which
    indicates an adequate model when no more than 11 residuals exceed the plus or minus 3 standard
    deviation levels. The middle panel is the sample ACF of the residuals which suggest the ARIMA

    model is adequate as all the lags lie within the horizontal red dashed 95% CI levels. The bottom
    panel is the Ljung-Box p-value plot in which no p-values fall below the threshold line indicating an
    adequate model. Hence, the ARIMA(0, 1, 1) may be considered a reasonable model of the AAVSO
    series.

    Figure 6: AAVSO series ARIMA model diagnostic plots..

    Figures ?? and ?? are the diagnostics for the Debrecen ARIMA(0, 1, 3) model. Figure ?? is
    the cumulative periodogram which suggests it follows a normal distribution. Figure ?? has the
    time-based diagnostic plots. The standardized residuals time series plot has only 2 of the possible
    11 values that lie outside ±3 standard deviations. The sample ACF of the residuals suggest the
    ARIMA model has all the lags within the 95% CI band. The Ljung-Box p-value plot has no p-
    values below the horizontal red threshold line. Hence, the ARIMA(0, 1, 3) may be considered a
    reasonable model of the Debrecen series.

    Figure ?? and ?? are the diagnostics for the STARA ARIMA(1, 1, 3) model. Figure ?? is the
    cumulative periodogram which suggests the periodogram is normally distributed. Figure ?? has
    the three time series diagnostic plots. The standardized residuals time series plot has no residuals
    outside the plus or minus 3 standard deviation levels. The sample ACF of the residuals has all the
    lags within the horizontal red dashed 95% CI levels. The Ljung-Box p-value plot has no p-values
    below the horizontal red threshold line. Hence, the ARIMA(1, 1, 3) may be considered a reasonable
    model of the STARA series.

    The autocorrelation of each of the three return surgery data sets has been identified and de-
    scribed. The residuals analyses show that the residuals of each time series are reduced to white
    noise, and thus the residuals are independent between any series pair. This is an important property
    for the series comparisons. We may now make pairwise comparisons of the data sets.

    Figure 7: Debrecen series ARIMA model diagnostic plots..

    2.4 Cross-Correlation Analysis

    The panel of scatter plots of the count sources in Figure 9 show the paired series associations. The
    second row, column one panel shows that the Debrecen versus AAVSO data have a clear nonlinear
    relationship with the smaller counts having the greater nonlinearity, and the large counts have the
    greater variability. A similar nonlinear relationship exists between the Debrecen and STARA series,
    which is depicted in the second row, column three panel. However, the STARA versus AAVSO data
    exhibit a more nearly linear relationship, though the variability of the larger counts is greater. This
    relationship is shown in the panel in the third row of the first column. Some of these characteristics
    have been addressed by constructing ARIMA models for each series, and it is with these models
    that the cross-correlations, i.e., the time-based alignment, may be developed.

    With autocorrelated data it is difficult to assess the dependence or comparison between any
    two time series. It is therefore necessary to disentangle the linear association between any two
    series from their respective autocorrelations. Another property that must be satisfied is that the
    two series must be stationary and independent of each other. While the data may be stationary,
    they must still be transformed to white noise to assure independence. The transformation may be
    accomplished by using the residuals from the respective series ARIMA models. We saw from the
    ARIMA model diagnostics that the residuals from the series ARIMA models are white noise, thus
    implying that the residuals of the ARIMA models are independent. For example, it was shown
    that the AAVSO data are adequately modeled by an ARIMA(0, 1, 1) with no intercept term, so,

    Figure 8: STARA series ARIMA model diagnostic plots..

    for xt representing the AAVSO counts,

    x̄t = zt −θzt−1
    = (1 −θB) zt, (2)

    where x̄t is the white noise model return surgery count at time t, zt is the white noise value at
    time t, and θ is the white noise parameter that is estimated from the ARIMA model analysis.
    The ARIMA model residuals x̄t, t = 0,±1,±2, · · · , are white noise and this process is known as
    prewhitening.

    We now compare the two series using the cross-correlation function (CCF) by prewhitening one
    series with its ARIMA model. The other series then is filtered through this same ARIMA model.
    Stationarity is assured by incorporating the first difference in the ARIMA filter. As prewhitening is a
    linear operation, any linear relationship between the two series will be preserved after prewhitening.
    For example, to compare the AAVSO data with the Debrecen data, first prewhiten the AAVSO
    data using its ARIMA model. Then filter the Debrecen data with the AAVSO ARIMA model.
    Finally, use the CCF to look for lags between the two series.

    Often a regression model is used to measure the relationship of one counts series to another. The
    fallacy of this method arises from the violation of two assumptions of regression: (i) the response
    must follow a normal distribution, and (ii) the two series must be independent. The first assumption
    was shown above to be violated as the counts follow a Poisson distribution. The second assumption
    is violated as demonstrated by the autocorrelation identified in the ARIMA model analyses, which
    is an indictment of non-independence.

    Figure 10 is the sample CCF between the ARIMA(0, 1, 1) filtered Debrecen counts and the

    Figure 9: Scatter plots of the return surgeries count sources show the paired series associations..

    ARIMA(0, 1, 1) prewhitened AAVSO counts. It is clear from the plot that the only lag is at zero,
    which suggests that the two series are nearly aligned in time.

    Figure 11 is the sample CCF between the ARIMA(0, 1, 1) filtered STARA counts and the
    ARIMA(0, 1, 1) prewhitened AAVSO counts. The plot shows balance between the AAVSO the
    STARA data. The AAVSO series and the STARA series is balanced at lag 0. This balance
    suggests that the two series are aligned in time.

    Figure 12 is the sample CCF between the ARIMA(1, 1, 3) filtered Debrecen counts and the
    ARIMA(1, 1, 3) prewhitened STARA counts. The AAVSO series and the STARA series are bal-
    anced at lag 0. This balance suggests that the two series are aligned in time.

    The cross-correlation analysis gives the pairwise time alignments to compare the magnitude of
    the counts for each series. The cross-correlation between the AAVSO and Debrecen series have zero
    lag and hence they are aligned. The same result holds for the cross-correlation between the AAVSO
    and STARA data, i.e., they are aligned. Similarly, the cross-correlation between the STARA and
    Debrecen data show they are aligned.

    2.5 Magnitude Comparison

    With the appropriate shifts for each return surgery counts series if needed, the counts magnitude
    comparison is tested with the Wilcoxon signed ranks test. This test is used over the t-test as the
    counts data do not follow a normal distribution, which is an assumption required for the t-test. The
    n time-ordered data pairs (x1,1,x2,1), (x1,2,x2,2), · · · , (x1,n∗,x2,n∗ ) for which the absolute value of

    Figure 10: The sample CCF between the ARIMA(0, 1, 1) filtered Debrecen counts and the
    ARIMA(0, 1, 1) prewhitened AAVSO count residuals..

    Figure 11: The sample CCF between the ARIMA(0, 1, 1) filtered STARA counts and the
    ARIMA(0, 1, 1) prewhitened AAVSO count residuals..

    Figure 12: The sample CCF between the ARIMA(1, 1, 3) filtered Debrecen counts and the
    ARIMA(1, 1, 3) prewhitened STARA count residuals..

    the differences are found such that

    Di = x1,i −x2,i, i = 1, . . . ,n∗. (3)

    Simplistically, all differences with the value 0 are eliminated so the remaining differences are n ≤ n∗.
    The n |Di| differences are ordered from lowest to highest, and then are ranked 1 to n. The ith rank
    Ri is designated as a positive rank if Di > 0, or Ri is designated as a negative rank if Di < 0. The test statistic is the sum of the positive signed ranks:

    T∗ =

    Ri, ∀Ri 3 Di > 0, i = 1, . . . ,n. (4)

    The test statistic T∗ is compared to the quantiles of a distribution whose shape varies depending
    on conditions.

    Table 2 lists the surgery counts time series pairs and their respective Wilcoxon signed rank
    test statistics. The last column in the table indicates if the count magnitudes may be considered
    statistically equal. Only the STARA and Debrecen time series have statistically identical daily
    counts.

    Table 2: Wilcoxon rank sum test with continuity correction counts magnitude comparison..

    X Y n W P(>W) X = Y

    AAVSO Debrecen 211 35368.5 < 2.2e− 16 no AAVSO STARA 211 34903 < 2.2e− 16 no STARA Debrecen 210 22286.5 0.8468 yes

    2.6 Conclusions

    Three time series of daily returning surgeries counts were compared for interchangeability; i.e.,
    does one return surgery time series have the same daily counts as some other particular time series
    within specified statistical error? Each series had peculiarities, e.g., networks of multiple observers
    or counting methodology, for which some adjustments were made in the time series and magnitude
    analyses.

    The Debrecen time series was compared to the STARA time series, and with the AAVSO time
    series. Also, the STARA and AAVSO series were compared. These daily time series were shown to
    be autocorrelated which was accounted for before the series were compared.

    Each time series was made stationary by taking the first difference. The autocorrelation function
    and the partial autocorrelation function were used to identify the order and type of autocorrela-
    tion for each of the series. The analysis of the AAVSO series gave the ARIMA(0, 1, 1) model,
    the Debrecen series analysis gave the ARIMA(0, 1, 3) model, and the STARA analysis gave the
    ARIMA(1, 1, 3) model.

    The cross-correlation function (CCF) between the ARIMA(0, 1, 1) filtered Debrecen counts
    and the ARIMA(0, 1, 1) prewhitened AAVSO counts showed the count changes occurred on the
    same days. It was clear from the plot that there was no lagging, which suggested that the two
    series were time-aligned. The CCF between the ARIMA(0, 1, 1) filtered STARA counts and the
    ARIMA(0, 1, 1) prewhitened AAVSO counts showed that the count series were time-aligned. The
    CCF between the ARIMA(1, 1, 3) filtered Debrecen counts and the ARIMA(1, 1, 3) prewhitened
    STARA counts suggested that the Debrecen series and the STARA data are time aligned.

    After the appropriate series shifts were made, the magnitude of the series counts was compared.
    Table 2 gives the details of the counts magnitude comparisons, and the table shows that only the
    STARA and Debrecen series are interchangeable.

    We showed that returning surgeries time series counts comparisons are best made after a statis-
    tical times series analysis is performed. We also showed that, as the counts do not follow a normal
    distribution, the appropriate magnitude comparison statistical method is the Wilcoxon signed ranks
    test provided the series pairings first are time-aligned. The results showed that only the Debrecen
    series and the STARA series are interchangeable.

    3

  • Bonus, 3D VAR(2) Model, 5 points
  • Set up the three-dimensional (3D) VAR(2) where the third variable does not Granger-cause the
    first variable. The Bonus.R script may help.

    4 Bonus, “Best Model”, 5 points

    Give criteria for aiding in the choice of a “best” time series model when two or more such models
    are available. What is, arguably, the most important criterion?

    • Time Series Model Construction, 20 points
    • Fossil Fuels Company Stocks
      Blackhole Detection from Suspected Gravity Lensing
      Return Surgeries, 15 points
      Introduction
      Data Sets
      AAVSO Data
      Debrecen Data
      STARA Data
      Autocorrelation Analysis
      Descriptive Analysis
      Autocorrelation Models
      Cross-Correlation Analysis
      Magnitude Comparison
      Conclusions
      Bonus, 3D VAR(2) Model, 5 points

    • Bonus, “Best Model”, 5 points
    Calculate your order
    275 words
    Total price: $0.00

    Top-quality papers guaranteed

    54

    100% original papers

    We sell only unique pieces of writing completed according to your demands.

    54

    Confidential service

    We use security encryption to keep your personal data protected.

    54

    Money-back guarantee

    We can give your money back if something goes wrong with your order.

    Enjoy the free features we offer to everyone

    1. Title page

      Get a free title page formatted according to the specifics of your particular style.

    2. Custom formatting

      Request us to use APA, MLA, Harvard, Chicago, or any other style for your essay.

    3. Bibliography page

      Don’t pay extra for a list of references that perfectly fits your academic needs.

    4. 24/7 support assistance

      Ask us a question anytime you need to—we don’t charge extra for supporting you!

    Calculate how much your essay costs

    Type of paper
    Academic level
    Deadline
    550 words

    How to place an order

    • Choose the number of pages, your academic level, and deadline
    • Push the orange button
    • Give instructions for your paper
    • Pay with PayPal or a credit card
    • Track the progress of your order
    • Approve and enjoy your custom paper

    Ask experts to write you a cheap essay of excellent quality

    Place an order