高清视频一区,91久久天天躁狠狠躁夜夜

01引言

上一篇推文【Python量化基礎(chǔ)】時(shí)間序列的自相關(guān)性與平穩(wěn)性

著重介紹了時(shí)間序列的一些基礎(chǔ)概念，包括自相關(guān)性、偏自相關(guān)性、白噪聲和平穩(wěn)性，以及Python的簡(jiǎn)單實(shí)現(xiàn)。本文在此基礎(chǔ)上，以滬深300指數(shù)收益率數(shù)據(jù)為例，探討如何使用Python對(duì)平穩(wěn)時(shí)間序列進(jìn)行建模和預(yù)測(cè)分析。時(shí)間序列經(jīng)典模型主要有自回歸模型AR，移動(dòng)回歸模型MA，移動(dòng)自回歸模型ARMA，以及差分移動(dòng)自回歸模型ARIMA，今天主要介紹這四種模型的基本原理以及Python的實(shí)現(xiàn)步驟。

02AR模型

AR模型全稱為Autoregressive Models，即自回歸模型，用于刻畫(huà)因變量能由它的多個(gè)滯后項(xiàng)表示。p階自回歸模型可以寫(xiě)成：

下面模擬一個(gè)AR(1)模型。

import pandas as pd

import numpy as np

import statsmodels.tsa.api as smt

#tsa為T(mén)ime Series analysis縮寫(xiě)

import statsmodels.api as sm

import scipy.stats as scs

from arch import arch_model

#畫(huà)圖

import matplotlib.pyplot as plt

import matplotlib as mpl

%matplotlib inline

#正常顯示畫(huà)圖時(shí)出現(xiàn)的中文和負(fù)號(hào)

from pylab import mpl

mpl.rcParams['font.sans-serif']=['SimHei']

mpl.rcParams['axes.unicode_minus']=False

#先定義一個(gè)畫(huà)圖函數(shù)，后面都會(huì)用到

def ts_plot(data, lags=None,title=''):

if not isinstance(data, pd.Series):

data = pd.Series(data)

#matplotlib官方提供了五種不同的圖形風(fēng)格，

#包括bmh、ggplot、dark_background、fivethirtyeight和grayscale

with plt.style.context('ggplot'):

fig = plt.figure(figsize=(10, 8))

layout = (3, 2)

ts_ax = plt.subplot2grid(layout, (0, 0), colspan=2)

acf_ax = plt.subplot2grid(layout, (1, 0))

pacf_ax = plt.subplot2grid(layout, (1, 1))

qq_ax = plt.subplot2grid(layout, (2, 0))

pp_ax = plt.subplot2grid(layout, (2, 1))

data.plot(ax=ts_ax)

ts_ax.set_title(title+'時(shí)序圖')

smt.graphics.plot_acf(data, lags=lags, ax=acf_ax, alpha=0.5)

acf_ax.set_title('自相關(guān)系數(shù)')

smt.graphics.plot_pacf(data, lags=lags, ax=pacf_ax, alpha=0.5)

pacf_ax.set_title('偏自相關(guān)系數(shù)')

sm.qqplot(data, line='s', ax=qq_ax)

qq_ax.set_title('QQ 圖')

scs.probplot(data, sparams=(data.mean(),

data.std()), plot=pp_ax)

pp_ax.set_title('PP 圖')

plt.tight_layout()

return

# 模擬AR(1) 過(guò)程

#設(shè)置隨機(jī)種子（括號(hào)里數(shù)字無(wú)意義）

np.random.seed(1)

#模擬次數(shù)

n=5000

#AR模型的參數(shù)

a = 0.8

#擾動(dòng)項(xiàng)為正態(tài)分布

x = w = np.random.normal(size=n)

for t in range(1,n):

x[t] = a*x[t-1] + w[t]

#畫(huà)圖

ts_plot(x, lags=30)

模擬的AR(1)模型是正態(tài)的。自相關(guān)系數(shù)圖（ACF）顯示滯后值之間存在顯著的序列相關(guān)性，偏自相關(guān)系數(shù)圖（PACF）則顯示在滯后1期時(shí)截尾（迅速降為0）。下面使用statsmodels構(gòu)建AR(p)模型，先用AR模型擬合上述模擬的數(shù)據(jù)，并返回估計(jì)的系數(shù)參數(shù)），然后選擇最佳滯后階數(shù)，最后與原模型設(shè)置對(duì)比看是否選擇了正確的滯后項(xiàng)。假如AR模型是正確的，那估計(jì)的系數(shù)參數(shù)將很接近真實(shí)的系數(shù)0.8，選擇的階數(shù)也會(huì)等于1。

#估計(jì)數(shù)據(jù)的AR模型參數(shù)和滯后階數(shù)

def simu_ar(data,a,maxlag=30,true_order = 1):

'''data:要擬合的數(shù)據(jù)；a為參數(shù),可以為列表；maxlag:最大滯后階數(shù)'''

# 擬合AR(p)模型

result = smt.AR(data).fit(maxlag=maxlag, ic='aic', trend='nc')

#選擇滯后階數(shù)

est_order = smt.AR(data).select_order(maxlag=maxlag,

ic='aic', trend='nc')

#參數(shù)選擇標(biāo)準(zhǔn)ic : 有四個(gè)選擇 {‘a(chǎn)ic’,’bic’,’hqic’,’t-stat’}

#趨勢(shì)項(xiàng)：trend：c是指包含常數(shù)項(xiàng)，nc為不含常數(shù)項(xiàng)

#打印結(jié)果

print(f'參數(shù)估計(jì)值：{result.params.round(2)}，

估計(jì)的滯后階數(shù)：{est_order}')

print(f'真實(shí)參數(shù)值：{a}，真實(shí)滯后階數(shù) {true_order}')

simu_ar(x,a=0.8)

參數(shù)估計(jì)值：[0.8]，估計(jì)的滯后階數(shù)：1

真實(shí)參數(shù)值：0.8，真實(shí)滯后階數(shù) 1

看下如何用AR(p)模型來(lái)擬合滬深300的對(duì)數(shù)收益

# Select best lag order for hs300 returns

import tushare as ts

token='輸入token'

pro=ts.pro_api(token)

df=pro.index_daily(ts_code='000300.SH')

df.index=pd.to_datetime(df.trade_date)

del df.index.name

df=df.sort_index()

df['ret']=np.log(df.close/df.close.shift(1))

max_lag = 30

Y=df.ret.dropna()

ts_plot(Y,lags=max_lag,title='滬深300')

result = smt.AR(Y.values).fit(maxlag=max_lag, ic='aic', trend='nc')

est_order = smt.AR(Y.values).select_order(maxlag=max_lag,

ic='aic', trend='nc')

print(f'滬深300擬合AR模型的參數(shù)：{result.params.round(2)}')

print(f'滬深300擬合AR模型的最佳滯后階數(shù) {est_order}')

滬深300擬合AR模型的參數(shù)：[ 0.03 -0.03 ...]

滬深300擬合AR模型的最佳滯后階數(shù) 15

最好的階數(shù)選擇是15或者說(shuō)有15個(gè)參數(shù)！任何模型有這么多參數(shù)在實(shí)際中不可能有用。顯然有比這個(gè)模型更好的模型可以解釋滬深300收益率走勢(shì)。

03MA模型

MA(q)模型與AR(p)模型非常相似。不同之處在于，MA(q)模型是對(duì)過(guò)去的白噪聲誤差項(xiàng)的線性組合，而不是過(guò)去觀測(cè)的線性組合。MA模型的動(dòng)機(jī)是我們可以直接通過(guò)擬合誤差項(xiàng)的模型來(lái)觀察誤差過(guò)程中的“沖擊”。在一個(gè)AR(p)模型中，通過(guò)在一系列過(guò)去的觀察中使用ACF間接觀察到這些沖擊。MA(q)模型的公式是：

下面使用Python模擬MA(1) 過(guò)程。

#這里使用arma模型進(jìn)行模擬，設(shè)定ar階數(shù)為0，即得到ma模型

alphas = np.array([0.])

betas = np.array([0.6])

ar = np.r_[1, -alphas]

ma = np.r_[1, betas]

#模擬MA的樣本數(shù)據(jù)

ma_sample = smt.arma_generate_sample(ar=ar, ma=ma, nsample=1000)

ts_plot(ma_sample, lags=30,title='MA(1)模型')

ACF函數(shù)顯示滯后1階系數(shù)顯著異于0，表明MA（1）模型適合擬合的數(shù)據(jù)。

# 對(duì)上述模擬數(shù)據(jù)進(jìn)行ARMA模型擬合

max_lag = 30

result = smt.ARMA(ma1, order=(0, 1)).fit(maxlag=max_lag,

method='mle', trend='nc')

print(result.summary())

模型估計(jì)d 滯后系數(shù)為0.6277，與真實(shí)值0.6比較接近。注意到，95％置信區(qū)間確實(shí)包含該真實(shí)值。

下面嘗試用MA(3)模型去擬合滬深300股價(jià)的對(duì)數(shù)收益，但這次并不知道真實(shí)的參數(shù)值。結(jié)果顯示，擬合的殘差自相關(guān)系數(shù)和偏自相關(guān)系數(shù)比較符合白噪聲過(guò)程，但由于存在厚尾，MA模型并不是預(yù)測(cè)滬深300未來(lái)回報(bào)的最佳模型。

max_lag = 30

result=smt.ARMA(Y.values,order(0,3)).fit(maxlag=max_lag,

method='mle', trend='nc')

print(result.summary())

resid=pd.Series(result.resid,index=Y.index)

ts_plot(resid, lags=max_lag,title='滬深300指數(shù)MA擬合殘差')

04ARMA模型

ARMA模型全稱為自回歸移動(dòng)平均模型Autoregressive Moving Average Models - ARMA(p, q)，是AR（p）和MA（q）模型之間的結(jié)合，從金融的角度理解，AR和MA模型的理論意義在于：AR（p）模型試圖捕捉（解釋）交易市場(chǎng)中經(jīng)常觀察到的動(dòng)量和均值回復(fù)效應(yīng)。MA（q）模型嘗試捕捉（解釋）在白噪聲條件下觀察到的沖擊效應(yīng)。這些沖擊效應(yīng)可以被認(rèn)為是影響觀察過(guò)程的意外事件。ARMA模型的弱點(diǎn)在于忽視了大多數(shù)金融時(shí)間序列中的波動(dòng)聚集效應(yīng)。模型的公式可以表示為:

print(result.summary())

# 下面使用ARMA(2, 2) 模型進(jìn)行模擬分析

max_lag = 30

n = 5000

burn = int(n/10)

alphas = np.array([0.5, -0.25])

betas = np.array([0.5, -0.3])

#注意ar模型1代表0階（自身），然后在其他系數(shù)前加負(fù)號(hào)

ar = np.r_[1, -alphas]

ma = np.r_[1, betas]

arma22 = smt.arma_generate_sample(ar=ar, ma=ma, nsample=n, burnin=burn)

_ = ts_plot(arma22, lags=max_lag)

result = smt.ARMA(arma22, order=(2, 2)).fit(maxlag=max_lag,

method='mle', trend='nc', burnin=burn)

結(jié)果顯示模型估計(jì)的參數(shù)與真實(shí)參數(shù)基本上吻合。下面使用ARMA模型來(lái)擬合滬深300的收益數(shù)據(jù)。ACF和PACF沒(méi)有顯示出明顯的自相關(guān)性。QQ和概率圖顯示殘差大致為正態(tài)分布但厚尾?？傮w而言，這個(gè)模型的殘差看起來(lái)不像白噪聲，說(shuō)明模型還是沒(méi)有很好的擬合其波動(dòng)性特性。

#不事先確定滯后階數(shù)，而是通過(guò)信息準(zhǔn)則選擇最佳的滯后階數(shù)

#先將初始值設(shè)置為無(wú)窮大

best_aic = np.inf

best_order = None

best_mdl = None

rng = range(5)

for i in rng:

for j in rng:

try:

tmp_mdl = smt.ARMA(Y.values, order=(i,j))

.fit(method='mle', trend='nc')

tmp_aic = tmp_mdl.aic

if tmp_aic < best_aic:

best_aic = tmp_aic

best_order = (i, j)

best_mdl = tmp_mdl

except: continue

print(f'最佳滯后階數(shù):{best_order}')

print(best_mdl.summary())

resid=pd.Series(best_mdl.resid,index=Y.index)

ts_plot(resid, lags=30,title='滬深300指數(shù)ARMA擬合殘差')

最佳滯后階數(shù):(4, 4)

05ARIMA模型

ARIMA模型全稱是差分移動(dòng)自回歸模型（Autoregressive Integrated Moving Average Models），是ARMA模型的拓展。由于現(xiàn)實(shí)中很多時(shí)間序列不是平穩(wěn)的，但可以通過(guò)差分來(lái)實(shí)現(xiàn)平穩(wěn)，即通過(guò)一階差分可以將非平穩(wěn)機(jī)游走其轉(zhuǎn)化為平穩(wěn)的白噪聲。由于前三個(gè)模型都有時(shí)間序列平穩(wěn)的假設(shè)在，如果時(shí)間序列存在明顯的上升或者下降趨勢(shì)，模型預(yù)測(cè)的效果大大折扣。對(duì)于有明顯下降或者上升趨勢(shì)的數(shù)據(jù)集，可以使用差分的方式將其轉(zhuǎn)化為平穩(wěn)序列，然后使用ARMA模型進(jìn)行擬合。假設(shè)模型經(jīng)過(guò)d次差分通過(guò)了時(shí)間序列平穩(wěn)的檢驗(yàn)，ARMA的系數(shù)為p，q，ARIMA模型為ARIMA(p，d，q）。

下面通過(guò)迭代（p，d，q）的不同組合，找到擬合滬深300收益率數(shù)據(jù)的最佳ARIMA模型。通過(guò)AIC信息準(zhǔn)則來(lái)評(píng)估每個(gè)模型，最后選取AIC最小的。

#原理與擬合ARMA模型類似

best_aic = np.inf

best_order = None

best_mdl = None

#假定最多滯后4階

pq_rng = range(5)

#假定最多差分一次

d_rng = range(2)

for i in pq_rng:

for d in d_rng:

for j in pq_rng:

try:

tmp_mdl = smt.ARIMA(Y.values, order=(i,d,j))

.fit(method='mle', trend='nc')

tmp_aic = tmp_mdl.aic

if tmp_aic < best_aic:

best_aic = tmp_aic

best_order = (i, d, j)

best_mdl = tmp_mdl

except: continue

print(f'ARIMA模型最佳階數(shù)選擇：{best_order}')

# 對(duì)擬合殘差進(jìn)行可視化

print(best_mdl.summary())

resid=pd.Series(best_mdl.resid,index=Y.index)

_ = ts_plot(resid, lags=30,title='滬深300指數(shù)ARIMA殘差')

ARIMA模型最佳階數(shù)選擇：(4, 0, 4)

最好的模型是差分為0，因?yàn)槲覀兪褂玫氖鞘找媛蕯?shù)據(jù)，相對(duì)于已經(jīng)采用了第一次對(duì)數(shù)差分來(lái)計(jì)算股票收益率。模型殘差圖結(jié)果與上面使用的ARMA模型基本相同。顯然，ARIMA模型同樣無(wú)法解釋時(shí)間序列中的條件波動(dòng)性。到這一步，時(shí)間序列的基本模型和建模步驟基本上大家已經(jīng)熟知，下面利用模型的forecast()方法進(jìn)行預(yù)測(cè)。

# 對(duì)滬深300收益率未來(lái)20天進(jìn)行預(yù)測(cè)

n_steps = 20

#分別設(shè)置95%和99%的置信度

f, err95, ci95 = best_mdl.forecast(steps=n_steps)

_, err99, ci99 = best_mdl.forecast(steps=n_steps, alpha=0.01)

date=(df.index[-1]).strftime('%Y%m%d')

cal=pro.trade_cal(exchange='', start_date=date)

idx = cal[cal.is_open==1][:20]['cal_date'].values

fc_95 = pd.DataFrame(np.column_stack([f, ci95]),

index=idx,columns=['forecast', 'lower_ci_95', 'upper_ci_95'])

fc_99 = pd.DataFrame(np.column_stack([ci99]),

index=idx, columns=['lower_ci_99', 'upper_ci_99'])

fc_all = fc_95.combine_first(fc_99)

#fc_all.head()

# 對(duì)預(yù)測(cè)的20日收益率數(shù)據(jù)進(jìn)行可視化

plt.style.use('ggplot')

fig = plt.figure(figsize=(12,7))

ax = plt.gca()

ts = df['ret'][-500:].copy()

ts.plot(ax=ax, label='HS300收益率')

# 樣本內(nèi)預(yù)測(cè)

pred=best_mdl.predict(np.arange(len(ts)) [0], np.arange(len(ts))[-1])

pf=pd.Series(pred,index=ts.index)

pf.plot(ax=ax, style='r-', label='樣本內(nèi)預(yù)測(cè)')

fc_all.index=pd.to_datetime(fc_all.index)

fc_all.plot(ax=ax)

plt.fill_between(fc_all.index, fc_all.lower_ci_95,

fc_all.upper_ci_95, color='gray', alpha=0.7)

plt.fill_between(fc_all.index, fc_all.lower_ci_99,

fc_all.upper_ci_99, color='gray', alpha=0.2)

plt.title('{} 天HS300收益率預(yù)測(cè)ARIMA{}'.format(n_steps,

best_order))

plt.legend(loc='best', fontsize=10)

plt.show()

06結(jié)語(yǔ)

本文主要以滬深300指數(shù)收益率數(shù)據(jù)為例，簡(jiǎn)要介紹了時(shí)間序列四大經(jīng)典模型的基本原理和Python的簡(jiǎn)單應(yīng)用，不難發(fā)現(xiàn)，這些模型在擬合和預(yù)測(cè)滬深300指數(shù)收益率上顯得力不從心。實(shí)際上，這些模型有一個(gè)潛在假設(shè)是干擾項(xiàng)的方差是固定不變的，但是研究者發(fā)現(xiàn)金融經(jīng)濟(jì)數(shù)據(jù)（如股票收益率）大都存在異方差現(xiàn)象，因此傳統(tǒng)的時(shí)間序列模型無(wú)法獲得可靠的估計(jì)結(jié)果。為了解決金融資產(chǎn)收益率序列波動(dòng)聚集的難題，學(xué)者們提出了ARCH、GARCH以及協(xié)整模型，后續(xù)推文將會(huì)對(duì)這一方面的應(yīng)用進(jìn)行詳細(xì)介紹。

參考資料：

1. statsmodels官方文檔

2. Time Series Analysis (TSA) in Python - Linear Models to GARCH

著作權(quán)歸作者所有

本站僅提供存儲(chǔ)服務(wù)，所有內(nèi)容均由用戶發(fā)布，如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊舉報(bào)。

九色国产,午夜在线视频,新黄色网址,九九色综合,天天做夜夜做久久做狠狠,天天躁夜夜躁狠狠躁2021a,久久不卡一区二区三区