说完了简单的TWAP我们来看看稍复杂的VWAP;TWAP有个特点是简单,简单意味着系统性出错的概率小,同时也意味着效果不怎么好;如果股票的日内成交量波动不大的情况下TWAP能达到不错的效果,但如果日内成交量波动较大,TWAP的效果可能会很差。
于是就有了VWAP (Volume weight average price),它是以日内平均成交量为权重,成交量大的时候多成交,成交量小的时候少成交。理论上VWAP可以很完美的跟踪市场VWAP价格–因为都是按成交量加权么–但实际上“日内成交量”的预测并不是那么的准确,目前来看无论采用何种方式预测日内的成交量,VWAP的实际拟合结果大概在市场VWAP的上下10个bp左右,也就是千分之一左右。
就像上面所说的,日内成交量的预测是VWAP绩效结果的决定因素,日内成交量的预测其实跟日内价格的预测一样很难预测的准确,所以目前市场上大多数算法交易产品都采用过去一段时间内的平均交易量来拟合当日日内成交量,虽然不准,不过总归是有依据的…这个“一段时间内平均内日成交量”叫做Volume Profile,是VWAP算法中最重要的部分,下图展示的是某只股票的Volume Profile(灰)和其某天实际成交量(红),可以看到两者还是有差距的
不过对于某些大盘股来说这样的差距并不大
所以对于大部分流通性好(波动小)的股票来说VWAP还是适用的。
就像上面说的,VWAP算法最主要的是计算Volume Profile,这个需要全市场的交易数据,按照计算精度的不同一般要1分钟、5分钟、10分钟维度的数据,时常有可能是7天,21天,30天或者60天,计算起来比较费时间,以三个星期的5分钟数据来说,SQL跑要30分钟,R语言跑要3个钟头的样子…不过好在Volumeprofile每天的变动并不大,大家算好了可以用一周..
R语言计算VolumeProfile的过程如下,丢出来引玉用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | #Get 21days 5min data to make the Volume Profile #Connect the Database library(RODBC) library(timeDate) odbcConnect("Nanshan",uid="",pwd="")->channel #make the Query centense tmp.ReadyRbind<- NULL #Cycle tmp Variable MonthBack<- 6 #Cycle Control Variable timeSequence(from = "2014-03-05",to = Sys.timeDate(), by = "month")->SearchMap SearchMap[(length(SearchMap)-(MonthBack-1)):length(SearchMap)]->SearchMap while (MonthBack >0) { #Start the cycle gsub("-","",substr(SearchMap[MonthBack],1,7))->Date_yyyymm paste("SELECT SECCODE,SECNAME,TDATE,MINTIME,MINTQ FROM [GTA_SEL1_TRDMIN_",Date_yyyymm,"].[dbo].[SHL1_TRDMIN05_",Date_yyyymm,"]",sep="") -> SQLQuery #Query and Save sqlQuery(channel,SQLQuery)->T1 rbind(T1,tmp.ReadyRbind) -> tmp.ReadyRbind #Conbine the less than 21 days stock MonthBack<- MonthBack-1 #End the cycle } rm(Date_yyyymm,SQLQuery,T1,MonthBack) odbcClose(channel) library(reshape2) #Summary of the Valid Day and variable tmp.ReadyRbind[,c("SECCODE","TDATE")][!duplicated(tmp.ReadyRbind[,c("SECCODE","TDATE")]),]->DuplicatedMap dcast(melt(DuplicatedMap,id="SECCODE"),SECCODE~variable,length)->CountedMap # More than 21 days stock into tmp. Volume_Profile_Code CountedMap[CountedMap$TDATE>=21,1]->tmp.Volume_Profile_Code print(cat("We have",length(tmp.Volume_Profile_Code),"of Socket have Valied Volume Profile")) #less than 21 days stock into tmp.cycle CountedMap[CountedMap$TDATE<21,]->tmp.Cycle_Code print(cat("Unfortunatly",dim(tmp.Cycle_Code)[1]," Sockets doesn't match the rule of Volume Profile")) print("tmp.Cycle_Code variable save the Code and valied days of unvalied stocks") tmp.ReadyRbind[tmp.ReadyRbind$SECCODE %in% tmp.Volume_Profile_Code,]->tmp.VolumeProfile #Get the last 21 days in tmp.VolumeProfile VolumeProfile<-NULL for (i in tmp.Volume_Profile_Code){ tmp.VolumeProfile[tmp.VolumeProfile$SECCODE==i,]->tmp tmp[!duplicated(tmp[,1:3]),]->AllCalendar AllCalendar[order(AllCalendar$TDATE,decreasing = T),][1:21,]->ValiedCalendar tmp[tmp$TDATE %in% ValiedCalendar$TDATE,]->VolumeProfile_RWA dcast(melt(VolumeProfile_RWA,id=c("SECCODE","SECNAME","MINTIME")),SECCODE+SECNAME+MINTIME~variable,mean)->VolumeProfile_part rbind(VolumeProfile,VolumeProfile_part)->VolumeProfile |
有了VolumeProfile后VWAP的拆单算法跟TWAP就很相近了,大致是1)按VolumeProfile的分布平均–抹掉少于100的零头—少于100的零头再按VolumeProfile分…直到零头不到100….R语言的示例如下,当然一般的算法核心没有用R语言来写的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | #Get the order ! ##style:symbol,price,startTime,Endtime # 600123,10.05,HHMMSS,HHMMSS read.csv(file="E:Research DocTWAPorder.csv",header=T)->tm.p #load the Volume Profile load(file="F:TESTVWAPVolumeProfile-20150305.Rdata") #Transform and standardised the Order data.frame( Symbol= formatC(tm.p$Symbol,width = 6, flag = 0), Price= tm.p$Price, StartTime= formatC(tm.p$StartTime,width = 6, flag = 0), EndTime= formatC(tm.p$EndTime,width = 6, flag = 0), T_Quantity=tm.p$Quantity, stringsAsFactors=F )->MyOrder rm(tm.p) #Using Vwap to Splits the Big Order library(timeDate) Interval_mins <- difftimeDate(timeDate(MyOrder$EndTime,format="%H%M%S"),timeDate(MyOrder$StartTime,format="%H%M%S"),units="mins") #Build the Order Queue floor(as.numeric(MyOrder$StartTime) / 500) *5->StartTime_5min ceiling(as.numeric(MyOrder$EndTime) / 500) *5->EndTime_5min attach(VolumeProfile) VolumeProfile[SECCODE==MyOrder$Symbol & MINTIME >=StartTime_5min & MINTIME <=EndTime_5min,c(3,5,4)]->Order_Plan Order_Plan[,3]<-0.0 names(Order_Plan)<-c("Time","Quantity","Price") detach(VolumeProfile) #debug Start Here 2150306 18:16 #Build the Part I (MyOrder$T_Quantity*Order_Plan[,"Quantity"]/sum(Order_Plan[,"Quantity"])) -((MyOrder$T_Quantity*Order_Plan[,"Quantity"]/sum(Order_Plan[,"Quantity"])) %% 100) ->Order_Plan[,"Quantity"] #Build the part II leaved_quantity<-MyOrder$T_Quantity-sum(Order_Plan$Quantity) while (leaved_quantity>100){ floor(leaved_quantity/100)->Avaliable_Order ceiling(as.numeric(Interval_mins)/5/Avaliable_Order)->Time_Gap Order_Plan[seq(from=Time_Gap,by=Time_Gap,to=as.numeric(Interval_mins)/5),"Quantity"]+100 -> Order_Plan[seq(from=Time_Gap,by=Time_Gap,to=as.numeric(Interval_mins)/5),"Quantity"] leaved_quantity<-leaved_quantity-(as.numeric(Interval_mins)/5/Time_Gap)*100 } #Build the Last Part Order_Plan[(-dim(Order_Plan)[1]+1):-1,2]+round(leaved_quantity/100)*100-> Order_Plan[(-dim(Order_Plan)[1]+1):-1,2] #End of Order_Plan #Debug sum(Order_Plan$Quantity) Order_Plan$Quantity rm(Avaliable_Order,Interval_Seconds,Time_Gap,leaved_quantity) |
OK,下一节视情况看是再学点算法交易还是学点量化择时/选股