-
Notifications
You must be signed in to change notification settings - Fork 54
数据格式转换与迭代
ChannelCMT edited this page Jun 24, 2019
·
2 revisions
aList = [1,2,3,4,5]
bList = [1,3,5,7,9]
aList+bList
[1, 2, 3, 4, 5, 1, 3, 5, 7, 9]
import numpy as np
aArray = np.array(aList)
bArray = np.array(bList)
aArray+bArray
array([ 2, 5, 8, 11, 14])
import pandas as pd
aSeries = pd.Series(aArray, index=pd.date_range("20180101",periods=len(aArray), freq="D"))
bSeries = pd.Series(bArray, index=pd.date_range("20180101",periods=len(bArray), freq="D"))
cSeries = aSeries+bSeries
dSeries = aSeries*bSeries
abDF = pd.DataFrame({'a': aSeries, 'b':bSeries})
abDF
a | b | |
---|---|---|
2018-01-01 | 1 | 1 |
2018-01-02 | 2 | 3 |
2018-01-03 | 3 | 5 |
2018-01-04 | 4 | 7 |
2018-01-05 | 5 | 9 |
pd.concat([aSeries, bSeries], axis=1, keys=['a','b'])
a | b | |
---|---|---|
2018-01-01 | 1 | 1 |
2018-01-02 | 2 | 3 |
2018-01-03 | 3 | 5 |
2018-01-04 | 4 | 7 |
2018-01-05 | 5 | 9 |
cdDF = pd.DataFrame({'a': cSeries, 'b':dSeries})
abcdPN = pd.Panel({'abDF': abDF, 'cdDF': cdDF})
abcdPN.transpose(2,1,0).to_frame(False)
a | b | ||
---|---|---|---|
major | minor | ||
2018-01-01 | abDF | 1 | 1 |
cdDF | 2 | 1 | |
2018-01-02 | abDF | 2 | 3 |
cdDF | 5 | 6 | |
2018-01-03 | abDF | 3 | 5 |
cdDF | 8 | 15 | |
2018-01-04 | abDF | 4 | 7 |
cdDF | 11 | 28 | |
2018-01-05 | abDF | 5 | 9 |
cdDF | 14 | 45 |
计算每一列的加总
{item: value.sum() for item, value in abDF.iteritems()}
{'a': 15, 'b': 25}
计算每一行的加总
{index: value.sum() for index, value in abDF.iterrows()}
{Timestamp('2018-01-01 00:00:00', freq='D'): 2,
Timestamp('2018-01-02 00:00:00', freq='D'): 5,
Timestamp('2018-01-03 00:00:00', freq='D'): 8,
Timestamp('2018-01-04 00:00:00', freq='D'): 11,
Timestamp('2018-01-05 00:00:00', freq='D'): 14}
-
python基础
-
python进阶
-
数据格式处理
-
数据计算与展示
-
因子横截面排序分析
-
信号时间序列分析
-
CTA策略类型
-
附录:因子算法