Skip to content

数据格式转换与迭代

ChannelCMT edited this page Jun 24, 2019 · 2 revisions

1、 List转Numpy

aList = [1,2,3,4,5]
bList = [1,3,5,7,9]

aList+bList
[1, 2, 3, 4, 5, 1, 3, 5, 7, 9]
import numpy as np

aArray = np.array(aList)
bArray = np.array(bList)
aArray+bArray
array([ 2,  5,  8, 11, 14])

2、 Numpy转Series

import pandas as pd

aSeries = pd.Series(aArray, index=pd.date_range("20180101",periods=len(aArray), freq="D"))
bSeries = pd.Series(bArray, index=pd.date_range("20180101",periods=len(bArray), freq="D"))
cSeries = aSeries+bSeries
dSeries = aSeries*bSeries

3、 Series转DataFrame

abDF = pd.DataFrame({'a': aSeries, 'b':bSeries})
abDF
a b
2018-01-01 1 1
2018-01-02 2 3
2018-01-03 3 5
2018-01-04 4 7
2018-01-05 5 9
pd.concat([aSeries, bSeries], axis=1, keys=['a','b'])
a b
2018-01-01 1 1
2018-01-02 2 3
2018-01-03 3 5
2018-01-04 4 7
2018-01-05 5 9
cdDF = pd.DataFrame({'a': cSeries, 'b':dSeries})

4、 DataFrame转Panel

abcdPN = pd.Panel({'abDF': abDF, 'cdDF': cdDF})
abcdPN.transpose(2,1,0).to_frame(False)
a b
major minor
2018-01-01 abDF 1 1
cdDF 2 1
2018-01-02 abDF 2 3
cdDF 5 6
2018-01-03 abDF 3 5
cdDF 8 15
2018-01-04 abDF 4 7
cdDF 11 28
2018-01-05 abDF 5 9
cdDF 14 45

5、 iteritem()

计算每一列的加总

{item: value.sum() for item, value in abDF.iteritems()}
{'a': 15, 'b': 25}

6、 iterrow()

计算每一行的加总

{index: value.sum() for index, value in abDF.iterrows()}
{Timestamp('2018-01-01 00:00:00', freq='D'): 2,
 Timestamp('2018-01-02 00:00:00', freq='D'): 5,
 Timestamp('2018-01-03 00:00:00', freq='D'): 8,
 Timestamp('2018-01-04 00:00:00', freq='D'): 11,
 Timestamp('2018-01-05 00:00:00', freq='D'): 14}
Clone this wiki locally