# インデックス

In [1]:
import pandas as pd

df = pd.read_pickle("ftx_btc-rerp_20220306-20220312.pickle")
df.head()

Unnamed: 0,startTime,time,open,high,low,close,volume
0,2022-03-09 20:00:00+00:00,2022-03-09 20:00:00,42229.0,42271.0,42228.0,42254.0,4986935.0
1,2022-03-09 20:05:00+00:00,2022-03-09 20:05:00,42254.0,42357.0,42246.0,42253.0,15411970.0
2,2022-03-09 20:10:00+00:00,2022-03-09 20:10:00,42253.0,42318.0,42123.0,42146.0,17410200.0
3,2022-03-09 20:15:00+00:00,2022-03-09 20:15:00,42147.0,42183.0,42080.0,42089.0,8337687.0
4,2022-03-09 20:20:00+00:00,2022-03-09 20:20:00,42089.0,42190.0,42089.0,42110.0,5248858.0


## locインデクサ

`DataFrame.loc` はDataFrameの要素に「ラベルから」アクセスするためのインデクサです。

`loc[行のラベル, 列のラベル]` の形式で要素を指定します。

```{note}
行および列のラベルを「インデックス」と表記することもありますが、ここでは順序を表すインデックスと区別するため、「ラベル」と表記しています
```

In [2]:
df.loc[3, "open"]

42147.0

`loc` にはスライスやリストによる要素に指定ができます

In [3]:
df.loc[2:5, ["time", "volume"]]

Unnamed: 0,time,volume
2,2022-03-09 20:10:00,17410200.0
3,2022-03-09 20:15:00,8337687.0
4,2022-03-09 20:20:00,5248858.0
5,2022-03-09 20:25:00,3678867.0


## ilocインデクサ

`DataFrame.loc` はDataFrameの要素に「順序から」アクセスするためのインデクサです。

`loc[行の順序, 列の順序]` の形式で要素を指定します。

In [4]:
df.iloc[3, 2]

42147.0

In [5]:
df.iloc[2:5, [1, -1]]

Unnamed: 0,time,volume
2,2022-03-09 20:10:00,17410200.0
3,2022-03-09 20:15:00,8337687.0
4,2022-03-09 20:20:00,5248858.0


## DatetimeIndex

`DatetimeIndex` は時系列データに特化したインデックスです。日時を柔軟に指定して要素にアクセスできます。

`set_index` メソッドは指定した列をインデックスにします。

In [6]:
df.set_index("time", inplace=True)
df.head()

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-09 20:00:00,2022-03-09 20:00:00+00:00,42229.0,42271.0,42228.0,42254.0,4986935.0
2022-03-09 20:05:00,2022-03-09 20:05:00+00:00,42254.0,42357.0,42246.0,42253.0,15411970.0
2022-03-09 20:10:00,2022-03-09 20:10:00+00:00,42253.0,42318.0,42123.0,42146.0,17410200.0
2022-03-09 20:15:00,2022-03-09 20:15:00+00:00,42147.0,42183.0,42080.0,42089.0,8337687.0
2022-03-09 20:20:00,2022-03-09 20:20:00+00:00,42089.0,42190.0,42089.0,42110.0,5248858.0


In [7]:
df.index

DatetimeIndex(['2022-03-09 20:00:00', '2022-03-09 20:05:00',
               '2022-03-09 20:10:00', '2022-03-09 20:15:00',
               '2022-03-09 20:20:00', '2022-03-09 20:25:00',
               '2022-03-09 20:30:00', '2022-03-09 20:35:00',
               '2022-03-09 20:40:00', '2022-03-09 20:45:00',
               ...
               '2022-03-15 00:10:00', '2022-03-15 00:15:00',
               '2022-03-15 00:20:00', '2022-03-15 00:25:00',
               '2022-03-15 00:30:00', '2022-03-15 00:35:00',
               '2022-03-15 00:40:00', '2022-03-15 00:45:00',
               '2022-03-15 00:50:00', '2022-03-15 00:55:00'],
              dtype='datetime64[ns]', name='time', length=1500, freq=None)

datetime型やtime型による指定ができます。

In [8]:
import datetime

df.loc[datetime.datetime(2022, 3, 9, 20), "volume"]

4986934.9804

In [9]:
df.loc[datetime.time(20, 0), :]

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-09 20:00:00,2022-03-09 20:00:00+00:00,42229.0,42271.0,42228.0,42254.0,4986935.0
2022-03-10 20:00:00,2022-03-10 20:00:00+00:00,39437.0,39455.0,39372.0,39402.0,7649057.0
2022-03-11 20:00:00,2022-03-11 20:00:00+00:00,38797.0,38837.0,38759.0,38787.0,7301834.0
2022-03-12 20:00:00,2022-03-12 20:00:00+00:00,39031.0,39055.0,38991.0,39010.0,2578932.0
2022-03-13 20:00:00,2022-03-13 20:00:00+00:00,38865.0,38866.0,38738.0,38774.0,19958910.0
2022-03-14 20:00:00,2022-03-14 20:00:00+00:00,38824.0,38920.0,38808.0,38904.0,14238040.0


In [10]:
df.loc[pd.Timestamp("2022-03-09 20:00"), :]

startTime    2022-03-09 20:00:00+00:00
open                           42229.0
high                           42271.0
low                            42228.0
close                          42254.0
volume                    4986934.9804
Name: 2022-03-09 20:00:00, dtype: object

文字列による指定もできます。

In [11]:
# 2022年3月9日20時0分
df.loc["2022-03-09 20:00", :]

startTime    2022-03-09 20:00:00+00:00
open                           42229.0
high                           42271.0
low                            42228.0
close                          42254.0
volume                    4986934.9804
Name: 2022-03-09 20:00:00, dtype: object

In [12]:
# 2022年3月9日
df.loc["2022-03-09", :].head()

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-09 20:00:00,2022-03-09 20:00:00+00:00,42229.0,42271.0,42228.0,42254.0,4986935.0
2022-03-09 20:05:00,2022-03-09 20:05:00+00:00,42254.0,42357.0,42246.0,42253.0,15411970.0
2022-03-09 20:10:00,2022-03-09 20:10:00+00:00,42253.0,42318.0,42123.0,42146.0,17410200.0
2022-03-09 20:15:00,2022-03-09 20:15:00+00:00,42147.0,42183.0,42080.0,42089.0,8337687.0
2022-03-09 20:20:00,2022-03-09 20:20:00+00:00,42089.0,42190.0,42089.0,42110.0,5248858.0


In [13]:
df.loc["2022-03-09", :].tail()

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-09 23:35:00,2022-03-09 23:35:00+00:00,42128.0,42150.0,42043.0,42043.0,30642210.0
2022-03-09 23:40:00,2022-03-09 23:40:00+00:00,42043.0,42062.0,41971.0,42041.0,7510289.0
2022-03-09 23:45:00,2022-03-09 23:45:00+00:00,42041.0,42083.0,41990.0,42003.0,5706220.0
2022-03-09 23:50:00,2022-03-09 23:50:00+00:00,42003.0,42019.0,41975.0,41997.0,3546661.0
2022-03-09 23:55:00,2022-03-09 23:55:00+00:00,41997.0,42033.0,41943.0,41960.0,9030276.0


In [14]:
# 2022年3月
df.loc["2022-03", :].head()

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-09 20:00:00,2022-03-09 20:00:00+00:00,42229.0,42271.0,42228.0,42254.0,4986935.0
2022-03-09 20:05:00,2022-03-09 20:05:00+00:00,42254.0,42357.0,42246.0,42253.0,15411970.0
2022-03-09 20:10:00,2022-03-09 20:10:00+00:00,42253.0,42318.0,42123.0,42146.0,17410200.0
2022-03-09 20:15:00,2022-03-09 20:15:00+00:00,42147.0,42183.0,42080.0,42089.0,8337687.0
2022-03-09 20:20:00,2022-03-09 20:20:00+00:00,42089.0,42190.0,42089.0,42110.0,5248858.0


In [15]:
df.loc["2022-03", :].tail()

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-15 00:35:00,2022-03-15 00:35:00+00:00,39558.0,39562.0,39460.0,39481.0,7827251.0
2022-03-15 00:40:00,2022-03-15 00:40:00+00:00,39481.0,39574.0,39469.0,39567.0,5586437.0
2022-03-15 00:45:00,2022-03-15 00:45:00+00:00,39567.0,39614.0,39505.0,39506.0,9568077.0
2022-03-15 00:50:00,2022-03-15 00:50:00+00:00,39506.0,39548.0,39506.0,39531.0,3365713.0
2022-03-15 00:55:00,2022-03-15 00:55:00+00:00,39535.0,39580.0,39532.0,39571.0,2119015.0


In [16]:
df.loc["2022-03-10 13:00":"2022-03-11 03:00", :]

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-10 13:00:00,2022-03-10 13:00:00+00:00,39262.0,39311.0,39246.0,39281.0,1.068628e+07
2022-03-10 13:05:00,2022-03-10 13:05:00+00:00,39281.0,39284.0,39201.0,39216.0,7.875157e+06
2022-03-10 13:10:00,2022-03-10 13:10:00+00:00,39216.0,39265.0,39213.0,39243.0,7.673845e+06
2022-03-10 13:15:00,2022-03-10 13:15:00+00:00,39243.0,39326.0,39236.0,39307.0,1.134806e+07
2022-03-10 13:20:00,2022-03-10 13:20:00+00:00,39307.0,39342.0,39273.0,39309.0,1.251940e+07
...,...,...,...,...,...,...
2022-03-11 02:40:00,2022-03-11 02:40:00+00:00,38407.0,38435.0,38369.0,38377.0,4.307957e+06
2022-03-11 02:45:00,2022-03-11 02:45:00+00:00,38377.0,38415.0,38364.0,38373.0,4.826126e+06
2022-03-11 02:50:00,2022-03-11 02:50:00+00:00,38373.0,38455.0,38243.0,38426.0,1.326355e+07
2022-03-11 02:55:00,2022-03-11 02:55:00+00:00,38426.0,38451.0,38240.0,38292.0,1.406263e+07


In [17]:
# 10時から11時
df.between_time("10:00", "11:00")

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-10 10:00:00,2022-03-10 10:00:00+00:00,39161.0,39232.0,39122.0,39224.0,1.232851e+07
2022-03-10 10:05:00,2022-03-10 10:05:00+00:00,39224.0,39238.0,39172.0,39202.0,9.854052e+06
2022-03-10 10:10:00,2022-03-10 10:10:00+00:00,39202.0,39238.0,39159.0,39220.0,9.966978e+06
2022-03-10 10:15:00,2022-03-10 10:15:00+00:00,39220.0,39234.0,39144.0,39205.0,9.997023e+06
2022-03-10 10:20:00,2022-03-10 10:20:00+00:00,39205.0,39246.0,39175.0,39202.0,8.136784e+06
...,...,...,...,...,...,...
2022-03-14 10:40:00,2022-03-14 10:40:00+00:00,39015.0,39079.0,38988.0,39040.0,9.049719e+06
2022-03-14 10:45:00,2022-03-14 10:45:00+00:00,39040.0,39048.0,39016.0,39025.0,4.972424e+06
2022-03-14 10:50:00,2022-03-14 10:50:00+00:00,39025.0,39096.0,39022.0,39089.0,2.936812e+06
2022-03-14 10:55:00,2022-03-14 10:55:00+00:00,39089.0,39134.0,39072.0,39112.0,4.526722e+06


## ブールインデックス

SeriesやDataFrameに対して比較演算すると、ブール値が返ります。

In [18]:
df.loc[:, "volume"] > 1e+7

time
2022-03-09 20:00:00    False
2022-03-09 20:05:00     True
2022-03-09 20:10:00     True
2022-03-09 20:15:00    False
2022-03-09 20:20:00    False
                       ...  
2022-03-15 00:35:00    False
2022-03-15 00:40:00    False
2022-03-15 00:45:00    False
2022-03-15 00:50:00    False
2022-03-15 00:55:00    False
Name: volume, Length: 1500, dtype: bool

locにブール値を指定することで、Trueの要素に対してアクセスできます。

In [19]:
df.loc[df.loc[:, "volume"] > 1e+7, :]

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-09 20:05:00,2022-03-09 20:05:00+00:00,42254.0,42357.0,42246.0,42253.0,1.541197e+07
2022-03-09 20:10:00,2022-03-09 20:10:00+00:00,42253.0,42318.0,42123.0,42146.0,1.741020e+07
2022-03-09 20:40:00,2022-03-09 20:40:00+00:00,41947.0,41993.0,41826.0,41850.0,1.928753e+07
2022-03-09 20:50:00,2022-03-09 20:50:00+00:00,41877.0,41920.0,41724.0,41865.0,2.270338e+07
2022-03-09 21:05:00,2022-03-09 21:05:00+00:00,41897.0,41968.0,41819.0,41822.0,1.060387e+07
...,...,...,...,...,...,...
2022-03-14 23:55:00,2022-03-14 23:55:00+00:00,39620.0,39731.0,39614.0,39689.0,1.003464e+07
2022-03-15 00:00:00,2022-03-15 00:00:00+00:00,39689.0,39835.0,39608.0,39700.0,5.662910e+07
2022-03-15 00:05:00,2022-03-15 00:05:00+00:00,39700.0,39759.0,39647.0,39718.0,1.092920e+07
2022-03-15 00:10:00,2022-03-15 00:10:00+00:00,39718.0,39734.0,39552.0,39555.0,1.141035e+07


## 要素の変更

インデックスで指定した要素に対して代入すると、値が変更されます。

In [20]:
df.loc["2022-03-09 20:00", "volume"] = 0
df.head()

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-09 20:00:00,2022-03-09 20:00:00+00:00,42229.0,42271.0,42228.0,42254.0,0.0
2022-03-09 20:05:00,2022-03-09 20:05:00+00:00,42254.0,42357.0,42246.0,42253.0,15411970.0
2022-03-09 20:10:00,2022-03-09 20:10:00+00:00,42253.0,42318.0,42123.0,42146.0,17410200.0
2022-03-09 20:15:00,2022-03-09 20:15:00+00:00,42147.0,42183.0,42080.0,42089.0,8337687.0
2022-03-09 20:20:00,2022-03-09 20:20:00+00:00,42089.0,42190.0,42089.0,42110.0,5248858.0


In [21]:
df.loc["2022-03-09 20:00", :] = 0
df.head()

Unnamed: 0_level_0,startTime,open,high,low,close,volume
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-09 20:00:00,0,0.0,0.0,0.0,0.0,0.0
2022-03-09 20:05:00,2022-03-09 20:05:00+00:00,42254.0,42357.0,42246.0,42253.0,15411970.0
2022-03-09 20:10:00,2022-03-09 20:10:00+00:00,42253.0,42318.0,42123.0,42146.0,17410200.0
2022-03-09 20:15:00,2022-03-09 20:15:00+00:00,42147.0,42183.0,42080.0,42089.0,8337687.0
2022-03-09 20:20:00,2022-03-09 20:20:00+00:00,42089.0,42190.0,42089.0,42110.0,5248858.0
