Pandas之ReIndex重新索引的实现

						约定：



import pandas as pd
import numpy as np

ReIndex重新索引

reindex()是pandas对象的一个重要方法，其作用是创建一个新索引的新对象。
一、对Series对象重新索引



se1=pd.Series([1,7,3,9],index=['d','c','a','f'])
se1

代码结果：

d    1
c    7
a    3
f    9
dtype: int64


调用reindex将会重新排序，缺失值则用NaN填补。



se2=se1.reindex(['a','b','c','d','e','f'])
se2

代码结果：

a    3.0
b    NaN
c    7.0
d    1.0
e    NaN
f    9.0
dtype: float64


传入method=” “重新索引时选择插值处理方式：

method='ffill'或'pad 前向填充
method='bfill'或'backfill 后向填充


se3=pd.Series(['blue','red','black'],index=[0,2,4])
se4=se3.reindex(range(6),method='ffill')
se4

代码结果：

0     blue
1     blue
2      red
3      red
4    black
5    black
dtype: object


二、对DataFrame对象重新索引

对于DataFrame对象，reindex能修改行索引和列索引。


df1=pd.DataFrame(np.arange(9).reshape(3,3),index=['a','c','d'],columns=['one','two','four'])
df1

代码结果：



  
    
      
      one
      two
      four
    
  
  
    
      a
      0
      1
      2
    
    
      c
      3
      4
      5
    
    
      d
      6
      7
      8
    
  


默认对行索引重新排序

只传入一个序列不能重新排序列索引


df1.reindex(['a','b','c','d'])

代码结果：


  
    
      
      one
      two
      four
    
  
  
    
      a
      0.0
      1.0
      2.0
    
    
      b
      NaN
      NaN
      NaN
    
    
      c
      3.0
      4.0
      5.0
    
    
      d
      6.0
      7.0
      8.0
    
  




df1.reindex(index=['a','b','c','d'],columns=['one','two','three','four'])


代码结果：


  
    
      
      one
      two
      three
      four
    
  
  
    
      a
      0.0
      1.0
      NaN
      2.0
    
    
      b
      NaN
      NaN
      NaN
      NaN
    
    
      c
      3.0
      4.0
      NaN
      5.0
    
    
      d
      6.0
      7.0
      NaN
      8.0
    
  


传入fill_value=n用n代替缺失值：



df1.reindex(index=['a','b','c','d'],columns=['one','two','three','four'],fill_value=100)

代码结果：



  
    
      
      one
      two
      three
      four
    
  
  
    
      a
      0
      1
      100
      2
    
    
      b
      100
      100
      100
      100
    
    
      c
      3
      4
      100
      5
    
    
      d
      6
      7
      100
      8
    
  


以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持中文源码网。