Basic
Basic¶
indexing¶
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html
avoid what is called chained indexing (not just for performance):
change column names¶
df.columns = df.columns.str.lower()
#common rows
df_index = np.intersect1d(df_percentage.index, df_repeated_EV_num.index)
df_index = df_percentage.index.intersection(df_repeated_EV_num.index)
df1 = df2.loc[df_index,:] * df3.loc[df_index,df2.columns]
get one val by row index and col label¶
df.at[df.index[0], 'A']
df.loc[df.index[0], 'A']
df.get_value(df.index[0], 'A')
df.iat[0, df.columns.get_loc('A')]
df.iloc[0, df.columns.get_loc('A')]
df.get_value(0, df.columns.get_loc('A'), takable=True)
replace col vals by dic¶
#slower
df.replace({'col1': dic})
df['col1'].replace(dic, inplace=True)
#faster
df['col1'].map(dic) #not matched will be changed to NaN
df['col1'].map(di).fillna(df['col1']) #keep unmatched values