Thursday, September 21, 2017

Pandas

This is just a running list of useful things about Pandas as I learn.




If you have data coming in from a .csv, use this: df=pd.read_csv('file.csv')

If your dataframe has strings and you want them to be numbers, use this: df.column = pd.to_numeric(df.column, errors='coerce')

If you have date/time in linux epoch format, you can convert using this: df['date'] = pd.to_datetime(df['date'],unit='s')

If you want to index on select columns: df.ix[:,:2]

df.describe() gives you min, max, avg, mean, percentiles, and std

df.groupby([column.other]).mean()

df[column.other == ]


Grab specific columns by name: df1 = df[['a','b']]

data.iloc[:, 0:2] # first two columns of data frame with all rows

this will square each cell, skipping the first column.  df.iloc[:,1:7]=df.iloc[:,1:7].apply(numpy.square, axis=0)

great link on managing jupyter: https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook

Great guide to pandas: http://www.lining0806.com:1234/pandas/Pandas%20DataFrame%20Notes.pdf

https://www.shanelynn.ie/select-pandas-dataframe-rows-and-columns-using-iloc-loc-and-ix/
https://www.youtube.com/watch?v=POe1cufDWFs

Some images you can ignore:







No comments:

Post a Comment