python - Fastest way to add an extra row to a groupby in pandas -
i'm trying create new row each group in dataframe copying last row , modifying values. approach follows, concat step appears bottleneck (i tried append too). suggestions?
def gennewobs(df): lastrowindex = df.obsnumber.idxmax() row = pd.dataframe(df.ix[lastrowindex].copy()) # changes other values in row here df = pd.concat([df,row], ignore_index=true) return df df = df.groupby(group).apply(gennewobs)
edit 1: have bunch of data last observation on different dates. want create final observation groups on current date.
group date days since last observation 1/1/2014 0 1/10/2014 9 b 1/5/2014 0 b 1/25/2014 20 b 1/27/2014 2
if pretend current date 1/31/2014 becomes:
group date days since last observation 1/1/2014 0 1/10/2014 9 1/31/2014 21 b 1/5/2014 0 b 1/25/2014 20 b 1/27/2014 2 b 1/31/2014 4
i've tried setting enlargement , slowest of techniques. ideas?
thanks user1827356, sped factor of 100 taking operation out of apply. reason first dropping group column, used idxmax instead.
def gennewobs(df): lastrowindex = df.groupby(group).date.idxmax() rows = df.ix[lastrowindex] df = pd.concat([df,rows], ignore_index=true) df = df.sort([group, date], ascending=true) return df
Comments
Post a Comment