-- Living Mobile --: AI/ML: Using dataframe to group and summarise

Friday, June 3, 2022

AI/ML: Using dataframe to group and summarise

data.groupby(['month']).groups.keys()

Out[59]: ['2014-12', '2014-11', '2015-02', '2015-03', '2015-01']

len(data.groupby(['month']).groups['2014-11'])

Out[61]: 230

data.groupby('month').first()

==> This gives first row of each month

data.groupby('month')['duration'].sum()

===> This gives sum by each month

data.groupby('month')['date'].count()

===> This gives entries in each month

data.groupby('month')['duration'].sum()

===> produces Pandas Series

data.groupby('month')[['duration']].sum()

===> Produces Pandas DataFrame

data.groupby('month', as_index=False).agg({"duration": "sum"})

===> The groupby output will have an index or multi-index on rows corresponding to your chosen grouping variables. To avoid setting this index, pass “as_index=False” to the groupby operation.

df_analyze.groupby(['week_label','evt']).agg({'evt':'count','t' : 'sum'})

===> This is powerful to give thhe evt filed as count and time as sum if we are using agg function

agg_procedure = {

'evt':'count',

't' : 'sum'

}

df_analyze.groupby(['week_label','evt']).agg(agg_procedure)

===> This above is an equivalent of the corresponding above, just that defined as a procedure

df_analyze.groupby(['week_label','evt']).agg({

'dt' : ['min','max', 'sum'],

'evt' : 'count',

't' : ['min', 'first', 'nunique']

})

references:

https://www.shanelynn.ie/summarising-aggregation-and-grouping-data-in-python-pandas/

-- Living Mobile --

Friday, June 3, 2022

AI/ML: Using dataframe to group and summarise

No comments:

Post a Comment

Followers

Blog Archive

About Me