Python Pandas
Python Pandas
DATAFRAME
PIVOT
Pivoting - Dataframe
There are two functions available in python for pivoting
dataframe.
1. pivot()
2. pivot_table()
table = OrderedDict((
("ITEM", ['TV', 'TV', 'AC', 'AC']),
('COMPANY',['LG', 'VIDEOCON', 'LG', 'SONY']),
('RUPEES', ['12000', '10000', '15000', '14000']),
('USD', ['700', '650', '800', '750'])
))
d = DataFrame(table) print("DATA
OF DATAFRAME")
print(d)
p = d.pivot(index='ITEM', columns='COMPANY', values='RUPEES')
print("\n\nDATA OF PIVOT")
print(p)
print (p[p.index=='TV'].LG.values)
#pivot() creates a new table/DataFrame whose columns are the unique values in
COMPANY and whose rows are indexed with the unique values of ITEM.Last
statement of above program return value of TV item LG company i.e. 12000
Pivoting - Dataframe
#Common Problem in Pivoting
pivot method takes at least 2 column names as parameters - the index and the
columns name as parameters. Now the problem may arise- What happens if we have
multiple rows with the same values for these columns? What will be the value of the
corresponding cell in the pivoted table using pivot method? The following diagram
depicts the problem:
d.pivot_table(index='ITEM', columns='COMPANY',
values='RUPEES‘,aggfunc=np.mean)
In essence pivot_table is a generalisation of pivot, which allows you to
aggregate multiple values with the same destination in the pivoted table.
Sorting - Dataframe
Sorting means arranging the contents in ascending or
descending order.There are two kinds of sorting
available in pandas(Dataframe).
1. By value(column)
2. By index
# Create a DataFrame
df = pd.DataFrame(d)
df=df.reindex([1,4,3,2,0])
print("Dataframe contents without sorting")
print (df)
df1=df.sort_index()
print("Dataframe contents after sorting")
print (df1)
# In above example dictionary object is used to create the
dataframe. Elements of dataframe object df is first
reindexed by reindex() method,index 1 is positioned at 0,4 at 1
and so on.then sorting by sort_index() method. By default it is
sorting in ascending order of index.
Sorting - Dataframe
Sorting pandas dataframe by index in descending order:
import pandas as pd import
numpy as np
Data aggregation –
Aggregation is the process of turning the values of a dataset (or a
subset of it) into one single value or data aggregation is a
multivalued function ,which require multiple values and return a
single value as a result.There are number of aggregations possible
like count,sum,min,max,median,quartile etc. These(count,sum etc.)
are descriptive statistics and other related operations on
DataFrame Let us make this clear! If we have a DataFrame like…