Seaborn - Count, Point, Bar and Box Plots
Count Plot
Lets see the dataset we will be using to better understand the plot. A simple pandas dataframe containing Canada immigration data : immigrants from various countries from Europe, Asia and Oceania.
Lets check the total number of immigrants based on various regions in those 3 continents.
To plot the same information using matplotlib's pyplot we can do the following:
As always, matplot lib is good for plotting but it lacks "STYLE" and the best alternative is? You got it right, SEABORN. It ships with a ton of plotting options a few of which were discussed in the earlier post. Lets see how to plot the above graph using seaborn.
Description:
All we did was use the countplot() method ('Yea i know its pretty obvious and at times i believe seaborn was created for lazy bums like us') . Its a simple plot which takes in a column of the dataframe. performs an aggregate operation on each item of the column and plots it.
Description:
The graph shown earlier did a simple counting operation for each of the Regions. Managers won't stop at this point. Do they? What if we they want us to plot the same counting graph with an additional feature added to it? Say, plot the same counting graph which also shows if the region is 'developed' or 'developing' and counts the same.
All we need to do is use 'hue' parameter as shown above.
Lets explore the graph a bit further to see whats happeneing. Notice the bars of 'Eastern Asia' . It has bars for both developing and developed. Where as a few other regions have only one bar. Need more proof? Look at the below output.
Bar Plot
I am pretty bored of using the same immigration data. Lets switch gears and use another pre-shipped dataset from the seaborn library.
load_dataset() is used to load the dataset we would like to use for analysis. There are other datasets to play with and the info could be found in the link below.
https://github.com/mwaskom/seaborn-data
Description:
Now barplot is no fancy graph. In fact, count plot and bar plot are alike. So a range of pulse values for each of the diet types is plotted above.
palette : Used to define the color palette to use to plot the graphs. Another coloring scheme is shown in the next example.
Description:
Notice the use of 'hue' to add one more feature in the analysis and a new coloring scheme in the palette parameter.
Point Plot
Description:
Tired of seeing bars? Here comes an ice breaker. A point plot consists of lines joining the points representing the data points. We can leverage the styling options here using a choice of 'markers' and 'line styles'.
Box Plots
Description:
Do they look familiar? If you have gone through the posts related to matplotlib, you must have learned about the boxplots there. seaborn just makes them look much better and provides us with options to enhance the readability.
orient: The default is vertical orientation. Here we have used "h" to change it.
Description:
From teh first figure, since we are using HUE to add a third parameter for data analysis, each of the kind (rest, walking, running) and its associated diet (no fat,low fat) are used for plotting. And the hue values are easily distinguishable since there are 2 different boxes.
If the requirement is not to have 2 different boxes aligned side by side, rather on top of each other, use the 'dodge' parameter.
Comments
Post a Comment
Hey there, feel free to leave a comment.