Matplotlib vs Seaborn: A Guide for Beginners

Published by Navneet Kishor on

data visualization

When we represent data graphically using histogram, heatmaps, pie-chart, etc. then it is called data visualization. For this purpose data visualization tools are of great importance. As we are discussing and practicing python, we are provided with a plethora of libraries that let us visualize the data in the way we want. Some of them which are commonly used are Matplotlib, Seaborn, Plotly, ggplot, Gleam. In this article, our prime focus is on Matplotlib vs Seaborn.

Understanding Matplotlib

Now, what is Matplotlib? Matolotlib is the most used and the most popular plotting library in python. Observe carefully, you will see a striking similarity between Matplotlib and MATLAB’s plotting curves. The matplotlib library plots the curves quite alike to the MATLAB. The only difference is MATLAB requires a license and is quite expensive. Every aspect of the figure can be controlled using this library. talking about its founder and sole developer, it is John Hunter and distributes it under a BSD license.

This open-source plotting library contains an API that helps you embed plots in applications too. One of the advantages Matplotlib has is the fact that its interface is quite easy to understand. Using Matplotlib we can plot lines, scatter plot, pies, and much more. Matplotlib contains an Object-Oriented API which helps us embedding the library in our ways.

Installing Matplotlib

Learning Matplotlib is easy most of the time but it requires attention and some basic knowledge of python. You can use pip or conda command to install Matplotlib first, open the terminal or command line of anaconda IDE.

Use any of the two commands:

pip install matplotlib

Or

conda install matplotlib

Or you can click here for the installation guide of Matplotlib.

Importing Matplotlib

Now, you are ready to use Matplotlib and can plot as many plots as you want according to your data. The next thing you are going to do is import the Matplotlib library in your notebook.

import matplotlib.pyplot as plt
%matplotlib inline

If you are using jupyter notebook, then you need to write %matplotlib inline to display your plots inside the notebook, this command can be left over otherwise.

To make matplotlib work we need to declare and assign to dataframes or arrays at initial stage. for example, let us define two arrays x , y using numpy library. Here, I’m assuming that you have a basic knowledge of printing arrays using numpy.

Plotting Data

Once you have the arrays initialized and printed now its turn of plotting their relation using Matplotlib. We know very well that Matplotlib and MATLAb has almost identical plots, but what makes Matplotlib do this is the Pyplot module because of the command style functions present in it. To plot the relation we have to use the command:

plt.plot(x,y)

Running this command will instantly generate the relation plot between the two arrays x and y

Matplotlib vs Seaborn: a random plot between numpy arrays
a plot using Matplotlib

in case you are using any other IDE other than Jupyter then use:

plt.plot(x,y)
plt.show()

Not only you can plot merely a graphical representation but, can customize it too as it uses Python GUI toolkits such as WxPythonotTkinter, PyQt, and more. Using Matplotlib you can :

  • Plot multiple graph relations in a single plot
  • Set the color of the plots according to you
  • Give a title to your plot
  • Set the labels for the axes of the graph
  • create gradient background plots

We can create several types of plot using Matplotlib. Some of those that are repeatedly used these are namely

Scatter Plot

To display the relationship between the variables graphically we create scatter plots. A direct method is provided by Matplotlib to print the scatter plots which is by using scatter() method. For correlation between the variables and heavy data sets the scatter plots are used

Matplotlib vs Numpy: A scatter plot
a scatter plot

Histograms

We can use histogram to plot numerical data’s frequency in continous data. You may think of them as bar-charts but there are significant

Matplotlib vs Numpy: a histogram
a histogram

Fill Between

You can fill the area between two horizontal curves. The curves are defined by the points (x, y1) and (x, y2). This creates one or multiple polygons describing the filled area.

Matplotlib vs Numpy: fill between plot
fill between plot

Some Advantages of Matplotlib

One of the advantages Matplotlib has is the fact that its interface is quite easy to understand for those that use Matlab , as it is using a pylab interface.Which implies it is a true open-source alternative for those who are using MATLAB.

  • Based on Python, one of the most modern object-oriented programming languages , makes it suitable for enterprise development too
  • It is suitable for CGI scripting (and basically other fast scripting methods)
  • Offers native vectorial graphics support (SVG)
  • no license fee required , completely open source library
  • very accurate and effiicient

That is all in general about Matplotlib for this post for learning more and to know how to plot various graphs using Matplotlib read our article.

Understanding Seaborn

Let’s dive into another beautiful python library that helps us visualizing the data, Seaborn. Seaborn is a statistcal plotting library that is built on top of matplotlib. So the knowledge we gained understanding Matplotlib is going to be useful in understanding Seaborn. Designed to work well with dataframe objects of pandas , Seaborn contains attractive default styles. the syntax usage in this library is lesser as compared to Matplotlib.

Again , Seaborn can be installed using conda or pip commands:

pip install seaborn

or,

conda install seaborn

using Seaborn we can plot heat plots , heat maps , violin plots , factor plots, histogram , joint distribution plot and much more. We have Seaborn compiled and integrated in such a way that you can easily visualize with the help of pandas data frame.

Its the turn for importing the libraries now to make Seaborn work as we know we need pandas library for Seaborn so we will import pandas first

import pandas as pd

Then for further customization of our plots we import Matplotlib

import matplotlib.pyplot as plt
%matplotlib inline ni

Then finally we will import Seaborn

import seaborn as sns

As Seaborn works with a data set, so you need to import Pandas. Because the stateful nature is not seen in Seaborn , the passing of object will require plot(). Seaborn is majorly used for stastical plotting which implies Seaborn is used for specific purposes.

You can use seaborn for plotting various plots such as:

  • Heatmaps
  • Jointplot
  • Boxplots

Heatmaps

Let us quckly understand what are these beautiful checker like pictures. These checker like pictures are heatmaps. These are use to understand and visualize complex data. They are the graphical representation of data stored in boxes with values depicted in colors

Matplotlib vs Numpy: a heat map
a heatmap

Jointplot

We can plot a joint plot if we want to show relation between bivariate and univariate which means two variables and single variable profiles.

Matplotlib vs Numpy: a joint plot
a joint plot

BoxPlots

This is a different type of technique to visualize the data that helps in portraying the group of data in numerical form through their grades or quartiles. With the help of boxes and whisker, it can apprehend the data summary more efficiently.

the main components of BoxPlots are:

  • minimum
  • first quartile
  • median quartile
  • third quartile
  • maximum
Matplotlib vs Numpy: a box plot
a box plot

Some advantages of seaborn

If you will compare Seaborn with Matplotlib you will see a huge difference in aesthetics. Matplotlib makes the plot look unattractive with ticks here and there on all the sides of plots, the color scheme, the immutable background color makes Matplotlib an unprofessional library scheme for many of us.

This unaesthetic nature of Matplotlib gave the rise to Seaborn and its salient features. the main advantages if this library are:

  • Some of the in-built plots are missing in Matplotlib which are present in Seaborn. For example, distribution plots and matrix plots are the ones which can be coded using Matplotlib but they are inbuilt in seaborn. Which implies when the same plot is built in a library then there’s no point in toiling to code the whole plot using another library, which would take much longer.
  • Seaborn has such aesthetics which can be easily customized. As I told you earlier, you can easily change the background color in Seaborn. It provides two commands: set_style and set_palette
sns.set_style

You can change the background color of your plot using this command and it takes four color choices and ticks as the arguments the background colors are namely:

  • darkgrid (default)
  • dark
  • whitegrid
  • white
sns.set_palette

Using this command you can set the color palette in your plots. Setting color palettes makes it easier to distinguish between multiple plots in a single graph. You can click here to know more about the color sets for choosing the color palettes.

some colors for color palette
  • Though it may lead to out of memory issues sometimes, Seaborn is capable of generating multiple figures in one go.

That’s all about Seaborn in general. You can read our articles if you want to learn how to plot various graphs using Seaborn.

Conclusion

The use of any of the two libraries solely depends on our purpose of plotting. We can use any of the two libraries we discussed, But we can clearly see Seaborn has an edge over Matplotlib because of its in-built default themes, aesthetics, and much more. But, matplotlib has its own significance too.

Hope you liked this post. If you have any query or doubt, feel free to ask in comments or contact us personally.

If you want to get started with pandas then click here. Also to learn about data manipulation using python check out our post data manipulation using python.

Categories: Blog

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *