The MatPlotLib Story
In our previous discussions, we delivered a brief overview to the mechanics of MatPlotLib. The first of these presented the basics of downloading the library and working with the code itself. This discussion may be found here. In the second article, we presented the basics of the line plot, and along with this, introduced mechanisms of plot customization. These customization functions are essential for more sophisticated plot structures, so if you haven’t investigated this subject in depth, we highly recommend it. You may find it here. In this article, however, we move forward with a bit more sophistication towards the construction of basic scatter plots in MatPlotLib.
The scatter plot is one of the most fundamental plots in all of data analysis, permitting the organization of raw data points in conjunction with other individuals in a sample. For that reason, we spend a great deal of time here exploring the intricacies of these structures in hopes that you may employ these in your own data models.
Introducing Scatter Plots
In scatter plots, data points are individually represented by dots, circles, and a variety of other symbols. In our previous discussion of MatPlotLib plotting functions like ‘plt.plot’ and ‘ax.plot’ which support construction of line plot. We can create a scatter plot rather than a line plot by passing in an argument denoting the symbols by which to represent the data points. The code for executing this plot appears as follows:
Notice here, as in the line plot, we create our input data using the np.linspace function. We specify the bounds as 0 and 10, and pick 25 equally spaced points between these bounds. We construct the plot by specifying these x-values and create ‘y’ values by passing the input into the sine function. Secondly, we then specify the data points as dots by using the ‘o’ constructor. We also specify the color as black using the color keyword argument. Doing this yields a plot of the following form:
MatPlotLib also provides a wide variety of data point markers. We can take a look at a few others:
These are just a few of the marker specifications that MatPlotLib has to offer, but they produce the following unique plots:
When creating scatter plots, if the data is derived from a specific function, it is sometimes useful to document a line connecting the individual data points. As with the linestyle and color arguments we explored in the previous article, MatPlotLib makes it possible to specify the marker, color, and a line together using symbolic arguments. When coding the plot, the x and y values associate with a symbolic/non-keyword argument to specify these features. For example, suppose we want to plot our sine function with circle markers, but also have these points connected by a black line. To do this, we would utilize the following code:
The preceding code predictably yields the plot that follows:
Keyword Arguments in Plotting
In our documentation of basic MatPlotLib plotting and customization, we focused primarily on the color and line style. However, MatPlotLib supplies a variety of keyword arguments for customizing the plot. We can explore some of them here.
With constructing scatter plots, MatPlotLib provides the opportunity for specifying the size of the marker. This is done using the ‘markersize’ keyword argument. The user can control the size of the marker by inputting an integer reflecting the size.
With respect to a MatPlotLib plotting object, MatPlotLib allows the width of a line to be exogenously controlled. This function is exerted using the MatPlotLib ‘linewidth’ keyword argument. Again, the user may associate this keyword argument with an integer which confers the width of a line.
Marker Face Color
While we previously specified the color of the marker with the ‘color’ keyword argument, we are able to separate the color of the marker based on two attributes. The primary color of the marker, and the color of the border or the edge of the marker. We can specify the primary color of the marker using the ‘markerfacecolor’. We control this aspect of the primary color by passing in a string object which specifies the color we desire.
Marker Edge Color
As previously state, MatPlotLib also affords ability to control the border color of the marker. We do this with the ‘markeredgecolor’. Like the ‘markerfacecolor’, the marker edge color is specified by passing in a string object that dictates the color we desire.
Marker Edge Width
As we demonstrated with our ability of specifying the marker size, we can also specify the width of the marker border. As with the marker size keyword argument, we pass in an integer to control the marker border.
Tying the Arguments Together
With code, we can tie all of these keyword arguments together to specify all of these different aspects of the scatter plot together. The code to do so appears as follows:
The style of the plot produced here appears as follows:
Creating Scatter Plots with plt.scatter
The previous examples relied primarily upon developing scatter plots with the ‘plt.plot’ function. Alternatively, we are able to create scatter plots with the ‘plt.scatter’ function. The usefulness of this technique allows individual point control individually. However, the ‘plt.plot’ function is a bit more efficient than the ‘plt.scatter’ because for the plt.scatter function, a bit more energy is devoted for plotting each point. We may demonstrate the utility of the plt.scatter function with a random data plot.
Firstly, we can use a random integer generator to obtain random values for ‘x’ and ‘y’ coordinates. We can plot all of these points within colored rings where the depth of the color and size of the circle confers the amount of values in that area. Let us observe the code which allows us to create this plot.
Here, we use the ‘alpha’ argument to control the transparency of the data points in the scatter plot. Furthermore, we specify the cmap keyword argument ‘viridis’ to acquire a quasi-smear in the circles plotted. We also use the ‘plt.colorbar’ function which creates a color bar we use as a reference to the significance of colors in the plot. The plot created appears as follows:
The Take Away
The present article has discussed at length the basics of constructing scatter plots with MatPlotLib, created from a variety of different data inputs. Furthermore, we expanded upon the variety of controls we have available for specifying various aspects of the plot. In our next article, we explore in great depth the various tools available for visualizing error in its various forms with MatPlotLib. Nevertheless, scatter plots with MatPlotLib possess a wide variety of options for careful control of the plot. If you seek to explore this subject in greater depth, consider checking out the MatPlotLib manual here.