How to Make Extremely Beautiful Charts with Python

The average data visualization still looks meh. Some tweaks can provide a serious upgrade

Dec 13, 2024

∙ Paid

TL;DR: The average data analyst might want to quickly visualize a dataset to understand what is worth focusing on. They might then present these same charts to their peers and bosses, in the hopes that it gets the same message over to them. The reality, however, is that many other people just see a sloppy or overly complicated graph. The good news is that you do not need to invest hours on upskilling yourself on data visualization in order to make your charts more impactful. In most cases, you can blow your colleagues’ socks off by using simple tricks, like adjusting the font, adding a logo, highlighting relevant sections, or adding some very basic interactive elements.

With some simple tweaks you can make your graphs look much more impressive. Image generated by Leonardo AI

Most graphs are not worth looking at. No matter which industry you work in, or what your seniority level is—fact is that you likely come across plenty of mediocre data visualizations in your work.

Good graphs bring home a specific message, and they do this fast. At the same time, they are visually appealing enough to invite the spectator to spend some more time on them and deepen their understanding of the key message. Good graphs also convey where they are from—which company or department—because they speak the same visual language.

Bad graphs might contain plenty of information, but they fail to get the key message across to the spectator. Either they are trying to show too many things at once, or they are just so unappealing visually that the spectator looks elsewhere before having understood the message. The spectator has no idea who created a bad graph unless the creator tells them that it’s from them.

Most data scientists, analysts, and similarly graph-creating professionals are smart enough to figure out how to create a good graph. The problem is that there is never enough time to do it.

Luckily, there are many workflows that you can literally copy-paste from various places. Once you have created a template for your flavor of good graphs—a process which should not take you more than your average lunch break—you’ll be able to reuse it again and again.

I’m biased but I’d say that a good starting point is reading this article. I’ll show you with pieces of my own work how you might pimp your own graphs.

I assume that you are already familiar with some data analysis in Python, and that you’ve used matplotlib before. If you are unfamiliar with matplotlib, there are plenty of tutorials like this one which you might want to check out before coming back to this article.

Tinkering with the looks of a graph might feel unproductive in the moment, but your efforts will likely be worth the time investment. Your bosses will thank you!

Trick #1: Use Seaborn

One very quick-and-easy way to take your charts to the next level is by using Seaborn in conjunction with matplotlib. In principle, you could create everything that Seaborn offers with matplotlib. Seaborn is based on matlplotlib after all. That being said, since Seaborn already exists, you might as well skip the time investment.

What makes Seaborn so nice to use is that it allows you to create nice-looking graphs with minimal boilerplate code. It interoperates quite seemlessly with Pandas, which makes it excellent for exploring tabular data.

There are situations where Seaborn can be somewhat constraining. There are only so many chart types available. When you want to do something a little more exotic, Seaborn might not have all the answers for you. In that case, you’ll need to go back to matplotlib. However, in any case you will be able to leverage Seaborn’s beautiful color palettes and its powerful controls on figure aesthetics.

You’ll find an example of a plot of mine below, which I made two years ago for my PhD thesis in particle physics. I will spare you the details, but at its simplest it shows how well two different algorithms perform when testing models on dark matter. You can see that the blue-marked points delimitate a much more compact area. The algorithm that generated these points (AL stands for Active Learning) vastly outperforms its competitor, a Markov Chain Monte Carlo algorithm.

Seaborn’s displot (or distplot, as it was still called back then) helped me create a beautiful two-dimensional Kernel density estimation (KDE) in this case. This would have been a pain for me to code from scratch and implement in matplotlib, and it got the message over much clearer than my previous attempts with matplotlib-based histograms and scatter plots.

Graph produced by the author during his PhD. It shows the performance of two different algorithms exploring a theoretical model of dark matter.

Trick #2: Adjust the Font

Another feature you’ll see in the above chart is that I adjusted the font of the title, labels, and all other text. This is quite easy to do with matplotlib, and you can choose whether you want that layout to affect just this one chart, just a specific section of a chart, or all future charts you create.

All styles of matplotlib are controlled by the rc settings. In the example above, I modified them in the header of the plot routine as follows:

import matplotlib.pyplot as plt
import matplotlib as mpl

mpl.rcParams['font.family'] = 'cmr10'
mpl.rcParams['mathtext.fontset'] = 'cm'

In this case, I used the font cmr10 for all text because the font with serifs matched that of my thesis and made it look more pleasing than the default option. I also specified cm for the math text, i.e., the variables stated on the x- and y-axis.

In order to avoid copy-pasting such lines of code, one can also specify rc settings in style sheets. An appropriate style sheet called ./images/presentation.mplstyle for the settings above might look like the following:

font.family = cmr10
mathtext.fontset = cm

In which case, the style sheet needs to be referenced in the main code as follows:

import matplotlib.pyplot as plt
plt.style.use('./images/presentation.mplstyle')

Finally, if one wants to make all future charts default to some specified styling without having to refer to a style sheet, one can edit the matplotlibrc file. One can edit one oneself or modify the default, which sits where matplotlib was installed. Refer to the matplotlib docs for more information on this.

Trick #3: Highlight Your Highlights

You might notice two little stars in the chart above. They refer to the highest-most dark matter mass that either algorithm found. This points out a possible weakness of the Active Learning (AL) algorithm versus the Markov Chain Monte Carlo MCMC) simulations: The AL was much more focused on an interesting area of the parameter space (which is good in principle), but it failed to identify the point with the highest mass. The MCMC found a higher mass.

Such comparisons are important, hence I added a star to mark the points that trigger such a reflection. Such markers are very easy to add. In my case, it was done in two lines of code:

plt.plot(MCMCmaxpoint[var1], 0, marker='*', color='red')
plt.plot(maxpoint[var1], 0, marker='*', color='purple')

Aside from stars (*), many other markers are available to choose from. One can also change the position, color, and size.

In other charts, it might make sense to also add a text box to explain the highlight. This can be easily achieved by using matplotlib’s figtext.

Keep reading with a 7-day free trial

Subscribe to Wangari Digest to keep reading this post and get 7 days of free access to the full post archives.