Tag Archives: 10_lines

Sample chart

Retrieve and display a data set

(First part of the “Practical Python in 10 lines or less” series)

Python is a simple but powerful language, and comes with a wealth of libraries. The chart above took just 10 lines of Python. All the hard work is done by the Pandas and MatPlotLib libraries.

The code

import pandas, matplotlib
data = pandas.read_csv('http://www.compassmentis.com/wp-content/uploads/2019/04/cereal.csv')
data = data.set_index('name')
data = data.calories.sort_values()[-10:]
ax = data.plot(kind='barh')
ax.set_xlabel('Calories per serving')
ax.set_ylabel('Cereal')
ax.set_title('Top 10 cereals by calories')
matplotlib.pyplot.subplots_adjust(left=0.45)
matplotlib.pyplot.show()

How it works

You will need Python and the Pandas and MatPlotLib libraries. See the installation instructions

Get started

1. import pandas, matplotlib
Grab the libraries we need to load, clean up and display the data.
The recommended approach (PEP 8) is to have two import statements on separate lines. To leave enough lines to make the chart look good, in this example I have combined them.

2. data = pandas.read_csv(‘http://www.compassmentis.com/wp-content/uploads/2019/04/cereal.csv’)
Load the csv data from a website. This gives us a pandas DataFrame, a two dimensional datastructure similar to a page in a spreadsheet.
I downloaded the data from https://www.kaggle.com/crawford/80-cereals/version/2, under Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) [https://creativecommons.org/licenses/by-sa/3.0/]

3. data = data.set_index(‘name’)
Set the row names (index) to the ‘name’ column. When we plot the data this becomes the data labels.

4. data = data.calories.sort_values()[-10:]
Take the ‘calories’ column, sort it and limit to the last 10 values. This gives us the 10 cereals with the highest calories per serving

5. ax = data.plot(kind=’barh’)
Plot the data as a horizontal bar chartax.set_xlabel(‘Calories per serving’)

6. ax.set_ylabel(‘Cereal’)
7. ax.set_title(‘Top 10 cereals by calories’)
8. ax.set_xlabel(‘Area in millions square kilometers’)

Set the label for the x and y axes and the title

9. matplotlib.pyplot.subplots_adjust(left=0.45)
Set the left margin (from the left of the image to the left of the chart area) to 45% to give enough space for the cereal names.

10. matplotlib.pyplot.show()
Show the chart