Category Archives: 10 Lines

Use Python to update a spreadsheet

How would you like to grab a share price daily and store it in a spreadsheet? Or add a new column to dozens of spreadsheets – automatically?

Python is a simple but powerful language, and comes with a wealth of libraries. Its openpyxl library lets you easily open a spreadsheet and make some changes.

Here is an example which adds a new column (“Next age”) to all spreadsheets in the source_folder. The left side of the image above shows an original spreadsheet. The Python script opens this, adds a new column (Next age), then saves it to the target_folder. The right side of the image shows the result

Here is the annotated code. You can find the raw code at the GitHub repository

Before installing openpyxl, to keep your Python version(s) clean, you may want to set up a virtual environment first

To install openpyxl: pip install openpyxl

1. import openpyxl
2. import os
3. for name in os.listdir('source_files'):
4.     workbook = openpyxl.load_workbook(filename='source_files/' + name)
5.     sheet = workbook['Sheet1']
6.     sheet['C1'].value = 'Next age'
7.     for row in range(2, 100):
8.         if sheet[f'B{row}'].value:
9.             sheet[f'C{row}'].value = sheet[f'B{row}'].value + 1
10.     workbook.save(filename='target_files/' + name)

1. import openpyxl
Load the openpyxl library.

2. import os
Load the os library. We will use this list the files in a folder

3. for name in os.listdir(‘source_files’):
For each file in our ‘source_files’ folder. Note that this includes all files, regardless of whether it is a spreadsheet or not

4.     workbook = openpyxl.load_workbook(filename=’source_files/’ + name)
Open the workbook

5.     sheet = workbook[‘Sheet1’]
Take the worksheet called ‘Sheet1’

6.     sheet[‘C1’].value = ‘Next age’
Enter something in cell C1

7.     for row in range(2, 100):
For rows 2 – 99 (Python stops just before reaching 100), do the following:

8.          if sheet[f’B{row}’].value:
If cell B2, B3, B4, etc is not empty, do the following:

9.               sheet[f’C{row}’].value = sheet[f’B{row}’].value + 1
Take the age from column B, add one to it and store in the cell to the right, i.e. in column C

10. workbook.save(filename=’target_files/’ + name)
Save the updated workbook to the target_files folder, using the same name

1/2 + 1/3 = 1/6

Fractions in Python

When you ask your spreadsheet to calculate 1/2 + 1/3 you get something like this:
This is obviously an approximation. The 3’s after the decimal point repeat indefinitely.

The correct answer is:

  • 1/2 = 3/6
  • 1/3 = 2/6
  • 1/2 + 1/3 = 3/6 + 2/6 = 5/6

Python is a simple but powerful language, and comes with a wealth of libraries. Its Fractions library gives you the correct answer in a couple of lines

Here is the annotated code. You can find the raw code at the GitHub repository

1. from fractions import Fraction
Load the Fractions library

2. half = Fraction(‘1/2’)
3. third = Fraction(‘1/3’)
Create the two fractions

4. total = half + third
Add them up

5. print(half, ‘+’, third, ‘=’, total)
Show the result.
The more modern way is to use an “f-string”, which was introduced in Python 3.6, December 2016. This is often more readable, but not here. It would look like this:
print(f'{half} + {third} = {total}’)

Sample chart

Retrieve and display a data set

(First part of the “Practical Python in 10 lines or less” series)

Python is a simple but powerful language, and comes with a wealth of libraries. The chart above took just 10 lines of Python. All the hard work is done by the Pandas and MatPlotLib libraries.

The code

import pandas, matplotlib
data = pandas.read_csv('http://www.compassmentis.com/wp-content/uploads/2019/04/cereal.csv')
data = data.set_index('name')
data = data.calories.sort_values()[-10:]
ax = data.plot(kind='barh')
ax.set_xlabel('Calories per serving')
ax.set_ylabel('Cereal')
ax.set_title('Top 10 cereals by calories')
matplotlib.pyplot.subplots_adjust(left=0.45)
matplotlib.pyplot.show()

How it works

You will need Python and the Pandas and MatPlotLib libraries. See the installation instructions

Get started

1. import pandas, matplotlib
Grab the libraries we need to load, clean up and display the data.
The recommended approach (PEP 8) is to have two import statements on separate lines. To leave enough lines to make the chart look good, in this example I have combined them.

2. data = pandas.read_csv(‘http://www.compassmentis.com/wp-content/uploads/2019/04/cereal.csv’)
Load the csv data from a website. This gives us a pandas DataFrame, a two dimensional datastructure similar to a page in a spreadsheet.
I downloaded the data from https://www.kaggle.com/crawford/80-cereals/version/2, under Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) [https://creativecommons.org/licenses/by-sa/3.0/]

3. data = data.set_index(‘name’)
Set the row names (index) to the ‘name’ column. When we plot the data this becomes the data labels.

4. data = data.calories.sort_values()[-10:]
Take the ‘calories’ column, sort it and limit to the last 10 values. This gives us the 10 cereals with the highest calories per serving

5. ax = data.plot(kind=’barh’)
Plot the data as a horizontal bar chartax.set_xlabel(‘Calories per serving’)

6. ax.set_ylabel(‘Cereal’)
7. ax.set_title(‘Top 10 cereals by calories’)
8. ax.set_xlabel(‘Area in millions square kilometers’)

Set the label for the x and y axes and the title

9. matplotlib.pyplot.subplots_adjust(left=0.45)
Set the left margin (from the left of the image to the left of the chart area) to 45% to give enough space for the cereal names.

10. matplotlib.pyplot.show()
Show the chart