Take out your turtle for a recursive walk

Python is a simple but powerful language, and comes with a wealth of libraries. It took just 10 lines of code and the Turtle library to create the black line in the image above.

Turtle graphics is popular for introducing programming, particularly to children. It shows a ‘turtle’ (triangle) which moves over the ‘paper’ whilst drawing a line. It is a fun way to try out some programming ideas.

The idea of recursion is that you break a large task down into a set of smaller tasks. In this case, to draw a long line, we give it a bit of flourish by drawing 4 shorter lines instead, like this:

Each of those shorter lines is broken down into 4 even shorter lines, like this:

And again:

And once more to get the final picture (above)

Here is the code for the black line. You can find the raw code at the GitHub repository

1. import turtle
Import the turtle library

2. def line(length):
Create a new function which draws a line of length ‘length’. However, it may make the line a bit wiggly

3.    if length <= 5:
If it is a short line, just draw it

4.        turtle.forward(length)
Move the turtle steps forward

5.        return
Exit this function

6.    for angle in (60, -120, 60, 0):
Go forward (straight or wiggly) for one-third of the required distance, then turn left 60 degrees. Repeat for right 120 degrees, left 60 degrees, and straight again. By the end of this the turtle will be pointing in exactly the same direction as before.

7.        line(length / 3)
Draw the line (straight or wiggly)

8.        turtle.left(angle)
Turn the requested angle

9. line(810)
Ask the ‘line’ function to draw a line of length 810. The recursive pattern will make it very wiggly

10. turtle.done()
Wait until the user closes the window. Without this the window would close as soon as the pattern has been drawn

To draw the coloured snowflake (see the image at the start of this article), use the following code:

Note: the “turtle.color()” call sets the line colour and the colour used to fill in the final shape

import turtle

def line(length):
    if length <= 5:
        turtle.forward(length)
        return
    for angle in (60, -120, 60, 0):
        line(length / 3)
        turtle.left(angle)

turtle.begin_fill()
turtle.color('firebrick3', 'wheat')
for _ in range(3):
    line(90)
    turtle.right(120)
turtle.end_fill()
turtle.done()

A simple plot with Python and Bokeh

Python is a simple but powerful language, and comes with a wealth of libraries. The chart above took just 9 lines of Python. All the hard work is done by the Bokeh library. It shows the chart in your browser, where you can zoom in and move around the chart.

Here is the annotated code. You can find the raw code at the end of this post or at the GitHub repository

Before installing Bokeh, to keep your Python version(s) clean, you may want to set up a virtual environment first

To install Bokeh: pip install bokeh

1. from bokeh.plotting import figure, show
Import part of bokeh, so we can create and show a figure

2. import math
We’ll use the math module to generate the points on the charts

3. x_values = range(0, 720)
The x axis contains the numbers from 0 to 719 (Python stops just before 720)

4. y_values = [math.sin(math.radians(x)) for x in x_values]
For each of the x values, the y value is sine of x. Python’s sin function expects the angle in radians rather than degrees. math.radians converts from degrees to radians.

We use something called ‘list comprehension’ here, to build up the list of y axis values. It creates a new list, which consists of the sine of each x (converted from degrees to radians) in the original list.

5. p = figure(title=’10 Sine waves’, x_axis_label=’x (degrees)’, y_axis_label=’y = sin(x)’, plot_width=850, plot_height=350)
Create an empty Bokeh figure, and set the title, labels, width and height

6. for i in range(10):
We’re drawing the same sine curve 10 times, at 10%, 20%, … 100% of the full height

For i is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, do the following:

7.     factor = 1 – i/10
Calculate the scaling factor, as (1 – 0/10) = 0, (1 – 1/10) = 0.9, (2 – 2/10) = 0.8, 0.7, … 0.1

8.     p.line(x_values, [y * factor for y in y_values])
Add a line to the figure, using the original list of x_values, but scale down the y_values by the current factor

9. show(p)
Ask Bokeh to show the result in your browser

1. from bokeh.plotting import figure, show
2. import math
3. x_values = range(0, 720)
4. y_values = [math.sin(math.radians(x)) for x in x_values]
5. p = figure(title='10 Sine waves', x_axis_label='x (degrees)', y_axis_label='y = sin(x)', plot_width=850, plot_height=350)
6. for i in range(10):
7.     factor = 1 - i/10
8.     p.line(x_values, [y * factor for y in y_values])
9. show(p)

Use Python to update a spreadsheet

How would you like to grab a share price daily and store it in a spreadsheet? Or add a new column to dozens of spreadsheets – automatically?

Python is a simple but powerful language, and comes with a wealth of libraries. Its openpyxl library lets you easily open a spreadsheet and make some changes.

Here is an example which adds a new column (“Next age”) to all spreadsheets in the source_folder. The left side of the image above shows an original spreadsheet. The Python script opens this, adds a new column (Next age), then saves it to the target_folder. The right side of the image shows the result

Here is the annotated code. You can find the raw code at the GitHub repository

Before installing openpyxl, to keep your Python version(s) clean, you may want to set up a virtual environment first

To install openpyxl: pip install openpyxl

1. import openpyxl
2. import os
3. for name in os.listdir('source_files'):
4.     workbook = openpyxl.load_workbook(filename='source_files/' + name)
5.     sheet = workbook['Sheet1']
6.     sheet['C1'].value = 'Next age'
7.     for row in range(2, 100):
8.         if sheet[f'B{row}'].value:
9.             sheet[f'C{row}'].value = sheet[f'B{row}'].value + 1
10.     workbook.save(filename='target_files/' + name)

1. import openpyxl
Load the openpyxl library.

2. import os
Load the os library. We will use this list the files in a folder

3. for name in os.listdir(‘source_files’):
For each file in our ‘source_files’ folder. Note that this includes all files, regardless of whether it is a spreadsheet or not

4.     workbook = openpyxl.load_workbook(filename=’source_files/’ + name)
Open the workbook

5.     sheet = workbook[‘Sheet1’]
Take the worksheet called ‘Sheet1’

6.     sheet[‘C1’].value = ‘Next age’
Enter something in cell C1

7.     for row in range(2, 100):
For rows 2 – 99 (Python stops just before reaching 100), do the following:

8.          if sheet[f’B{row}’].value:
If cell B2, B3, B4, etc is not empty, do the following:

9.               sheet[f’C{row}’].value = sheet[f’B{row}’].value + 1
Take the age from column B, add one to it and store in the cell to the right, i.e. in column C

10. workbook.save(filename=’target_files/’ + name)
Save the updated workbook to the target_files folder, using the same name

1/2 + 1/3 = 1/6

Fractions in Python

When you ask your spreadsheet to calculate 1/2 + 1/3 you get something like this:
This is obviously an approximation. The 3’s after the decimal point repeat indefinitely.

The correct answer is:

  • 1/2 = 3/6
  • 1/3 = 2/6
  • 1/2 + 1/3 = 3/6 + 2/6 = 5/6

Python is a simple but powerful language, and comes with a wealth of libraries. Its Fractions library gives you the correct answer in a couple of lines

Here is the annotated code. You can find the raw code at the GitHub repository

1. from fractions import Fraction
Load the Fractions library

2. half = Fraction(‘1/2’)
3. third = Fraction(‘1/3’)
Create the two fractions

4. total = half + third
Add them up

5. print(half, ‘+’, third, ‘=’, total)
Show the result.
The more modern way is to use an “f-string”, which was introduced in Python 3.6, December 2016. This is often more readable, but not here. It would look like this:
print(f'{half} + {third} = {total}’)

Sample chart

Retrieve and display a data set

(First part of the “Practical Python in 10 lines or less” series)

Python is a simple but powerful language, and comes with a wealth of libraries. The chart above took just 10 lines of Python. All the hard work is done by the Pandas and MatPlotLib libraries.

The code

import pandas, matplotlib
data = pandas.read_csv('http://www.compassmentis.com/wp-content/uploads/2019/04/cereal.csv')
data = data.set_index('name')
data = data.calories.sort_values()[-10:]
ax = data.plot(kind='barh')
ax.set_xlabel('Calories per serving')
ax.set_ylabel('Cereal')
ax.set_title('Top 10 cereals by calories')
matplotlib.pyplot.subplots_adjust(left=0.45)
matplotlib.pyplot.show()

How it works

You will need Python and the Pandas and MatPlotLib libraries. See the installation instructions

Get started

1. import pandas, matplotlib
Grab the libraries we need to load, clean up and display the data.
The recommended approach (PEP 8) is to have two import statements on separate lines. To leave enough lines to make the chart look good, in this example I have combined them.

2. data = pandas.read_csv(‘http://www.compassmentis.com/wp-content/uploads/2019/04/cereal.csv’)
Load the csv data from a website. This gives us a pandas DataFrame, a two dimensional datastructure similar to a page in a spreadsheet.
I downloaded the data from https://www.kaggle.com/crawford/80-cereals/version/2, under Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) [https://creativecommons.org/licenses/by-sa/3.0/]

3. data = data.set_index(‘name’)
Set the row names (index) to the ‘name’ column. When we plot the data this becomes the data labels.

4. data = data.calories.sort_values()[-10:]
Take the ‘calories’ column, sort it and limit to the last 10 values. This gives us the 10 cereals with the highest calories per serving

5. ax = data.plot(kind=’barh’)
Plot the data as a horizontal bar chartax.set_xlabel(‘Calories per serving’)

6. ax.set_ylabel(‘Cereal’)
7. ax.set_title(‘Top 10 cereals by calories’)
8. ax.set_xlabel(‘Area in millions square kilometers’)

Set the label for the x and y axes and the title

9. matplotlib.pyplot.subplots_adjust(left=0.45)
Set the left margin (from the left of the image to the left of the chart area) to 45% to give enough space for the cereal names.

10. matplotlib.pyplot.show()
Show the chart

Getting started with Python for Scientific Computing

So you’d like to do some data analysis or other scientific computer with Python. How do you start?

The Anaconda distribution

A Python ‘distribution’ is a bundle of Python goodies, typically Python itself, a set of Python libraries and possibly an integrated development environment.

Anaconda is a Python distribution specifically for data science. It includes the most popular data science and machine learning Python packages, Jupyter for quick exploratory data analysis and Spyder for creating and running Python scripts.

For more information and to install Anaconda go to the Anaconda Distribution page

Jupyter Notebook

A Jupyter notebook lets you try out different Python commands and create a story which shows your steps and the results. For instance:

Once you have installed Anaconda, or otherwise installed Jupyter:

  1. Open a Terminal or Command Prompt
  2. jupyter notebook
  3. Jupyter will open in your browser
  4. Click on the ‘New’ button (right hand side), and select ‘Python 3’
  5. Start typing
  6. To execute a cell, hit Ctrl-Enter
  7. Jupyter automatically saves the notebook. Click on the title (top left hand corner, next to Jupyter logo) to give it a sensible name

Getting started with Python

So you’d like to give Python a go. How do you start?

(If you are going to be using Python for Scientific Computing, including Data Analysis, have a look at this article instead)

Installing Python

Make sure you install Python 3, which is the modern version of Python. There is also a legacy version of Python, Python 2.7, but this is being phased out and should not be used for new projects.

You can find installation files for Windows and Mac OSX at https://www.python.org/downloads/. When you start the installation on Windows there will be an option to add Python to the system path. I recommend you select this option, as it makes it easier to run your Python scripts. I have not tried this on Mac OSX; it may have the same option.

For Linux you can use your software package manager, such as aptitude, yum or zypper to install ‘python3’. This will give you Python 3

Running Python – REPL/Console

For trying out some simple Python commands you can use the Python Console. This is also called the REPL (Read, Execute, Print Loop). To start the Python Console, just run Python. This will give you something like this:

Have a little play with this. For instance:

When you are done, press ^Z (Windows) or ^D (Mac OSX and Linux). Or enter ‘exit()’

Running Python – IDLE editor

The console is great for quick experiments. For anything more permanent it is better to create a script, a text file which contains Python code. When you installed Python it came with IDLE, a very simple integrated development environment.

Start IDLE from your operating system’s menu. You will see something like this:

Now select File, New File. Enter some Python commands, like:

Hit ‘F5’ to run the program. You will be prompted to save the file first, so give it a name and save it. You will see the result of your script in the original (shell) window:

Running a Python script from the command line

Say you’ve written a Python script, or someone else has given you a script. How do you run it?

  1. Start a Terminal or (as Windows calls it) a Command Prompt.
  2. Use the ‘cd <path to folder>’ command to go to the folder which contains the script
  3. Enter: ‘python <scriptname>.py’. For instance: ‘python test.py’

Other editors

IDLE is great for getting you started quickly, but for any serious Python development I suggest you use a professional text editor or IDE (Integrated Development Environment). Both a text editor and an IDE let you create and edit text files. An IDE can also run, debug, test and more. For instance:

  • PyCharm. My favourite IDE. It gives you so much power to write, run, debug and test your scripts, I don’t know where to start. Just check it out at …. Start with the free Community edition.
  • Visual Studio Code. I hear good things about this IDE, and it recently became more popular than PyCharm, so it must be doing something right.
  • Sublime Text. An excellent text editor

Data Analysis with Python

Python is a very popular tool for data extraction, clean up, analysis and visualisation. I’ve recently done some work in this area, and would love to do some more. I particularly enjoy using my maths background and creating pretty, clear and helpful visualisations

  • Short client project, analysing sensor data. I took readings from two accelerometers and rotated the readings to get the relative movement between them. Using NumPy, Pandas and MatplotLib, I created a number of different charts, looking for a correlation between the equipment’s setting and the movement. Unfortunately the sensors aren’t sensitive enough to return usable information. Whilst not the outcome they were hoping for, the client told me “You’ve been really helpful and I’ve learned a lot”
  • At PyCon UK (Cardiff, September 2018) I attended 14 data analysis sessions. It was fascinating to see the range of tools and applications in Python data analytics. At a Bristol PyData MeetUp I summarised the sessions in a 5 minute lightening talk. This made me pay extra attention and keep useful notes during the conference
  • Short client project, researching best way to import a large data set, followed by implementation. The client regularly accesses large datasets using a folder hierarchy\to structure that data. They were looking to replace this with a professional database, i.e. PostgreSQL. I analysed their requirements, researched the different storage methods in PostgreSQL, reported my findings and created an import script.

Django Rest Framework API Microservice

I recently completed a small project for Zenstores. They simplify the shipping process for ecommerce sites. Their online service lets online businesses use multiple shipping companies for deliveries.

Each shipping companies offers a different own API, for booking shipments, etc. My client uses a separate microservice for each shipping company. These microservices listen to requests from the main system and translate them to the shipping company’s standard.

My client asked me to use Django Rest Framework to create a microservice which supports a new shipping company. DRF is a popular and powerful library to create RESTful APIs using Django.

The supplier provided me with a sandbox API and extensive documentation. The documentation was somewhat incomplete and out of date. Fortunately their support contact was very helpful all along.

I used Test Driven Design for complex functions where I understood the functionality well. For the rest I used a more experimental approach and added unit tests afterwards. Testing coverage was over 90{d34bf16ac7b745ad0d2811187511ec8954163ba9b5dbe9639d7e21cc4b3adbdb}.

The client has integrated the microservice within their system and the first test shipments have gone through.

Teaching Python

Recently Learning Tree, a well-respected training company, invited me to teach Python for them. Last week I delivered my first course for them, their Advanced Python course

A room full of people, nearly 500 slides, about 10 step-by-step practical exercises and four days to make sure every left with a better understanding of Python

Even though I’ve been programming in Python for 6 years, I still don’t know it all. The language itself is constantly growing, there are 150,000+ open source Python packages, and only so many bytes of storage in my brain. In preparation I read through the slides, and looked up anything which i wasn’t fully clear on myself. I was pleasantly surprised by how much I do know

And, on the flip side, I added some of my own experiences whilst delivering the slides, adding some depth and flavour to the course

I made sure to regularly check the delegates’ understanding, and to fine tune my delivery. I’ve yet to receive a compilation of the feedback but, as far as I can tell, everyone made good progress and enjoyed it.