Author Archives: coen

Namepy – on the shoulders of giants

Whilst my core skill/tool is Python, I’m always learning new things, either inside or outside the Python ecosystem. I recently had the pleasure of working with Angular and Python/Flask. Here is a playful application based on these, plus Highcharts.

Going through “Python for Data Analysis”, some of the examples use a database of frequency of (US) baby names since 1880. I thought I’d combine this with a bit of Scrabble™.

In the Python world it’s common to add “py” to a word when making up names, so I’m calling this project “namepy”.

Since I’ll be using various frameworks and libraries, all created by others, I’ve subtitled this “On the shoulders of giants”.

Taking small steps often results in faster progress, so that’s what I’m be doing here.

Technical set up

The source code is at https://github.com/CoachCoen/namepy, with one branch per step.

Many production sites Content Delivery Networks for serving Javascript frameworks and libraries, usually minified, which helps to take the load of the server and may speed up first page load. To keep things simple and stable over time, I’m using full-sized, downloaded, copies.

I’m using WebFaction (affiliate link) as the host, since they make it easy to create Flask, Django and similar projects. And, as a popular host for developers, you’ll find lots of helpful documentation for developers online.

Getting started

Create a project folder

mkdir namepy
cd namepy

At the start of each of the steps

cd (my folder for personal projects)
cd namepy
git clone https://github.com/CoachCoen/namepy.git -b step1 step1

Note: “-b step1” specifies the name of the branch to clone. The second “step1” is the target folder, i.e. namepy/step1.

Next

Continue to Step 1 – Angular “Hello World”

Investment Tracking System – Django/Python

My client, a start up with a lot of experience in their field, had identified an important gap in the market. Large sums of money were being invested, with very long payback periods, without access to effective performance tracking tools.

They designed a tool to cover the gap and asked me to create a demonstration system in preparation for generating interest and raising capital.

I developed the system in Django, Python, PostgreSQL and Javascript. The front end uses a dashboard template based on Bootstrap and jQuery. Graphs are created using the excellent Highcharts charting library.

The resulting system imports the base data and generates monthly cost and revenue forecasts, taking into account seasonal variations, tax allowances and more.

Selection_005

The main management screen gives quick access to some key performance indicators.

Selection_006

Constraints can be defined, and potential investments can be checked against them.

Selection_007

Actual results can be compared against the projections.

Selection_008

Different heat maps show absolute or relative performance by state or county.

Selection_009

This was an eight month intensive project, resulting in a demo site which generated a lot of interest in the industry and allowed the client to achieve their first round of funding.

VirtualBox – moving virtual disks

Having installed VirtualBox I created a few virtual machines, with 20GB virtual disks each. After all, I’ve got a 1TB hard disk in my computer.

When Linux started complaining about running out of disk space I realised that my main drive is a 256GB Solid State Drive, and that is where VirtualBox was storing the virtual disks. Time to move the virtual disks. Please note that it is a bit of a manual process and, whilst it worked fine for me, if you’ve already invested a lot of time setting up your virtual machines, I suggest that you test this out by moving one or two VMs first.

  1. Stop all VMs, stop VirtualBox
  2. Find a folder called “VirtualBox VMs” (probably in your home folder), and move it to the new location
    1. Or move just a few files first, to make sure it works
  3. Restart VirtualBox
    1. In File -> Preferences -> General, point the Default Machine Folder to the new location
    2. All your existing VMs are now broken. Delete them
    3. One at a time: Machine -> Add
      1. Select the .vbox file from the new folder

Virtual machines using VirtualBox

Tapping into all the wonderful tools and languages that make our lives as developers so interesting can throw up some fun (?) challenges.

You start installing the latest linter or library, and before you know it you’re in dependency hell. Required libraries require even more libraries, with difference versions numbers and clashing with your existing setup.  If you’re really unlucky you break some essential software along the way.

A safer approach is to use virtual machines, for experimentation, and to keep different (and clashing) environments separate.

For more detail see the step by step instructions at Everyday Linux User.

These instructions are for installing VirtualBox on Linux Mint (17.2), and then creating a Linux Mint virtual machine

Installation and setup

I used mostly used the default settings. This may not be right for everyone

  1. Use a package manager to install VirtualBox.
    1. I used Synaptic, and selected virtualbox-5.0
  2. When I started VirtualBox and tried to create my first virtual machine (VM), it only let me create 32 bit VMs, even though my computer is fully 64 bit
    1. Fix this by going into the BIOS and enabling (Intel) Virtualisation
  3. If like me you have multiple drives (SSDs and/or HDDs) or multiple partitions, specify where VirtualBox stores the virtual drives (files):
    1. File -> Preferences -> General -> Default Machine Folder

Your first virtual machine

  1. Download a copy of the relevant linux Distro
  2. Start VirtualBox: Menu -> Administrate -> Oracle VM VirtualBox
  3. Click on “New”
    1. Name: E.g. “Linux Mint 1”
    2. Type: “Linux”
    3. Version: Linux 2.6/3.x (64 bit) or Other Linux (64 bit)
    4. Memory: Recommended size or higher (note: you can always change it later)
      1. During my first attemp I used the default of 256MB. When trying to boot Linux Mint off the install ‘disk’, the virtual machine ground down to a halt. Increasing this to 1GB fixed this)
      2. 512MB minimum, 1GB+ is probably better
  4. Hard drive: Create a virtual drive now
    1. Note: VirtualBox creates a file and pretends that it is a whole hard disk (i.e. a “virtual drive”)
    2. Type: leave as is (VDI)
    3. Size: Dynamically allocated
    4. Limit: leave as is or increase
    5. Create
  5. The machine has been created, but isn’t running yet. Click on the “New” button
    1. Click on the folder with green ‘arrow’ icon
    2. Select the previously download distro
    3. Click Start
      1. This will start the VM, which will boot off the downloaded distro
    4. Follow the instructions to install the new OS
    5. Suggestion: Use a different password from your main password. This is generally good security practice, but in this case it may stop you doing something on your actual Linux install whilst thinking you’re working on a VM. I realised this when I tried to stripdown all s/w on a VM to only that relevant to development work, and nearly removed some software of the host operating system
  6. If you get a large error message “Running in software rendering mode”:
    1. With VM stopped, go into settings -> Display -> Enable 3D accelleration

I suggest you clone this VM before you start experimenting, to save you having to reinstall the OS should you need another (clean) VM

Installing Python 2.10 on Linux Mint 17.2

Warning: You may have some packages on your machine which rely on Python, and which may no longer work after installing a different version of Python. Ideally you should use a virtual machine for this

From https://slobaexpert.wordpress.com/2015/07/09/upgrade-python-2-7-6-to-python-2-7-10-on-linux-mint-os/:

  1. Download Gzipped source tarball from the Python website
  2. Unzip the downloaded file
  3. Switch to new folder (e.g. /Downloads/Python-2.7.10)
  4. In a terminal window:
    1. apt-get install libc-dev
      1. If you miss this, you’ll get an error: error: C compiler cannot create executables
    2. ./configure
    3. sudo make install
  5. Run “python –version”
    1. You should now see 2.7.10

Bottle – Python micro framework

Like Flask, Bottle is a Python micro-framework. It is so micro that it only consists of a single file. Whilst Flask is already a fairly small framework, some developers prefer Bottle, mainly for its easy of setup (single file, no dependencies)

The Bottle home page gives a Hello World example

Local trial

  1. Create and activate a new virtualenv or use a virtual machine.
    1. I use a VM for experimenting with Python libraries and frameworks, to keep it separate from my client work
  2. sudo pip install bottle
  3. Create hello_world.py with the code from the Bottle home page
  4. Run the script: python hello_world.py
  5. Check in your browser: http://localhost:8080/hello/world
    1. This should show “Hello world!”
    2. Also try it with your own name, e.g. http://localhost:8080/hello/Coen

On a server – using wsgi

For Python I use WebFaction, which makes it very easy to create new Python applications

In your WebFaction control panel:

  1. Domains / Websites -> Websites
    1. Select the domain
  2. Click on “Add an application” -> Create a new application
    1. Name: bottle_hello_world
    2. Category: mod_wsgi
    3. Type: mod_wsgi / Python 2.7 (note: Bottle also works with Python 3.x)
    4. URL: /bottle-hello-world
    5. Save
    6. Click Save again
  3. ssh into the host
    1. cd webapps
    2. cd <app name> (e.g. bottle_hello_world)
    3. cd htdocs
    4. pip install bottle
  4. Check bottle is installed
    1. python2.7
    2. import bottle
  5. Adapt code for wsgi (based on instructions at http://bottlepy.org/docs/dev/deployment.html#apache-mod-wsgi) and replace the contents of index.py with:

    import os
    # Change working directory so relative paths (and template lookup) work again
    os.chdir(os.path.dirname(__file__))

    import bottle
    from bottle import route, run, template
    application = bottle.default_app()

    @route(‘/hello/<name>’)
    def index(name):
    return template(‘<b>Hello {{name}}</b>!’, name=name)

  6. Test it, e.g. http://cm-demo.com/bottle-hello-world/index.py/hello/fred
    1. This should show “Hello fred!”

 

Flask and Angular on Heroku

I am working my way through this excellent tutorial, covering Python3, Flask, Angular, Heroku, SQLAlchemy, Alembic, requests, Beautiful Soup, NLTK, Redis and D3. Here are some extra notes

  • To stop me from blindly copying/pasting the code, I printed off the tutorial and worked from the paper version.
  • I had some problems installing Virtualenvwrapper (on Linux Mint 17.2), until I followed these instructions
  • I had some clashes with Anaconda
    • Virtualenvwrapper’s deactivate clashed with Anaconda’s deactivate. Prompted by these instructions I renamed ~/anaconda/bin/activate and ~/anaconda/bin/deactivate
    • “pip install psycopg2” resulted in:
      Error: “setuptools must be installed to install from a source distribution”
      After much experimentation I guessed that this might be due to Anaconda. I created a new virtual machine (without Anaconda) and re-started the tutorial. This fixed the psycopg2 problem

Part 1, set up Heroku

  • I used a free Heroku account. Between a dedicated server, a WebFaction account and a HotDrupal account I’m already paying enough for hosting
  • “heroku create wordcounts-pro” gave me an error “Name is already taken”. According to this Heroku page,  app names are in the global namespace, so I guess I’m not the first one to follow this tutorial. To work around this, I prepended the app name with my initials, i.e. “heroku create cdg-wordcounts-pro”, etc
  • So I can push the changes to heroku I set up public key access
  • Before running “git push stage/pro master”, make sure to check in the changes to git (git add, git commit)

Part 2, set up databases

  • To create the Postgres database:
    • sudo su — postgres
    • psql
      • # CREATE DATABASE wordcount_dev;
      • # CREATE USER ‘<your user name>’
      • # GRANT ALL PRIVILEGES ON wordcount_dev TO <your user name>;
  • After running “heroku run python manage.py db upgrade …” I got the error message:
    No such file or directory: ‘/app/migrations/versions’

    • Locally I had an empty directory <app folder>/migrations/versions. However, git ignores empty directories. This is why I could run “.. manage.py db upgrade” locally but not on heroku
    • Oops, I’d forgotten to run
      python manage.py db migrate
      Now it worked fine
    • If you make the same mistake, remember to propagate the changes to heroku and then re-run db migrate on heroku

Part 3, requests, Beautiful Soup and NLTK

  • At one stage I got a server error. To sort this I looked at the heroku log:
    heroku logs –app <heroku app name>
  • When I ran the nltk downloader I didn’t get the usual gui but a “tui” (text user interface). It was fairly simple to navigate, but I didn’t bother to specify the location of the tokenizers. Instead I used the default (~/nltk_data) and then moved nltk_data into my app folder
  • The links to Bootstrap and jQuery didn’t work, either because I mistyped them or because they are out of date. The Bootstrap and jQuery websites give you up-to-date CDN links, so use those instead

Part 4, Redis task queue

  • I used these instructions to install Redis on Linux Mint
  • Apart from the inevitable few typing mistakes, everything worked remarkably smoothly. Nothing else to add

Part 5, Adding in Angular

  • It all went well, until I added the getWordCount function to the controller. I’d put the function inside the main.js file, but outside of the controller. When poller got called, none of the dependencies were included, so it couldn’t find $http (first line of poller function)
    • The error was: $http not defined
    • Despite comparing my version with the author’s GitHub one, I couldn’t see the difference. In the end I used the author’s version (of main.js) instead of mine. That worked fine. It took another line by line comparison to find the problem
  • The word/frequency list is no longer sorted. jsonify loses the order

Part 6, Staging the changes, including Redis

  • So far I’ve been using a free account. When I tried to add on Redis, heroku tells me: Please verify your account to install this add-on plan (please enter a credit card)
    • If I understand it correctly, it is still free (but don’t take my word for it – and don’t come back to me if you end up getting charged)
    • I entered my credit card details for my Heroku again. Now I can add Redis
  • “heroku addons:add redistogo –app” gave a warning to say that “addons:add” has been deprecated.
    • I used “addons:create” instead

 

A simple Python Flask app on WebFaction

Flask is a popular Python-based micro framework. Here is how to install it on WebFaction.

This is based on the Flask instructions and the WebFaction instructions.

  1. Log into your WebFaction control panel
  2. Domains/Websites -> Applications -> Add new application
    1. Name: FlaskTest
    2. mod_wsgi
    3. mod_wsgi / Python 2.7
    4. (keep port closed)
  3. Domains/Websites -> Websites
    1. Click on website
    2. Add an application -> Reuse an existing application
    3. Select flasktest, url: http://cm-demo.com/flasktest
    4. Save the Website (important: this step is easy to miss, I’ve missed it myself a few times)
  4. ssh into host
    1. cd webapps/flasktest
    2. vi setup.sh
    3. cut and paste script from https://community.webfaction.com/questions/12718/installing-flask
    4. Change APPNAME and URLPATH (e.g. /foo)
    5. exit vi (<esc>ZZ)
    6. sh setup.sh
  5. Test it: point your browser to <domain>/foo
    1. This should show “Hello World!”
    2. Edit webapps/flasktest/flasktest/__index__.py, line:
      return “Hello World!”
      to return some different text
    3. Refresh page in browser
    4. Browser should now show the new text
  6. Set up the static files – make all files in /foo/static static:
    1. Domains/Websites -> Websites
    2. Click on website
    3. Add an application
      1. Create a new application
      2. Name: flasktest_static
      3. Symbolic link
      4. Symbolic link to static-only app
      5. Extra info: /home/<user_name>/<webapps/<app>/<app>/static
    4. Create the static folder
    5. Create a test page in the new static folder
    6. Try loading the page in your browser

Python for Data Analysis and Natural Language Processing

As I’m making my way through Natural Language Processing with Python and Data Science from Scratch: First Principles with Python, the first step is to set up the development environment.

My first attempt was to install numpy, python, nltk, matplotlib, IPython, etc, one at a time. However, I hit a few clashes between Python versions, so switched to Anaconda instead:

  1. Download Anaconda
  2. From the download folder, execute
    1. sh <name of downloaded file>
    2. Accept the defaults, but answer yes to preprend to PATH
  3. To check it works, start Python
    1. import numpy
    2. import pandas
    3. import matplotlib
  4. Download the nltk assets. Start Python and enter:
    1. import nltk
    2. nltk.download()
    3. use GUI to download “all”

Python, Bottle and websockets

Here is a simple websockets demo, using Python and Bottle, based on the example on the Bottle website

  1. Requirements:
    • pip install bottle
    • pip install python-dev
    • pip install gevent-websocket
  2. Create websockets.py, from the source code at http://bottlepy.org/docs/dev/async.html#finally-websockets
  3. python websockets.py
    • This starts the (websockets) server. Note that it doesn’t show any output
  4. Create websockets.html, again from http://bottlepy.org/docs/dev/async.html#finally-websockets
    • Make sure to change the websocket address (from “ws://example.com:8080/websocket”). On your local machine this should be “ws://localhost:8080/websocket”
  5. Load websockets.html in your browser (e.g. as a local file, at file:/// etc)
    • This should come up with an alert saying: Your message was ‘Hello, world’