Python2 & Python3 on the same Apache server

Notes on setting up a python 3 application on an apache server already running python 2 applications. This covers Flask, mod_wsgi-express, Apache2 & HTTPS support. 

For my latest personal project – a web app allowing my wife to fetch activities tracked on her garmin GPS watch & upload them to runkeeper – I finally decided to take the plunge & switch to Python 3.(1)When I began my PhD way back in 2012 and started learning Python, library support for Python 3 wasn’t yet complete enough for me and I’ve been stuck on Python 2 since. All was well when testing locally using Flask’s built in webserver, but things got tricky when I tried to deploy on my VPS where Apache was already serving Python 2 applications.

Running both isn’t directly possible, though the linked page starts to explain the solution. I found bits of this scattered across the web, but no complete descriptions. Hopefully this post might help others in a similar position. Throughout, I will show the setup using paths on my system, with a Flask application deployed in /var/www/apps/garmin with a wsgi script called flask.wsgi, which we want to serve on the URL /garmin .

mod_wsgi-express

What makes this possible at all is mod_wsgi-express. Using the standard mod_wsgi, Apache uses an embedded Python, which prevents running two versions simultaneously. Instead, mod_wsgi-express configures its own apache instance, which can target whatever Python version we like. There is a pip module, which provides mod_wsgi-express though you will need python & apache headers installed (e.g.  apt-get install python3-dev apache2-dev).

We can then start a new web server using the following:

This of course needs to be run as a user that can access these files. The –reload-on-changes means that we do not have to restart the server after making changes to our application. A new apache server will be listening on localhost:8000 by default, with the configuration & service scripts available in a directory something like /tmp/mod_wsgi-localhost:8000:106.

Apache2 reverse proxy

With mod_wsgi-express providing a local Apache server, we now need our public-facing Apache instance to forward requests. We set up a Reverse Proxy to forward requests to the appropriate resource to the internal webserver. We can do so this like so:

Unlike standard mod_wsgi, there is no need for us to configure WSGIDaemonProcess, WSGIScriptAlias, WSGIProcessGroup, WSGIApplicationGroup etc. Great!

Fixing the URLs

But wait! Although this appears to work and our site is being served to the world, all of our links that are relative to our site root (i.e. beginning with a /) are broken and point to the public-facing site root. This includes URLs generated by Flask. Fortunately, the fix is quite straightforward – though quite a few guides seem to have this slightly wrong. First, we update the public-facing apache configuration:

We then provide the –mount-point flag to mod_wsgi-express:

Finally, we need to make sure xhr requests understand where to route to. We can do this by passing through the script root in our base Flask template:

We then simply prefix this to any site-relative links generated Javascript.

HTTPS support

The final challenge was providing HTTPS support – since we’re forced to scrape Garmin’s website, users must provide their garmin username and password. With an SSL certificate in place, at least they’re not sent in the clear(2)Yes, this really isn’t nice – originally I’d planned to keep this a private application for my wife and keep her credentials on the server, but this at least means it could be used by more than one person and means I don’t have to authenticate users myself. Using the absolutely awesome Let’s Encrypt, I setup a free, signed HTTPS certificate.

The problem is that the internal, forwarded request is a standard HTTP request (not really a security issue; all internal traffic). However, this means URLs constructed by Flask will use HTTP and not HTTPS. This might not be a problem for most people if you’re forwarding HTTP requests to HTTPS but it’s a huge issue if the URL were to be used in, say, a hash function as in the Oauth redirect_uri! To fix, make sure the public-facing Apache sets an appropriate header:

and mod_wsgi-express knows to trust such a header:

In newer versions of Flask – I’m running 0.12.2 — this is good enough and Flask will take care of the rest. It appears that older versions might not to so automatically and you may need to experiment. However, AFAICT the ProxyFix shouldn’t be necessary as our http server is correctly setting the headers for us.

Aside: certbot doesn’t understand WSGI directives

certbot, the tool provided by Let’s Encrypt to auto-configure your setup, doesn’t yet properly understand WSGI directives. In particular, the name given to each WSGIDaemonProcess must be unique. certbot attempts to duplicate these, causing an Apache config error & certbot to rollback the changes. For now, the simplest fix is to comment out such lines before using certbot & then uncomment one of each (or both, renaming the HTTPS daemon process).

   [ + ]

1. When I began my PhD way back in 2012 and started learning Python, library support for Python 3 wasn’t yet complete enough for me and I’ve been stuck on Python 2 since
2. Yes, this really isn’t nice – originally I’d planned to keep this a private application for my wife and keep her credentials on the server, but this at least means it could be used by more than one person and means I don’t have to authenticate users myself

Analysing the UK General Election Results

On Thursday, 7th May, the UK voted in a Conservative government by a small majority, to the surprise of most. The surprise was not so much that the Conservatives won the largest number of seats, but that a majority government could be formed at all – a hung parliament was widely expected. As the final votes were being counted on Friday morning, I set about analysing some of the results. In particular I was interested in how the UK system – in which 650 MPs are elected into parliament by separate votes in each constituency – affected things. I created a webpage to present this; this post is about how I did that.

Getting the data

The dataset I wanted included both the overall results – number of seats and votes won be each party – and the per constituency results. I couldn’t find any website providing this data in an easily accessible form, so I set about scraping the results from the BBC’s live coverage.

Obtaining the overall results proved to be very straightforward, the above page contains the results in a JSON string. So far, so good.

Per constituency results turned out to be a little more tricky. The BBC provided a results page for each constituency (e.g. the constituency I voted in). Unfortunately these results weren’t in a nice JSON format, so I’d have to parse the HTML. The BeautifulSoup Python module is great for this:

Cool. But I still needed to get the data for all the individual constituencies. The BBC results pages clearly had an ID in the title (e.g. http://www.bbc.co.uk/news/politics/constituencies/E140006020), so I just needed to find a list of IDs somewhere. After a bit of digging, I found that the winner of each constituency was being fetched on the results page in a nice JSON string. The network traffic analyser in the Chrome developer tools was great for finding this. A quick bit of manipulation of this JSON structure later, and we have a textfile containing all 650 constituency IDs. A simple bash pipeline later…

…and we have all the pages we need, fetched two at a time with parallel and wget. By the way, I love GNU Parallel

With the data now in an easy format to work with, it was pretty straightforward to knock together some quick(1)Time was of the essence; I wanted to have this ready to share whilst interest in the results was high Python scripts to extract some results.

Generating graphs

Initially I planned to create a static image with a few graphs. But I decided that it would be easier to share the analysis as webpage. First up, I would need a Javascript graphing library(2)Ok, so I could have created the images offline, but I was committed to the web endeavour! Plus this would give a little interactivity to the plots. I’ve used Flot in the past, but I decided to try out Google Charts for no reason other than that it’s good to experiment. The documentation was pretty clear, as were the examples, so it was easy to get started. First-up, a bar-chart of how the result would be affected if seats were allocated based on the proportion of the national result won by the party:

Getting a PNG image of the graph from Google Charts for this post was also straightforward:

effect of proportional representation

 

I also generated a pie chart of how seats are allocated under the current system. Setting the colours of the pie slices could not be done in the same way as with the colours of the bars. I’ve no idea why; the latter method seemed much simpler and less bloated.

election_wheel

Creating the webpage

Since (web)design is really not my forte, I thought I’d try out out Bootstrap to help ensure everything looked halfway reasonable. This was the first time I’ve tried using one of these frameworks, but it was pretty straightforward to get going with for my simple use case. I used the panel components to separate out the different parts of analysis, with a simple grid to present tables and graphs with simple analysis alongside. I was pleased that this seemed to look ok on mobile devices pretty much straight away, although the table is at risk of being cut-off on smaller screens.

Summing up

Despite not having previously used Google Charts or Bootstrap, I would it pretty straightforward to generate this simple webpage. Both of these tools passed the test of being easy to get started with. Overall, I found this a good way to present the information and made it easy to share with friends.

Finally, putting any personal political viewpoints aside, I think it demonstrates how crazy the current system is.

   [ + ]

1. Time was of the essence; I wanted to have this ready to share whilst interest in the results was high
2. Ok, so I could have created the images offline, but I was committed to the web endeavour! Plus this would give a little interactivity to the plots