Visualising the confirmed cases of COVID-19 in England with Mapbox

Preamble

In [9]:
import numpy as np                   # for multi-dimensional containers 
import pandas as pd                  # for DataFrames
import plotly.graph_objects as go    # for data visualisation
import plotly.io as pio              # to set shahin plot layout
from geopy.geocoders import Nominatim
from geopy.exc import GeocoderTimedOut
from IPython.display import display, clear_output

pio.templates['shahin'] = pio.to_templated(go.Figure().update_layout(legend=dict(orientation="h",y=1.1, x=.5, xanchor='center'),margin=dict(t=0,r=0,b=0,l=0))).layout.template
pio.templates.default = 'shahin'

Introduction

This section is similar to the previous ones on Visualising the confirmed cases of COVID-19 in England with Scattergeo. However, in this section we will be using more recent data and plotting with Mapbox instead of Scattergeo.

I came across the Table of confirmed cases of COVID-19 in England provided by Public Health England and thought it would be useful to visualise it. I have no doubt a similar visualisation already exists, but I thought it would be an interesting exercise. The data used throughout this notebook was last updated at 9:00 am on 11 March 2020.

Note taken from the data source

Data may be subject to delays in case confirmation and reporting, as well as ongoing data cleaning.

Location is based on case residential postcode. When this is not available, NHS trust or reporting laboratory postcode is used. The data is therefore subject to change.

Counts for Isles of Scilly and City of London are combined with Cornwall and Hackney respectively for disclosure control.

Visualising the Table

The first step was to copy and paste the data from the table into a CSV, followed by adding two column headings for lat and lon respectively. I have hosted the CSV for convenient and easy reproducibility.

Let's load the data into a pandas.DataFrame and look at the first few records.

In [2]:
data = pd.read_csv('https://shahinrostami.com/datasets/phe_covid_uk_11032020.csv')
data.head()
Out[2]:
local_authority confirmed_cases lat lon
0 Barking and Dagenham 1 NaN NaN
1 Barnet 8 NaN NaN
2 Barnsley 2 NaN NaN
3 Bath and North East Somerset 0 NaN NaN
4 Bedford 0 NaN NaN

We have the local authority (we can consider these to be locations) and the number of confirmed cases. You can also see our lat and lot columns are empty. Let's populate these by making requests through GeoPy.

First, we need an instance of the Nominatim (geocoder for OpenStreetMap data) object. We don't want to violate the usage policy, so we'll also pass in a user_agent.

In [3]:
geolocator = Nominatim(user_agent="covid_shahinrostami.com")

Let's see if we can get some location data using one of our local_authority items. To demonstrate, we'll use the one for my hometown, Bournemouth.

In [4]:
data.local_authority[10]
Out[4]:
'Bournemouth, Christchurch and Poole'

This will now be passed into the geocode() method. We'll also append "UK" to the string for disambiguation, e.g. France has a "Bury" too.

In [5]:
location = geolocator.geocode(f"{data.local_authority[10]}, UK")
location
Out[5]:
Location(Bournemouth, Bournemouth, Christchurch and Poole, South West England, England, United Kingdom, (50.744673199999994, -1.8579577396350433, 0.0))

It looks like it's returned all the information we need. We will need to access this directly too.

In [6]:
print(location.latitude, location.longitude)
50.744673199999994 -1.8579577396350433

Now we need to do this for every local_authority in our dataset and fill in the missing lat and lon values.

In [7]:
for index, row in data.iterrows():
    location = geolocator.geocode(f"{row.local_authority}, UK",timeout=100)

    data.loc[index,'lat'] = location.latitude 
    data.loc[index,'lon'] = location.longitude

    # None of the following code is required
    # I just wanted a progress bar!
    clear_output(wait = True)
    amount_unloaded = np.floor(((data.shape[0]-index)/data.shape[0])*25).astype(int)
    amount_loaded = np.ceil((index/data.shape[0])*25).astype(int)
    loading = f"Retrieving locations >{'|'*amount_loaded}{'.'*amount_unloaded}<"
    display(loading)

print("Done!")
'Retrieving locations >|||||||||||||||||||||||||<'
Done!

Now let's put this on the map! We'll go for a bubble plot on a map of the UK, where larger bubbles indicate more confirmed cases.

Note

To plot on Mapbox maps with Plotly you will need a Mapbox account and a public Mapbox Access Token. I've removed mine from mapbox_access_token in the cell below.

In [8]:
data['text'] = data['local_authority'] + '<br>Confirmed Cases ' + (data['confirmed_cases']).astype(str)

import plotly.graph_objects as go

mapbox_access_token = "your_mapbox_access_token"

fig = go.Figure(go.Scattermapbox(
    lon = data['lon'],
    lat = data['lat'],
        mode = 'markers',
        marker = go.scattermapbox.Marker(
            size = data['confirmed_cases']/.5,
            color = 'rgb(180,0,0)',
        ),
        text = data['text'],
    ))

fig.update_layout(
    autosize = True,
    hovermode = 'closest',
    mapbox = dict(
        accesstoken = mapbox_access_token,
        bearing = 0,
        center = {'lat': (data.lat.min() + data.lat.max())/2,
                'lon': (data.lon.min() + data.lon.max())/2},
        pitch = 0,
        zoom = 5,
        style = "basic", # try basic, dark, light, outdoors, or satellite.
    ),
)

fig.show()