Technical writing on data science, visualization, and artificial intelligence.

A collection of technical notebooks from my books and publications, presented in chronological order.

Pareto-Front Visualisation with PlotAPI

Dominance relations can be clearly visualised when working in a two-objective space. Let's do this with some arbitrary solutions. We'll use the `ParetoFront` visualisation from PlotAPI.

Node sorting

The nodes in a PlotAPI Sankey diagram can be sorted (e.g. alphabetically) within their columns.

Goal Rush

Everything you need to create beautiful, engaging, and interactive Goal Rush visualizations.

Details and thumbnails

The popup supports displaying additional details and thumbnails. This makes exploration even better with PlotAPI Chord.

Troubleshooting

Frequently asked questions and their answers.

Node start index

BarFight allows you to scroll up and down the number of bars, and set a starting point.

Split Chord

Everything you need to create beautiful, engaging, and interactive Split Chord visualizations.

Chord

Everything you need to create beautiful, engaging, and interactive Chord visualizations.

Pulled out arcs

PlotAPI Chord supports pulling out arcs as a way to highlight or group elements.

Showcase

Explore the different demonstrations of PlotAPI in this showcase.

WebGL or canvas

We may want to switch between the WebGL or D3.js canvas animations depending on our requirements. These have been configured to look the same.

Vertical pipes

Let's take a look at the vertical property of the Terminus diagram. Using it can give the impression that the particles are "falling" through the pipes.

Terminus

Everything you need to create beautiful, engaging, and interactive Terminus visualizations.

Stats styling

The stats panel that appears in a default Terminus diagram can be adjusted or hidden entirely. This gives us even more control over the presentation of our Terminus diagram.

Pixels per unit

Let's look at a unit-based approach to handling data with very high values or many sources and targets.

Pixels per percentage

Let's look at a percentage-based approach to handling data with very high values or many sources and targets.

Pixel size

The Terminus pixel size can be modified to improve the presentation of our diagram, this works as intended in both WebGL and D3.js canvas mode.

Pixel colors

The pixel color scheme can be set to one of many beautiful presets, or even a customized by supplying a list of colors (HEX, RGB, etc.).

Pipe styling

The Terminus pipes can be modified to improve the presentation of the diagram, giving control over the spacing and slope positions.

Pipe colors

The pipe color and opacity can be changed to improve the presentation of our Terminus diagram.

Pipe bundling

Let's take a look at the bundling parameter. This will group together the input, giving the impression that they are entering through the same pipe.

Pipe alignment

Pipes are aligned to the top-left by default, however, changing their alignment can often significantly improve the presentation of a Terminus diagram.

Layout properties

The title, width, height, margin, and position can be changed with these layout properties.

Dynamic title count

The Terminus diagram supports setting the title and also including the dynamic count of pixels currently on their journey.

Bar styling

The bars that appear at each terminus can be modified or hidden entirely. This gives us even more control over the presentation of our Terminus diagram.

Animation

The Terminus animation can be modified to improve the presentation of our diagram, giving us control over a pixel's journey duration, how many pixels are dispatched at once, and the delay before the animation begins.

Text formatting

The many text formatting options include multi-line label wrapping, size, and color, and much more.

Sankey

Everything you need to create beautiful, engaging, and interactive Sankey visualizations.

Reverse gradients

Depending on our data, it may make more sense to reverse the gradient directions used for colouring the links.

Popup layout and format

The popup supports many customizations, including the text format, width, and disabling it entirely.

Node styling

The nodes in a PlotAPI Sankey diagram can be styled with many different parameters.

Node and link colors

The color scheme can be set to one of many beautiful presets, or even a customized by supplying a list of colors (HEX, RGB, etc.). Overrides also allow colouring specific nodes.

Node aligment

PlotAPI Sankey supports different layouts using the specification of node alignment.

Linked data table

PlotAPI Chord supports a linked data table. This means as you hover over arcs and chords in the Chord diagram, a data table will be filtering in real-time to show more information.

Link opacity

The opacity of the links in the Sankey diagram can be changed. The opacity will change on mouse-over to highlight the selected connections.

Link background color

Until we mouseover a Sankey link, what we actually see is the background color. By default, this background colour is set to be the same as the link foreground color, as defined by the `colors` parameter, which by default is `"gradient"`.

Layout properties

The title, width, height, margin, position, background color, and border can be changed with these layout properties.

Flat link colors

PlotAPI Sankey uses gradients to colour the links by default. This can be changed to use flat colors instead.

Animation

PlotAPI Sankey supports animations - both for interactions and for nice introductions to your visualisation.

Temporal format

Temporal formats can be used for order values and presentation.

Pie Fight

Everything you need to create beautiful, engaging, and interactive Pie Chart Race visualizations.

Node icons and colors

Let's take a look at how we can change the colours and icons for nodes in our Pie Fight diagram. These configurations override the color scheme.

Node text and values

Let's take a look at how we can change the presentation of the text and values in our Pie Fight diagram.

Theme colors

The color scheme can be set to one of many beautiful presets, or even a customized by supplying a list of colors (HEX, RGB, etc.).

Event mode

Event information can be displayed using different modes, these include paused, running, and interactive.

Event information

Let's take a look at how we can display event content during our visualisation at different times. This can be useful for displaying additional information or images that are relevant to specific events.

Pareto Front

Everything you need to create beautiful, engaging, and interactive Pareto Front visualizations.

Temporal format

Temporal formats can be used for order values and presentation.

Node icons and colors

Let's take a look at how we can change the colours and icons for nodes in our Line Fight diagram. These configurations override the color scheme.

Node colors

The color scheme can be set to one of many beautiful presets, or even a customized by supplying a list of colors (HEX, RGB, etc.).

Line Fight

Everything you need to create beautiful, engaging, and interactive Line Chart Race visualizations.

Event mode

Event information can be displayed using different modes, these include paused, running, and interactive.

Event information

Let's take a look at how we can display event content during our visualisation at different times. This can be useful for displaying additional information or images that are relevant to specific events.

Popup layout and format

The popup supports many customizations, including the text format, width, and disabling it entirely.

Heat Map

Everything you need to create beautiful, engaging, and interactive Heat Map visualizations.

Ticks

PlotAPI Chord supports displaying and formatting ticks around the diagram.

Text formatting

The many text formatting options include multi-line label wrapping, size, and color.

Reverse gradients

Depending on our data, it may make more sense to reverse the gradient directions used for colouring the ribbons.

Radius scales

The inner and outer radius scales of the Chord diagram can be adjusted. This will change the thickness of the arcs.

Popup layout and format

The popup supports many customizations, including the text format, width, and disabling it entirely.

Linked data table

PlotAPI Chord supports a linked data table. This means as you hover over arcs and chords in the Chord diagram, a data table will be filtering in real-time to show more information.

Layout properties

The title, width, margin, position, and rotation can be changed with these layout properties.

Equal-sized arcs

The arcs of a Chord diagram can be customised to be of equal size, regardless of the value of their relationships.

Directed chords

The default behaviour of PlotAPI chord is to represent both sides of a relationship with a single chord. However, it may be more suitable to use two different chords with arrows to indicate the dimension of a relationship.

Curved labels

In some cases, enabling curved labels will improve the look of a PlotAPI Chord diagram.

Hidden diagonals

By default, Chord displays occurrences where a category is not related to another category. These values appear in the diagonal of the matrix. It may be desirable to hide (but not remove) these values.

Chord opacity

The opacity of the inner section of the Chord diagram, or the _chords_, can be changed. The opacity will change on `mouseover` to highlight the selected connections.

Chord colors

The color scheme can be set to one of many beautiful presets, or even a customized by supplying a list of colors (HEX, RGB, etc.).

Asymmetric relationships

By turning symmetric mode off, each end of a chord can be of different size, and it also changes the information presented in the popup.

Arc padding

Adjusting padding may not sound exciting, but in this case, it can have some interesting effects.

Arc numbers

PlotAPI Chord supports displaying the quantity associated with each arc as a label.

Animation

PlotAPI Chord supports animations - both for looping and for nice introductions to your visualisation.

Temporal format

Temporal formats can be used for order values and presentation.

Node visibility

Let's take a look at how we can change the number of bars that are visible by default.

Node icons and colors

Let's take a look at how we can change the colours and icons for nodes in our Bar Fight diagram. These configurations override the color scheme.

Node colors

The color scheme can be set to one of many beautiful presets, or even a customized by supplying a list of colors (HEX, RGB, etc.).

Layout properties

The title, width, height, margin, position, background color, and border can be changed with these layout properties.

Event mode

Event information can be displayed using different modes, these include paused, running, and interactive.

Event information

Let's take a look at how we can display event content during our visualisation at different times. This can be useful for displaying additional information or images that are relevant to specific events.

Bar Fight

Everything you need to create beautiful, engaging, and interactive Bar Chart Race visualizations.

Animation

The BarFight animation can be modified to improve the presentation of our diagram, giving us control over animations durations, delays, and behavior.

Visualizations

Easily turn your data into engaging visualizations with PlotAPI's friendly interface — with or without code.

Getting started

Welcome to the PlotAPI documentation. It contains all the information you need to get started using PlotAPI.

Uploading to cloud

PlotAPI supports uploading visualizations to the cloud, where they can be embedded and shared privately, publicly, or within your Team.

Saving locally

PlotAPI supports saving visualizations locally. This includes saving to PNG, PDF, SVG, animated MP4, and interactive HTML.

REST API

PlotAPI supports more than just Python and Rust. With the REST API, you can create beautiful visualizations from any language.

Julia

How to use PlotAPI with Julia.

JavaScript

How to use PlotAPI with JavaScript.

HTTP

How to use PlotAPI with HTTP.

License activation

Activate your license and get access to Pro and Business features with the instructions below.

Installation

Get access to the API with the instructions below. Libraries available for Python and Rust. For everything else, there's the REST API.

Displaying in notebook

PlotAPI supports displaying interactive visualizations directly within Jupyter Notebooks. This includes Jupyter Lab, Jupyter Notebook Classic, Google Colab, and more.

Desktop Browsers Market Share with Pie Fight

In this notebook we're going to use PlotAPI Pie Fight to visualise desktop browser market share over time. We"ll use Python, but PlotAPI can be used from any programming language.

Pokemon Trends with Bar Fight

In this notebook we're going to use PlotAPI Bar Fight to visualise Pokémon search trends over time. We"ll use Python, but PlotAPI can be used from any programming language.

Pokemon Types with Chord

In this notebook we're going to use PlotAPI Chord to visualise the co-occurrences between Pokémon types. We"ll use Python, but PlotAPI can be used from any programming language.

Animal Crossing Villager Species and Personality

In this notebook we're going to use PlotAPI Chord to visualise the co-occurrences between the species and personality of Animal Crossing villagers. We"ll use Python, but PlotAPI can be used from any programming language.

Apple 2021 Q3 Results with Sankey

In this notebook we're going to use PlotAPI Sankey to visualise some of the Apple's filings for the third quarter of 2021.

Apple 2021 Q4 Results with Sankey

In this notebook we're going to use PlotAPI Sankey to visualise some of the Apple's filings for the fourth quarter of 2021.

League of Legends Classes

In this notebook we're going to use PlotAPI Chord to visualise the co-occurrences between League of Legends classes. We"ll use Python, but PlotAPI can be used from any programming language.

Pokemon Types with Heat Map

In this notebook we're going to use PlotAPI Heat Map to visualise the co-occurrences between Pokémon types. We"ll use Python, but PlotAPI can be used from any programming language.

Animal Crossing Villager Style

In this notebook we're going to use PlotAPI Chord to visualise the style co-occurrences of Animal Crossing villagers. We"ll use Python, but PlotAPI can be used from any programming language.

IMDb Top 1000 with Chord

In this notebook we're going to use PlotAPI Chord to visualise the co-occurrences of genres in the IMDb "Top 1000" (Sorted by IMDb Rating Descending). We"ll use Python, but PlotAPI can be used from any programming language.

Degree Classification by Graduate Gender with Terminus

In this notebook we're going to use PlotAPI Terminus to visualise how degree classes vary by graduates' gender. We"ll use Python, but PlotAPI can be used from any programming language.

Degree Classification by Graduate Ethnicity with Terminus

In this notebook we're going to use PlotAPI Terminus to visualise how degree classes vary by graduates' ethnicity. We"ll use Python, but PlotAPI can be used from any programming language.

Global Email Spam with Terminus

In this notebook we're going to use PlotAPI Terminus to visualise the average daily email & spam volume for August 2021.

DataFrame to Samples Dict

Data often needs wrangling prior to visualisation. Let's take a look at how we can transform our data from a DataFrame to a Dictionary for PlotAPI.

Generating Background Colours

Choosing the right colours can often make all the difference when adding finishing touches to a visualization. Finding colours that work together can be difficult, even more so when we need to find colours for hundreds of elements. So let's see how we can generate some colours with code!

Creating a Dataset by Web Scraping

Data analysis and visualization tutorials often begin with loading an existing dataset. Sometimes, especially when the dataset isn't shared, it can feel like being taught how to draw an owl. So let's take a step back and think about how we may create a dataset for ourselves.

Path Construction with 2D Arrays

Whilst you could say that it's possible to draw a zigzag using multiple rect elements at different positions and rotations, it is certainly an infeasible and inefficient exercise. This is where the SVG path element comes in. A path describes an outline of some shape that can be filled and/or stroked.

CVSS Exploratory Data Analysis

A comparison between CVSS-2 and CVSS-3 using Python and the data science stack.

Colour Transitions

Whilst we could use the string literal magenta as our argument to change the fill colour, we'll use its hexadecimal equivalent, #ff00ff instead. This will give us more opportunity should we want to tinker with the colours. We could also use an RGB value, e.g. rgb(255, 0, 255).

Animated Transitions with Easing

D3.js transitions support easing functions which can change the speed at which an animation progresses throughout its duration. There are many different easing functions available in D3.js. Examples, descriptions, and visualisations of each one can be found in the API reference.

Looping Animated Transitions

We've introduced this new invocation of the transition.on(typenames, listener) function. This can add a listener function to the selection, which is invoked based on the event type. We've used an event type of end because we want our transition to end before calling the next one.

Animated Transitions

Much like selections, transitions can be used to modify attributes and styles. The difference is that whilst selections apply the changes instantly, transitions apply the changes gradually (and smoothly) over a specified duration.

Grouping Elements

The g SVG element is a container used to group other SVG elements. Transformations applied to the g element are performed on its child elements, and its attributes are inherited by its children. We can create a group element with D3.js by appending a g element using any selection.

Attributes and Styles

We can set CSS style properties by invoking .style(name, value) on the selection, and set SVG attributes by invoking .attr(name, value) where the argument to the first parameter should be the name of the attribute we want to set, and the argument to the second parameter should be the value we want to set it to.

Selections and Selecting Elements

Besides the d3.create() and d3.append() functions which return selections, we can use the d3.select() and d3.selectAll() functions to return selections by matching a CSS selector.

Creating Shape Elements

To create a circle element with D3.js we can invoke the append(name) function on our svg selection and pass in the name of the element. In this case, we're passing in circle as our argument for the name parameter.

Creating an Empty SVG

To create an element with D3.js we invoke the create(name) function and pass in the name of the element. In this case, we're passing in svg as our argument for the name parameter.

Software and Page Setup

It helps to have the right tools and templates available so that we can focus on the examples and exercises. Getting the hang of D3.js may involve plenty of tinkering, saving, and refreshing of HTML documents - don't be discouraged if your visualisations don't look right the first, second, or third time!

Preface

There is a wealth of cookbook-style resources available for D3.js visualisations, meaning you can create some interesting visualisations by copying some code and passing in your data. However, what this book aims to be is a practical journey through the many components of D3.js. By the end of this book, we want to be able to create new visualisations from the ground up and modify the behaviour of existing ones.

Box Plots at the Olympics

We work towards illustrating the age and height of athletes grouped by games in the 120 years of Olympic history.

Getting Started with PlotAPI and Rust

Generate visualisations through the API from Python, Rust, and more. Super-charge your notebooks with inline visualisations!

Olympic Weightlifting Medals with Stacked Bar Charts

We're going to use 120 years of Olympic history to create a visualisation. Let's set our sights on something that illustrates the distribution of Olympic medals awarded for the weightlifting sport.

Video Game Publishers and Genres with SplitChord

In this notebook we're going to use PlotAPI SplitChord to visualise the co-occurrences between genres and publishers in video game titles. We"ll use Python, but PlotAPI can be used from any programming language.

Top Olympic Medal Earning Countries

In this notebook we're going to use PlotAPI Chord to visualise the co-occurrences between countries and medals earned in the olympic games. We"ll use Python, but PlotAPI can be used from any programming language.

League of Legends World Championship

In this notebook we're going to use PlotAPI Chord to visualise the matches between teams throughout the Legends World Championship 2019. We"ll use Python, but PlotAPI can be used from any programming language.

StamiStudios Panels and Colours

In this notebook we're going to use PlotAPI Chord to visualise the co-occurrences between panels and colours purchased for the StamiStudios Everyday Ita Bag. We"ll use Python, but PlotAPI can be used from any programming language.

Visualisation of Co-occurring Types

We're going to use the Complete Pokemon Dataset dataset to visualise the co-occurrence of Pokémon types from generations one to eight. We'll make this happen using a chord diagram.

Interactive Chord Diagrams

Chord diagrams are useful when trying to convey relationships between different entities, and they can be beautiful and eye-catching.

NDArray Index Arrays and Mask Index Arrays

NumPy has many features that Rust's NDArray doesn't have yet, e.g. index arrays and mask index arrays. However, there is more than one way to index an array!

Unique Array Elements and their Frequency

Let's demonstrate a few approaches to identifying the unique elements in an array, counting the number of unique elements, and the frequency of these unique elements.

API Requests and JSON Data

We'll often need to make HTTP requests to retrieve data - let's see one common approach in Python.

Software Setup

Software Setup

Preface

Preface

Better Output for 2D Arrays

Let's improve the presentation of the cell output for our arrays. This will generally improve the presentation of our notebooks.

Finishing Touches for Visualisation

Let's take our Plotly workaround a step further to its final destination - a reusable function that we can use throughout our analyses.

Better Plotting with Plotly

Let's improve our workaround for data visualisation with Plotly for Rust in Jupyter notebooks.

Plotting with Plotly

How to embed Plotly visualisations in a Jupyter Notebook with a small workaround.

Plotting with Plotters

I had originally planned to use Plotters for all the graphing in this book. However, shortly after finding Plotters, I found out that a Rust library had enabled Plotly support.

Setup Anaconda, Jupyter, and Rust

We are taking a practical approach in the following sections. As such, we need the right tools and environments available in order to keep up with the examples and exercises.

Preface

The Rust programming language has become a popular choice amongst software engineers since its release in 2010. Besides being something new and interesting, Rust promised to offer exceptional performance and reliability.

YAML for Configuration Files

What this section is most interested in is using YAML for configuration files, enabling us to extract parameters that we use within our programs so that they can be separated.

MNE DataFrame Scaling

MNE DataFrame scaling with Pandas.

Time Domain Analysis

Plotting multiple channels in the time domain.

Topographical Plots

Plotting topographical maps with MNE.

Fast Fourier Transform

How to plot multiple different sine waves onto different subplots.

Algorithm Performance and Statistical Significance

Let's test the significance of our pairwise comparison. The significance test you select depends on the nature of your data-set and other criteria. We will use the Wilcoxon signed-rank.

Summing Sine Waves

We move on from creating and plotting individual sine waves, to summing them and plotting our new and more complicated wave.

Sine Waves in the Time Domain

Let's look at a simple sine wave, how to create one in Python and how to visualise it in the time domain using a line chart.

Sample Size Sufficiency

Before conducting a comparison between algorithms we need to determine whether our sample size will be sufficient, i.e. is our sample size large enough to support our hypothesis?

Using a Framework to Compare Algorithm Performance

Let's use the Platypus framework to compare the performance of the Non-dominated Sorting Genetic Algorithm II (NSGA-II) and the Pareto Archived Evolution Strategy (PAES).

Using a Framework to Generate Results

Let's demonstrate how we can use a popular multi-objective optimisation algorithm, NSGA-II, to approximate multiple trade-off solutions to the DTLZ2 test problem.

Contributing Hypervolume Indicator

The Contributing Hypervolume (CHV) indicator is a population sorting mechanism based on an adaptation of the hypervolume indicator.

Hypervolume Indicator

The hypervolume indicator is a performance metric for indicating the quality of a non-dominated approximation set.

Non-Dominated Sorting

Non-dominated sorting is important during the selection stage of an Evolutionary Algorithm because it allows us to prioritise the selection of solutions based on their dominance relations with respect to the rest of the population.

Pareto Optimality and Dominance Relations

We often look for a single solution which has the best objective value, whereas this is not possible in multi-objective problems because they often involve conflicts between multiple objectives.

Single Objective Problems: Rastrigin

In single-objective problems, the objective is to find a single solution which represents the global optimum in the entire search space. Let's take the Rastrigin function as an example.

Single Objective Problems

In single-objective problems, the objective is to find a single solution which represents the global optimum in the entire search space. Let's take the Sphere function as an example.

Using a Framework and the ZDT Test Suite

Let's use the Platypus implementation of ZDT1, which will save us from having to implement it in Python ourselves.

Population Initialisation

Before the main optimisation process can begin, we need to complete the initialisation stage of the algorithm. There are many schemes for generating the initial population - let's start simple.

Synthetic Objective Functions and ZDT2

We will be using a synthetic test problem throughout this notebook called ZDT2. It is part of the ZDT test suite, consisting of six different two-objective synthetic test problems.

Synthetic Objective Functions and ZDT1

We will be using a synthetic test problem throughout this notebook called ZDT1. It is part of the ZDT test suite, consisting of six different two-objective synthetic test problems.

Objective Functions

Objective functions are perhaps the most important part of any Evolutionary Algorithm, whilst simultaneously being the least important part too.

Block Diagrams in Notebooks

Throughout this book, we will programmatically generate block diagrams to illustrate concepts and processes. Let's see how they're generated.

Python Crash Course

This crash-course makes the assumption that you already have some programming experience, but perhaps none with Python.

Software Setup

We are taking a practical approach in the following sections. As such, we need the right tools and environments available in order to keep up with the examples and exercises.

Preface

Evolutionary Algorithms (EAs) are a fascinating class of algorithms for meta-heuristic optimisation. Perhaps the most difficult question to answer is where do we start? There is so much to cover, and many potential starting points.

Class Imbalance and Oversampling

Let's take a quick look at the problem of imbalanced datasets and one way to address it with oversampling.

Results Analysis

Supported by figures and statistics, we will have a look at how our solution performed and discuss anything interesting about the results.

Data Wrangling and Experiment Implementation

We will use the Keras API on top of TensorFlow to implement our experiment. All code will be in Python, and at the time of publishing everything is guaranteed to work within a Kaggle Notebook.

Experimental Design

It's important to know what we're looking for, how we're going to use our dataset, what algorithms we will be employing, and how we will determine whether the performance of our approach is successful.

Exploratory Data Analysis

We present and discuss a dataset selected for our machine learning experiment. This includes some analysis and visualisations to give us a better understanding of what we're dealing with.

Getting Started with Kaggle

Let's go through the process of signing up to Kaggle, firing up a Kernel, and executing a Hello World program in Python.

Pairwise Comparison

Pairwise comparison of data-sets is very important. It allows us to compare two sets of data and make decisions based on the outcome.

Standard Deviation

A brief re-cap on calculating the standard deviation with and without numpy.

Software Setup

We are taking a practical approach in the following sections. As such, we need the right tools and environments available in order to keep up with the examples and exercises.