Hypervolume Indicator

Preamble

In [1]:
import numpy as np                   # for multi-dimensional containers
import pandas as pd                  # for DataFrames
import plotly.graph_objects as go    # for data visualisation
import plotly.io as pio              # to set shahin plot layout
import platypus as plat              # multi-objective optimisation framework
import pygmo as pg                   # multi-objective optimisation framework
import plotly.express as px

pio.templates['shahin'] = pio.to_templated(go.Figure().update_layout(legend=dict(orientation="h",y=1.1, x=.5, xanchor='center'),margin=dict(t=0,r=0,b=40,l=40))).layout.template
pio.templates.default = 'shahin'

Introduction

In this section we're going to take a look at how to calculate the hypervolume indicator value for a population of solutions. The hypervolume indicator (or $s$-metric) is a performance metric for indicating the quality of a non-dominated approximation set, introduced by [1] where it is described as the "size of the space covered or size of dominated space". It can be defined as:

$$ HV\big(f ^{ref}, \mathrm{X}\big) = \Lambda \left( \bigcup_{\mathrm{X}_n \in \mathrm{X} } \Big[ f_{1}(\mathrm{X}_n), f_{1}^{ref} \Big] \times \dots \times \Big[ f_{m}(\mathrm{X}_n), f_{m}^{ref} \Big] \right) $$

where $HV\left(f ^{ref}, \mathrm{X}\right)$ resolves the size of the space covered by an approximation set $\mathrm{X}$, $f^{ref} \in \mathbb{R}$ refers to a chosen reference point, and $\Lambda \left(.\right)$ refers to the Lebesgue measure. This can be illustrated in two-dimensional objective space (to allow for easy visualisation) with a population of three solutions.

Hypervolume Indicator

The hypervolume indicator is appealing because it is compatible with any number of problem objectives and requires no prior knowledge of the true Pareto-optimal front, this is important when working with real-world problems which have not yet been solved. The hypervolume indicator is currently used in the field of multi-objective optimisation as both a proximity and diversity performance metric, as well as in the decision-making process.

Unlike dominance-based criteria which require only two solutions for performing a comparison (which can be used on an ordinal scale), a reference vector is required to calculate the HV indicator value (i.e. it requires the objective to be measured on an interval scale). When used for pairwise or multiple comparison of optimisation algorithms, this reference vector must be the same, otherwise the resulting HV indicator values are not comparable. This reference vector can be approximated as large values for each problem objective in order for all objective values in any approximation set to be within the reference vector.

Calculating the Hypervolume Indicator with Platypus

Let's define some necessary variables before invoking the Platypus implementation of the hypervolume indicator algorithm. We will use the ZDT1 test function with design variables $\mathrm{D}=30$ throughout this example, with population size $\mathrm{N}=100$.

In [2]:
problem = plat.ZDT1()
D = 30
N = 100

With these variables defined, we will now move onto generating our initial population. We will be using Platypus Solution objects for this, which we will initialise with random problem variables, evaluate, and then append to a list named solutions.

In [3]:
solutions = []

for i in range(N):
    solution = plat.Solution(problem)
    solution.variables = np.random.rand(D)
    solution.evaluate()
    solutions.append(solution)

Let's print out the variables and objectives for the first solution in this list to see what they look like.

In [4]:
print(f"Design variables:\n {solutions[0].variables}")
print(f"Objective values:\n {solutions[0].objectives}")
Design variables:
 [0.60334361 0.87787542 0.03142044 0.14487709 0.92481308 0.95661311
 0.33349403 0.37519477 0.03984425 0.01261005 0.67984533 0.2307125
 0.96553172 0.15568727 0.50446299 0.51060793 0.56343829 0.59170951
 0.32728624 0.02539182 0.92936501 0.32965463 0.95077326 0.80808745
 0.10041791 0.9403275  0.61565891 0.39527713 0.56914812 0.59565309]
Objective values:
 [0.6033436096584304, 3.674672751755749]

Now that we have a population of solutions stored in the solutions variable, we can prepare an instance of the Platypus.indicators.Hypervolume() object with the desired reference point for the hypervolume indicator calculation. For ZDT1, the reference point is typically $\langle11,11\rangle$.

In [5]:
hyp = plat.indicators.Hypervolume(minimum=[0, 0], maximum=[11, 11])

We can now use this hyp object to calculate the hypervolume indicator for any population.

Note

The Platypus implementation of the hypervolume indicator requires either a minimum and a maximum point, or a reference_set (not the same as the hypervolume reference point). Normally, a hypervolume indicator algorithm would only require a single vector that defines the reference point $f ^{ref}$. In the case of Platypus, $f ^{ref}$ actually corresponds to maximum, but Platypus also forces us to provide a vector for minimum, which we have set to $\langle0, 0\rangle$

Let's calculate the hypervolume indicator value for the population of solutions we created above and named solution.

In [6]:
print(f"Hypervolume indicator value: {hyp.calculate(solutions)}")
Hypervolume indicator value: 0.7554161688220257

We now have this single hypervolume indicator value that we can use to gauge and compare the performance of our population. The higher the the value is, the "better" the hypervolume quality.

Calculating the Hypervolume Indicator with PyGMO

We can also use a different framework's implementation of the hypervolume indicator on our population. We should be expecting the same value, but this is a good exercise to learn how to use a different framework, and perhaps to check that they do indeed arrive at the same value.

This time we will use the PyGMO framework. PyGMO's hypervolume indicator function can work with a few different data-types, including numpy.array(). We have previously moved our Platypus solutions to a pandas.DataFrame (which can easily be output as a numpy.array()). Let's begin by creating a new DataFrame with the columns f1 and f2 which will be used to store our objective values for each solution.

In [7]:
solutions_df = pd.DataFrame(index=range(N),columns=['f1','f2']).astype(float)
solutions_df.head()
Out[7]:
f1 f2
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN

We can see that we've also defined an index range that covers the number of solutions in our population, $\mathrm{N}=100$. This means we have $100$ rows ready, but they their values are initialised to NaN (Not A Number), which in this case simply indicates missing data.

Let's use a loop to iterate through our solutions list of $100$ solutions and assign the desired values to the corresponding row in our solutions_df DataFrame

In [8]:
for i in range(N):
    solutions_df.loc[i].f1 = solutions[i].objectives[0]
    solutions_df.loc[i].f2 = solutions[i].objectives[1]
        
solutions_df.head()
Out[8]:
f1 f2
0 0.603344 3.674673
1 0.806312 2.628754
2 0.059896 5.468507
3 0.322516 4.536322
4 0.928396 2.653431

We can see our DataFrame now contains the desired values. We can now easily get this data as a numpy.array() to feed into PyGMO's hypervolume indicator object constructor.

In [9]:
hyp = pg.hypervolume(solutions_df[['f1','f2']].values)

Now we can invoke the compute() method on our hypervolume object and pass in the reference point to calculate the value.

In [11]:
hyp.compute([11, 11])
Out[11]:
91.40535642746507

Upon first inspection, this value seems to be different to the one calculated by Platypus so we may decide that we've either done something wrong, or there is a mistake in at least one of the implementations. However, in this case it is simply the case that one of them (Platypus) has normalised the output. We can do the same with the output from PyGMO by dividing the hypervolume indicator value by the product of the reference point.

In [13]:
hyp.compute([11, 11]) / np.prod([11, 11])
Out[13]:
0.7554161688220253

Now we can see that both frameworks have arrived at the same hypervolume indicator.

Conclusion

In this section we have introduced the hypervolume indicator as a criterion that can be used in the selection of a population. We also demonstrated the application of two implementations of the hypervolume indicator, one in Platypus, and one in PyGMO.

Exercise

Experiment with the hypervolume indicator: try using different reference points, different population sizes, and different problems.


  1. E. Zitzler, S. K ̈unzli, Indicator-based selection in multiobjective search, in: Parallel Problem Solving from Nature-PPSNVIII, Springer, 2004, pp. 832–842