# Population Initialisation

Contents

## Preamble¶

In [9]:
# used to create block diagrams
%xdiag_output_format svg

import numpy as np                   # for multi-dimensional containers
import pandas as pd                  # for DataFrames
import plotly.graph_objects as go    # for data visualisation
import plotly.io as pio              # to set shahin plot layout
import plotly.express as px

pio.templates['shahin'] = pio.to_templated(go.Figure().update_layout(margin=dict(t=0,r=0,b=40,l=40))).layout.template
pio.templates.default = 'shahin'


## Introduction¶

Before the main optimisation process (the "generational loop") can begin, we need to complete the initialisation stage of the algorithm. Typically, this involves generating the initial population of solutions by randomly sampling the search-space. We can see see in the figure below that this initialisation stage is the first real stage, and it's only executed once. There are many schemes for generating the initial population, and some even include simply loading in a population from an earlier run of an algorithm.

In [2]:
%%blockdiag
{
orientation = portrait
Initialisation ->Evaluation -> "Terminate?" ->Selection -> Variation -> Evaluation
Initialisation [color = '#ffffcc']
}


## Randomly sampling the search-space¶

When generating an initial population, it's often desirable to have a diverse representation of the search space. This supports better exploitation of problem variables earlier on in the search, without having to rely solely on exploration operators.

We ppreviously defined a solution $x$ as consisting of many problem variables.

$$x=\langle x_{1},x_{2},\ldots,x_{\mathrm{D}} \rangle \tag{1}$$

We also defined a multi-objective function $f(x)$ as consisting of many objectives.

$$f(x) =(f_{1}(x),f_{2}(x),\ldots,f_{M}(x))\tag{2}$$

However, we need to have a closer look at how we describe a general multi-objective optimisation problem before we initialise our initial population.

$$\left.\begin{array}{lll}\tag{3} optimise & f_{m}(x), & m=1,2,\ldots,\mathrm{M};\\ subject\, to & g_{j}(x)\geq0, & j=1,2,\ldots,J;\\ & h_{k}(x)=0, & k=1,2,\ldots,K;\\ & x_{d}^{(L)}\leq x_{d}\leq x_{d}^{(U)} & d=1,2,\ldots,\mathrm{D}; \end{array}\right\}$$

We may already be familiar with some parts of Equation 3, but there are some we haven't covered yet. There are $\mathrm{M}$ objective functions which can be either minimised or maximised. The constraint functions $g_j(x)$ and $h_k(x)$ impose inequality and equality constraints which must be satisfied by a solution $x$ in order for it to be considered a feasible solution. Another condition which affects the feasibility of a solution $x$ is whether the problem variables fall between (inclusively) the lower $x_{d}^{(L)}$ and upper $x_{d}^{(U)}$ boundaries within the decision space.

The lower $x_{d}^{(L)}$ and upper $x_{d}^{(U)}$ boundaries may not be the same for each problem variable. For example, we can define the following upper and lower boundaries for a problem with 10 problem variables.

In [2]:
bounds_lower = [-2, -2, -2, 0, -5, 0.5, 1, 1, 0, 1]
bounds_upper = [ 1,  2,  3, 1, .5, 2.5, 5, 5, 8, 2]


In Python, we normally use np.random.rand() to generate random numbers. If want to generate a population of 20 solutions, each with 10 problem variables ($\mathrm{D} = 10$), we could try something like the following.

In [4]:
D = 10
population = pd.DataFrame(np.random.rand(20,D))
population

Out[4]:
0 1 2 3 4 5 6 7 8 9
0 0.766612 0.184047 0.601579 0.068389 0.284365 0.731083 0.345808 0.490977 0.216370 0.469142
1 0.931453 0.285878 0.219297 0.950589 0.149372 0.755915 0.224620 0.021315 0.784383 0.233293
2 0.591960 0.876968 0.728366 0.460648 0.291675 0.075145 0.633870 0.974640 0.327474 0.757012
3 0.731576 0.058537 0.871623 0.786730 0.639722 0.411750 0.478481 0.151831 0.096821 0.064846
4 0.768519 0.624092 0.897223 0.847986 0.307039 0.966692 0.422969 0.068752 0.111435 0.576825
5 0.786212 0.876639 0.617626 0.385611 0.475770 0.394879 0.662741 0.611269 0.851260 0.735206
6 0.216132 0.827222 0.519841 0.184116 0.349358 0.851645 0.330379 0.803641 0.028491 0.660653
7 0.340328 0.433944 0.876099 0.154460 0.438385 0.345494 0.822077 0.948882 0.579029 0.437911
8 0.046793 0.527956 0.927804 0.452686 0.468774 0.098060 0.053914 0.592822 0.872358 0.951044
9 0.712764 0.586972 0.269684 0.260473 0.114770 0.284557 0.352796 0.300339 0.939495 0.371977
10 0.834033 0.416069 0.599284 0.782922 0.777133 0.367259 0.769624 0.197171 0.326048 0.620496
11 0.596868 0.235184 0.076150 0.525104 0.111583 0.842872 0.242970 0.489983 0.438350 0.079297
12 0.908824 0.750178 0.984141 0.967807 0.991128 0.376226 0.362957 0.700389 0.333699 0.054738
13 0.775101 0.095440 0.394862 0.509590 0.289470 0.717739 0.721343 0.320663 0.512721 0.591190
14 0.957106 0.921012 0.296528 0.978763 0.844241 0.852089 0.845572 0.133657 0.626378 0.336043
15 0.884786 0.390959 0.985424 0.511147 0.364119 0.036313 0.988521 0.871238 0.171361 0.907771
16 0.886475 0.091019 0.546598 0.497058 0.174661 0.292781 0.366122 0.519969 0.630355 0.917971
17 0.147351 0.904364 0.020057 0.434303 0.385260 0.438435 0.551153 0.608886 0.180581 0.357245
18 0.961474 0.264556 0.873333 0.141266 0.711996 0.504578 0.002061 0.057101 0.058072 0.933630
19 0.181440 0.594304 0.543607 0.167866 0.432604 0.965145 0.334304 0.926761 0.073111 0.959622

This works fine if all of our problem variables are to be within the boundries 0 and 1 ($x_d \in [0,1]$). However, in this case we have 10 different upper and lower boundaries, so we can use np.random.uniform() instead.

In [5]:
population = pd.DataFrame(np.random.uniform(low=bounds_lower, high=bounds_upper, size=(20,D)))
population

Out[5]:
0 1 2 3 4 5 6 7 8 9
0 -1.595909 1.535486 1.293814 0.814964 -0.062594 2.054281 3.126897 1.335578 0.936931 1.926560
1 0.096592 -1.640662 1.577867 0.317075 -0.830324 1.178568 3.476282 1.638450 0.396965 1.772735
2 -0.535852 -0.754006 -0.633159 0.373528 -4.507362 1.864907 4.120304 4.970927 2.330047 1.580339
3 0.188357 1.989350 0.873236 0.896850 -3.036866 0.952019 2.549332 2.924610 7.829042 1.505618
4 -0.370044 1.333593 2.932775 0.049951 -3.143218 1.463486 1.493686 4.895625 5.184477 1.016235
5 -1.496762 -0.706285 2.829133 0.585000 -2.631146 2.404434 4.764201 2.688590 3.714554 1.469213
6 -0.539600 -0.772007 -1.589400 0.450851 -2.227573 2.009811 4.203952 2.225340 3.437317 1.559837
7 -0.011132 0.604741 2.719304 0.766913 -1.688981 0.902308 3.465328 1.969914 1.263797 1.471288
8 0.649533 1.990616 -0.517845 0.746607 0.221777 1.657132 2.769852 4.612056 2.622929 1.766202
9 0.741051 -1.016850 1.810758 0.550817 -3.508683 1.902297 3.286318 4.174131 2.016121 1.746620
10 -0.083028 1.801292 1.885415 0.594444 0.250987 2.101817 3.633468 4.289277 1.906456 1.076899
11 -0.848767 0.690134 -1.808839 0.686618 0.134730 2.497157 1.523919 3.302480 6.988643 1.666587
12 0.817963 1.533294 1.511442 0.169367 -0.632227 1.056512 4.888487 2.274547 4.583653 1.361091
13 -0.459185 -0.959863 2.251957 0.351321 -3.568598 2.257585 3.572641 2.365158 7.767679 1.725867
14 0.958412 0.537646 -0.567982 0.589296 -1.531263 2.261984 2.415958 4.144767 6.123147 1.894044
15 -0.175442 -1.134751 2.724895 0.393374 -1.612171 2.270715 1.241392 2.801125 3.548206 1.195132
16 0.552039 -1.775044 1.950477 0.326143 -3.116347 2.362907 4.494731 2.442099 2.221567 1.209989
17 -0.213648 -1.197165 2.250890 0.465618 -1.696848 2.293076 4.367690 1.685938 3.076453 1.692199
18 -0.827996 -0.844561 1.263769 0.204334 -0.885901 2.371207 1.637415 3.092980 5.956112 1.490452
19 -1.056114 1.811104 2.737403 0.257598 -0.467132 1.158290 4.192725 2.173034 7.834752 1.256477

Let's double check to make sure our solutions fall within the problem variable boundaries.

In [6]:
population.min() > bounds_lower

Out[6]:
0    True
1    True
2    True
3    True
4    True
5    True
6    True
7    True
8    True
9    True
dtype: bool
In [7]:
population.max() < bounds_upper

Out[7]:
0    True
1    True
2    True
3    True
4    True
5    True
6    True
7    True
8    True
9    True
dtype: bool

Great! Now all that's left is to visualise our population in the decision space. We'll use a parallel coordinate plot.

In [24]:
fig = go.Figure(layout=dict(xaxis=dict(title='problem variables', range=[1, 10]),yaxis=dict(title='value')))

for index, row in population.iterrows():

fig = px.scatter_matrix(population, title=' ')