YAML for Configuration Files

Preamble

In [1]:
# used to create block diagrams
%reload_ext xdiag_magic
%xdiag_output_format svg
    
import os

Introduction

In this section, we're going to have a look at YAML, which is a recursive acronym for "YAML Ain't Markup Language". It is a data interchange file format that is often found with a .yaml or .yml file extension.

What this section is most interested in is using YAML for configuration files, enabling us to extract parameters that we use within our programs so that they can be separated. This means that configuration files can be shared between scripts/programs, and they can be modified without needing to modify source code.

In [2]:
%%blockdiag
{
    orientation = portrait
    "config.yaml" <-> "notebook.ipynb"
    "config.yaml" [color = '#ffffcc']
}
blockdiag { orientation = portrait "config.yaml" <-> "notebook.ipynb" "config.yaml" [color = '#ffffcc'] } config.yamlnotebook.ipynb

Of course, there are many alternatives such as JavaScript Object Notation (JSON) or Tom's Obvious, Minimal Language (TOML), and they all have their advantages and disadvantages. We won't do a full comparison of YAML vs. alternatives, but some advantages of YAML are:

  • It is human-readable, making it easy for someone to read or create them.
  • Many popular programming languages have support for managing YAML files.
  • YAML is a superset of JSON, meaning that JSON can be easily converted to YAML if needed.

What Does YAML Look Like?

Let's start by showing an example of what a .yaml file would look like.

learning_rate:  0.1
random_seed: 789108
maintainer: Shahin Rostami
categories:
- hotdog
- not a hotdog

We can see that we have some key-value mappings, where the key appears before the colon and the value appears after. The first key we have is learning_rate with a value of 0.1, the second is random_seed with a value of 789108, and the third is maintainer with a value of Shahin Rostami. Finally, I have included a key mapping to a sequence, where categories has the value of list which contains the items "hotdog" and "not a hotdog".

You can read more about YAML at the YAML Specification web page.

So that we can work with this file later on in this section, you can paste the above YAML into a file in the same directory as this notebook and name it config.yaml. Alternatively, you can just run the cell below which will do it for you.

In [3]:
config_string = '''learning_rate:  0.1
random_seed: 789108
maintainer: Shahin Rostami
categories:
- hotdog
- not a hotdog'''

with open('config.yaml', 'w') as f:
    f.write(config_string)

Getting Python Ready for YAML

Before we begin working with YAML files in Python, we need to make sure we have PyYAML installed. There are alternatives to PyYAML available, but they may not be compatible with the following instructions. However, you can use ruamel.yaml as a drop-in replacement if you wish.

Some options to install PyYAML are with:

Anaconda

conda install -c conda-forge pyyaml

pip

pip install PyYAML

Once you have the package installed you should be ready to import the PyYAML package within Python.

In [4]:
import yaml

Loading YAML with Python

It is surprisingly easy to load a YAML file into a Python dictionary with PyYAML.

In [5]:
with open('config.yaml') as f:
    config = yaml.load(f, Loader=yaml.FullLoader)

We can confirm that it worked by displaying the contents of the config variable.

In [6]:
config
Out[6]:
{'learning_rate': 0.1,
 'random_seed': 789108,
 'maintainer': 'Shahin Rostami',
 'categories': ['hotdog', 'not a hotdog']}

It's as easy as that. We can now access the various elements of the data structure like a normal Python dictionary.

In [7]:
config['learning_rate']
Out[7]:
0.1
In [8]:
config['categories']
Out[8]:
['hotdog', 'not a hotdog']
In [9]:
config['categories'][0]
Out[9]:
'hotdog'

Updating YAML with Python

Let's say we want to update our learning_rate to 0.2 and add an extra category to our category list. We can do this using the normal Python dictionary manipulation.

In [10]:
config['learning_rate'] = 0.2
In [11]:
config['categories'].append('kind of hotdog')

We can then write this back to the config.yaml file to save our changes.

In [12]:
with open('config.yaml', 'w') as f:
    config = yaml.dump(config, stream=f,
                       default_flow_style=False, sort_keys=False)

All done! We can confirm this by loading our YAML File again and displaying the dictionary.

In [13]:
with open('config.yaml') as f:
    config = yaml.load(f, Loader=yaml.FullLoader)
config
Out[13]:
{'learning_rate': 0.2,
 'random_seed': 789108,
 'maintainer': 'Shahin Rostami',
 'categories': ['hotdog', 'not a hotdog', 'kind of hotdog']}

Conclusion

In this section, we briefly introduced YAML before using the PyYAML package to load, manipulate, and save a collection of configuration settings that we stored in a file named config.yaml. Keeping your configuration settings separate from your source code comes with multiple benefits, e.g. allowing modification of these configurations without modifying source code, automation and search throughout your project, and sharing configurations between multiple bits of work.

Using a Framework with a Custom Objective Function

Preamble

In [3]:
# used to create block diagrams
%reload_ext xdiag_magic
%xdiag_output_format svg
    
import numpy as np                   # for multi-dimensional containers
import pandas as pd                  # for DataFrames
import plotly.graph_objects as go    # for data visualisation
import plotly.io as pio              # to set shahin plot layout
import platypus as plat              # multi-objective optimisation framework

pio.templates['shahin'] = pio.to_templated(go.Figure().update_layout(legend=dict(orientation="h",y=1.1, x=.5, xanchor='center'),margin=dict(t=0,r=0,b=40,l=40))).layout.template
pio.templates.default = 'shahin'

Introduction

When applying multi-objective optimisation algorithms to real-world problems, we will often need to implement the objective functions ourselves. This problem comes in two parts:

  1. We need to design an objective function that correctly represents our real-world problem, taking the problem variables and producing the correct objective values;
  2. We need to implement this objective function in a way that can work with our optimiser.

This comes down to passing the desired number of problem variables to a custom objective function and receiving the desired number of objective values.

In [24]:
%%blockdiag
{
    orientation = portrait
    "Problem Variables" -> "Objective Function" -> "Objective Values"
    "Objective Function" [color = '#ffffcc']
}
blockdiag { orientation = portrait "Problem Variables" -> "Objective Function" -> "Objective Values" "Objective Function" [color = '#ffffcc'] } Problem VariablesObjective FunctionObjective Values

When preparing to implement multi-objective optimisation experiments, it's often more convenient to use a ready-made framework/library instead of programming everything from scratch. Many libraries and frameworks have been implemented in many different programming languages. With our focus on multi-objective optimisation, our choice is an easy one. We will choose Platypus which has a focus on multi-objective problems and optimisation.

Platypus is a framework for evolutionary computing in Python with a focus on multiobjective evolutionary algorithms (MOEAs). It differs from existing optimization libraries, including PyGMO, Inspyred, DEAP, and Scipy, by providing optimization algorithms and analysis tools for multiobjective optimization.

In this section, we will use the Platypus framework to apply the Non-dominated Sorting Genetic Algorithm II (NSGA-II)1 and to a custom objective function.

The Custom Objective Function

For our custom objective function we will look to implement F2 from Schaffer 19852, which is described as being a two-valued function of one variable. The function has been listed in Equation 1.

$$ \text{Minimize} = \begin{cases} f_{1}\left(x\right) = x^{2} \\ \tag{1} f_{2}\left(x\right) = \left(x-2\right)^{2} \\ \end{cases} $$

Let's implement this objective function using Python.

In [11]:
def schaffer_f1(x):
    f1 = x[0]**2
    f2 = (x[0]-2)**2
    return [f1, f2]

Now let's get this Python function into the Platypus Problem object which can be used during the evaluation stage of Platypus' optimisation process.

First, we will instantiate an instance of the Problem object, passing in the parameters $1$ and $2$, indicating that we want 1 problem variable and 2 objective values, respectively.

In [14]:
problem = plat.Problem(1, 2)

Next, we need to specify the type of the problem variables and their boundaries. In this case, we want real-valued problem variables between -10 and 10.

In [ ]:
problem.types[:] = plat.Real(-10, 10)

Finally, we will assign our implementation of the Schaffer F1 function to our Problem object.

In [15]:
problem.function = schaffer_f1

Now we're ready to apply an optimisation algorithm to the problem. Let's create an instance of the NSGA-II optimiser, and pass in our problem object as a parameter for its configurations.

In [16]:
algorithm = plat.NSGAII(problem)

Now we can execute the optimisation process. Let's give the optimiser a budget of 10,000 function evaluations as a termination criterion. This may take some time to complete depending on your processor speed and the number of function evaluations.

In [17]:
algorithm.run(10000)

Finally, we can display the results. In this case, we will be printing the objective values for each solution in the final population of the above execution.

In [18]:
for solution in algorithm.result:
    print(solution.objectives)
[4.453351359150486e-07, 4.002669782738808]
[4.00008015015591, 4.0149894575182075e-10]
[0.1916439111493342, 2.4405577972219388]
[0.14247947716777792, 2.6326213266501712]
[2.912722530875832, 0.08604248168550006]
[1.5417514375854118, 0.5750600840416275]
[3.8408134578188857, 0.0016160929669092707]
[3.077490401031466, 0.06037942738259151]
[3.7116568949830153, 0.005392539296257204]
[1.6363359146616123, 0.5195620644132161]
[1.492223130046025, 0.6059597232354179]
[2.3739127336051826, 0.21090991536122847]
[1.0200911100097365, 0.9801087139892063]
[0.029225142651433323, 3.34541063386785]
[1.0686258778952966, 0.933651386278118]
[0.0006487501268842501, 3.8987664551480803]
[2.052222620644453, 0.3219903452204237]
[1.2757739621522382, 0.7577673890676204]
[3.436466946363341, 0.02138291839619627]
[0.8524303051296667, 1.1593442132446337]
[0.04058380734917394, 3.234766881747097]
[0.04449665126733351, 3.2007274765060507]
[1.4165274699386203, 0.6558090872764065]
[2.8038239714106497, 0.10597479853654006]
[2.7387854132130136, 0.119074950405126]
[0.12705394061242367, 2.7012688762360364]
[1.228557617693898, 0.794944882789961]
[2.267223451432706, 0.24430262986089188]
[0.7983927694043604, 1.224279689146579]
[2.167912613186228, 0.2783793709861598]
[0.20702504935941213, 2.3870248323949763]
[1.191068637098248, 0.8256249948584392]
[0.003500151747508683, 3.766851830477023]
[1.8873535782252802, 0.39211407010513977]
[0.007228361021465059, 3.6671492873585287]
[0.021274451879960535, 3.4378438818209616]
[3.2992527428173406, 0.033714642643517305]
[1.720372650430989, 0.4738535751873066]
[0.4206979658258525, 1.8262486112972967]
[2.6749821042839037, 0.1328330943222337]
[1.9812946126961428, 0.3509559192303327]
[1.148831920460887, 0.8614888355918685]
[0.058268585450140194, 3.0927131146447375]
[0.901851636359289, 1.1032168576856898]
[0.002366073946340473, 3.80779684939773]
[2.446101091215531, 0.1900945807131245]
[0.9765587992108801, 1.0237192138780675]
[1.6753978246401262, 0.4979114466530224]
[2.8708869040056286, 0.09341020068232929]
[0.26408235141564357, 2.208524611372738]
[0.24827407620732173, 2.255189727937749]
[0.3034920248950914, 2.0998876143164904]
[3.1611162227527414, 0.04930495106928492]
[1.3665464534842686, 0.6905713406603885]
[2.288830151307554, 0.23727808419940896]
[2.5362936988815616, 0.16599549690506493]
[0.014674435253989709, 3.5301221015852358]
[0.11021384985246038, 2.7822749967273]
[3.6369470134344004, 0.008634472973672376]
[0.3628914960259589, 1.9532724521903275]
[0.9361940123477354, 1.0659092427779604]
[3.601319469626585, 0.010462369534042762]
[0.22510242419018528, 2.3273040189208154]
[3.929924384329358, 0.0003096301770651386]
[0.44258902328378813, 1.781494441623713]
[0.6154523531365325, 1.4774214976182953]
[0.0762097772462723, 2.9719650417187005]
[0.7383727888763719, 1.3012279531119622]
[0.8720319868299435, 1.1367258696182503]
[0.0970024998688544, 2.851193526590178]
[0.7819881421237153, 1.2447844055572952]
[3.5546141492431895e-05, 3.976187312163294]
[0.31891679267520895, 2.0600080555184057]
[0.6941524520648898, 1.3615199741222297]
[3.5208173251091917, 0.015280888468099278]
[1.7682626917285436, 0.4492211363675877]
[0.02470362304302411, 3.396008173172114]
[0.6440389190105195, 1.4339575019482806]
[0.7180811014769725, 1.3284944565257686]
[2.101813459452626, 0.3027604892477481]
[0.07004153007017616, 3.0114271143725535]
[3.2376877085727775, 0.04025737982596567]
[0.23264564624068063, 2.3033114761574134]
[3.101799115479861, 0.05702900938855038]
[1.1085848373894185, 0.897010618854181]
[2.4306368151410274, 0.19443692720795713]
[0.5138028010316759, 1.6466012356535433]
[2.59411836599673, 0.1516115669356936]
[1.3212375570746797, 0.7234336342007524]
[1.9350994994484758, 0.3707853466372228]
[3.2123227439837496, 0.043141207452540103]
[3.5313938421728595, 0.01459256272122349]
[1.3972708592454008, 0.6690223729741459]
[0.46747041317761945, 1.7325980924207784]
[0.2844624139717213, 2.151061695975325]
[0.13716712190253666, 2.655723923382733]
[0.955596990910819, 1.0454113361658806]
[0.3856475268585109, 1.9016265064691555]
[0.10718720195055548, 2.797608937313849]
[0.48749031964350936, 1.694670040016453]

Visualising Solutions in Objective Space

In the last section, we concluded by printing the objective values for every solution. This information will be easier to digest using a plot, so let's quickly put the data into a Pandas DataFrame and then use Plotly to create a scatterplot. Let's start by moving our Platypus data structure to a DataFrame.

In [19]:
objective_values = np.empty((0, 2))

for solution in algorithm.result:
    y = solution.objectives
    objective_values = np.vstack([objective_values, y])
    
# convert to DataFrame
objective_values = pd.DataFrame(objective_values, columns=['f1','f2'])

With that complete, we can have a peek at our DataFrame to make sure we've not made any obvious mistakes.

In [20]:
objective_values
Out[20]:
f1 f2
0 4.453351e-07 4.002670e+00
1 4.000080e+00 4.014989e-10
2 1.916439e-01 2.440558e+00
3 1.424795e-01 2.632621e+00
4 2.912723e+00 8.604248e-02
... ... ...
95 1.371671e-01 2.655724e+00
96 9.555970e-01 1.045411e+00
97 3.856475e-01 1.901627e+00
98 1.071872e-01 2.797609e+00
99 4.874903e-01 1.694670e+00

100 rows × 2 columns

With no obvious issues, let's visualise the results using a scatterplot.

In [21]:
fig = go.Figure()
fig.add_scatter(x=objective_values.f1, y=objective_values.f2, mode='markers')
fig.show()

Great! If you search the literature for the true Pareto-optimal front for Schaffer F1, you can see that our approximation is looking as expected.

Conclusion

In this section, we have demonstrated how we can use a popular multi-objective optimisation algorithm, NSGA-II, to approximate multiple trade-off solutions to the Schaffer F1 test problem. We did this using the Platypus framework, and by implementing our custom objective function. You can use this approach to write your own objective functions that can be optimised by any algorithm in the Platypus framework.


  1. Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. A. M. T. (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation, 6(2), 182-197. 

  2. Schaffer, J.. (1985). Multiple Objective Optimization with Vector Evaluated Genetic Algorithms. Proceedings of the First Int. Conference on Genetic Algortihms, Ed. G.J.E Grefensette, J.J. Lawrence Erlbraum. 93-100. 

Fourier Transform Algorithm

Preamble

In [55]:
# used to create block diagrams
%reload_ext xdiag_magic
%xdiag_output_format svg
    
import numpy as np                   # for multi-dimensional containers
import pandas as pd                  # for DataFrames
import plotly.graph_objects as go    # for data visualisation
import plotly.io as pio              # to set shahin plot layout
from plotly.subplots import make_subplots
import scipy.fftpack                 # discrete Fourier transforms

pio.templates['shahin'] = pio.to_templated(go.Figure().update_layout(margin=dict(t=0,r=0,b=40,l=40))).layout.template
pio.templates.default = 'shahin'
In [ ]:
 

Fast Fourier Transform

Preamble

In [55]:
# used to create block diagrams
%reload_ext xdiag_magic
%xdiag_output_format svg
    
import numpy as np                   # for multi-dimensional containers
import pandas as pd                  # for DataFrames
import plotly.graph_objects as go    # for data visualisation
import plotly.io as pio              # to set shahin plot layout
from plotly.subplots import make_subplots
import scipy.fftpack                 # discrete Fourier transforms

pio.templates['shahin'] = pio.to_templated(go.Figure().update_layout(margin=dict(t=0,r=0,b=40,l=40))).layout.template
pio.templates.default = 'shahin'

In a previous section we looked at how to create a single Sine Wave and visualise it in the time domain.

In [2]:
sample_rate = 1000
start_time = 0
end_time = 10

time = np.arange(start_time, end_time, 1/sample_rate)

frequency = 3
amplitude = 1
theta = 0

sinewave = amplitude * np.sin(2 * np.pi * frequency * time + theta)

fig = go.Figure(layout=dict(xaxis=dict(title='Time (sec)'),yaxis=dict(title='Amplitude')))
fig.add_scatter(x=time, y=sinewave)
fig.show()