LOGO

Game-Benchmark for Evolutionary Algorithms

On this site, we provide some details on the constructed benchmark. Everything is work in progress.

The benchmark was integrated into the COCO framework in order to make it convenient to run and analyse. You can find it on Github here. For now, it is on the rw-new-suite branch. For more details on the COCO framework, including how to run your own algorithms and post-processing, please check out the separate documentation.

Function Suites

In the following, we descripe the function suites contained in the GBEA benchmark. For more detailed and technical descriptions of the context, please refer to the respective papers. For each of the suites, we also present the data we obtained when running established optimisation algorithms. Additionally, we computed some ELA features using the R package flacco, which we also make available.

Top Trumps

The Top Trumps problem suites are based on Demonstrating the Feasability of Automatic Game Balancing, where the optimisation task is to fine-tune the values on the cards of a TopTrumps deck. This is done based on fitness functions taken from the paper, but the game was re-implemented in C++ to allow for an easier integration in COCO.

Below is an illustrative video of the game. For a detailed description, please refer to the paper. You can also find a more detailed and formalised description of the fitness functions there.

Game Rules

  1. Shuffle deck and distribute evenly among players
  2. Starting player chooses characteristic (category)
  3. All players compare corresponding values on their cards
  4. Player with highest value wins trick
  5. Until all cards have been played exactly once
  6. Winning player announces new characteristic, goto 3

Fitness Functions

We provide two function suites related to Top Trumps. The first is rw-top-trumps and comprised of 5 single-objective functions. The second one is rw-top-trumps-biobj, containing 3 bi-objective functions.

Any solution of the problem is an array that specifies all values of all cards in a deck. We specify that each deck has 4 categories. The search space dimension is thus a multiple of 4 and only depends on the number of cards specified to be in the deck.

The instances represent different potential themes of a deck. We express that by allowing different value ranges for each category, depending on the instance. The exact boundaries are generated using the instance number as a random seed.

rw-top-trumps
objectives 1
dimensions (88, 128, 168, 208) = 4* (22, 32, 42, 52)
functions 5
instances 15
function id description
1 deck hypervolume
2 standard deviation of category averages
3 winrate of better player
4 switches of trick winner
5 trick difference at the end of the game
rw-top-trumps-biobj
objectives 2
dimensions (88, 128, 168, 208) = 4* (22, 32, 42, 52)
functions 3
instances 15
function id description
1 (fun 1, fun 2)
2 (fun 3, fun 4)
3 (fun 5, fun 4)

Results

The data was obtained by running CMA-ES and SMS-EMOA on the respective Top Trumps suites for dimension 128 and up to 500*dimension function evaluations. The runs contained in the data differ in terms of budget.
Optimisation
Raw performance RData:
  • rw-top-trumps:
  • rw-top-trumps-biobj:
ELA
Raw feature data:
  • rw-top-trumps:
Feature correlation plots:
  • rw-top-trumps:

Mario GAN

The MarioGAN suite is based on Evolving mario Levels in the Latent Space of a Deep Convolutional Generative Adversarial Network, where the optimisation task is to find \emph{good} Mario levels in a specifically defined search space. The fitness of the discovered levels is measured using several functions as surveyed in Understanding Mario: An Evaluation of Design Metrics for Platformers. The existing implementation from Evolving mario Levels in the Latent Space of a Deep Convolutional Generative Adversarial Network as available on GitHub here was modified to be integrated with COCO. In addition, new training examples were added and the encoding replaced. The modified version of the code can be accessed on branch gbea.

Below is an illustrative video of the generation method. For a detailed description, please refer to the paper. You can also find a more detailed and formalised description of the fitness functions there.

Fitness Functions

rw-gan-mario
objectives 1
dimensions 10, 20, 30, 40
functions 84
instances 5

Results

The data was obtained by running CMA-ES and Random Search (RS) on the rw-gan-mario suite for dimension 10 and up to 1000*dimension function evaluations. The runs contained in the data differ in terms of budget.
Optimisation
Raw performance RData:
  • CMA-ES:
  • Random Search:
ELA
Raw feature data:
  • CMA-ES:
  • Random Search:
Feature correlation plots:
  • CMA-ES:
  • Random Search:

Problem Submission

We will be using the COCO framework to produce benchmarking results based on your submitted problem. To do that, we need the following information from you:

  • possible search space dimensions
  • possible objective space dimensions
  • number of instances

There are three ways you can submit your problem to us which are detailed in the following in order of preference.

Server

If you have your problem running on a server, we just need an interface to your evaluation function. In order to keep calls consistent between all submissions, please adhere to the following specifications and formats. We want to be able to send our request with the following parameters to the address and port you provide.

n d i x_1 x_2 ... x_n, where
  • n is the search space dimension
  • d is the objective space dimension
  • i is the instance identifier
  • x_1 to x_n are the input values

You should then return the objective value(s) y_1 to y_d as

y_1 y_2 ... y_d.

Code

If you prefer to have us run the code locally, we need an interface to your evaluation function. To make things easier, please adhere to the following specifications and formats. We want to be able to call your code from the command line with the following parameters

n d i x_1 x_2 ... x_n, where
  • n is the search space dimension
  • d is the objective space dimension
  • i is the instance identifier
  • x_1 to x_n are the input values

You should then return the objective value(s) y_1 to y_d to stdout as

y_1 y_2 ... y_d.

Please include a readme file with

  • details on the setup and installation process
  • an example of calling your problem from the command line

The code will be running on an Ubuntu machine, so please either make them platform independent or executable on Ubuntu.

Data

Finally, if none of the above methods seems feasible to you, you can also send us some output values as raw data. For this option, please contact us at gamesbench@lists.cs.uni-dortmund.de. We will then provide you with a textfile containing input vectors for each possible combination of search space dimension n, objective space dimension d and instance as

x_1 x_2 ... x_n separated by linebreaks.

For each input vector in each textfile you should then compute the objective value(s) and send them to us in the same format, i.e. textfiles that include output vectors as

x_1 x_2 ... x_n : y_1 y_2 ... y_d separated by linebreaks.

If you are interested in this option, please consider that we would like to have your response with the data by the problem submission deadline on June 30th, 2018. Therefore, please contact us early enough so that you are able to compute the output values in time. We are looking for at least 50 samples per input dimension if possible.