Calculates the Solow-Polasky measure [1] for a set of points given as columns of a numeric matrix.

solow_polasky(x, d = NULL, theta = 1, ...)

Arguments

x

[matrix]
Numeric input matrix (e.g., a EA-population) where each column holds one point / individual.

d

[matrix | NULL]
Optional distance matrix. If NULL, the default, the Euclidean distance is calculated automatically via dist.

theta

[numeric(1)]
Single scalar value (see details section). Default is 1.

...

[any]
Further argument passed down to dist in case d is NULL.

Value

Solow-Polasky diversity measure (scalar numeric value).

Details

This measure was introduced by Solow and Polasky back in 1994 to measure the amount of diversity between species in biology [1]. Later, Ulrich and Thiele [2] adopted this measures for Evolutionary Diversity Optimization where the goal is to come up with a population \(P = \{P_1, \ldots, P_{\mu}\}\) of \(\mu\) individuals such that the following holds:

  1. All individuals in \(P\) adhere to a minimum quality threhold, i.e., \(f(x) \leq v_{\min}\) for some threhold value \(v_{\min}\) (we assume the fitness function \(f\) w.l.o.g. to be minimized).

  2. The population should be “diverse” with respect to some diversity measure \(D\) that maps the population to a single scalar numeric value.

Given \(P = \{P_1, \ldots, P_{\mu}\}\) and pairwise distances \(d(P_i, P_j)\) \(1 \leq, i,j \leq \mu\) let \(M\) be a \((\mu \times \mu)\) matrix with $$ M_{ij} = \exp(-\theta \cdot d(P_i, P_j)). $$ Then the Solow-Polasky diversity is defined as $$ D_{SP}(P) = \sum_{1 \leq i,j \leq \mu} M_{ij}^{-1} \in [1, \mu] $$ where matrix \(M^{-1}\) is the Moore-Penrose generalized inverse of a matrix \(M\). \(D_{SP}\) can be interpreted as “as the number of different species in the population” [2]. Note however that the measure calculates a real valued diversty in \([1, \mu]\) and no integer value. Hence, it can be seen as a more fine-grained measure of diversity that captures a distance other than the “binary” distance \(d'\) were \(d'(P_i, P_j) = 1\) if \(P_i \neq P_j\) and \(d'(P_i, P_j) = 0\) otherwise.

The runtime is dominated by the inverse matrix calculation and hence is upper bounded by \(O(\mu^3)\). In [2], the authors propose an alternative method to update the calculation once the population changes. However, the measure implemented here is meant to be used a-posteriori to assess the “diversity” of a final population.

References

[1] Andrew R. Solow and Stephen Polasky. Measuring biological diversity. English (US). In: Environmental and Ecological Statistics 1.2 (June 1994), pp. 95–103. issn: 1352-8505. doi: 10.1007/BF02426650.

[2] Tamara Ulrich and Lothar Thiele. Maximizing population diversity in single-objective optimization. In: 13th Annual Genetic and Evolutionary Computation Conference (GECCO 2011). ACM, 2011, pp. 641–648. doi: 10.1145/2001576.2001665.

Examples

# Generate a random point cloud in [0,1] x [0,1] x = matrix(runif(100), nrow = 2L) solow_polasky(x)
#> [1] 1.912474
solow_polasky(x, theta = 2)
#> [1] 3.049745
# All points equal x = matrix(rep(1, 100), nrow = 2L) solow_polasky(x)
#> [1] 1