Gács-Körner Common Information

The Gács-Körner common information [5] take a very direct approach to the idea of common information. It extracts a random variable that is contained within each of the random variables under consideration.

The Common Information Game

Let’s play a game. We have an n-variable joint distribution, and one player for each variable. Each player is given the probability mass function of the joint distribution then isolated from each other. Each round of the game the a joint outcome is generated from the distribution and each player is told the symbol that their particular variable took. The goal of the game is for the players to simultaneously write the same symbol on a piece of paper, and for the entropy of the players’ symbols to be maximized. They must do this using only their knowledge of the joint random variable and the particular outcome of their marginal variable. The matching symbols produced by the players are called the common random variable and the entropy of that variable is the Gács-Körner common information, \(\K\).

Two Variables

Consider a joint distribution over \(X_0\) and \(X_1\). Given any particular outcome from that joint, we want a function \(f(X_0)\) and a function \(g(X_1)\) such that \(\forall x_0x_1 = X_0X_1, f(x_0) = g(x_1) = v\). Of all possible pairs of functions \(f(X_0) = g(X_1) = V\), there exists a “largest” one, and it is known as the common random variable. The entropy of that common random variable is the Gács-Körner common information:

\[\begin{split}\K[X_0 : X_1] &= \max_{f(X_0) = g(X_1) = V} \H[V] \\ &= \H[X_0 \meet X_1]\end{split}\]
The Gács-Körner common information :math:`\K[X:Y]`

As a canonical example, consider the following:

>>> from __future__ import division
>>> from dit import Distribution as D
>>> from dit.algorithms import common_information as K
>>> outcomes = ['00', '01', '10', '11', '22', '33']
>>> pmf = [1/8, 1/8, 1/8, 1/8, 1/4, 1/4]
>>> d = D(outcomes, pmf)
>>> K(d)
1.5

So, the Gács-Körner common information is 1.5 bits. But what is the common random variable?

>>> from dit.algorithms import insert_meet
>>> crv = insert_meet(d, -1, [[0],[1]])
>>> print(crv)
Class:          Distribution
Alphabet:       (('0', '1', '2', '3'), ('0', '1', '2', '3'), ('2', '0', '1'))
Base:           linear
Outcome Class:  str
Outcome Length: 3
RV Names:       None

x     p(x)
002   0.125
012   0.125
102   0.125
112   0.125
220   0.25
331   0.25

Looking at the third index of the outcomes, we see that the common random variable maps 2 to 0 and 3 to 1, maintaining the information from those values. When \(X_0\) or \(X_1\) are either 0 or 1, however, it maps them to 2. This is because \(f\) and \(g\) must act independently: if \(x_0\) is a 0 or a 1, there is no way to know if \(x_1\) is a 0 or a 1 and vice versa. Therefore we aggregate 0s and 1s into 2.

Properties & Uses

The Gács-Körner common information satisfies an important inequality:

\[0 \leq \K[X_0:X_1] \leq \I[X_0:X_1]\]

One usage of the common information is as a measure of redundancy [4]. Consider a function that takes two inputs, \(X_0\) and \(X_1\), and produces a single output \(Y\). The output can be influenced redundantly by both inputs, uniquely from either one, or together they can synergistically influence the output. Determining how to compute the amount of redundancy is an open problem, but one proposal is:

\[\I[X_0 \meet X_1 : Y]\]
The zero-error redundancy :math:`\K[X\meetY:Z]`

This quantity can be computed easily using dit:

>>> from dit.example_dists import RdnXor
>>> from dit.algorithms import insert_meet, mutual_information as I
>>> d = RdnXor()
>>> d = insert_meet(d, -1, [[0], [1]])
>>> I(d, [3], [2])
1.0

\(n\)-Variables

With an arbitrary number of variables, the Gács-Körner common information is defined similarly:

\[\begin{split}\K[X_0 : \ldots : X_n] &= \max_{\substack{V = f_0(X_0) \\ \vdots \\ V = f_n(X_n)}} \H[V] \\ &= \H[X_0 \meet \ldots \meet X_n]\end{split}\]

The common information is a monotonically decreasing function:

\[\K[X_0 : \ldots : X_{n-1}] \ge \K[X_0 : \ldots : X_n]\]

The multivariate common information follows a similar inequality as the two variate version:

\[0 \leq \K[X_0 : \dots : X_n] \leq \min_{i, j \in \{0..n\}} \I[X_i : X_j]\]
The Gács-Körner common information :math:`\K[X:Y:Z]`
common_information(dist, rvs=None, crvs=None, rv_names=None)[source]

Returns the Gacs-Korner common information K[X1:X2...] over the random variables in rvs.

Parameters :
  • dist (Distribution) – The distribution from which the common information is calculated.
  • rvs (list, None) – The indexes of the random variables for which the Gacs-Korner common information is to be computed. If None, then the common information is calculated over all random variables.
  • crvs (list, None) – The indexes of the random variables to condition the common information by. If none, than there is no conditioning.
  • rv_names (bool, None) – If True, then the elements of rvs are treated as random variable names. If False, then the elements of rvs are treated as random variable indexes. If None, then the value True is used if the distribution has specified names for its random variables.
Returns:

K (float) – The Gacs-Korner common information of the distribution.

Read the Docs v: dev
Versions
latest
dev
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.