Hidden Markov Processes

Preliminaries

Information Theory

Convex and Concave Functions

Let us first start with a formal definition of a 2 vector convex combination. Then we will break down the definition into parts and analyze the definition. Then we will formally define and analyze a convex combination with a finite number of vectors in the same manner.

A subset \(S \subseteq \mathbb{R}^{n}\) is said to be a convex set if:

\(\lambda x + (1 - \lambda)y \in S \;\forall \lambda \in [0,1], \forall x,y \in S\)

Our formal definition here is for a convex combination with 2 vectors. Vectors \(x\) and \(y\) are elemenets of the subset \(S\). \(S\) is a proper subset of the real numbers to \(n\) dimensions. Our variable lambda \(\lambda\) has the value betweewn 0 and 1. The result of \(\lambda x + (1 - \lambda)y\) should be an element in \(S\).

Now let us formally define a convex combination with any finite number of vectors. Let us say that \(S \subseteq \mathbb{R}^{n}\) and \(x_{1},...,x_{k}\in S\). Then a vector of the form:

\(y=\sum_{i=1}^{k}\lambda_{i}x_{i} \geq 0 \; \forall i, \sum_{i=1}^{k} \lambda_{i}=1\)

is called a convex combination of the vectors \(x_{1},...,x_{k}\). Here the summation of every \(\lambda_{i}\) is equal to 1 and every \(\lambda_{i}\) is greater is greater than or equal to 0.

Entropy

Definition of Entropy

Let us first formally define entropy. Say \(\mu \in S_{n}\) is an \(n\)-dimensional probability distribution. The entropy of the distribution can be denoted as \(H(\mu)\) and is defined by:

\(H(\mu) := \sum_{i=1}^{n}\mu_{i}log(1/\mu_{i})\)

\(\quad\quad\;\;\;= -\sum_{i=1}^{n}\mu_{i}log\mu_{i}\)

Here, \(\mu\) is a probability distribution that is an element from the set \(S_{n}\). The set \(S_{n}\) is a set of \(n\) probability distributions. The entropy of the probability distribution \(\mu\) can be seen as a the function \(H\) on \(\mu\). We take the summation of every probability in \(\mu\) at \(i\) multiplied by \(log(1/\mu_{i})\). The final result of this summation over every probabiltiy in our probability distribution is the entropy of the probability distribution \(\mu\).

Properties of the Entropy Function

To continue.