ML Information Theory

Information: Processed Data

Knowledge: Information that is modeled to be useful

We need INFORMATION to be able to get KNOWLEDGE

Screenshot 2024-06-05 at 3.15.59 PM.png

Screenshot 2024-06-05 at 3.16.41 PM.png

X : random variable with distribution p(x)

$I(X) = log_2(\frac{1}{p(x)})$

expected number of bits needed to encode a randomly drawn value of Y

Entropy H(Y) of a random variable Y

$H(Y) = -\sum_{k=1}^{K} P(y=k) log_2P(y=k) = \sum p(y=k)log_2(\frac{1}{P(y=k)})$