Professor Jo Jeonghyo’s Information Geometry and Machine Learning Trilogy

Haebom

Jun 20, 20233y ago

•

Professor Jo Jeong-hyo from the Department of Physics at Seoul National University wrote this series last year, where you can see how information geometry and machine learning are connected and just how crucial mathematics is in the field of AI.

•

I picked this up because a PhD I know personally recommended it, and I found it not only fascinating but really well written—so I wanted to share it here, too.

정보기하학과 머신러닝 [1]: 확률모형 사이의 거리

머신러닝의 모형은 크게 분류모형과 생성모형으로 구별된다. 이들은 확률을 써서 일반적으로 표현할 수 있다. 분류모형은 입력 \(x\)에 대한 출력 \(y\)를 표현하는 조건부 확률 \(P(y|x; \theta)\)에 해당한다. 여기서 \(\theta\)는 이 확률모형의 매개변수parameter를 뜻한다. 생성모형은 주어진 데이터 \(x\)가 나타날 확률 \(P(x; \theta)\)에 해당한다. 확률모형을 이용해서 확률이 높은 \(x\)를 선택하는 행위가 바로 샘플 생성이 된다. 여기서 이들 확률모형을 그래프로…

horizon.kias.re.kr

•

In machine learning, models are generally categorized as either classification models or generative models.

•

You can represent models using probabilities.

•

Classification models are described by the conditional probability P(y|x;θ), while generative models correspond to the probability P(x;θ) for the data x.

•

With a probabilistic model, it is possible to generate samples by selecting data x with a higher probability.

•

The explanation centers around probability models from the exponential family.

•

Exponential family probability models allow you to compute accumulation through the cumulative generating function.

•

The Bregman distance can be used to define the distance between different probability models.

•

The Bregman distance in exponential family probability models is related to the Pythagorean theorem.

•

By using projections in both the original and dual spaces, you can determine the distance between models.

정보기하학과 머신러닝 [2]: 충분통계량과 f-거리

지난 글(클릭 시 1편으로 연결)에서 우리는 확률모형 사이의 거리를 어떻게 정의할지 생각해 보았다. 특히 쿨백-라이블러 거리Kullback-Leibler divergence는 일반화된 피타고라스 정리를 만족하고, 투사projection라는 성질도 가지고 있음을 확인하였다. 이번 글에서는 쿨백-라이블러 거리의 또 다른 면모를 소개해 보려고 한다. 충분통계량sufficient statistic이라고 들어 보셨는지? 정해진 확률 \(\theta_0\)과 \(\theta_1\)을 따라 0과 1이 나오는 베르누이 시행을 생각해보자. (1, 1, 0, 1,…

horizon.kias.re.kr

•

The Kullback–Leibler distance is one way to define the distance between probability models.

•

The Kullback–Leibler distance can be used to derive sufficient statistics related to data compression.

•

Sufficient statistics contain all the information necessary to estimate the model's parameters; other information is irrelevant.

•

When you measure the distance between probability models by transforming variables, information is lost, so the distance decreases.

•

The f-distance is one way to measure distances using a concave function.

•

The Kullback–Leibler distance is a specific measure that's invariant to sufficient statistics.

•

The Kullback–Leibler distance is frequently used in machine learning to compare models or data distributions.

정보기하학과 머신러닝 [3]: 거울 하강법

앞선 두 개의 글

horizon.kias.re.kr

•

Using gradient descent is important for optimization.

•

Natural gradient descent is more efficient at updating parameters because it takes the curvature of the objective function into account.

•

Natural gradient descent is invariant to changes in scale.

•

Gradient descent and natural gradient descent are closely related, as natural gradient descent uses the concave function corresponding to the Legendre transform.

•

Mirror descent using dual space includes natural gradient descent as a special case.

•

Mirror descent enables natural gradient descent without having to consider curvature.

•

Using a data-dependent Bregman distance lets you update the model parameters with the data taken into account.

•

The distance between models and the curvature of the objective function are key geometric concepts in machine learning.

•

The development of information geometry and machine learning is closely interconnected.

Subscribe to 'haebom'

📚 Welcome to Haebom's archives.
---
I post articles related to IT 💻, economy 💰, and humanities 🎭.
If you are curious about my thoughts, perspectives or interests, please subscribe.
haebom@kakao.com

Haebom

Jun 20, 2023

참고 하시면 좋습니다.

Maths_for_ML.pdf5.47MB

See latest comments