Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Single-pass Adaptive Image Tokenization for Minimum Program Search

Created by
  • Haebom

Author

Shivam Duggal, Sanghyun Byun, William T. Freeman, Antonio Torralba, Phillip Isola

Outline

In this paper, we propose a single-pass adaptive tokenizer, KARL, which performs variable-length tokenization according to the complexity of an image based on the principles of Algorithmic Information Theory (AIT). KARL uses a learning process similar to the inverse reinforcement learning paradigm by approximating the Kolmogorov complexity (KC) and stopping token generation when the minimum description length is reached. Unlike conventional adaptive tokenizers that require multiple encoding searches, KARL achieves the same performance in a single pass. In addition, we analyze the scaling law for factors such as encoder/decoder size, continuous/discrete tokenization, etc., and explore the relationship between image complexity (KC) and structure/noise, and in/out of distribution familiarity through a conceptual study between adaptive image tokenization and AIT, showing its consistency with human intuition.

Takeaways, Limitations

Takeaways:
We present the possibility of more efficient image tokenization than existing methods via a single-pass adaptive tokenizer.
Providing a new perspective on image understanding by measuring and analyzing image complexity using Kolmogorov complexity.
Provides insight into optimizing model performance by presenting scaling laws for factors such as encoder/decoder size, tokenization method, etc.
Confirming consistency with human intuition through analysis of the relationship between image complexity and structure/noise, and familiarity within/outside the distribution.
Limitations:
There may be differences from the actual KC because an approximation of Kolmogorov complexity is used.
Further validation is needed to see how well the proposed KARL's performance generalizes to various image datasets and tasks.
Further analysis is needed on the complexity and stability of the learning process based on inverse reinforcement learning.
Lack of information on specific experimental results and comparison models.
👍