Hita is a novel image tokenizer proposed to improve the performance of autoregressive image generation models. To overcome the limitation of existing tokenizers that map local image patches to tokens and thus utilize limited global information, we introduce a global-local tokenization technique that uses learnable global queries and local patch tokens. Hita improves the consistency with the autoregressive generation process through a sequential structure that places global tokens first and then places local tokens consecutively, and a lightweight fusion module that preferentially processes global tokens before inputting the dequantized tokens to the decoder. It achieves FID 2.59 and IS 281.9 on the ImageNet benchmark, outperforming existing tokenizer-based models, and also shows effectiveness in zero-shot style transfer and image inpainting.