English
Share
Sign In
Summary of costs for each LLM
Haebom
1
👍
In English, the tokens are roughly as follows. In Korean, you can multiply them by 3 to 4 times. In Korean, the current token-based division is not very efficient.
Token
Language
1
About 4 to 10 characters
30
About 1~2 sentences
100
About 75 words
2048
About 1500 words
Tokens can be easily calculated on the following website. The 1 million tokens in the example below can be thought of as equivalent to about 5 books.
These are the costs mentioned on each company's official page.
OpenAI GPT-4 Turbo: Costs $10 to $30 per million tokens.
OpenAI GPT-3.5 Turbo: Costs between $1 and $2 per million tokens.
Anthropic Claude 2.1: Costs $11 to $32.70 per 1 million tokens.
Anthropic Claude Instant: Costs between $1.60 and $5.50 per million tokens.
Here are the criteria for the Llama 2 70B, 13B that I'm personally running. (CUDA, A100 standard)
Llama 2 13B: It costs between $0.7 and $1 per 1 million tokens.
Llama 2 70B: Costs between $1 and $2 per 1 million tokens.
I haven't done a detailed cost estimation yet when running on CPU in MacStudio. However, Llama2 is definitely more economical. GPT-4 shows overwhelming performance in inference ability, but it seems to be sufficient for general generation and intent classification. I'm thinking of experimenting with Mistral for 7B, but I'm not really interested.
I'm a bit curious about the domestic models such as CLOVA, Faith, and Aeksawon. Usually, when we build our own cloud, we experience a cost reduction of more than 30% excluding the initial cost. I'm curious about the domestic model. If you feel free to contact me, I think it would be good to talk frankly. haebom@kakao.com
Subscribe to 'haebom'
📚 Welcome to Haebom's archives.
---
I post articles related to IT 💻, economy 💰, and humanities 🎭.
If you are curious about my thoughts, perspectives or interests, please subscribe.
Would you like to be notified when new articles are posted? 🔔 Yes, that means subscribe.
haebom@kakao.com
Subscribe
1
👍
    Haebom
    H100은 3배 싸게 고효율로 돌릴 수 있다는데 할당을 못받아서 못써봤습니다...