Sign In

???: The boss keeps telling me to try something—anything—with AI...

Haebom
Recently, foundation LLMs have been delivering impressive results in a variety of tasks. This has a lot of industries eyeing AI adoption with anticipation, but I personally think it’s worth pausing to reconsider right now. At the moment, it’s hard to get a good ROI, and with new technologies and models constantly emerging, this phase can easily turn into a game of chicken. If you jump into AI without careful thought, you might see a spike in user numbers, but losses from retention and revenue could grow even bigger. (Personally, I think we’ll see more cost-effective alternatives within the next 1–2 years.)
Still, if you believe you need to implement AI, I suggest starting with a small, lightweight model. In other words, there’s no need to bring out a sledgehammer for a simple job. There are times when you’ll need models tailored to particular tasks. I have a number of concerns about the purpose and importance of task-specific models, and about jumping into them at this time. There are three main reasons for this.

When handling personal data

If your personal data—such as legal, medical, or business data—isn’t available on the web, you’ll need a model focused on specific tasks that requires access to this data. You’ll also need to have a substantial amount of such data. Otherwise, models with longer context windows or simple retrieval methods can handle these situations just fine.

When working on domain-specific tasks

Models for specific tasks are also valuable when you need to get work done quickly and affordably within a certain domain. For example, Codex only needs to be good at coding and can be deployed at scale, quickly and cheaply. Likewise, instead of relying on a massive, unwieldy LLM for every domain challenge, it’s often better to develop a small, specialized model for the task. (Of course, this approach has obvious limitations outside its target domain.)

When chasing the highest possible performance

When you want peak performance for a specific task and can keep up with each new advance in GPT-N+1 models, task-specific models are necessary. Even if you train a model for a given task that outperforms GPT-4, this is usually just a short-term win. With every new GPT-n+1 release becoming bigger and smarter, there’s a real risk that your specialized model will fall behind. (At the end of the day, what matters is not the budget but the method and speed of learning.)

Conclusion

Models specialized for particular tasks are needed when you’re handling personal data, working on highly specific domains, or going after top-tier performance. But considering how quickly general models are advancing, focusing too much effort on specialized tasks can be risky.

Subscribe to 'haebom'
📚 Welcome to Haebom's archives.
---
I post articles related to IT 💻, economy 💰, and humanities 🎭.
If you are curious about my thoughts, perspectives or interests, please subscribe.
haebom@kakao.com
Subscribe