LLM Comparison/Test: Testing 39 models (7B~70B + ChatGPT/GPT-4)

The Big Secret of Small Habits: The Real Reason Behind the 'Micro-Efficiency' Craze

"You have to brush your teeth while showering, wear shoes without laces, and save every minute of your day." Have you ever thought about this? If you look at SNS or YouTube recently, you can see that so-called 'micro-efficiency' is deeply embedded in people's daily lives. It is a movement to save even a minute of time with a thoroughly calculated strategy for even the smallest actions. The philosophy behind a cup of tea Veronica Pullen, 54, from the UK, makes two cups of tea every time she drinks it. One cup is lukewarm with milk, and the other is perfect 40 minutes later. This way, she saves 20 minutes a day, or about 10 days over two years. Such small efficiency can make a big difference in the long run. Pullen’s habits don’t end there. She always has boiled eggs for breakfast, an omelet for lunch, and a set menu for dinner. She also repeats a set pattern for her clothes every day. This minimizes the stress of having to make new decisions every time. She says that this habit gives her enough energy to do what she wants. Why are people so obsessed with one minute? This phenomenon is by no means limited to the habits of unique individuals. The tag #LifeHack has been mentioned more than 11 million times on TikTok, and countless productivity influencers are emphasizing the “1% rule.” The best-selling book Atomic Habits preaches to the public how the cumulative effects of small habits can bring about big changes. In a social context, this micro-efficiency is not just a hobby or a trend, but also a phenomenon that reveals the pressures of modern society. Technological advancements have made it possible to do things faster, but the paradoxical situation of having more work to do in the remaining time is repeated. Psychologists interpret this as an early symptom of "active burnout." Good Habits vs. Obsessions People who pursue efficiency have various reasons. They range from people who are sick and want to save even a little energy to people who simply want to practice 'laziness' more comfortably. However, experts warn that if the habit of efficiency becomes too obsessive, it can lead to obsessive-compulsive disorder or depression. Being efficient does not necessarily mean being satisfied. In fact, a British survey found that on average, people only have 23 hours a week where they feel truly free. That creates pressure to save every minute. 3 Tips for Healthy Efficiency But unconditionally pursuing efficiency is not the answer. So how should we approach healthy efficiency? Check your purpose. It is a good idea to make a clear plan on how you will use the time you save. If it ends with just saving time, efficiency will eventually lose its meaning. Allow yourself some 'looseness'. Overly strict rules can actually increase fatigue. It's also a good idea to intentionally break away from efficiency habits on weekends or vacations. Review and reset constantly. As your life circumstances change, your efficiency strategy should change as well. Periodically check whether your current habits are really meaningful to you, and if they need to be changed, do not hesitate to modify them. What really matters is not the quantity of time but the quality. "What will we do with the time we save?" This is the key question that we must not miss. The habit of saving time should ultimately be a means to make our lives happier and more meaningful. Not simply to live busy, but rather to regain leisure in life. Actually, this is a continuation of the article...

Haebom

2025/07/13 10:00 AM

The Secret to Small Teams Grow Big: 35 People, 50 Million Users, $50 Million in Annual Revenue

Someone once said, "You do planning, development, and marketing. What on earth are you?" What is your specialty? To this question, I actually don't know. I just did everything I could. I just answered, "We're doing everything we can to make it work." Originally, GTM (Go-To-Market) refers to the entire process of launching a product or service to the market and making it successfully adopted by customers. The GTM manager in charge of this is in charge of everything from the product development stage to actually reaching the hands of customers. The Secret to Small Teams Growing Big While working at Kakao Brain and Nexon Korea, I definitely worked on great projects and had great colleagues, but I couldn't shake the feeling that something was lacking. In fact, I had a lot of concerns. Above all, while talking about the word 'growth', I couldn't really feel the actual growth of the product. Is what I'm making really providing value to my customers? Am I creating meaningful growth in my career? I don't like terms like growth/newbie/hexagon in the first place. Real Growth Experiences Learned from Notion Then, working with Notion completely changed my perspective. I learned how to build a product that has tens of millions of users globally, how to increase monthly recurring revenue (MRR), and how to create maximum efficiency with a small team. Notion wasn't just a company with a lot of people. It had clear values, a well-polished product, and users who believed in it. Through this experience, I realized the most important thing in my career. "Rather than focusing on large organizations, we need to focus on the relationship between products and users." Why I joined Gamma, and the results With this realization, I joined Gamma. And in two years, we achieved $50 million in ARR and have been profitable for 15 consecutive months. Gamma is a small but mighty team, and I am now a senior GTM (Go-To-Market) manager. It was an incredible growth, reaching 50 million users and $50 million in annual revenue (ARR) with just 35 people. Behind Gamma’s rapid growth were the following key strategies. I was promoted from the previous GTM Manager to Senior GTM Strategy Manager . The method we use when we attack a market is simple: set principles and stick to them. Let me share a few. 1. User-centric pricing strategy (Van Westendorp) We never guess when it comes to pricing. Gamma used the Van Westendorp Methodology to figure out exactly what users were “ready to pay” and set their prices accordingly. This created a solid revenue structure from the beginning and allowed them to maintain steady growth without changing their prices once for the next two years. Price is a signal to the customer. Accurate pricing creates trust between the product and the user. 2. Maximum growth with minimum manpower Gamma creates AI-based presentation tools that sell and market themselves. Our go-to-market strategy was simple:

Haebom

2025/07/12 7:10 AM

Is what we see real or a creation of our minds?

The intense ending scene of 『Squid Game Season 3』, which I watched on Netflix a while ago, is still lingering in my mind. The final game is the so-called 'high-altitude squid game' that takes place in the sky. The game is played on square, equilateral triangle, and perfect circular pillars in order. Especially in the round pillar scene in the final stage, the protagonist chooses to sacrifice himself to escape the shackles of greed that dominated him. Until now, the circle symbolized greed and obsession, but when the protagonist chooses death on this circle for his child and future, its meaning is completely reversed. This scene vividly shows that the reality we see is not absolute, and its meaning can change at any time depending on the environment or experience. For reference, I found Squid Game Season 3 very boring. Culture Changes the Way We See? Two Interesting Illusion Studies A similar story can be found in two recent studies of visual illusions. Ivan Kroupin's research team at the London School of Economics (LSE) in the UK showed a picture called the 'Coffer illusion' to Western people and rural Namibians. Interestingly, Western participants mostly saw squares, while rural Namibians mostly saw circles. The research team explains this with the 'carpentered world' hypothesis, which states that the environment we live in determines the way our brains see the world. But another study challenges this hypothesis. Dorsa Amir and Chaz Firestone have shown that the Müller-Lyer illusion occurs in humans, animals, and even blind children, regardless of their environment. These two studies may seem to conflict, but they actually send an important message: the world we see is not an objective reality, but something our brains are constantly interpreting and creating. Seeing and speaking are ultimately 'hallucinations of the brain' Neuroscientist Anil Seth describes reality as "a controlled hallucination created by the brain." In other words, our brain does not show us the objective world as it is, but rather interprets the world subjectively based on our experiences, environment, and culture. What's interesting is that this phenomenon applies equally to language as to vision. Take, for example, a study by Stanford University psychologist Lera Boroditsky, who asked German and Spanish speakers to describe the words “key” and “bridge,” and found that the genders of these words were reversed in both languages. In German, key is a masculine noun, and leg is a feminine noun. In Spanish, key is a feminine noun, and leg is a masculine noun. Surprisingly, German speakers described keys with masculine traits such as “heavy, strong,” and legs with feminine traits such as “beautiful, elegant.” Spanish speakers, on the other hand, described keys with the opposite traits. Likewise, the language we use ultimately determines how our brain creates and interprets reality. What we believe we ‘see’ and what we believe we ‘say’ are ultimately controlled illusions of the brain. 『Squid Game』, Visual Illusion, The Same Story That Language Tells The reason why the final circular pillar scene in 'Squid Game' was so powerful and why language changes our perception of reality is ultimately the same. The meaning of the world that we believed to be absolute can change at any time, and the world can be completely different depending on the language, culture, and experiences we have. The common message that optical illusions, language studies, and drama give us is clear. The world is always being reinterpreted and recreated in our minds. So if we want to change our lives to be more positive, shouldn't we first change the way we look at the world? Just as the protagonist of 'Squid Game' changed the meaning of the archetype from greed to sacrifice, we too can change the meaning of what we see and say in our lives. The difficulties, pain, and even happiness we experience can ultimately be changed by how we look at them. In fact, the reason why the work called "Squid Game" received attention was because the games that we remember from our daily lives or from our childhood have become games that adults who are obsessed with greed risk their lives to play. If we change our perspective a little, wouldn't something completely new or fun come out?

Haebom

2025/07/07 10:00 AM

Why Are China's Super Apps So Successful?

Recently, I met someone who is fluent in Chinese through Skewer Coaching. We talked about apps that are frequently used in China, and I told them about the ones I have used. Even those who don’t know much about China have probably heard of WeChat. There is even a joke that you can’t do almost anything in China without WeChat. Ordering food, calling a taxi, shopping, making payments, and even using government services can all be done within WeChat. Compared to KakaoTalk or Naver, which are commonly used in Korea, it offers a much wider range of functions within a single app. Is this “super app” phenomenon in China simply because Chinese people prefer convenience? Or is there another reason? When explaining the differences between Eastern and Western app design, it is often said that "Asians like apps with a lot of information and complexity, while Westerners prefer simple apps." However, this is not actually the case. Recently, apps in Asian countries such as China, Korea, and Japan have also gradually changed to a cleaner and simpler design. Nevertheless, why have super apps become so strong in China? The starting point of the 'mobile first' era that began with smartphones In the early 2010s, the Internet was not yet fully established in China. At that time, the Internet penetration rate in China was less than 35%, and there were not many people using desktop computers. In this situation, smartphones became the first personal computing devices for Chinese consumers. Since the mobile-centric Internet environment was built from the beginning, existing Internet habits were not formed. For example, in the US and Europe, desktop-based Internet usage habits such as web browsers and emails were already established, but in China, mobile apps were mainstream from the beginning. This gap was filled by Tencent's WeChat. WeChat started as a messenger, but began to provide almost all Internet services through 'Official Accounts' and 'Mini Programs'. In fact, WeChat is like a browser that acts as China's Google Chrome. Instead of moving existing web-based services to mobile, it created an app-centric environment from the beginning. Here, I always think of the words of Tencent Chairman Ma Huateng. When Tencent's QQ and games were ridiculed as copycats, Ma Huateng said, "We drew a tiger after seeing a cat." The early days of China’s digital economy were poorly developed. Banks did not provide consumer-friendly online payment systems, shopping was largely cash-based, and e-commerce was in its early stages of lack of trust. In this situation, Alibaba developed Alipay for online transactions on Taobao, creating its own payment system. Companies had to create their own services if they didn’t have what they needed. This is the real reason why super apps were born. In other words, it was an inevitable choice to fill a gap in the market and quickly take over the entire industry, not for the sake of an ideal user experience (UX). The emergence of the 'wall-building' competitive strategy The rise of China’s super apps can be explained by another reason: the extreme competition. Giants like Tencent, Alibaba, Baidu, and ByteDance have long used a so-called “walled garden” strategy, blocking links to each other’s platforms. For example, if you try to open a shopping link on Taobao in WeChat, it won’t work. So each company tries to pack as many services as possible into its own app. This has led to super apps like WeChat offering food delivery, taxi hailing, payments, shopping, and social media all in one app. Although the Chinese government only banned such link blocking in 2021, consumer habits have hardened and the super app model is deeply entrenched. Another reason can be found in economic terms. The online spending power of early Chinese consumers was lower than that of the West, so the lifetime value of each individual app was small. Accordingly, companies tried to maximize revenue per customer by providing as many services as possible to each customer. Also, because the initial cost of acquiring customers was very high, they integrated more services to avoid sending users to competitors once they had gathered users in one app. This was part of the strategy when Tencent ran the “red envelope (hongbao)” campaign during the Chinese New Year (CNY) to promote WeChat Pay. Strategic choice, not cultural preference The reason super apps succeed in China is not because of user demand, but because of a combination of unique market conditions: a mobile-first environment, lack of industry infrastructure, fierce competition, and low spending power. So should companies in other countries just copy the super app model? Not necessarily. Rather, the important lesson is to understand what drives product design decisions. Super apps aren’t always the answer. In some cases, it might be better to bundle multiple services into one app, while in other cases, it might be more effective to create a single app that provides the best user experience. In reality, WeChat is not a 'perfect super app' that can solve everything. The way it allows users to experience simple services through mini programs is essentially the 'Open Web' of the mobile environment. Users can experience the service in advance and then download a separate app if necessary. In Korea, there were some who once boasted that super apps were the future. But who actually achieved this? Personally, Toss is the only company that has truly achieved a super app. I think Toss's current app is the real beginning of this. The biggest Takeaways we can get from the case of China's super apps is to understand how business needs evolve products. After all, product growth begins not with user needs, but with strategic responses to the environment and market conditions that companies face. Personally, I think Korea is the place that most underestimates China's digital ecosystem and technological prowess. As I always say, if you go to a developed city in China or even Shanghai, you will realize that this is not the China I knew.

Haebom

2025/07/05 8:27 PM

Beyond the Black Box: How to Practically Implement Explainability in Financial AI

Recently, while consulting on AI planning and architecture for a company that creates financial services, I felt that this kind of discussion is accelerating in the financial industry. Although generative AI such as ChatGPT is receiving a lot of attention, the financial sector has been actively utilizing AI for a long time. From fraud detection to credit risk management, and even ultra-short-term trading strategies, AI is playing a key role in many core financial tasks. However, there are still many tasks left for AI to be actually trusted and used ethically. The most important issue among them is explainability. In the financial sector, the more complex the AI model, the more difficult it becomes to understand how it makes decisions. This is the so-called 'black box' problem. Even if an AI model makes accurate predictions, if it cannot explain the criteria and process by which the predictions were made, it can be a serious problem, especially in a field like finance where trust is essential. So today, I would like to take a deep look at what this 'explainability' is and how it can be implemented in the financial sector. Can I give you an interesting example? In 2019, Apple Card became a social issue due to the controversy over gender-discriminatory loan screening. A couple with the same income and credit rating applied, but the husband's credit limit was set much higher than the wife's. People immediately criticized this decision as 'gender-based discrimination.' However, the card issuers and financial institutions that managed the screening algorithm failed to explain exactly why this problem occurred. As a result, their image suffered serious damage. This case illustrates the potential problems that can arise when AI operates in the financial sector. AI makes decisions based on data, but if the data itself is biased or the algorithm’s judgment criteria are not clearly revealed, financial institutions can face serious ethical and legal responsibilities. In this context, the financial sector must ask the following questions when using AI: “Why did our AI model make that decision?” “Are the decisions made by AI really fair?” “Can we explain the judgment criteria of AI models?” Three Key Elements of AI Explainability Explainability is more than just showing the technical details of how a model works. To properly implement explainability in AI in finance, all three of the following elements must be present: (1) Transparency It's about making it clear to stakeholders how the AI model is structured, what data it was trained with, and what prerequisites or assumptions it operates on. For example, trust can be built by disclosing to customers and regulators the data sources for credit rating models and the reasons for selecting assessment variables. (2) Interpretability The goal is to make AI decisions easily understandable to humans. The way the model works should be explained using simple algorithms or visual tools. For example, you should be able to explain why you declined a loan application with specific data points (“Your loan was declined because of your high credit card utilization”). (3) Accountability It's about establishing clear accountability for decisions made by AI models and deciding in advance how to respond when problems arise. When a model makes a bad decision, establish clear processes and accountability to immediately correct it and remediate the damage. An integrated approach that embraces all three elements is key to properly implementing the explainability of AI in finance.

Haebom

2025/07/02 8:53 PM

AI is four times more accurate than doctors?

When we visit a hospital, we always expect accurate diagnosis and quick treatment. However, the reality is that it is not easy to receive treatment when you want it due to long waiting times and a lack of doctors. However, a surprising study recently released by Microsoft has presented a new possibility to the medical field. It is that a diagnosis system using artificial intelligence (AI) made a diagnosis four times more accurately than a human doctor. It is true that there are doubts such as, "Can AI really replace doctors?" AI Diagnostic Orchestrator Enters the Medical Field Microsoft 's AI Diagnostic Orchestrator (MAI-DxO) isn't just a simple AI model. It's designed like a panel of five doctors with different roles. Each AI agent develops a hypothesis, selects a test item, and consults with each other to arrive at a final diagnosis, deriving the most appropriate treatment method. What’s interesting here is that it clearly shows the process by which the AI arrives at its conclusion. Microsoft calls it the “Chain of Debate,” and it transparently discloses the logic through which the AI solves the problem. Accuracy that surpasses human doctors? So how effective is this AI in real-world medical settings? To test this, Microsoft presented the AI with 304 of the most challenging diagnostic cases published in the New England Journal of Medicine (NEJM), the top medical journal in the United States. The results were astonishing. When AI worked best (using OpenAI’s o3 model), the diagnostic accuracy was a whopping 85.5%. Experienced human doctors diagnosing the same cases had a success rate of only 20%. Despite the limitations of human doctors not having access to textbooks or colleagues, AI’s overwhelming performance came as a huge shock to the medical community. AI that saves both cost and time In addition to accuracy, the cost-saving effect was also noteworthy. Microsoft set the AI to consider cost in the diagnostic process, and as a result, the number of tests required was significantly reduced, saving hundreds of thousands of dollars in real-world cases. “This system is the most advanced AI performance we’ve ever seen, and could open new doors to healthcare accessibility,” said Microsoft’s Dr. Dominic King. AI models are now 'products', real competitiveness is 'combination power' In this experiment, Microsoft used AI models from several companies, including OpenAI, Meta, Anthropic, Google, and xAI. In particular, Mustafa Suleyman emphasized that even the best-performing OpenAI model will ultimately be “commoditized,” and the real difference lies in the “orchestrator” that integrates and combines these models. Microsoft said it plans to apply the technology to its AI chatbot Copilot and its Bing search engine, which could have huge potential on a platform that processes more than 50 million health-related questions a day. The Era of 'Medical Superintelligence' Needs Preparation Mustafa Suleiman describes this research as a first step toward “medical superintelligence.” A future where faster, more accurate, and cheaper diagnoses are possible is just around the corner. However, further verification is needed before it can be applied to clinical settings. Dr. Eric Topol, a cardiologist and AI medical authority, also evaluated this research as an important study that proved the possibility of AI’s medical efficiency, although it was not conducted in a real medical setting.

Haebom

2025/07/02 8:43 PM

Understanding Generation Z’s Sexual Recession

"Young people these days don't have enough sex. It's hard to believe." In 2016, the American media outlet Bustle reported the shocking results of a study showing that the frequency of sexual intercourse among young people in their early 20s had plummeted, and declared this. Since then, this phenomenon has been called the "Sex Recession" and has become a hot topic. The image of a bee and a bird turning their backs on each other on the cover of the American magazine The Atlantic strongly expressed this generation's sexual disconnection. In the past, adults were worried about 'young people who are too promiscuous,' but today's older generation is more worried about 'young people who avoid sex.' In fact, recent statistics show that the frequency of sexual intercourse among Generation Z (born in the mid-1990s to early 2010s) is unprecedentedly low. According to a 2018 survey, about a third of men and a fifth of women aged 18 to 24 had not had sex in more than a year, and the pandemic has further exacerbated this phenomenon. In 2021, nearly 40% of Californians aged 18 to 30 said they had never had sex. A generation where everything is possible but nothing is wanted Interestingly, Generation Z is a generation that is more open and has more diverse choices about sex than previous generations. They live in an era where they can easily have short and casual encounters through dating apps with just a smartphone, and various sexual preferences are freely accepted. So why do they stay away from sex? On this issue, British journalist Louise Perry offers a somewhat conservative but interesting perspective. Her book A New Guide to Sex in the 21st Century takes sex seriously, acknowledges the biological differences between men and women, and warns of the dangers of casual sex. Perry strongly warns that “any man can kill almost any woman with his bare hands,” and argues that women should choose their sexual partners carefully. Meanwhile, Guardian journalist Carter Sherman, in his book The Second Coming, explains why Generation Z is experiencing a sexual recession as a result of being caught between political conservatism and the massive power of the internet. The internet provides an endless supply of sexual content while also encouraging the commodification of sexuality, which in turn hinders real intimacy. Generation Z is overexposed to pornography from a young age, making it harder for them to form healthy attitudes toward sex. The Real Reason for the Sexual Recession is the Recession of 'Relationships' The fundamental problem of the sexual recession is 'loneliness'. Generation Z suffers from anxiety and depression much more than previous generations, and has difficulty forming close relationships with others. It is also noteworthy that alcohol consumption has decreased. Alcohol was a medium for forming close relationships quickly with previous generations, but Generation Z is distancing itself from this and is having even more difficulty forming relationships. The influence of social media also hinders the formation of intimacy in an environment where people are constantly evaluated on their attractiveness based on numerical criteria. Ultimately, the sexual recession reflects a social phenomenon where relationships are difficult to form and true connections with others are scarce. What should we do? Both Louis Perry and Carter Sherman point out the causes of the sexual recession from their own perspectives, but what both authors ultimately overlook is the 'power of pleasure and connection' that sex has. Sex is not just about physical pleasure, but it is a precious area where humans can connect most deeply and directly with others. In intimate relationships, we learn to understand, respect, and love each other. Our society needs to discuss how to help Generation Z re-establish healthy relationships and enjoyable experiences through sex. Rather than reducing the root causes of the sexual recession to mere personal problems, it is time to start a more comprehensive and in-depth social conversation. Recently, I've been reading articles like this, and I think that what Generation Z wants is not just a 'simple meeting', but a process of understanding and getting to know the other person. I'm not sure if that's the area called 'self-satisfaction'... or something like the resume blind date that was popular in the past. I found the expression 'sex recession' interesting, so I looked it up.

Haebom

2025/07/02 8:37 PM

Vibe Coding Shovel EP.02 (feat. Monetization)

The Vibe Coding article I recently posted received more attention than I expected. In particular, when I honestly shared specific profit stories, many people sympathized and found it interesting. Thanks to that, I was feeling good, but recently, I encountered an unexpected situation. This is what happened when an anonymous person reported something to me about an overseas corporation I run . At first, I was really surprised. I didn't do anything illegal, but I was suddenly reported, so it was absurd. Through this experience, I would like to share a story that may be useful to those who are considering overseas corporations or global payments like me. 🚨 What was the report? The content of the report is as follows. It was a complaint about whether I was properly following the various reporting and procedures required in Korea regarding the overseas corporation I established with Stripe Atlas. To conclude, fortunately, I had no problems. This is because I had been processing tax returns (comprehensive income tax) and foreign exchange transaction reports from the beginning through a professional tax accountant. It was also the season for the final income tax, so I was preparing the documents thoroughly with the tax accountant. I am always grateful to Ichon Tax Accounting Firm. What you need to know when operating an overseas corporation in Korea However, I learned something through this experience. In my case, I had already prepared, but if I hadn't prepared properly in advance, a complicated and troublesome situation could have arisen. There are obligations that must be fulfilled when establishing or operating an overseas corporation in Korea. I actually don't know the details, so I followed the advice and instructions of the tax accountant. Foreign exchange transaction report (reporting of overseas direct investment and regular reporting required through foreign exchange bank) Comprehensive income tax return (including income earned overseas in addition to income earned domestically) The above procedure is not difficult, but if you don't do it in advance or forget, it can become unnecessarily complicated. Tips that helped me personally The reason I was able to get through this without any problems this time is because of the following reasons. 1️⃣ Get help from a professional tax accountant Whether you are a sole proprietor or a corporation, you can deal with these situations right away by working with a tax accountant. (A tax accountant is especially helpful during the comprehensive income tax filing season.) 2️⃣ Use the Creator Automatic Deposit Service (Shinhan Bank) If you are a developer or creator, revenue management can be complicated, but I use Shinhan Bank's 'Creator Automatic Deposit Service', so revenue management is neatly organized. Thanks to this, it was convenient when reporting. 3️⃣ If you don’t absolutely need an overseas corporation, use a domestic solution In fact, if you don’t absolutely need an overseas corporation, there are many solutions that allow you to easily build a payment system domestically. Personally, I think services like Latpeed and Toss Payments are the most realistic and recommended. If you are a self-employed person and do not like complicated things, I recommend Lepid. If you can handle development yourself and are a corporate business, I recommend Toss. Of course, if you can establish and handle an overseas corporation, I recommend Stripe Atlas or LemonSqueeze. Is an overseas corporation really necessary? "Unless there's a really special reason for it, offshore corporations can be unnecessarily complicated."

Haebom

2025/06/03 9:38 PM

My Experience with a Vibe Coding Project Gone Wrong EP.01

The post I recently uploaded about Vibe Coding drew more attention than I expected. While many people focused only on the success story, there were actually a number of trials, errors, and failures during the process. Today, I want to share one of the more interesting 'struggles' I experienced with you. At the time, the service I developed was a PDF protection system called "PDF AI SHIELD." It came from the idea that, as more and more LLMs (large language models) easily read and summarize PDF documents, there should be a way to prevent this. When I'd distribute materials to students at university or carefully prepare and share official documents, people would just have AI summarize and read them—so I started wondering how to limit that. The way LLMs organize and summarize the documents and data we provide is pretty straightforward. They read the document → pick out key points → group and condense them → express them briefly → and then output the information for us. During this process, I came up with a method to interfere in the "reading" step, preventing LLMs from even chunking the data. Standard PDF security typically uses passwords, certificates, or OCR removal, but I added a unique approach on top of these methods. That trick was basically spraying "invisible paint" on the PDF. To the human eye, everything seemed normal, but for LLMs, it acted as a kind of "transparent paint" on the document that prevented them from reading it. I also encrypted tags in the form of certificates and the PDF metadata, making it difficult for LLMs to access. Technically speaking, it worked better than I expected—well-known LLMs like ChatGPT, Claude, and even local models were all blocked by this method. Seeing the results, it almost felt like I'd scored a small victory against the big AI companies. With this boost in confidence, I set up the pricing policy and started promoting in Reddit and several overseas communities. Buyers came in faster than I thought, and it really felt like the business would take off quickly. Honestly, I admit I was a little carried away at that point. Here's how I structured the business model: First use is free without logging in Log in and you can use it once per day (based on 24 hours) With a monthly subscription, you can use it up to 30 times a day Unlimited use with an annual subscription But it didn’t take long for an unexpected issue to pop up. While the protection worked well with models like GPT-4o, Claude Sonnet 3.5, and Gemini Pro, it was completely bypassed by newer models such as o3 and Sonnet 3.7. Plus, with some mini models or in certain situations, even the encrypted metadata was breached. Luckily, one of the early users kindly reported this problem, and I immediately processed refunds for all buyers. Since this was a security service, even a single breach meant I couldn't keep selling the product. Because of payment fees and such, the profit I'd made up front quickly turned into a loss, but handling it quickly saved me from bigger troubles. Better that than getting dragged into some security lawsuit, right? It's not like I rashly counted a deposit as profit. I've got a few more stories like this. If people are interested, I'll try sharing more of them in the future. As I said in my first post about Vibe Coding, when you get into it, there's way more trial, error, and revision than you'd expect—and I think this process actually increases the demand for developers. Plus, I feel like the more you 'struggle' through these processes, the better your product gets. People seem to prefer success stories and dramatic tales over failure stories, but personally, I have plenty of the latter. Even though I call these 'struggles,' I learned a lot from them. I'm not sure how posts like this will be received, but if folks like it, I'll definitely keep sharing parts two and three as well.

Haebom

2025/05/23 12:11 PM

A brief review of meeting 100+ people through coaching

When I quit my job and started my own business, the biggest thing I felt was a thirst for good stimulation . Thankfully, thanks to the help of many people and the flow of the times, I didn't have to worry about starving, and the company got on a stable track faster than I thought. However, my thirst for constant communication with good colleagues was not easily quenched. As a one-man business owner, I worked with several freelancers in a remote work format, so I naturally missed the relationship where we could talk comfortably and stimulate each other like coworkers . It was burdensome to hire valuable personnel just for this reason, and although I occasionally worked once a week at other companies, it was a bit different from the stimulation or inspiration I had hoped for. I tried hosting or participating in certain reading groups, but as expected, something was lacking. When I was young, I liked gathering with many people and making noise, but now I feel tired from such places. Then I thought about coaching . I had learned the skills and programs from my previous agile coaching training, so I decided to do free coaching based on them. At first, I started out without specifying the target audience, mainly targeting low-level workers and job seekers. As time went by, surprisingly, people in their 50s and 60s came to me. They were people who were preparing for the second act of their lives or were already running vigorously. One time, I met and talked with a person who builds small ships. He was running a shipyard in Geoje that specialized in small ships under 20 tons. He made hybrid ships using aluminum and fiber-reinforced plastic, which is not even known, and the way he customized the superstructure of the ship according to the ship owner’s detailed requirements was very impressive. I had seen countless ships while traveling around Busan, Incheon, and Gangneung, but I had never thought about who made the ships and how they were made. It was like getting a glimpse into a new world when I learned that these small shipyards were flexible in customizing their ships in a different way than the big ones. It was also very interesting to learn how they charge commissions and roughly how much each cost. One day, while talking to current nurses, I learned about an app called 'MyDuty'. MyDuty was a tool that helped nurses efficiently manage their complicated shift schedules and easily share them with colleagues. The biggest advantage of this app was that it allowed me to see the work schedules of colleagues in the same ward at a glance. It was a service I had never heard of even while working in the IT industry, and this app, which was created by accurately identifying the needs of a specific job group, was a great inspiration to me. It has expanded globally and is now an app that many people can't live without. Through this person who works at Mando, I learned how complex and precise the collaborative structure is needed to make a single car. Mando is a company that develops and produces core parts that are directly related to driver safety, such as braking, steering, and suspension systems. The fact that even a single car we drive every day is so closely intertwined with so many parts and companies was a very refreshing stimulus for me in the IT field, who usually works by making a single product as a whole. The manufacturing method, where individual parts are specialized and each company organically collaborates to complete the final product, showed me a completely different charm from the forms of collaboration I had experienced. Of course, I knew in my head that it was made that way, but it was even more unique when I heard the story directly from someone who actually does the work and works in the industry. As I met people from various fields through this skewer coaching, I felt that the world I see has become wider and deeper . It wasn’t just meeting people, but I felt that my perspective on the world had broadened through various lives and experiences. Sometimes, people I met would connect with each other and lead to employment, or business collaborations would begin. When I received unexpected inspiration from unexpected places, it was a great joy for me. Talking to many people eventually made me realize one thing. The world I thought I knew was so narrow, and only when we share our stories can we see a wide and colorful world. In the future, I want to meet more people through skewer coaching, weave more stories, and connect our lives more meaningfully. As I met each person, I ended up meeting them for over 100 minutes individually, not cumulatively, and this became a huge asset. It cost a lot of money for food and coffee, but I think it was worth it. I think it would be good to meet more systematically in the future, and I will end here.

Haebom

2025/05/20 6:32 PM

How Can Humans Be Certain of Justice?

Last weekend, I spent a pleasant time at the home of a couple who both served as judges and experienced various cases. Both have now put down their judicial gavels and are living as a lawyer and enjoying rural life. As the night deepened and we gathered around a bonfire, our conversation expanded from my curious and somewhat impertinent question about "whether judges are truly impartial" to how they train for and affirm this quality. Since then, I've been organizing these thoughts bit by bit, though I'm far from being a legal professional and only possess minimal legal knowledge as a citizen. The only formal legal education I ever had was a high school course called "Law and Society" in my first year. So please be generous as you read my thoughts, which are based on news articles and my own reflections. Judicial Delays Occurring Worldwide "Justice delayed is justice denied." This age-old adage is becoming reality across the world. While researching the judiciary, I noticed cases from the UK in particular. From Europe to Korea, judicial systems are suffering from case backlogs and trial delays. Especially in the UK and Europe, serious delays are occurring one after another, threatening to paralyze the entire judicial system. According to the UK Ministry of Justice, court case backlogs that were only about 48,000 in 2016 have increased to over 70,000 as of 2024. In London, a case involving a defendant who threatened someone with a knife has been scheduled for 2028, despite requiring only a three-day trial, causing significant social repercussions. Across Europe, investment in judicial systems is chronically insufficient, with the Council of Europe reporting that judicial budgets across European countries have effectively decreased over the past decade, standing at just 0.31% of GDP. In Lisbon, Portugal, frequent strikes by court employees regularly paralyze judicial proceedings. France and Spain are struggling as well... and the US has also been facing various challenges recently. Korea's judiciary is also in crisis. According to a survey by the Legal Times, Korean judges handle about five times as many cases annually as German judges and three times as many as Japanese judges. In a situation where 12 Supreme Court Justices process approximately 40,000 appeals each year, frequent rotations of judges significantly undermine case continuity and fairness. Recently, case backlogs have worsened as political issues draw attention at both the Constitutional Court and Supreme Court, causing tangible harm to citizens' rights to trial and legal remedies. Well, even without such textbook statements... Replace the Judiciary with Artificial Intelligence? Although the National Assembly passed a bill last year to increase the number of judges by 370, many point out that simply increasing personnel will not solve the fundamental problems. Instead, there are growing calls for structural and efficient reform of the judicial system itself. In this context, some are now claiming that replacing judges with artificial intelligence could enable fair judgments. But this is a significant misconception. AI is more likely to amplify existing biases, and recent deep learning-based large language models operate like black boxes, making it difficult to even trace the sources of bias. Ultimately, the issue with judgment-making entities isn't their imperfection, but rather the consistency of their judgments. The most fundamental reason the judiciary exists is because we have entrusted them with authority, trusting they will deliver consistent judgments under clear principles and philosophy. However, we're increasingly seeing cases where the basic logic and philosophy of judgments are being shaken in the name of public opinion or majority views. This could be the beginning of the gradual collapse of the rule of law. Judicial Confusion Caused by Excessive Legislation One of Korea's problems is excessive legislation. Our country produces an overwhelmingly higher number of new laws annually compared to other nations. When there are too many laws, the room for interpretation narrows and various social and economic activities become constrained. Conflicts between laws occur frequently, creating the paradoxical situation where it becomes difficult for businesses and citizens to comply with the law itself. To solve this problem, what we need is rather the digitalization of the legislature. Instead of AI replacing or assisting judges, we need a system that thoroughly reviews potential conflicts between laws and their social impacts using data during the legislative stage. The current practice of supporting bills at the request of party colleagues must be improved. We might consider initially evaluating bill content objectively through blind assessment, followed by roll-call voting to enhance accountability. Ultimately, justice is not a luxury but an essential value we must preserve. Overcoming our current crisis requires fundamental and systematic reforms to maintain the consistency and independence of the judiciary, along with digitalization of the legislature to build a more transparent and efficient legal system. Nation Average number of bills introduced per member per year Total Annual Bills Proposed Proportion of Legislator-Proposed Bills (%) Proportion of Government-Proposed Bills (%) Overall Bill Pass Rate

Haebom

2025/05/14 1:16 AM

Research Paper Acceptance: Study on Artificial Intelligence Memory Structure

I'd like to share that my paper was recently accepted for publication in the Korean Artificial Intelligence Association journal (https://kjai.jams.or.kr/). This paper was previously uploaded to Arxiv in pre-print form, and I ambitiously named it HEMA. In fact, the name is derived from the hippocampus, a part of the brain. The hippocampus is a structure that plays an important role in memory and learning in the brain. It is particularly crucial for remembering new facts and navigating spaces, and also participates in forming emotional memories through interaction with the amygdala. The core of this paper is research on preventing quality degradation in LLMs during long-context conversations. To explain simply, just as humans don't remember every detail of events or situations, the research stores generalized summaries as "Compact Memory" in text form. Following how humans primarily remember symbols/signals, the research preserves previous conversations in Vector DB form as "Vector Memory." In this case, it's possible to maintain context longer during extended conversations and provide a more consistent experience. This can be well-utilized in operating fiction bots, companion chatbots, etc. The experiments were conducted using a fully reproducible method with repeated testing on sLMs. Although the research is based on English, the fact that it performs well even at the 6b parameter level suggests it would work even better in already commercialized services like Claude or ChatGPT. Typically, when using Claude, conversations can become too long and need to be moved to different chats, or with ChatGPT, performance can degrade as conversations lengthen. Recently, we've seen attempts to solve this with features like "Project" or "Memory." This is similar to the previous approach of using prompts like "summarize our conversation so far" to reset before starting a new discussion. I've been continuously submitting papers to KCI and SCI, but I feel the hurdles have gotten higher. The acceptance rate isn't what it used to be. The most troublesome aspect is that while I could conduct research abundantly with company resources when I was employed, now I have to do it frugally with personal funds. I'm preparing a follow-up study on maintaining writing style while producing non-repetitive and consistent output, and I'll share those results as well when they become available.

Haebom

2025/05/13 12:55 PM

Who Makes Humanoid Robots?

A few years ago, when my younger brother told me he was leaving LG Electronics' robotics division to join MakinaRocks, that was probably the first time I took a close look at the robotics industry. Until then, I had used robot vacuums and robot arms, but I had never been curious about who makes them and how they work. And that interest didn't last long. As someone who had only made simple robots using Arduino for repetitive tasks or Raspberry Pi to recognize and avoid obstacles when I was young, it was honestly a bit difficult, and I was about to join Kakao Brain so I had a lot to study and couldn't think deeply about it. Then, around 2023, I saw a paper that included a demonstration of using multimodal approaches to train a robotic arm to perform specific household tasks. There have been countless such papers since, but it was very impressive at the time. Later, seeing models like Tesla's Optimus or Figure 01 made me slowly start to think that humanoids might not be completely far-fetched. In fact, if you visit manufacturing plants, robotic arms are made much more precisely than you might expect. I personally saw this at H company's shipyard in 2023, so I imagine they've developed even further by now. And there are many domestic companies doing well in this field. Humanoids are somewhat different. Robotic arms are ultimately plugged in, so they receive steady power supply and can perform complex computational calculations relatively quickly and easily. However, with humanoids, because they move completely independently, there's a lot to consider from batteries to computing power. That's why I thought it was quite a distant reality. At least until I saw Nvidia's Physical AI session in 2025. I actually started compiling this in March, but kept postponing it, so the post is coming out late. I've written it in the same format as my previous data center post. Looking at materials compiled by various consulting firms about humanoid robots, they typically divide the humanoid "fields" or "drive systems" into about 12 categories: Head, Shoulder, Elbow, Waist & Pelvis, Hands, Upper Arm, Forearm, Thigh, Calf, Feet, Battery Pack, and Others – these 12 fields. Of course, I'm still learning, so if you disagree, you're right. The market is projected to expand from approximately $3.28 billion in 2024 to $66 billion by 2032, representing an average annual growth rate of 45.5%. Additionally, manufacturing costs have plummeted by 40% in recent years, far exceeding previous projections (15-20% annual decrease), which has accelerated industrial applications and investment timing. I've examined all 12 core component categories, focusing on high-value areas to map investment potential. The Importance of Hands: Sophisticated Manipulation Drives the Value Chain The component I paid most attention to is "hands." This component costs about $9,500 per robot (17.2% of the total cost), by far the highest. As a result, a dedicated market of about $3.5 billion is expected to form by 2032. The technical difficulty of being as delicate yet robust as human hands is driving this market. Novanta Inc. (NASDAQ: NOVT): Provider of end-effector technology and multi-axis (force/torque) sensors FANUC Corporation (TYO: 6954): Manufacturer of 6-axis force sensing sensors Teradyne Inc. (NASDAQ: TER): Strengthens End-effector Solutions After Acquiring Universal Robots Shadow Robot Company: Dexterous Hand with 24 degrees of freedom, 20 drive motors, and over 100 sensors SCHUNK GmbH: Modular gripping systems Figure AI: February 2024, $675 million investment (corporate value $2.6 billion) I particularly noted multi-layer tactile sensors that mimic human skin and tendon drive systems. These two features realize more degrees of freedom in tight spaces, and the tactile sensor market is expected to reach $35.5 billion by 2030. I think these tactile sensors will be used in various applications besides humanoids. Lower Body Components: The Foundation of Mobility and Stability Thigh, calf, and foot components account for 38.6% of the total humanoid cost. The leg section, which determines the robot's "gait," presents significant investment opportunities. For thigh and calf, the market size is projected to grow rapidly from $433 million in 2024 to $8.71 billion by 2032, and the foot section connecting to the ground is expected to reach $800-900 million by 2025. Emerson Electric, Thomson Industries (Altra Industrial Motion): High-Duty Linear Actuators MISUMI Group: Precision Machinery Parts Bosch Rexroth: High-power electric actuators Agility Robotics: Actuators for Digit Biped Robots Figure AI: Next-generation lower body actuator

Haebom

2025/05/07 6:11 PM

Beyond the Hype: My Real Experience Making $18,500 with AI Coding Tools

"I easily made $10,000 a month using AI!" "I made hundreds of millions in additional income with artificial intelligence." Every time I open Thread, Instagram, or KakaoTalk, I see a flood of brilliant success stories about making money with AI. As I read the posts that come up every day, I honestly feel more skeptical than envious. There are many people around me who study AI or develop models themselves, but ironically, those who teach AI seem to make faster and bigger profits than those who actually develop technology or build services. I have been making a living in IT for a long time, but the reality I experienced was not that dramatic. Then, starting last year, out of curiosity and fun, I started using various AI coding tools. At first, the tools I used were just for maintenance or writing small scripts, but at some point, they started bringing in unexpected profits. If we make this more systematic and purposeful, it is called 'Vibe Coding' by Andrej Karpathy. Over the past few months, I've used various AI coding tools including Cursor , Replit , Trae , V0 , Copilot , and more recently Windsurf and Lovable. Through hands-on experience, I discovered each tool has distinct characteristics and differences. I also realized these tools can be categorized into 'Cold Start' tools (advantageous for those without coding experience to quickly create prototypes) and 'Boosting' tools (ideal for those with some coding knowledge to dramatically increase productivity). Of course, it doesn't mean anything because I divided it up arbitrarily! Equipment Classification Key Features One line review Lovable Cold start Frontend, design, and backend integration automation Supabase (database) and email integration Good to use when you are an IT person and have clear ideas. Replit Cold start/boosting Browser-based development environment Deployment and auto-scaling, hosting Available as a mobile app

Haebom

2025/05/05 9:02 PM

Communicating with AI: Balancing the Engine Layer and the Interaction Layer

Imagine you go to a restaurant. In one restaurant, you get a menu, and in another restaurant, you have to talk freely with the waiter without a menu. If you have a menu, it is easy to order, but if you don't have a menu, it is difficult. On the other hand, if you have a free conversation with the waiter, the possibilities are endless, but you have to think about what to ask for and how to ask for it. The way we communicate with artificial intelligence (AI) today is also divided into these two methods. As discussed in the previous article, if there is an interface based on the LLM (Large Language Model) that focuses on efficiency and performance, there is also a need for an interface that focuses on human experience. These two approaches do not exclude each other, but can coexist in separate layers, the engine layer and the interaction layer. This article is a continuation of the article below. 1. Engine Layer: The World of Prompt Management and Model Optimization As mentioned in the previous article on fine-tuning , understanding the engine layer is essential to effectively utilize AI. This layer focuses on maximizing the performance and efficiency of generative AI. What I have discovered while supporting the introduction of AI to various companies through 3blocks.ai, which I run, is that many companies start with the simple idea of “let’s try AI,” but in reality, the introduction process is never simple. Especially in a closed network environment, it is often necessary to build the entire process from data preprocessing to model serving, rather than simply using an API. What's important in the engine layer is: Prompt management : Version and optimize your prompts like Git code with tools like PromptHub or LangSmith. RAG (Retrieval-Augmented Generation) : A technology that augments the model's response by retrieving related information, effectively connecting internal corporate data and LLM. Fine-tuning : The process of further training a model to suit a specific domain or task, and is used in specialized fields such as legal document summarization or medical record analysis. However, what I realized while looking at the system prompt of GPT-4o is that the prompt is not that complicated even in the real research field. It is embarrassing to call the degree of determining the format of how to output it 'engineering'. In the end, I think the important thing is to find an approach that is suitable for solving the actual problem rather than a flashy prompt technique. 2. Interaction Layer: User-Centered Experience Design The interaction layer, on the other hand, focuses on how users interact and work with AI. As Intel’s case shows, experience has a significant impact on AI technology adoption: 64% of those with experience would consider AI PCs for their next upgrade, while only 32% of those without experience were positive. Providing an intuitive interface is the key to making complex AI technology easily accessible to ordinary users. Just as Slack transformed the functionality of IRC into an integrated software that ordinary users can understand and use, AI interfaces also need to undergo a similar transformation. Things to note in the interaction layer: Visual Flow and Context : Designing interfaces that visually extend the user’s thinking, like Miro or Tana. Accessible UI : Intuitive buttons and menu structure instead of complex commands Maintaining context : Ongoing interactions that understand and remember the context of the user’s actions. The importance of the interaction layer is even more evident in specialized areas such as typography or logo design, such as the recently popular Ideogram AI. The interface that hides the complexity of the technology and allows you to focus on creative expression is the core of the user experience. 3. The Future of Hybrid Interfaces So how should these two layers meet? I think the ideal hybrid interface should have the following structure:

Haebom

2025/05/05 12:07 PM

Watching Meta's first LlamaCon 2025

One of the most notable names in the generative AI field over the past few years has to be Meta's Llama series. This year, Meta held its first large-scale conference called LlamaCon. This conference gave us a clear glimpse into Meta's vision and future direction for AI. I've organized the insights I gained through my direct participation and shared them with you along with the official announcements. Meta AI, closer to everyday life with mobile apps One of the most eye-catching announcements at LlamaCon 2025 was the launch of the Meta AI standalone mobile app . The app is based on the latest model, Llama 4, and offers natural text and voice-based conversations, as well as image creation and editing capabilities. It also integrates with Meta’s various platform data to provide each user with a personalized Discover feed. What was especially interesting was the announcement that the app would be integrated with Ray-Ban’s smart glasses . With the ability to freely utilize AI assistants anytime, anywhere via smart glasses, Meta could be seen as truly opening up the possibilities of the wearable AI era. 1.2 Billion Downloads: The Success Formula for Open Source AI Llama's popularity has already been proven. The Llama series has achieved a whopping 1.2 billion cumulative downloads in just two years since its release. 650 million views in December 2024 1 billion views in March 2025 Recently surpassed 1.2 billion views The secret to this remarkable growth can be attributed to the open source strategy and powerful customization features adopted by Meta. It is expected that more developers and companies will jump into this open source AI ecosystem in the future. Llama API Launch: The Perfect Combination of Open Source and API One of the presentations that I was personally most excited about at this conference was the release of the Llama API , which was released in limited preview form to allow developers to easily call and customize Llama models. Python and Typescript SDK support Fully compatible with OpenAI SDK Partnering with Cerebras and Groq to provide a fast inference environment Fine-tuning is possible with your own data One thing Meta emphasizes in particular is that custom models and their weights created by developers are fully owned by users, which is being received as an important message about data and model sovereignty. Expanding the Enterprise Environment of Llama Stack Meta announced an expansion of its collaboration with global companies such as NVIDIA NeMo, IBM, Red Hat, and Dell to develop Llama Stack into an enterprise-grade AI solution . This partnership aims to enable companies to easily operate Llama in various environments. This is expected to be of great interest, especially in financial, medical, and public institutions that require security, performance, and cost efficiency. I recently saw the following news and thought that this might be part of this behavior. Building an open source AI ecosystem that even cares about security One of the biggest concerns when using an open source model is security issues. In this announcement, Meta unveiled a number of security tools to address these concerns.

Haebom

2025/04/30 12:03 PM

Google's Fun Lab

I wrote the above article because I was impressed by Google's challenge in the past, and there are often people who ask me, "What is Google doing?" After thinking about it, I think that Gemma3 or Gemini would be somewhat difficult to experience unless you are using an existing model, creating something with an API, or using a studio. So, I recently recommended the following site. As the name suggests, it is like a Google laboratory. https://labs.google/ "Slang Hang": Speaking Naturally Like a Local When learning a foreign language, you usually learn textbook-like sentences first. However, local people sometimes use slang or colloquial expressions naturally, unlike what is in the textbook. This is what is called "slang practice." In this experiment, you can naturally learn how conversations actually flow by watching the actual conversation flow of local people one message at a time. You can learn vivid expressions by following the conversations between street vendors and customers or the conversations of friends you haven't seen in a long time on the subway. New words learned with the camera "Word Cam" The "Word Cam" feature allows you to take pictures of objects around you, and the AI will recognize the objects in the pictures and label them with the language it is learning. For example, if you know the word "window" but not the word "blind," you can learn additional expressions related to blinds with just one picture. Career Dreamer designs career journeys and recommends related jobs What I'm talking about now is only a very small part, and if you go to Google's Labs and AI Studio, you can see various prototypes and cases that utilize Gemini's multimodality. It seems like everyone only likes OpenAI, Claude, and Deepseek, so I'm sharing this once to show that Google is also doing very, very well.

Haebom

2025/04/30 11:53 AM

Is Elon Musk's 'DOGE' Experiment Over?

Elon Musk's Department of Government Efficiency (DOGE) experiment , one of the hottest controversies in the United States recently, has effectively come to an end. Now Musk has declared that he will return to his main business, Tesla and xAI . What does this declaration mean, and what achievements and controversies has DOGE left behind so far? 🚩 What is DOGE and why was it created? One of Elon Musk’s most prominent promises during his presidential campaign was to “reduce government inefficiency .” To that end, he created an unprecedented new organization within the government. It was called the Office of Government Efficiency (DOGE) . Yes, the name was also a play on his famous “Dogecoin.” His initial promises were impressive. He promised a whopping $2 trillion in government budget cuts, and was met with applause from many supporters. However, this promise soon ran into real problems. 📉 The gap between expectations and reality: Are the promised results being achieved? A website called Wall of Receipts was launched to verify whether DOGE actually performed . It was said that Musk's savings would be transparently disclosed. However, the reality was different. The initial $2 trillion target was reduced to $1 trillion in just a few months, and then abruptly lowered to $150 billion. Confidence was hit hard as miscalculations and information inaccuracies continued to surface in the published receipts . Eventually, the media and experts questioned DOGE's performance. In particular, the Government Accountability Office (GAO) and the Treasury Inspector General (OIG) began a full-scale investigation, and although the exact results have not yet been released, the prevailing analysis is that the actual savings are likely to be much less than the announced amount . ⚖️ Two perspectives on DOGE Opinions on DOGE are sharply divided at the moment. Positive view : This is the view that Musk's attempt itself was meaningful. It is an evaluation that it made us look back on the extravagant government spending and inefficient administration, even if only a little. In particular, young voters and Musk fans evaluate that he raised issues that the existing political circles did not dare to do. Critical view : On the other hand, from a negative perspective, DOGE is seen as having caused even greater confusion and distrust with “unrealistic goals” and “inaccurate numbers.” The gap between promises and reality was too big, and criticisms that the important national budget issue was treated lightly are typical. Personally, I'm curious about this. Would this constitute fraud for the US government? Would it be considered breach of trust for the founders and managers of Tesla and xAI? I was curious, so I looked up the US federal law and each state law, and the requirements for fraud are as follows. If I apply these requirements to DOGE, it would be like this. This should be revealed in the GAO‧OIG report or hearing, but they say that a decision will be made by August 2025 at the earliest, or by the end of 2025 at the latest, so I guess I'll have to wait a while... Judging factors Legal Requirements (USA) DOGE Status ① Whether it is a false fact Objectively incorrect numbers must be presented. Numerous calculation errors were confirmed, including unit errors such as '800 million → 8 billion' and misunderstandings about contract types. ② Intent or recklessness Whether the presenter knew or could have known that it was a lie and ignored it

Haebom

2025/04/29 5:41 PM

24-hour usage to create the best condition

A word that appears very frequently on YouTube these days is "brain science." It seems like a very attractive concept to understand the scientific principles of the brain and optimize your life through life hacking, but I don't really have a good sense of how far it is actually possible. Around 2019, I was promoting a book called Make Time to people around me, and I remembered a lot of the content in the book. I gave a summary of the book to people around me, and the response was good at the time. Then, after a long time, I listened to Jake Knapp's overseas podcast, and although I didn't know much about brain science, there was a recommended routine based on the endocrine system, hormones, and brain characteristics. When we usually think of "time management," most people think of a to-do list. However, a truly efficient day is not simply about getting things done, but about living in accordance with the natural rhythms of our brains and bodies . So today, based on the latest brain science research and reliable data, I'm going to create the most ideal daily schedule with you. We'll look at the scientific reasons why we do certain things at certain times to be most efficient. 🌙 Night: Getting enough sleep is the beginning of everything (23:00 – 06:00) Our brain performs its most important tasks when we sleep: organizing memories and removing brain waste products (such as beta-amyloid) . Getting at least seven hours of sleep improves memory, concentration, and creativity, and has a “brain-cleansing” effect that removes toxins from the brain. It is recommended to block blue light from smartphones and other sources for at least an hour before going to bed to avoid interfering with the secretion of melatonin, a sleep hormone. ✨ Practice tip : Put your smartphone away an hour before bedtime, and instead end your day with light stretching or meditation. 🌅 Morning: Golden time to wake up your brain (06:00 – 09:00) The first 30 minutes after waking up in the morning is when cortisol levels spike and our brains work at their fastest and clearest. Exposure to sunlight during this time can reset your brain’s biological clock and help you start your day on a positive note. Getting some sunlight and doing some light stretching right after waking up will help you feel better and stay focused throughout the day. Find direction for your day with simple self-improvement (reading, planning, meditation). ✨ Practical tip : As soon as you wake up in the morning, open the curtains, sit by the window and take 5 minutes to soak up the sunlight while taking deep breaths. You will experience amazing results. 🚀 Morning work focus time (09:00 – 12:00) Our brains are at their peak of focus twice a day, the first of which is in the morning. Therefore, scheduling your most challenging or creative tasks of the day during this time will help you be more productive[^4]. During your focused time, stick to a 90-minute work-10-minute break cycle[^4]. It is best to block out external contact in the morning to avoid being disturbed as much as possible. ✨ Practical tip : Turn off messenger notifications for 2-3 hours in the morning and put your smartphone away for a while. You will be able to experience true immersion. 🍽️ Recharge your energy after lunch (12:00 – 15:00)

Haebom

2025/04/29 7:30 AM

Those who teach must also learn.

Recently, I started teaching a machine learning course at a university. I am used to giving special lectures or public lectures, but this is my first time teaching a 15-week regular course for undergraduates, masters, and doctoral students, so I am looking at various materials to find more effective teaching methods. Among them, I would like to share with you three teaching methods that I found particularly impressive. They are Shaping, Chaining, and Chunking. I am teaching two classes, and I am learning more while teaching than I thought. It is a great opportunity. Although it is a concurrent position... It is also new to hear the word "professor." 1. Shaping: Show the overall flow and have them follow it. Shaping is a way to demonstrate an entire flow or process from beginning to end and have students emulate it. This can be very effective in machine learning classes. For example, when I first teach students the logistic regression model, I first demonstrate the entire workflow with a simple example. I organize it so that students can see everything at a glance, from data preprocessing to model training and evaluation, so that they can intuitively understand the entire process. After that, it is more effective to have students perform the same process on their own and provide immediate feedback on problems that arise during the process. 2. Chaining: Connecting complex processes step by step Chaining is a method of breaking down a complex task into several small steps, learning each step clearly, and then gradually connecting the entire process. This is especially effective in fields with many complex processes, such as machine learning. For example, in the case of building a neural network model, data preprocessing, model design, learning, and evaluation stages are taught separately, and when students understand these individual stages well, the entire process is connected as a complete pipeline. In particular, by first showing the final evaluation results and then using the reverse chaining method, which approaches in reverse order "what steps were there to get this result," students' learning motivation and understanding can be further increased. 3. Chunking: Learn complex concepts by breaking them down into smaller pieces. Chunking is a method of breaking down complex tasks or concepts into small, manageable units, learning them individually, and then recombining them to complete the whole. It can also be very usefully applied in machine learning. For example, when teaching support vector machines (SVMs), I divide the detailed elements such as the concept explanation, the meaning and function of the kernel, and the actual code implementation into "chunks" and deal with them individually, and then let the students try to build the entire model when they have a sufficient understanding of these chunks. This way, the students can easily accept the complex content and develop a clear understanding and practical ability for each step. And as I've been teaching, I've realized that there are more education and welfare programs for professors than I thought, and even more capacity building programs. Most of them are free and of high quality, so I think it's good. Of course, the only schools I've applied to are Kookmin University and Korea University, so I don't know about other universities, but it was amazing to meet professors who were completely different from when I was a student. The midterm exam period is already over, and now it's finals. I just found out that exam periods are hard not only for students, but also for professors... I wish I hadn't hated professors so much when I was an undergraduate.

Haebom

2025/04/27 10:50 PM

Haebom's Archive

LLM Comparison/Test: Testing 39 models (7B~70B + ChatGPT/GPT-4)

Models Tested

Testing methodology:

7B scale model

👍👍👍 OpenHermes-2-Mistral-7B (Mistral format):

👍👍 Airoboros-m-7b-3.1.2 (LLaMA 2 format):

👍 Em_german_leo_mistral (Vicuna format):

Dolphin-2.1-mistral-7b (Mistral format):

SynthIA-7B-v1.3 (own model):

Mistral-7B-Instruct-v0.1 (own model):

SynthIA-7B-v2.0 (own model):

CollectiveCognition-v1.1-Mistral-7B (Vicuna format ):

Mistral-7B-OpenOrca (Mistral Official) :

Zephyr-7b-alpha (own model) :

Xwin-MLewd-7B-V0.2 (Alpaca format ) :

ANIMA-Phi-Neptune-Mistral-7B (LLaMA 2 format):

Nous-Capybara-7B (Vicuna format):

Xwin-LM-7B-V0.2 (Vicuna format):

7B Overall Review

Wolfram Ravenwolf's personal opinion

13B scale model

👍👍👍 Xwin-MLewd-13B-V0.2-GGUF (Alpaca format ):

👍👍 LLaMA2-13B-Tiefighter-GGUF (Alpaca format ):

👍 Xwin-LM-13B-v0.2-GGUF (Vicuna format ):

Mythalion-13B-GGUF (Alpaca format ):

Speechless-Llama2-Hermes-Orca-Platypus-WizardLM-13B-GGUF (Alpaca format ):

MythoMax-L2-13B-GGUF (Alpaca format ):

LLaMA2-13B-TiefighterLR-GGUF Q8_0 (Alpaca format ):

Overall Review

Wolfram Ravenwolf's personal opinion

20B model

Overall Review

Wolfram Ravenwolf's personal opinion

70B scale model

Overall Review

Wolfram Ravenwolf's personal opinion:

OpenAI (GPT-3.5/4)

⭐ GPT-4 API:

GPT-3.5 Turbo Instruct API:

GPT-3.5 Turbo API:

Overall Review

Wolfram Ravenwolf's personal opinion

Reference