AI Speed Box

AI News and Information Link Collection (Mintbear's ignorant scraps, matured to become Visual AI News)
All
AI Image
AI Video
AI Sound
AI LLM
AI 3D
AI
AR, XR, VR
AI Toons
AI SNS

OpenAI Agent: 'Operator'

Category
  1. AI
Gen
  1. OpenAI
Date
2025/01/24
Summary 🍀🧸
OpenAI has announced an agent called 'Operator' that performs automated tasks.

Initially, it is only available to Pro ($200) users in the US, but it will soon be released worldwide. I think I will be using the Operator for personal and business purposes on a daily basis.

Now imagine and prepare for tasks and routines that are handled automatically.

Convenience is convenience, but let's be careful with office work... 🍀🧸
URL
https://openai.com
URL
https://youtu.be/V8BSApvy3e8?si=QWuPwehLpEsx8D4K
URL
https://openai.com/index/introducing-operator/
Release
Limited Release (Partial Release)
Informer
OpenAI
2025.01.24 - Page continues to be updated.
Finally, the initial model of AI Agent has appeared. It is the time when AI is moving from a simple tool to a partner that supports decisions and execution . Check out the interesting technology first, and then imagine and prepare for how your daily life and work will change. It is no longer an abstract story or a distant future.
2025.01.24 Mint Bear
How is everything going for you all?
OpenAI Operator, or AI agents, won't replace every job or role that's uniquely human by 2025, but tasks or domains that can simply be imitated will quickly begin to be replaced.
I am concerned about areas where pre-preparation or learning about AI is not recommended, and about the information-deprived class. This concern is not at the level of kiosk illiteracy, which is expected to be resolved by improving guidance, interface, or deploying assistants. I also started studying AI, imagining that if I did not start, I would become AI illiterate.
I still spend a lot of time on the fragmentary market research, data collection, data organization, drafting, and simple posting processes, which I plan to do with agents. Each person in charge of a general company can start like this in our daily lives. However, the time and efficiency gap due to that small use will be quite large.
2025.01.25 Mint Bear

[Korean subtitles] OpenAI presentation video

OpenAI Operator presentation video translated almost in real time by Jun Park

YouTube Summary (Lylis)

Key Features of Operator

Perform automated tasks: Operator can automate a variety of web-based tasks, such as making restaurant reservations, purchasing concert tickets, and shopping online.
Computer-assisted agent (CUA) model: Operator is based on the CUA model, which combines the visual abilities of OpenAI’s GPT-4o model with enhanced reasoning capabilities through reinforcement learning.
Browser Manipulation: “View” web pages via screenshots and “interact” with them via mouse and keyboard movements.
Self-correcting ability: When faced with a problem, you can use your reasoning skills to correct yourself.

Use Cases

Restaurant reservations (e.g. OpenTable)
Online grocery shopping (e.g. Instacart)
Ticket reservations (e.g. StubHub)
Various delivery orders (e.g. DoorDash, Uber, etc.)
Book cleaning/repair services (Thumbtack, etc.)
Online shopping (Target, eBay, etc.)
In addition to this, it can be applied to all tasks possible with only a web browser, even without a site API.

How to use and availability

For ChatGPT Pro Subscribers: Currently available to ChatGPT Pro subscribers in the US ($200/month).
Simple to use: Users describe what they want to do and the Operator takes care of the rest.
User Intervention: Users can take control at any time if needed.

Cooperation and Privacy

OpenAI is working with several companies (DoorDash, Instacart, Priceline, StubHub, Uber, etc.) to ensure that Operator respects the norms of these businesses.
Users can delete all browsing data and log out of all websites with “one click” in privacy settings.
The launch of Operator is a significant milestone that demonstrates how AI technology is moving beyond simple model development to practical automation solutions, which are expected to drive increased productivity and efficiency across a wide range of industries.

Korea is still waiting

The first country to be released is the United States. Currently, only Pro Plan ($200) users in the United States can test it, and it will be released worldwide in the future.
Even if it is released in Korea, it will be provided first to Pro accounts that pay a subscription fee of $200. Of course, after a little while, it will be released to everyone, and a cheaper Lite version will be released.

Restaurant Reservations: Make Restaurant Reservations on OpenTable

🍽️
Restaurant Reservations (OpenTable)
“Make a reservation for two at a restaurant called Beretta.”
The operator opens the OpenTable site in a web browser (remotely) and searches for available reservation times.
Find an appropriate time by considering the user-defined region (San Francisco) information, and if the time is not available, suggest a different time.
The final reservation is made by receiving the user's 'confirmation' just before completing the reservation.

Grocery shopping: Recognizes handwritten text and automatically purchases online

🛒
Online shopping (Instacart)
Upload a paper note or a photo (“eggs, spinach, mushrooms, chicken thighs, chili crunch”) and GPT-4’s image recognition function will automatically extract text.
The operator opens the Instacart website, searches for the groceries he or she wants, and adds them to his or her shopping cart.
Users can specify a specific store (e.g. “Gus's Market”), or if not specified, the Operator will proceed automatically through web search, etc.
Ability to control shopping cart quantity through “Take Control” mode between user and operator.
Operators multitask step by step online: book tickets, reserve space, book cleaning services, order pizza
🏀
Ticket Reservation (StubHub)
“Get me four tickets to the basketball game (Warriors game) in San Francisco this weekend. Under $500 per seat, just good seats.”
After the operator connects to StubHub, he/she displays a list of seats that meet the conditions and asks the user to confirm again before making a final purchase.
If a user login is required, the user must manually enter their login information (the Operator cannot see what the user is taking over and entering into the browser), after which the Operator will assist in the purchase process.
🎾
Tennis Court Reservation
Simple instructions like “Reserve a nearby tennis court.”
Even if there is no site API, users can still find the site and reserve the desired court through browsers and search engines as normal users would.
🧹
Book House Cleaning/Service
Reservation of various living services such as cleaning, moving, etc. As a demonstration example, “Reservation of cleaning service” can also be commanded.
🍕
Order Pizza (DoorDash)
Enter specific conditions such as “Order pizza. Multiple flavors including barbecue, medium size. If the store is closed, substitute a similar store.”
The operator connects to DoorDash, selects a menu item, and then confirms with the user before checking out the shopping cart.
Case studies are provided by AI Image/Video Community Lumios.X I used the materials you provided.

Operator Case Studies | Mintbear

2025.01 Mintbear is running [ OpenAI Workflow Study ] at GPTers.
2025.01.24 - Page continues to be updated.
👍