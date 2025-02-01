OpenAI showed off its first AI Agent, Operator, last week, but it already has a scrappy competitor offering an AI tool called Browser Use that can complete tasks online for you. This Computer-Using Agent (CUA) can write, search, click buttons, and copy information from websites without you needing to touch the mouse or keyboard and without the $200-a-month ChatGPT Pro subscription.

Browser Use is actually free, at least if you're willing and able to spend some time playing with API code. I'm not very code-literate, but I naively thought I knew enough of how GitHub works to use the API version. Hours of sifting through documentation, tweaking settings, and watching examples later, I decided this would need a deeper level of coding knowledge than I have, let alone the average person browsing the web.

Happily, for me, Browser Use just debuted a cloud version that employs OpenAI's own GPT-4o model. It cuts out a lot of the heavy technical lifting and streamlines things into a more familiar chat format without any extra work. It has its limitations and costs $30, but after my inept API mess, it felt like a bargain. And even in this (still obviously unfinished) form, you still need to put some effort into engineering prompts and negotiating how the AI functions. The most limiting aspect is that you can only issue one prompt before having to start a new interaction. Despite the text box, you can’t respond to what the AI does and refine your request.

Buying AI

(Image credit: Screenshots from Browser Use)

With everything set up, I put Browser Use through a few real-world tests. First up was a price comparison task. I entered the prompt: "Navigate to Amazon, Best Buy, and Walmart and search for 'MacBook Air M2'. Extract the product name, price, and stock availability from the first five results on each site. Compare the prices and identify the lowest one. If discounts or coupons are present, record them. Provide a final summary with the best deal and where to buy it."

It did the job well, though it didn’t find any hidden discounts or coupons. Still, the fact that I could automate price tracking across multiple sites was pretty exciting. That said, a continuing issue for any agent like this comes when a website wants to check that you're human. Browser Use has a button that lets you take over whenever you want, but it will also alert you when there's a need. You can prove your humanity and then hit resume to let the AI take over again.

(Image credit: Screenshots from Browser Use)

Fly AI

(Image credit: Screenshots from Browser Use)

Next came a travel planning task with the prompt: "Search for a round-trip flight from New York to London on Dec 15, 2025 on British Air. Select the cheapest option and extract details, including price, airline, and departure time."

Browser Use delivered, pulling up a British Airways flight at $750, complete with departure time and other relevant details. This could be incredibly useful for people who book a lot of travel, especially if you automate it to check for price drops regularly.

Fair weather AI friend

(Image credit: Screenshots from Browser Use)

Finally, I tested out weather prediction and planning with the prompt: “Check the 7-day weather forecast for New York City on weather.com and summarize temperature trends, rain chances, and any severe weather warnings and then suggest how to dress for it.”

Weather is one of the most popular uses for voice assistants, so I wanted to see how the AI handled a more complex request in that vein. It did very well, not only extracting the information from the forecast but suggesting which days to wear a light coat and which days I should “insulate with a warm coat and scarf, as it will be chilly with low rain chance.”

Power trip

The key difference between the two is accessibility. Browser Use is like a Swiss Army knife for developers. It has the flexibility to do almost anything within a browser, but you need to know how to use the tools. You can dig into the code, tweak it, and mold it to your exact needs. If a feature is missing, nothing’s stopping you from adding it. Browser Use, being open-source, also has an active developer community constantly refining it. That means if you run into issues, there are forums and GitHub discussions where you can likely find answers.

OpenAI’s Operator, on the other hand, is like hiring a butler. It does a lot for you but within certain constraints. Operator’s strength is its integration with OpenAI’s broader AI ecosystem, giving it access to proprietary models that can make more nuanced decisions. However, you’re locked into OpenAI’s pricing structure and limited customization options.

Browser Use isn’t perfect. Even its cloud version demands some patience. You need to craft your prompts carefully, brace yourself for troubleshooting, and occasionally start over. The cloud version may make up for some of this later, but for now, the limits of not being able to edit or respond within the conversation put hard limits on its otherwise flexible nature.

And the speed can be frustrating as well. Check out a video of my second test; this is four times the speed of the actual process.

Right now, Browser Use is best suited for people who enjoy tinkering, such as developers, researchers, and automation geeks who don’t mind getting their hands dirty. If you’re willing to put in the effort, you’ll get a powerful, flexible tool that costs way less than its competition.

But if you’d rather not spend your weekend wrestling with configuration files, Operator may be the more forgiving option. Either way, web automation is ready for a boom.