Technology

72498 readers

3600 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

Anthropic tested Claude's(LLM, AI Chatbot) ability to manage a physical “storefront” to mixed results, as the AI struggled with pricing strategy and inventory management (www.anthropic.com)

submitted 1 week ago by [email protected] to c/[email protected]

9 comments fedilink hide all child comments

all 11 comments

sorted by: hot top controversial new old

[–] [email protected] 23 points 1 week ago (1 children)

Anybody who thought the answer could have been even remotely close to Yes is delusional.

[–] [email protected] 12 points 1 week ago (1 children)

I doubt anyone expected it to work completely, but it is interesting to see to what extent it worked and how it failed (halucinations and sycophancy)

[–] [email protected] 2 points 1 week ago

True; I just hate headlines that ask stupid questions.

But then again, there's always the premise that it could work, in such attempts, which annoys me no less.

[–] [email protected] 20 points 1 week ago* (last edited 1 week ago) (1 children)

It is an interesting article, even if it's conclusions are entirely too rosy. The "storefront" was a single vending machine, and the bot was instructed to interact with Anthropic employees (with an hourly cost attached) to do all physical interactions. While the bot did a decent job managing the stock most of the time, it made a lot of bad decisions based on trying to be too helpful to it's customers. It also frequently hallucinated, with some hilarious results I wont spoil here. But as anyone who owns a small business knows, one bad decision could put it under, so saying that an AI can manage a vending machine well "most of the time" is equivalent to saying it cant do the job at all.

Their conclusion is that with a bit more work, Claude might be able to perform as a middle-manager. To me, that says more about how useless middle-management is than how capable their AI is.

[–] [email protected] 5 points 1 week ago

So what you are saying is the AI is ready to replace tech CEOs.

[–] [email protected] 2 points 1 week ago

All the tasks could have been easily solved with some basic APIs and algorithms.

[–] [email protected] 1 points 1 week ago

This is so funny. It fails miserably and they’re all “yeah so this is promising.”

Sure, a world where your manager hallucinates meetings with you and assesses you poorly for not performing according to plans that were hallucinated through said meetings sounds like a fantastic idea.