热门话题
#
Bonk 生态迷因币展现强韧势头
#
有消息称 Pump.fun 计划 40 亿估值发币,引发市场猜测
#
Solana 新代币发射平台 Boop.Fun 风头正劲

Jaya Gupta
关于人工智能和其他有趣事物的推文。目前@foundationcap;曾任麦肯锡,@georgiatech校友,@stackfolio(收购),@peak6,@raymondjames
我认为你所声称的是:一个长期代理在足够的时间内(但上下文最小)可以匹配一个具有丰富上下文的垂直代理所做的事情。换句话说,能力可以替代记忆。磨练足够长的时间,你就会弄明白。
你不能用时间来替代知识/记忆。一个工作了一个世纪的长期代理仍然不知道我们因为2024年发生的事情而以不同的方式处理客户X。

Gokul Rajaram1月16日 08:05
VERTICAL AI CHALLENGE
Vertical AI Founders: You've spent 2+ years building your agents, training your model on your customers' data, embedding into workflows, creating a powerful GTM motion, all the best practices. You've beaten back challengers and are the #1 or #2 player in your vertical.
I'm sorry, you cannot relax. In fact, you need to massively up your game.
Turns out you are facing an existential challenge: long-horizon agents (eg: Claude Code). Agents that are not trained on a specific domain, but can reliably work for hours or days on end in pursuit of a goal, self-correct, and actually do stuff.
I'm sure many Vertical AI founders will say: "Oh, we are not worried. We are the system of record for decision traces. We train on enterprise-specific context. That's why these horizontal agents can never catch up with this."
You might well be right.
But, but, but ... you cannot afford to bury your head in the sand. These long-horizon agents will get better very, very quickly. You need to understand precisely how good they are at the exact jobs you've built your agents on. You cannot wait for someone else to do this. For example, if you're a legal AI company with an agent that automates contract review, you must compare how good your specialized agent is versus a general-purpose long-horizon agent that's simply given the contract and asked to perform the same review.
My challenge to you: Assign a strong engineer on your team to focus 100% on using long-horizon agents (with minimal context, other than just the contract in the example above) to compete with your custom-trained agents. Benchmark how the long-horizon agents perform vs your agent. Rinse and repeat it every few months.
Like with most other things worth measuring, what matters is the rate of improvement (the "slope" vs the Y-intercept). If the long-horizon agent is 30% as good as your vertical agent on Day 1, but 50% as good on Day 60, and 70% as good on Day 120, you need to reassess your product strategy.
AGI is coming for everyone. Long-horizon agents are the closest we have to AGI, and as a Vertical AI company, you need to figure out how you compete and survive.
Game on.
31
热门
排行
收藏

