
OpenAI上周發布的GPT-5本應是一場勝利,證明該公司仍是AI領域無可爭議的領導者,然而結果卻事與愿違。上周末,用戶的強烈反對使此次發布不僅演變成公關危機,更升級為產品與信任危機。用戶痛惜失去他們最喜歡的、可兼任心理醫生、朋友和伴侶的模型,開發者則抱怨模型的性能下降。行業評論家蓋瑞·馬庫斯照例批評GPT-5“姍姍來遲、過度炒作、索然無味”。
許多人指出問題的根源是顯而易見的:全新的實時模型“路由器”會為每項任務自動調度GPT-5的某個子版本。許多用戶原以為GPT-5是從零訓練的單體模型;實際上,它卻是多個模型的組合網絡,有些模型性能較弱、成本更低,有些模型能力更強但成本更高。專家表示,隨著大語言模型的發展且日益消耗資源,這種架構可能代表了AI的未來方向。但在GPT-5的首秀中,OpenAI暴露出該架構存在的一些固有挑戰,也深刻認識到AI時代用戶期望的演變趨勢。
盡管模型路由技術有眾多優勢,但廣大GPT-5用戶仍對其剝奪控制權感到憤怒。有人甚至質疑OpenAI可能試圖故意蒙蔽用戶。
為平息風波,OpenAI迅速為專業用戶重新啟用早期主力模型GPT-4o,同時宣布修復路由故障、提高使用限額,并承諾持續更新以重建用戶信任與系統穩定性。
對于這種情況,AI銷售平臺FirstQuadrant聯合創始人阿南德·喬杜里直言不諱地評價道:“當路由精準時,它像魔法一樣神奇,但當它失靈時,卻如同系統崩潰一般。”
模型路由技術的未來前景與不一致性
伊利諾伊大學厄巴納-香檳分校(University of Illinois Urbana-Champaign)計算機科學助理教授游家軒(音譯)向《財富》透露,其實驗室深入研究了模型路由技術的未來前景與不一致性。他表示,就GPT-5而言,他相信(雖并未證實)模型路由器有時可能將同一查詢的不同部分分發至多個模型:更廉價快速的模型給出一種答案,而響應速度較慢、專注于推理的模型產生另一結果,當系統拼接不同模型的回應時會出現細微的矛盾之處。
他解釋道,模型路由的構想雖然直觀,“但真正讓它發揮作用卻并不容易”。他補充道,完善路由系統的難度堪比打造亞馬遜(Amazon)級別的推薦系統,需要耗費數年,并且與眾多領域專家協作。他解釋道:“構建GPT-5模型投入的資源本應呈指數級增長。即便路由器選擇小型模型,也不該產生不一致的答案。”
不過游家軒堅信路由技術將成常態。他表示:“業內同樣認可模型路由技術的前景。”他指出這源于技術與經濟的雙重考量。在技術層面,單體模型性能似乎觸及瓶頸。他提到了廣受認可的擴展定律,即數據與算力增長可提升模型性能。他表示:“但眾所周知,模型改進存在極限。過去一年我們親眼見證單體模型的能力趨于飽和。”
在經濟層面,路由技術使AI供應商能夠重復使用舊模型,而不是在新模型發布后將其棄用。時事類查詢需頻繁更新,但靜態事實在多年之后依舊準確。將特定查詢導向舊模型,可避免浪費先前為訓練模型投入的大量時間、算力和資金。
物理限制同樣關鍵。GPU內存已成為訓練更大模型的瓶頸,而芯片技術正逼近單晶片可承載的存儲極限。游家軒解釋稱,物理限制意味著新模型的規模無法擴大十倍。
重獲關注的舊理念
AI平臺Lightning AI創始人兼CEO威廉·法爾肯指出,模型集成并非新概念,而是在2018年左右就已出現,由于OpenAI模型屬黑箱系統,我們無法得知GPT-4是否也采用了模型路由技術。
他表示:“或許他們現在更明確地公開了這一點。”無論如何,GPT-5的發布被過度炒作——包括其模型路由系統。介紹該模型的官方博文宣稱這是“迄今為止最智能、最快速、最實用的內置思維模型”。OpenAI在ChatGPT的官方博客中證實,GPT-5通過后臺路由器協調多模型運行,必要時切換至深度推理模式。GPT-5系統文檔更進一步列明多個變體:標準版gpt-5-main、高速版gpt-5-main-mini、思維版gpt-5-thinking、精簡思維版gpt-5-thinking-mini及專業思考版,并闡述統一系統如何自動調度。
在媒體預發布會上,OpenAI CEO薩姆·奧爾特曼將模型路由器譽為解決“模型選擇難題”的方案。他表示舊版模型選擇界面是“一團糟,令人迷惑”。
但法爾肯認為,核心問題在于GPT-5未帶來跨越式提升。“從GPT-1到2、3、4,每次迭代都有巨大飛躍。而第四代到第五代的改進微乎其微,這才是用戶不滿的根源。”
多模型疊加能否實現AGI?
關于模型路由的爭議引發部分人士批評當前對通用人工智能(AGI)即將實現的過度炒作。OpenAI官方將AGI定義為“在大多數具有經濟價值的工作中超越人類的高度自主系統”,但奧爾特曼上周特別強調該術語“實用性不足”。
TensorOpera聯合創始人、AI研究員何朝陽在X平臺發文批評GPT-5的發布稱:“承諾的AGI在哪里?強大如OpenAI這樣的公司也無力訓練超大模型,被迫采用實時模型路由器。”
AI生產平臺Anyscale的聯合創始人羅伯特·西哈拉表示,AI領域仍在持續擴展,但全能型單體模型仍遙不可及。他表示:“很難打造出樣樣精通的全能模型。”這正是GPT-5依賴路由連接的模型網絡而非單體架構的原因。
OpenAI曾表示希望未來整合為單一模型,但西哈拉強調混合系統具備實質優勢:你可以逐步升級系統中的某個部分,不會影響其他部分的運行;這樣既能獲得大部分性能提升,又能避免重新訓練整個龐大模型所帶來的高昂成本和復雜性。因此他認為路由技術將長期存在。
何朝陽對此表示認同。理論上擴展定律依然成立,即更多數據與算力能提升模型性能,但在實際操作中,他認為AI的發展會在兩種路徑之間“螺旋式推進”:一方面是將多個專用模型通過路由機制組合使用,另一方面則是嘗試將它們整合成一個統一的大模型。決定因素在于工程成本、算力與能源限制,以及商業壓力。
對AGI的過度炒作也需要調整。法爾肯在談及大語言模型的“大腦”時表示:“如果真有人做出接近AGI的東西,我不確定那是否會由一組權重參數來實現。如果是一組模型組合起來,整體看起來像是AGI,那也沒問題。我們在這方面不要拘泥于純粹主義。”(財富中文網)
譯者:劉進龍
審校:汪皓
OpenAI上周發布的GPT-5本應是一場勝利,證明該公司仍是AI領域無可爭議的領導者,然而結果卻事與愿違。上周末,用戶的強烈反對使此次發布不僅演變成公關危機,更升級為產品與信任危機。用戶痛惜失去他們最喜歡的、可兼任心理醫生、朋友和伴侶的模型,開發者則抱怨模型的性能下降。行業評論家蓋瑞·馬庫斯照例批評GPT-5“姍姍來遲、過度炒作、索然無味”。
許多人指出問題的根源是顯而易見的:全新的實時模型“路由器”會為每項任務自動調度GPT-5的某個子版本。許多用戶原以為GPT-5是從零訓練的單體模型;實際上,它卻是多個模型的組合網絡,有些模型性能較弱、成本更低,有些模型能力更強但成本更高。專家表示,隨著大語言模型的發展且日益消耗資源,這種架構可能代表了AI的未來方向。但在GPT-5的首秀中,OpenAI暴露出該架構存在的一些固有挑戰,也深刻認識到AI時代用戶期望的演變趨勢。
盡管模型路由技術有眾多優勢,但廣大GPT-5用戶仍對其剝奪控制權感到憤怒。有人甚至質疑OpenAI可能試圖故意蒙蔽用戶。
為平息風波,OpenAI迅速為專業用戶重新啟用早期主力模型GPT-4o,同時宣布修復路由故障、提高使用限額,并承諾持續更新以重建用戶信任與系統穩定性。
對于這種情況,AI銷售平臺FirstQuadrant聯合創始人阿南德·喬杜里直言不諱地評價道:“當路由精準時,它像魔法一樣神奇,但當它失靈時,卻如同系統崩潰一般。”
模型路由技術的未來前景與不一致性
伊利諾伊大學厄巴納-香檳分校(University of Illinois Urbana-Champaign)計算機科學助理教授游家軒(音譯)向《財富》透露,其實驗室深入研究了模型路由技術的未來前景與不一致性。他表示,就GPT-5而言,他相信(雖并未證實)模型路由器有時可能將同一查詢的不同部分分發至多個模型:更廉價快速的模型給出一種答案,而響應速度較慢、專注于推理的模型產生另一結果,當系統拼接不同模型的回應時會出現細微的矛盾之處。
他解釋道,模型路由的構想雖然直觀,“但真正讓它發揮作用卻并不容易”。他補充道,完善路由系統的難度堪比打造亞馬遜(Amazon)級別的推薦系統,需要耗費數年,并且與眾多領域專家協作。他解釋道:“構建GPT-5模型投入的資源本應呈指數級增長。即便路由器選擇小型模型,也不該產生不一致的答案。”
不過游家軒堅信路由技術將成常態。他表示:“業內同樣認可模型路由技術的前景。”他指出這源于技術與經濟的雙重考量。在技術層面,單體模型性能似乎觸及瓶頸。他提到了廣受認可的擴展定律,即數據與算力增長可提升模型性能。他表示:“但眾所周知,模型改進存在極限。過去一年我們親眼見證單體模型的能力趨于飽和。”
在經濟層面,路由技術使AI供應商能夠重復使用舊模型,而不是在新模型發布后將其棄用。時事類查詢需頻繁更新,但靜態事實在多年之后依舊準確。將特定查詢導向舊模型,可避免浪費先前為訓練模型投入的大量時間、算力和資金。
物理限制同樣關鍵。GPU內存已成為訓練更大模型的瓶頸,而芯片技術正逼近單晶片可承載的存儲極限。游家軒解釋稱,物理限制意味著新模型的規模無法擴大十倍。
重獲關注的舊理念
AI平臺Lightning AI創始人兼CEO威廉·法爾肯指出,模型集成并非新概念,而是在2018年左右就已出現,由于OpenAI模型屬黑箱系統,我們無法得知GPT-4是否也采用了模型路由技術。
他表示:“或許他們現在更明確地公開了這一點。”無論如何,GPT-5的發布被過度炒作——包括其模型路由系統。介紹該模型的官方博文宣稱這是“迄今為止最智能、最快速、最實用的內置思維模型”。OpenAI在ChatGPT的官方博客中證實,GPT-5通過后臺路由器協調多模型運行,必要時切換至深度推理模式。GPT-5系統文檔更進一步列明多個變體:標準版gpt-5-main、高速版gpt-5-main-mini、思維版gpt-5-thinking、精簡思維版gpt-5-thinking-mini及專業思考版,并闡述統一系統如何自動調度。
在媒體預發布會上,OpenAI CEO薩姆·奧爾特曼將模型路由器譽為解決“模型選擇難題”的方案。他表示舊版模型選擇界面是“一團糟,令人迷惑”。
但法爾肯認為,核心問題在于GPT-5未帶來跨越式提升。“從GPT-1到2、3、4,每次迭代都有巨大飛躍。而第四代到第五代的改進微乎其微,這才是用戶不滿的根源。”
多模型疊加能否實現AGI?
關于模型路由的爭議引發部分人士批評當前對通用人工智能(AGI)即將實現的過度炒作。OpenAI官方將AGI定義為“在大多數具有經濟價值的工作中超越人類的高度自主系統”,但奧爾特曼上周特別強調該術語“實用性不足”。
TensorOpera聯合創始人、AI研究員何朝陽在X平臺發文批評GPT-5的發布稱:“承諾的AGI在哪里?強大如OpenAI這樣的公司也無力訓練超大模型,被迫采用實時模型路由器。”
AI生產平臺Anyscale的聯合創始人羅伯特·西哈拉表示,AI領域仍在持續擴展,但全能型單體模型仍遙不可及。他表示:“很難打造出樣樣精通的全能模型。”這正是GPT-5依賴路由連接的模型網絡而非單體架構的原因。
OpenAI曾表示希望未來整合為單一模型,但西哈拉強調混合系統具備實質優勢:你可以逐步升級系統中的某個部分,不會影響其他部分的運行;這樣既能獲得大部分性能提升,又能避免重新訓練整個龐大模型所帶來的高昂成本和復雜性。因此他認為路由技術將長期存在。
何朝陽對此表示認同。理論上擴展定律依然成立,即更多數據與算力能提升模型性能,但在實際操作中,他認為AI的發展會在兩種路徑之間“螺旋式推進”:一方面是將多個專用模型通過路由機制組合使用,另一方面則是嘗試將它們整合成一個統一的大模型。決定因素在于工程成本、算力與能源限制,以及商業壓力。
對AGI的過度炒作也需要調整。法爾肯在談及大語言模型的“大腦”時表示:“如果真有人做出接近AGI的東西,我不確定那是否會由一組權重參數來實現。如果是一組模型組合起來,整體看起來像是AGI,那也沒問題。我們在這方面不要拘泥于純粹主義。”(財富中文網)
譯者:劉進龍
審校:汪皓
OpenAI’s GPT-5 announcement last week was meant to be a triumph—proof that the company was still the undisputed leader in AI—until it wasn’t. Over the weekend, a groundswell of pushback from customers turned the rollout into more than a PR firestorm: It became a product and trust crisis. Users lamented the loss of their favorite models, which had doubled as therapists, friends, and romantic partners. Developers complained of degraded performance. Industry critic Gary Marcus predictably called GPT-5 “overdue, overhyped, and underwhelming.”
The culprit, many argued, was hiding in plain sight: a new real-time model “router” that automatically decides which one of GPT-5’s several variants to spin up for every job. Many users assumed GPT-5 was a single model trained from scratch; in reality, it’s a network of models—some weaker and cheaper, others stronger and more expensive—stitched together. Experts say that approach could be the future of AI as large language models advance and become more resource-intensive. But in GPT-5’s debut, OpenAI demonstrated some of the inherent challenges in the approach and learned some important lessons about how user expectations are evolving in the AI era.
For all the benefits promised by model routing, many users of GPT-5 bristled at what they perceived as a lack of control. Some even suggested OpenAI might purposefully be trying to pull the wool over their eyes.
In response to the GPT-5 uproar, OpenAI moved quickly to bring back the main earlier model, GPT-4o, for pro users. It also said it fixed buggy routing, increased usage limits, and promised continual updates to regain user trust and stability.
Anand Chowdhary, cofounder of AI sales platform FirstQuadrant, summed the situation up bluntly: “When routing hits, it feels like magic. When it whiffs, it feels broken.”
The promise and inconsistency of model routing
Jiaxuan You, an assistant professor of computer science at the University of Illinois Urbana-Champaign, told Fortune his lab has studied both the promise—and the inconsistency—of model routing. In GPT-5’s case, he said, he believes (though he can’t confirm) that the model router sometimes sends parts of the same query to different models. A cheaper, faster model might give one answer while a slower, reasoning-focused model gives another, and when the system stitches those responses together, subtle contradictions slip through.
The model routing idea is intuitive, he explained, but “making it really work is very nontrivial.” Perfecting a router, he added, can be as challenging as building Amazon-grade recommendation systems, which take years and many domain experts to refine. “GPT-5 is supposed to be built with maybe orders of magnitude more resources,” he explained, pointing out that even if the router picks a smaller model, it shouldn’t produce inconsistent answers.
Still, You believes routing is here to stay. “The community also believes model routing is promising,” he said, pointing to both technical and economic reasons. Technically, single-model performance appears to be hitting a plateau: You pointed to the commonly cited scaling laws, which says when we have more data and compute, the model gets better. “But we all know that the model wouldn’t get infinitely better,” he said. “Over the past year, we have all witnessed that the capacity of a single model is actually saturating.”
Economically, routing lets AI providers keep using older models rather than discarding them when a new one launches. Current events require frequent updates, but static facts remain accurate for years. Directing certain queries to older models avoids wasting the enormous time, compute, and money already spent on training them.
There are hard physical limits, too. GPU memory has become a bottleneck for training ever-larger models, and chip technology is approaching the maximum memory that can be packed onto a single die. In practice, You explained, physical limits mean the next model can’t be 10 times bigger.
An older idea that is now being hyped
William Falcon, founder and CEO of AI platform Lightning AI, points out that the idea of using an ensemble of models is not new—it has been around since around 2018—and since OpenAI’s models are a black box, we don’t know that GPT-4 did not also use a model routing system.
“I think maybe they’re being more explicit about it now, potentially,” he said. Either way, the GPT-5 launch was heavily hyped up—including the model routing system. The blog post introducing the model called it the “smartest, fastest, and most useful model yet, with thinking built in.” In the official ChatGPT blog post, OpenAI confirmed that GPT 5 within ChatGPT runs on a system of models coordinated by a behind-the-scenes router that switches to deeper reasoning when needed. The GPT 5 System Card went further, clearly outlining multiple model variants—gpt 5 main, gpt 5 main mini for speed, and gpt 5 thinking, gpt 5 thinking mini, plus a thinking pro version—and explains how the unified system automatically routes between them.
In a press pre-briefing, OpenAI CEO Sam Altman touted the model router as a way to tackle what had been a hard-to-decipher list of models to choose from. Altman called the previous model picker interface a “very confusing mess.”
But Falcon said the core problem was that GPT-5 simply didn’t feel like a leap. “GPT-1 to 2 to 3 to 4—each time was a massive jump. Four to five was not noticeably better. That’s what people are upset about.”
Will multiple models add up to AGI?
The debate over model routing led some to call out the ongoing hype over the possibility of artificial general intelligence, or AGI, being developed soon. OpenAI officially defines AGI as “highly autonomous systems that outperform humans at most economically valuable work,” but Altman notably said last week that it is “not a super useful term.”
“What about the promised AGI?” wrote Aiden Chaoyang He, an AI researcher and cofounder of TensorOpera, on X, criticizing the GPT-5 rollout. “Even a powerful company like OpenAI lacks the ability to train a super-large model, forcing them to resort to the Real-time Model Router.”
Robert Nishihara, co-founder of AI production platform Anyscale, says scaling is still progressing in AI, but the idea of one all-powerful AI model remains elusive. “It’s hard to build one model that is the best at everything,” he said. That’s why GPT-5 currently runs on a network of models linked by a router, not a single monolith.
OpenAI has said it hopes to unify these into one model in the future, but Nishihara points out that hybrid systems have real advantages: You can upgrade one piece at a time without disrupting the rest, and you get most of the benefits without the cost and complexity of retraining an entire giant model. As a result, Nishihara thinks routing will stick around.
Aiden Chaoyang He agrees. In theory, scaling laws still hold—more data and compute make models better—but in practice, he believes development will “spiral” between two approaches: routing specialized models together, then trying to consolidate them into one. The deciding factors will be engineering costs, compute and energy limits, and business pressures.
The hyped-up AGI narrative may need to adjust, too. “If anyone does anything that’s close to AGI, I don’t know if it’ll literally be one set of weights doing it,” Falcon said, referring to the “brains” behind LLMs. “If it’s a collection of models that feels like AGI, that’s fine. No one’s a purist here.”