2023年经济学人 ChatGPT是多语种奇迹(在线收听

 

    Culture

    文艺版块

    Johnson

    约翰逊专栏

    Speaking in many tongues

    讲多国语言

    ChatGPT may make things up, but it does so fluently in more than 50 languages.

    ChatGPT可能会编假话,但它能用50多种语言流利地编假话。

    The hype that followed ChatGPT's public launch last year was, even by the standards of tech innovations, extreme.

    ChatGPT自去年公开发布后所引发的炒作,即使以科技创新的标准来看也是极端的。

    OpenAI's natural-language system creates recipes, writes computer code and parodies literary styles.

    OpenAI的这一自然语言系统能创造食谱,编写计算机代码,模仿各种文学风格。

    Its latest iteration can even describe photographs.

    其最新版本甚至可以描述照片。

    It has been hailed as a technological breakthrough on a par with the printing press.

    ChatGPT被誉为与印刷机相媲美的技术突破。

    But it has not taken long for huge flaws to emerge, too.

    但没过多久,巨大的缺陷也显现出来。

    It sometimes "hallucinates" non-facts that it pronounces with perfect confidence, insisting on those falsehoods when queried.

    它有时会"幻想"出并非事实的东西,并自信满满地把这些东西讲出来,就算被质疑也坚持这些谎言。

    It also fails basic logic tests.

    它也未能通过基本的逻辑测试。

    In other words, ChatGPT is not a general artificial intelligence, an independent thinking machine.

    换句话说,ChatGPT不是通用人工智能,不是一台能独立思考的机器。

    It is, in the jargon, a large language model.

    用行话来说,它是一个大型语言模型。

    That means it is very good at predicting what kinds of words tend to follow which others, after being trained on a huge body of text -- its developer, OpenAI, does not say exactly from where -- and spotting patterns.

    这意味着,在用大量文本进行训练后,它非常擅长预测哪些单词之后往往接着哪些其他单词并找出其中的规律,其开发者OpenAI没有具体说明这些文本的来源。

    Amid the hype, it is easy to forget a minor miracle.

    在炒作中,很容易忘记一个小小的奇迹。

    ChatGPT has aced a problem that long served as a far-off dream for engineers: generating human-like language.

    ChatGPT成功解决了一个长期以来一直被工程师们视为遥远梦想的问题:生成类似人类的语言。

    Unlike earlier versions of the system, it can go on doing so for paragraphs on end without descending into incoherence.

    与早期版本不同,ChatGPT可以长篇大段地一直说下去,而不会出现语句不通的情况。

    And this achievement's dimensions are even greater than they seem at first glance.

    这一成就的影响范围甚至比它在初看之时所表现的更大。

    ChatGPT is not only able to generate remarkably realistic English.

    ChatGPT不仅能生成非常逼真的英语。

    It is also able to instantly blurt out text in more than 50 languages -- the precise number is apparently unknown to the system itself.

    还能立即脱口而出50多种语言 -- 系统自己显然也不知道确切数字是多少。

    Asked (in Spanish) how many languages it can speak, ChatGPT replies, vaguely, "more than 50", explaining that its ability to produce text will depend on how much training data is available for any given language.

    当被问及(用西班牙语)它会说几种语言时,ChatGPT含糊地回答说"超过50种",并解释说,它以某种语言生成文本的能力取决于这一语言的训练数据有多少。

    Then, asked a question in an unannounced switch to Portuguese, it offers up a sketch of your columnist's biography in that language.

    然后,在没有通知的情况下转而用葡萄牙语提问时,它又用葡萄牙语提供了您的专栏作家的生平简介。

    Most of it was correct, but it had him studying the wrong subject at the wrong university.

    大部分内容是正确的,但他就读的大学和专业搞错了。

    The language itself was impeccable.

    而语言本身无可挑剔。

    Portuguese is one of the world's biggest languages.

    葡萄牙语是世界上最大的语种之一。

    Trying out a smaller language, your columnist probed ChatGPT in Danish, spoken by only about 5.5m people.

    为了试一个更小的语种,您的专栏作家又用丹麦语对ChatGPT进行了追问,大约只有550万人说丹麦语。

    Danes do much of their online writing in English, so the training data for Danish must be orders of magnitude scarcer than what is available for English, Spanish or Portuguese.

    丹麦人在网上写东西大部分都是用英语,所以丹麦语的训练数据肯定比英语、西班牙语或葡萄牙语能提供的训练数据要少几个数量级。

    ChatGPT's answers were factually askew but expressed in almost perfect Danish.

    ChatGPT的回答歪曲了事实,但其丹麦语几近完美。

    (A tiny gender-agreement error was the only mistake caught in any of the languages tested.)(在所有测试的语言中,只发现了一个微小的性别一致性错误。)Indeed, ChatGPT is too modest about its own abilities.

    的确,ChatGPT对自己的能力过于谦虚。

    On request, it furnishes a list of 51 languages it can work in, including Esperanto, Kannada and Zulu.

    它应要求提供了它可以使用的51种语言的清单,其中包括世界语、卡纳达语和祖鲁语。

    It declines to say that it can "speak" these languages, but rather "generates text" in them.

    它拒绝说自己会"说"这些语言,而是说能用这些语言"生成文本"。

    This is too humble an answer.

    这个回答真是过谦了。

    Addressed in Catalan -- a language not on the list -- it replies in that language with a cheerful "Yes, I do speak Catalan -- what can I help you with?"在用加泰罗尼亚语(这种语言不在清单上)和它说话时,它用这种语言愉快地回答道:"是的,我会说加泰罗尼亚语,有什么可以帮你的吗?"A few follow-up questions do not trip it up in the slightest, including a query about whether it is merely translating answers first generated in another language into Catalan.

    一些后续的提问也丝毫没能让它出差错,包括询问它是否只是先用另一种语言生成答案,然后再翻译成加泰罗尼亚语。

    This, ChatGPT denies: "I don't translate from any other language; I look in my database for the best words and phrases to answer your questions."ChatGPT否认了这一点:"我不翻译任何其他语言,我在我的数据库中寻找最佳词句来回答您的问题。"Who knows if this is true?

    谁知道这是不是真的?

    ChatGPT not only makes things up, but incorrectly answers questions about the very conversation it is having.

    ChatGPT不仅编造故事,而且错误地回答了有关正在进行的对话的问题。

    (It has no "memory", but rather feeds the last few thousand words of each conversation back into itself as a new prompt.

    (它没有"记忆",而是将每次对话的最后几千个单词反馈给自己,作为新的提示符。

    If you have been speaking English for a while it will "forget" that you asked a question in Danish earlier and say that the question was asked in English.)如果你说了一段时间的英语,它就会"忘记"你之前用丹麦语问了一个问题,并说那个问题是用英语问的。)ChatGPT is untrustworthy not just about the world, but even about itself.

    ChatGPT在关于世界,甚至关于它自己的方面是不可信赖的。

    This should not overshadow the achievement of a model that can effortlessly mimic so many languages, including those with limited training data.

    但这不应该掩盖这一模型的成就,它可以毫不费力地模仿如此多的语言,包括那些训练数据有限的语言。

    Speakers of smaller languages have worried for years about language technologies passing them by.

    多年来,较小语种的使用者一直担心语言技术会与他们擦肩而过。

    Their justifiable concern had two causes: the lesser incentive for companies to develop products in Icelandic or Maltese, and the relative lack of data to train them.

    他们这一合理担忧有两个原因:公司开发冰岛语或马耳他语产品的动力较小,以及训练数据相对缺乏。

    Somehow the developers of ChatGPT seem to have overcome such problems.

    ChatGPT的开发者似乎不知如何已经克服了这些问题。

    It is too early to say what good the technology will do, but this alone gives one reason to be optimistic.

    现在说这项技术会有什么好处还为时过早,但只是这一点就给了我们一个保持乐观的理由。

    As machine-learning techniques improve, they may not require the vast resources, in programming time or data, traditionally thought necessary to make sure smaller languages are not overlooked online.

    随着机器学习技术的进步,它们可能不像之前以为的那样,需要编程时间或数据方面的大量资源,这会确保较小的语种不会在网上被忽视。

  原文地址:http://www.tingroom.com/lesson/jjxrhj/2023jjxr/565751.html