Почему модель суммирования bart-large-cnn дала забавный результат с разными настройками длины?

У меня есть кусок текста в 4226 символов (316 слов + спецсимволы)

Я пробую разные комбинации min_length и max_length, чтобы получить сводку

      print(summarizer(INPUT, max_length = 1000, min_length=500, do_sample=False))

С кодом:

Код

      summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

INPUT = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

print(summarizer(INPUT, max_length = 1000, min_length=500, do_sample=False))


У меня есть вопросы:

Вопрос 1: Что означает следующее предупреждающее сообщение?Your max_length is set to 1000, ...

Ваша max_length установлена ​​на 1000, но ваша input_length равна всего 856. Вы можете рассмотреть возможность уменьшения max_length вручную, например summer('…', max_length=428)

Вопрос 2: После приведенного выше сообщения публикуется сводка длиной 2211 символов. Как он это получил?

Вопрос 3: Из приведенных выше 2211 символов первые 933 символа являются допустимым содержимым текста, но затем публикуется текст типа

Для получения конфиденциальной поддержки позвоните в организацию «Самаритяне» по телефону 08457 90 90 90 или посетите местное отделение Самаритян, подробности см. на сайте www.samaritans.org .Для поддержки …

Вопрос 4. Как на самом деле работают min_length и max_length (похоже, что они не соответствуют заданным ограничениям)?

Вопрос 5: Каков максимальный объем данных, которые я могу предоставить этому сумматору?

1 ответ

Вопрос 2: После приведенного выше сообщения публикуется сводка длиной 2211 символов. Как он это получил?

Ответ: Длина, которую видит модель, не является показателем «нет». символов, поэтому Q2 выходит за рамки вопроса. Более целесообразно определить, короче ли выход модели, чем входной номер. токенов подслов.

Как мы по-человечески решаем нет. слов немного отличается от того, как модель видит «нет». токенов, т.е.

      from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")

text = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

tokenized_text = tokenizer(text)

print(len(tokenized_text['input_ids']))

[вне]:

      800

Мы видим, что входной текст, который у вас есть в примере, имеет 800 токенов входных подслов, а не 300 слов.


Q1: Что означает следующее?Your max_length is set to 1000 ...

Предупреждающее сообщение выглядит так:

Your max_length is set to 1000, but you input_length is only 856. You might consider decreasing max_length manually, e.g. summarizer(‘…’, max_length=428)

Давайте сначала попробуем внести входные данные в модель и посмотреть «нет». токенов, которые он выводит (без конвейера)

[код]:

      
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")


text = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

tokenized_text = tokenizer(text, return_tensors="pt")

outputs = model.generate(tokenized_text['input_ids'])

tokenizer.decode(outputs[0], skip_special_tokens=True)

[стандартный код]:

      /usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py:1288: 

UserWarning: Using `max_length`'s default (142) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.

[стандартный вывод]:

ChatGPT — это механизм, который в конечном итоге обеспечит взаимодействие человека с компьютерными системами привычным, естественным и интуитивно понятным способом. Microsoft, которая является инвестором OpenAI, интегрирует ChatGPT в свою поисковую систему Bing. По данным IDC и Bloomberg Intelligence, в 2020 году общий объем рынка оборудования и услуг искусственного интеллекта составил почти 36 миллиардов долларов США.

Проверка выхода №. токенов:

      print(outputs.shape)

print(len(tokenizer.decode(outputs[0], skip_special_tokens=True)))

[вне]:

      torch.Size([1, 73])
343

Таким образом, модель суммирует входные 800 токенов подслов и выводит 73 подслова, состоящих из 343 символов.

Не знаю, как вам удалось получить более 2000 символов, поэтому давайте попробуем с помощью конвейера.

[код]:

      from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

text = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

output = summarizer(text)

print(output)

[вне]:

      [{'summary_text': 'ChatGPT is an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data.'}]

Проверяем размер вывода:

      print(output[0]['summary_text'])

[вне]:

      343

Это соответствует тому, как мы используем модель без конвейера, краткое описание из 343 символов.

Вопрос: Означает ли это, что мне не нужно устанавливать ?

Да, вроде, ничего делать не надо, так как аннотация и так короче входного текста.

Вопрос: Что делает установка?

Мы знаем, что сводка вывода по умолчанию дает нам 73 токена. Давайте попробуем посмотреть, что произойдет, если мы установим значение до 30 жетонов!

      
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")


text = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

tokenized_text = tokenizer(text, return_tensors="pt")

outputs = model.generate(tokenized_text['input_ids'], max_new_tokens=30)

[стандартный код]:

      ValueError                                Traceback (most recent call last)
<ipython-input-26-665cd5fbe802> in <module>
      3 tokenized_text = tokenizer(text, return_tensors="pt")
      4 
----> 5 model.generate(tokenized_text['input_ids'], max_new_tokens=30)

1 frames
/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py in generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, **kwargs)
   1304 
   1305         if generation_config.min_length is not None and generation_config.min_length > generation_config.max_length:
-> 1306             raise ValueError(
   1307                 f"Unfeasible length constraints: the minimum length ({generation_config.min_length}) is larger than"
   1308                 f" the maximum length ({generation_config.max_length})"

ValueError: Unfeasible length constraints: the minimum length (56) is larger than the maximum length (31)

Ах-ха, есть некоторая минимальная длина, которую модель хочет вывести в виде сводки!

Итак, давайте просто попробуем установить его на 60

      tokenized_text = tokenizer(text, return_tensors="pt")

outputs = model.generate(tokenized_text['input_ids'], max_new_tokens=60)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

[вне]:

      ChatGPT is an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The broad AI hardware and services market was nearly USD 36bn

Мы видим, что теперь суммарный вывод короче, чем вывод по умолчанию в 73, и укладывается в установленный нами лимит max_new_tokens в 60.

И если мы проверимprint(len(outputs[0])), мы получаем 61 токен подслова, дополнительный из max_new_tokens предназначен для учета символа конца предложения. Если вы распечатаетеoutputs, вы увидите, что идентификатор первого токена равен 2, который представлен самим токеном.

Когда вы указываетеskip_special_tokens=Trueэто удалит</s>токен, а также токены начала предложения<s>.


Вопрос 4. Как на самом деле работают min_length и max_length (похоже, что они не соответствуют заданным ограничениям)?

Учитывая приведенные выше примеры, на самом деле сложно определить, поскольку модель должна определить минимальное количество токенов подслов, необходимых для получения хорошего сводного вывода. ПомнитеUnfeasible length constraints: the minimum length (56) ...предупреждение?


Вопрос 5: Каков максимальный объем данных, которые я могу предоставить этому сумматору?

Разумныйmax_lengthили, что более правильно, скорее всего, будет меньше длины вашего ввода, и если есть какие-то ограничения пользовательского интерфейса или ограничения вычислений/задержки, лучше всего держать ее на низком уровне и близко к тому, что необходимо.

То есть, чтобы установить , просто убедитесь, что он меньше, чем номер входного текста. токенов и достаточно разумно для вашего приложения. Если вы хотите знать приблизительный номер стадиона. попробуйте модель, не устанавливая предел, и посмотрите, соответствует ли итоговый результат ожидаемому поведению модели, а затем откорректируйте его соответствующим образом.

Как приправа во время приготовления: «Добавь/уменьши». max_new_tokensпо желанию"


Вопрос 3: Из приведенных выше 2211 символов первые 933 символа являются допустимым содержимым текста, но затем публикуется текст типа...

При установке для min_length произвольно большого числа, намного большего, чем выходные данные модели по умолчанию, т. е. 73 подслова,

      print(summarizer(text, max_length=900, min_length=300, do_sample=False))

print(summarizer(text, max_length=900, min_length=500, do_sample=False))

Тогда он предупредит вас,

[стерр]:

      Your max_length is set to 900, but you input_length is only 800. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=400)

Он начнет галлюцинировать вещи, выходящие за рамки первых 300 токенов подслов. Возможно, модель считает, что кроме 300 подслов больше ничего из входного текста не имеет значения.

И вывод выглядит примерно так:

      [{'summary_text': 'ChatGPT is an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. ... They recommend semiconductor companies, cloud-service providers that provides the infrastructure needed for generative AI to take off, and private equity firms that provide the infrastructure for cloud-based services. They also suggest investors can consider opportunities in private equity (PE) to invest in AI platforms in the short-term and in the medium to long-term.'}]

[{'summary_text': "ChatGPT is an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. ... They say AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. The technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth, they say. The authors believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments. They recommend semiconductor companies, cloud-service providers that provides the infrastructure needed for generative AI to take off, and private equity firms that provide the infrastructure for cloud-based services. They also suggest investors can consider opportunities in private equity (PE) to invest in AI platforms in the short-term and in the medium to long-term, such as within healthcare and traditional manufacturing. The author's firm is based in New York and they have worked with Microsoft, Google, Facebook, and others on AI projects in the past. The firm has also worked with Google, Microsoft, Facebook and others to develop AI products and services in the U.S. and abroad. For confidential support, call the National Suicide Prevention Lifeline at 1-800-273-8255 or visit http://www.suicidepreventionlifeline.org/. For confidential. support on suicide matters call the Samaritans on 08457 90 90 90 or visit a local Samaritans branch or click here for details. In the UK, contact Samaritans at 08457 909090 or visit\xa0the Samaritans’\xa0online helpline at http:// www.samaritans.org\xa0or\xa0click\xa0here for details on how to get involved in the UK’s national suicide prevention Lifeline (in the UK or the UK). For confidential help in the United States, call\xa0the National suicide Prevention Line at\xa0800\xa0273\xa08255."}]

Вопрос: Почему у модели начались галлюцинации после 300 подслов?

Хороший вопрос, а также активная область исследований, см. https://aclanthology.org/2022.naacl-main.387/ , и в этой области есть еще много других.

[Мнение]: Лично, как говорит догадка, это, скорее всего, потому, что большая часть данных, которые модель извлекла из текста, состоит из 800 подслов, а подготовленное ею резюме имеет длину от 80 до 300 подслов. А точки данных обучения, где в аннотации 300-500 подслов, всегда содержат горячую линию SOS. Таким образом, модель начинает переобучаться всякий раз, когда достигает этого значения.min_lengthэто >300.

Чтобы доказать догадку, попробуйте другой случайный текст из 800 подслов, а затем снова установите min_length равным 500, это, скорее всего, снова будет галлюцинировать предложение SOS, превышающее 300 подслов.

Другие вопросы по тегам