(Adds hyperlinks in paragraphs 2-4, 8)
By Helen Coster
NEW YORK, Oct 19 (Reuters) - You may never have to read
another news story in your life, if you have artificial
intelligence that can digest all the web’s information and serve
up a summary on demand.
That’s the stuff of nightmares for media barons as Google
GOOGL.O and others experiment with what's called generative
AI, which creates new content drawing from past data.
Since May, Google has begun rolling out a new form of search
powered by generative AI, after industry observers questioned
the tech giant's future prominence in providing consumers with
information following the rise of OpenAI's query-answering
chatbot, ChatGPT.
The product, called Search Generative Experience (SGE), uses AI
to create summaries in response to some search queries,
triggered by whether Google’s system determines the format would
be helpful. Those summaries appear on the top of the Google
search homepage, with links to “dig deeper,” according to
Google’s overview of SGE.
If publishers want to prevent their content from being used
by Google’s AI to help generate those summaries, they must use
the same tool that would also prevent them from appearing in
Google search results, rendering them virtually invisible on the
web.
Searching for “Who is Jon Fosse” – the recent Nobel Prize in
Literature winner – for instance, generates three paragraphs on
the writer and his work. Drop-down buttons provide links to
Fosse content on Wikipedia, NPR, The New York Times and other
websites; additional links appear to the right of the summary.
Google says that the AI-generated overviews are synthesized
from multiple web pages and that the links are designed to be a
jumping off point to learn more. It describes SGE as an opt-in
experiment for users, to help it evolve and improve the product,
while it incorporates feedback from news publishers and others.
To publishers, the new search tool is the latest red flag in a
decades-long relationship in which they have both struggled to
compete against Google for online advertising, and relied on the
tech giant for search traffic.
The still-evolving product – now available in the United
States, India and Japan – has raised concerns among publishers
as they try to figure out their place in a world where AI could
dominate how users find and pay for information, according to
four major publishers who spoke to Reuters on the condition of
anonymity to avoid complicating ongoing negotiations with
Google.
Those concerns relate to web traffic, whether publishers
will be credited as the source of information that appears in
the SGE summaries, and the accuracy of those summaries, those
publishers say. Most significantly, publishers want to be
compensated for the content on which Google and other AI
companies train their AI tools – a major sticking point around
AI.
A Google spokesperson said in a statement: “As we bring
generative AI into Search, we’re continuing to prioritize
approaches that send valuable traffic to a wide range of
creators, including news publishers, to support a healthy, open
web.”
On compensation, Google says it is working to develop a
better understanding of the business model of generative AI
applications and get input from publishers and others.
In late September Google announced a new tool, called
Google-Extended, that gives publishers the option to block their
content from being used by Google to train its AI models.
Giving publishers the option to opt out of being crawled for
AI is a “good faith gesture,” said Danielle Coffey, president
and chief executive of the News Media Alliance, an industry
trade group that has been lobbying Congress over these issues.
“Whether payments will follow is a question mark, and to what
extent there is openness to having a healthier value exchange.”
The new tool doesn’t allow publishers to block their content
from being crawled for SGE, either the summaries or the links
that appear with them, without disappearing from traditional
Google search.
Publishers want clicks to secure advertisers, and showing up
in Google search is key to their business. The design for SGE
has pushed the links that appear in traditional search further
down the page, with potential to reduce traffic to those links
by as much as 40%, according to an executive at one of the
publishers.
More alarming is the possibility that web surfers will avoid
clicking any of the links if the SGE passage fulfills the users’
need for information – satisfied, for example, to learn the best
time of year to go to Paris, without having to click on a travel
publication’s website.
SGE is “definitely going to decrease publishers’ organic
traffic and they’re going to have to think about a different way
to measure the value of that content, if not click through
rate,” said Forrester Research Senior Analyst Nikhil Lai. Even
so, he believes publishers’ reputations will remain strong as a
result of their links appearing in SGE.
Google says that it designed SGE to highlight web content.
“Any estimates about specific traffic impacts are speculative
and not representative, as what you see today in SGE may look
quite different from what ultimately launches more broadly in
Search,” a company spokesperson said in a statement.
While publishers and other industries have spent decades
adjusting their websites to show up prominently in traditional
Google search, they don’t have enough information to do the same
for the new SGE summaries, these publishers say.
“The new AI section is a black box for us,” said an
executive at one publisher. “We don’t know how to make sure
we’re a part of it or the algorithm behind it.”
Google said publishers do not need to do anything different
than what they have been doing to appear in search.
Publishers have long allowed Google to “crawl” their content
for the purposes of appearing in search results – using a bot,
or piece of software, to automatically scan and index it.
“Crawling” is how Google indexes the web to make content show up
in search.
Publishers’ concerns with SGE boil down to a key point: They
say that Google is crawling their content, for free, to create
summaries that users may read instead of clicking on their
links, and that Google hasn’t been clear about how they can
block content from being crawled for SGE.
Google’s new search tool, said one publisher, “is even more
threatening to us and our business than a crawler that is
crawling our business illegally.”
Google did not comment on that assessment.
When given the option, websites are blocking their content
from being used for AI if doing so doesn’t impact search,
according to exclusive data from AI content detector
Originality.ai. Since its Aug. 7 release, 27.4% of top websites
are blocking ChatGPT’s bot – including The New York Times and
Washington Post. That’s compared to 6% that are blocking
Google-Extended since its Sept. 28 release.
(Reporting by Helen Coster; editing by Kenneth Li and Claudia
Parsons)
((mailto:helen.coster@thomsonreuters.com;))