Now, Hindi publishers barring AI & tech firms from using content without permission

Dainik Bhaskar and Amar Ujala have updated terms and conditions banning all tech firms who intend to use their content

by Kanchan Srivastava
Published - September 14, 2023
4 minutes To Read
Now, Hindi publishers barring AI & tech firms from using content without permission

Not only Open AI, Indian news publishers have decided to take on all tech firms that are working on Large Language Models (LLMs) to build their own generative AI tools.

Taking a lead in the fight against digital giants, Dainik Bhaskar and Amar Ujala have barred all “AI and tech firms” from scanning and using their digital content to train their LLM models without their written permission.

Both publishers have updated their terms of reference in this regard on their websites this week in which “non-commercial use” of content has been defined in detail to warn all AI and tech firms who seek to train their LLM models by feeding on news websites' content.

The updated “terms and conditions” of Dainik Bhaskar’s website reads, “All materials published or available on the Services are protected by copyright, and owned or controlled by DBCL solely or in association with third parties or with such other parties who are given credit as the provider of the Content. Non-commercial use of the Service shall also include the use of Content only upon obtaining prior written consent from DBCL in connection with: (1) the development of any software program, including, but not limited to, training a machine learning or artificial intelligence (AI) system or large language model (LLM); or (2) providing archived or cached data sets containing Content to another person or entity.”

The company has also defined the content as “including, but not limited to all text, photographs, images, illustrations, designs, audio clips, video clips, “look and feel,” metadata, data, or compilations, all referred to as the "Content".

Amar Ujala has added similar conditions on its website in Hindi. The daily has barred makers of all devices and services, including robots and spiders from using its content sans written permission.

e4m has earlier reported how over 70 per cent of Digital News Publishers Association (DNPA) members have restricted access to Microsoft-backed OpenAI. Global media houses like New York Times, The Guardian, CNN and Reuters have already blocked OpenAI’s access to their online offerings.

https://www.exchange4media.com/media-others-news/70-dnpa-members-blocked-openais-access-to-their-websites-129775.html

Generative AI tools, including ChatGPT, are based on Large language Models (LLM), which are trained on vast numbers of documents taken from the internet: news articles, authored essays, technical reports, blogs, social media posts among others.

Open AI has allegedly neither acknowledged publishers' contributions to ChatGPT development nor presented any revenue-sharing model with them so far. On top of that, the firm is making money through subscriptions, DNPA members told e4m, adding that the media houses invest huge amounts in producing content that is being used by AI firms for free.

More publishers are in the process of updating the terms and conditions of their websites to protect copyright violations. No one is ready to speak on the record though. They are also blocking web crawlers of all such firms.

"Many of our members have taken action to block web crawlers and are currently in the process of updating their terms and conditions," DNPA’s Secretary General Sujata Gupta had told e4m on Tuesday.

Digital ecosystem disrupted

“Backed by Microsoft, OpenAI has disrupted the digital ecosystem by launching its powerful generative AI tool ChatGPT last November,” the digital head of a prominent TV channel said.

While news publishers are struggling on the revenue front due to a range of reasons including the loss of referral traffic over the last one year, OpenAI was valued at $27 billion within months of ChatGPT launch, in April to be precise, when it went for a funding round.

Meanwhile, tech giants like Google and Meta are in the process of launching their generative AI tools, which has alarmed the publishers. Some like Chatsonic have already rolled out ChatGPT-type platforms.

Indian publishers allege that Google built its business model on their content only but it never shared a fair share of revenue with them, a charge that Google India rejected many times.

It is noteworthy that DNPA has dragged Google India to the Competition Commission of India (CCI) two years ago alleging that the tech giant was not giving them their due share of its advertising revenue, a charge that Google has always denied.  The matter is still pending at the CCI.

RELATED STORY VIEW MORE