Microsoft's CEO of AI said that content on the open web can be copied and used to create new content.
What you need to know
- Microsoft's AI CEO claimed that content shared on the web is "freeware" that can be copied and used to create new content.
- The remarks centered around Microsoft and other companies using preexisting content to train AI models.
- The CEO claimed that there's a separate category of content that cannot be used to train AI, which is indicated by an organization explicitly stating "do not scrape or crawl me for any other reason than indexing me so that other people can find that content."
Microsoft may have opened a can of worms with recent comments made by the tech giant's CEO of AI Mustafa Suleyman. The CEO spoke with CNBC's Andrew Ross Sorkin at the Aspen Ideas Festival earlier this week. In his remarks, Suleyman claimed that all content shared on the web is available to be used for AI training unless a content producer says otherwise specifically.
"With respect to content that is already on the open web, the social contract of that content since the 90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been freeware, if you like. That's been the understanding," said Suleyman.
"There's a separate category where a website or a publisher or a news organization had explicitly said, 'do not scrape or crawl me for any other reason than indexing me so that other people can find that content.' That's a gray area and I think that's going to work its way through the courts."
Suleyman's quote raises several questions:
- Is it actually okay to use other people's work to create new content?
- If so, is it okay to profit off those recreations or work derivative of preexisting content?
- How could websites and organizations "explicitly" say that their work cannot be used for AI training before AI became commonplace?
- Has Microsoft respected any organization that specified content should only be used for search?
- Have Microsoft's partners, including OpenAI, respected any demands that content not be used for AI training?
Several ongoing lawsuits suggest that publishers do not agree with the take of Suleyman.
Training vs. stealing
Generative AI is one of the hottest topics in tech in 2024. It's also a hot button topic among creators. Some claim that AI trained on other people's work is a form of theft. Others equate training AI on existing work to artists studying at school. Contention often circles around monetizing work that's derivative of other content.
YouTube has reportedly offered "lumps of cash" to train its AI models on music libraries from major record labels. The difference in that situation is that record labels and YouTube will have agreed to terms. Suleyman claims that a company could use any content on the web to train AI, as long as there was not an explicit statement demanding that not be done.
- Tux 528
- 1
Recommended Comments
There are no comments to display.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.