How Tech Giants Cut Corners to Harvest Data for A.I.
Meta debated buying a publisher like Simon & Schuster for AI training data...
The New York Times
Various
There is a lot in this NYT report – bylined by five reporters: Cade Metz, Cecilia Kang, Sheera Frenkel, Stuart A. Thompson and
As much as the blog author drags this idea, if there is any sort of legal crackdown (whether through legislation, or legal precedent via lawsuit) on the usage of scraped non public domain data to train commercial models on, this is where the companies are going to turn to.
They’re going to buy up or strike deals with companies that have some sort of actionable rights over data that they’re legally able to license and/or sell. It’s the logical route and as long as the majority of consumers aren’t effected as far as content access goes, they’re not going to care. Likewise content creators will be stuck between a rock and a hard place trying to find replacement platforms/publishers/etc that are built around safeguarding content creators against such deals.
As much as the blog author drags this idea, if there is any sort of legal crackdown (whether through legislation, or legal precedent via lawsuit) on the usage of scraped non public domain data to train commercial models on, this is where the companies are going to turn to.
They’re going to buy up or strike deals with companies that have some sort of actionable rights over data that they’re legally able to license and/or sell. It’s the logical route and as long as the majority of consumers aren’t effected as far as content access goes, they’re not going to care. Likewise content creators will be stuck between a rock and a hard place trying to find replacement platforms/publishers/etc that are built around safeguarding content creators against such deals.