• jballs@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    10
    ·
    8 months ago

    Lol the part about non-copyrighted text definitely should be read with a wink.

    You can use any text that you want, but please, do not choose something copyrighted. The New York Times is currently suing OpenAI for training ChatGPT on its copyrighted material. Reddit’s data is uniquely valuable, since it’s not subject to those kinds of copyright restrictions, so it would be tragic if users were to decide to intermingle such a robust corpus of high-quality training data with copyrighted text.