Authors using a new tool to search a list of 183,000 books used to train AI are furious to find their works on the list.

    • just another dev@lemmy.my-box.dev
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      4
      ·
      1 year ago

      Fair use is any copying of copyrighted material done for a limited and “transformative” purpose, such as to comment upon, criticize, or parody a copyrighted work.

      I don’t see why it should.

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        8
        arrow-down
        1
        ·
        1 year ago

        The creation of the AI model is transformative. The AI’s model does not contain a literal copy of the copyrighted work.

        • just another dev@lemmy.my-box.dev
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          5
          ·
          1 year ago

          No, but the training data does contain a copy. And making a model is not criticising, commenting upon, or creating a parody of it.

          • FaceDeer@kbin.social
            link
            fedilink
            arrow-up
            6
            arrow-down
            1
            ·
            1 year ago

            That list is not exclusive, it’s just a list of examples of fair use.

            The training data is not distributed with the AI model.

            • just another dev@lemmy.my-box.dev
              link
              fedilink
              English
              arrow-up
              6
              arrow-down
              2
              ·
              edit-2
              1 year ago

              it’s just a list of examples of fair use.

              Yes, it’s a list of quite similar ways of commenting upon a work. Please explain how training an LLM is like any of those things, and thus, how Fair use would apply.

              • FaceDeer@kbin.social
                link
                fedilink
                arrow-up
                1
                ·
                1 year ago

                I’m not saying that training an LLM is like any of those things. I’m saying it doesn’t have to be like those things in order for it to still be fair use.

              • FontMasterFlex@lemmy.world
                link
                fedilink
                English
                arrow-up
                4
                arrow-down
                1
                ·
                1 year ago

                It’s not. The humans that trained it (assumably) purchased the material used to train it. What’s the problem?

                • BURN@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  3
                  arrow-down
                  1
                  ·
                  1 year ago

                  The use of the material to create a commercial product as well as the reality being that the humans training it never buy the data on an individual level.