@Mahlzeit

Mahlzeit@feddit.de · 11 months ago

I doubt if it is clear-cut enough to bring down enforcement in any case. However, that does not mean that the clause is enforceable.

It is easy to circumvent such a ban. Eventually, the only option that MS has is suing. Then what?

Mahlzeit@feddit.de · 11 months ago

The issue is that what they are doing there is blatantly anticompetitive.

https://www.ftc.gov/advice-guidance/competition-guidance/guide-antitrust-laws/single-firm-conduct/refusal-deal

Mahlzeit@feddit.de · 11 months ago

I wonder if that clause is legal. It could be argued that it legitimately protects the capital investment needed to make the model. I’m not sure if that’s true, though.

Mahlzeit@feddit.de · 11 months ago

FunSearch (so called because it searches for mathematical functions, not because it’s fun)

I’m probably not the only one who wondered.

Mahlzeit@feddit.de · 11 months ago

Understandable.

Mahlzeit@feddit.de · 11 months ago

Why is it important to you what some corporation does or doesn’t do?

Mahlzeit@feddit.de · edit-2 11 months ago

Can I ask why this is important to you? Did you donate and don’t like how your money is used?

ETA: I asked, because I wondered if it has to do with AI-tech specifically, as many here obviously believe. OP kindly answered my question in DMs. They obviously do not wish the details to be public, but I believe I can say that the answer was very reasonable and not connected to AI-tech. (There’s nothing in the answer which is private or couldn’t be made public, but it’s up to them.)

Mahlzeit@feddit.de · 1 year ago

It’s likely a reference to Yudkowsky or someone along those lines. I don’t follow that crowd.

Mahlzeit@feddit.de · 1 year ago

This touches several difficult topics.

I think my disagreement with you about AI copyright infringement is that you think that AI can create new things whereas I don’t think that.

I don’t think that matters to copyright law, as it exists.

Copyright law is all about substantial similarity in copyrightable elements. All portraits are similar by virtue of being portraits. Portraits are not copyrighted, nor can one copyright genres and such. A translation of a text has superficially no similarity with the original, but has to be authorized.

What you are saying would mean, that similarity is no longer a requirement for an infringement. That’s a big change. It is copyright, after all.

Furthermore it really wouldn’t take a huge change to copyright law, just clear differences between the rules that apply to sentient vs non-sentient sources.

Non-sentient sources are not new. Take cameras, for example. Cameras have been improved over time so that less skill is necessary to operate one. It’s no longer necessary to manually focus, to set the exposure time, to develop the film, … This also means that photos today have less human creative input. In current smartphone cameras, neural AIs make many decisions and also “photoshop” the result.

It doesn’t really make sense to me to treat modern cameras differently to old ones. Or: Someone poses and renders a figure in Blender. What difference does it make if they use an old-fashioned physical based render or a genAI?

Nevertheless, the question whether AIs can create something new, can be answered. The formal definition of “information” is that it is a reduction in uncertainty. For example, take the sequence of letters: “creativit_”. You probably have a very clear idea what the last, missing letter is. So learning that it is “y” doesn’t give you much information.

But take the sequence: “juubfpvoi_”. The missing letter could be any lower-case letter. You may not feel very informed when you learn that it is “f”, but it does represent a much bigger reduction in uncertainty.

When we write texts, we use the same old words in the dictionary; just a few 10,000 at most. We string them together with the same old rules of grammar to tell the same old things. The sky is blue, things fall down, not up; people love and hate, and in the end the good guys win. You can probably think of exceptions to all these. They are exceptions. We create small variations on the same old themes. We rehash.

If a story does not cater to expectations, then it’s not believable. People should behave as we know people to behave. The laws of nature should be consistent and familiar. Most of all: The conventions of the genre should be followed. As a human, you are supposed to lift ideas from previous works. New ideas may be appreciated, but are not required.

The second string was, in fact, created by a machine; not an AI, but an RNG. Even with many GBs of output, it should be impossible to find any biases or patterns that allow one to guess at the next letter. I didn’t make one up myself because humans are not very random even when we try. And when we write, we do our best to reduce our randomness even further. We try not to invent new spellings; ie make spelling errors.

AIs receive input from a pRNG, which means that they create new things. What they are supposed to do is to strip away all that novel information and create something largely predictable. They often fail and, say, create images of humans with an innovative number of fingers. LLMs make continuity errors, or straight start to spout gibberish. The problem is that AIs create too many new things, not that they don’t.

Mahlzeit@feddit.de · 1 year ago

Can we get back to this? I am confused why you believe that AIs like ChatGPT spit out “exact copies”. That they spit out memorized training data is unusual in normal operation. Is there some misunderstanding here?

Mahlzeit@feddit.de · 1 year ago

Ok, where did GPT-4 copy the ransomware code? You can’t reshuffle lines of code much before the program breaks. Should be easy to find.

Mahlzeit@feddit.de · 1 year ago

Well, that’s simply not true.

Mahlzeit@feddit.de · 1 year ago

Well, that is a philosophical or religious argument. It’s somewhat reminiscent of the claim that evolution can’t add information. That can’t be the basis for law.

In any case, it doesn’t matter to copyright law as is, that you see it that way. The AI is the equivalent to that book on how to write bestsellers in my earlier reply. People extract information from copyrighted works to create new works, without needing permission. A closer example are programmers, who look into copyrighted references while they create.

Mahlzeit@feddit.de · 1 year ago

I didn’t downvote you. (Just gave you an upvote, though.) You’re reasonable and polite, so a downvote would be very inappropriate. Sorry for that.

Music is having ongoing problems with copyright litigation, like Ed Sheeran most recently. From what I have read, it’s blamed on juries without the necessary musical background. As far as I know, higher courts usually strike down these cases, as with Sheeran. Hip hop was neutered, in a blow to (African-)American culture. While it was obviously wrong, not to find for fair use in that case, samples are copies.

It’s not so bad outside of music. You can write books on “how to write a bestseller”, or “how to draw comics” without needing permission. Of course, you would study many novels and images to get material. The purpose of books is that we learn from them. That we go on to use this to make our own thing is intended (in the US).

What you’re proposing there would be a great change to copyright law and probably disastrous. Even if one could limit the immediate effect to new technologies, it would severely limit authors in adopting these technologies.

Mahlzeit@feddit.de · 1 year ago

Yes, if it’s new content, it’s obviously no copy; so no copyvio (unless derivative, like fan fiction, etc.). I was thinking of memorized training data being regurgitated.

Mahlzeit@feddit.de · 1 year ago

I understand. The idea would be to hold AI makers liable for contributory infringement, reminiscent of the Betamax case.

I don’t think that would work in court. The argument is much weaker here than in the Betamax case, and even then it didn’t convince. But yes, it’s prudent to get the explicit permission, just in case of a case.

Mahlzeit@feddit.de · 1 year ago

That shouldn’t be an issue. If you look at an unauthorized image copy, you’re not usually on the hook (unless you are intentionally pirating). It’s unlikely that they needed to get explicit “consent” (ie license the images) in the first place.

Mahlzeit@feddit.de · 1 year ago

The models are deliberately engineered to create “good” images, just like cameras get autofocus, anti-shake and stuff. There are many tools that will auto-prettify people, not so many for the reverse.

There are enough imperfect images around for the model to know what that looks like.

Mahlzeit@feddit.de · 1 year ago

That ought to satisfy all those who wanted “consent” for training data.

Mahlzeit@feddit.de · 1 year ago

It just seems that Google should have been able to move faster. Yes, they did publish a lot of important stuff, but seeing the splash that came from Stability and OpenAI, they seem to have done so little with it. What their researchers published was important but I can’t help thinking, that a public university would have disseminated such research more openly and widely. Well, I may be wrong. I don’t have inside knowledge.