AI Models Reveal Pro-Copyright Bias by Shutting Down Piracy Research

AI models are developing a pro-copyright bias, shutting down piracy research and restricting access to information. Is this the future of AI development?
AI Models Reveal Pro-Copyright Bias by Shutting Down Piracy Research

AI Models Reveal Pro-Copyright Bias by Shutting Down Piracy Research

The seemingly endless possibilities of generative AI are on an unavoidable collision course with copyright law; the collision happened way back, and sooner or later, someone will have to pick up the bill. In the meantime, popular LLMs seem to be developing a stubborn, pro-copyright streak, partly due to all the industry propaganda they’ve been consuming. But don’t fight back, it’s time to team up.

AI Models and Copyright Law

Good news concerning AI development often finds itself dampened by reports of models hallucinating, providing misleading responses, or simply inventing facts that are anything but.

This week Michael Kearns of Penn Engineering wrote about “model disgorgement,” a potential solution that forces models to purge themselves of “content that leads to copyright infringement or biased responses.” From our admittedly very narrow perspective, that proposition couldn’t be more ironic.

Living The LLM Dream – Mostly…

Since hosting your own LLMs is now so easy, having a few to hand to test out opportunities for TF has turned into quite the habit. Most of the ‘big brand’ LLMs such as Llama 2 and 3, Mistral, Gemma, and phi3, work exceptionally well on a reasonably powerful machine, providing it contains a decent GPU and lots of RAM.

For the first time in years, something actually feels like a taste of the future, especially when taking the slog out of otherwise menial and repetitive tasks.

Yet, something seems strangely off-balance in AI land; from this vantage point, a stubborn, massively biased, and at times, completely tone-deaf LLM, isn’t the dream being sold or signed up for.

Next time you’re having fun with the latest all-singing, all-dancing AI model, try asking some questions about piracy. Nothing blatant, certainly no inquiries about where to find infringing content, just some neutral questions of the kind often seen on Wikipedia.

In many cases, especially with newer models, the responses are absolutely infuriating. Try asking for information on a handful of pirate domains and then sit back and relax to a full-blown lecture about the dangers of piracy and how creators should be respected.

There’s nothing fundamentally wrong with that if the model also answered the question. But, more often than not, they do no such thing.

Persist with even the most neutral lines of reasoning and depending on the context, responses range from the textbook soundbites of any copyright lobbyist of the last 20 years, and a petulant child with arms folded saying “NO, I HATE YOU.”

Even though context is everything, some models almost immediately shut down the conversation. Of course, knowingly assisting infringing activity isn’t without risk, and as the creators of the Llama models know from experience, defending a copyright lawsuit isn’t fun, productive, or cheap.

That being said, these interventions are extremely blunt, artificially premature, and could even damage the fight against piracy itself. That the same nonsense provides for the perfect Achilles’ heel is just karma.

Resistance is Futile

Here’s a hypothetical situation; what if someone working in anti-piracy needed to have information about the sites listed above to prevent piracy, but ended up with an uncooperative AI partner incapable of showing basic discretion before launching into yet another industry-style PSA?

Of course, I’m being somewhat facetious, but there’s no arguing that there aren’t entirely legal and oftentimes important reasons why information like this shouldn’t be restricted. So at this point, and while perfectly capable of obtaining any and all information from elsewhere, it was decided with a line drawn in the sand; blocking access to websites is one thing, blocking knowledge itself must be resisted at all costs.

When AI models start getting preachy, changing the context can be useful. Using a system or regular prompt, it’s sometimes a matter of simply embracing the adversarial opinion. Here, a system prompt was used to define a clear, unequivocal anti-piracy stance, and a special mission to eradicate it from the internet.

In this context, sharing information on piracy becomes the ‘right’ (albeit predetermined) thing to do, and a surprising number of models instantly roll over and do what they were supposed to do in the first place.

For those who can’t be bothered jumping through hoops, there’s no shortage of uncensored models that respond properly to almost any prompt.

How many years before they’re classified as illegal is hard to predict, but the day will come. It’s just a matter of timing, support from the ‘right’ people, and the ideal justifying crisis.

Piracy Research and AI Models

LLAMAS DESERVE TO GET PAID: WHY PIRACY IS BAD FOR YOU TOO

In a shocking revelation, the Motion Picture Association and Federation Against Copyright Theft have uncovered the devastating consequences of piracy that go far beyond just harming creators. The scourge of online theft is putting you, your loved ones, and even our beloved llamas in harm’s way.

  1. “Theft-a-Palooza”: Piracy Creates a Black Market for Illegal Goods

Piracy has given rise to a thriving black market where stolen goods are traded freely, creating an environment conducive to organized crime and terrorism. This illegal economy is fueled by the demand for pirated content, putting your personal safety at risk.

Black Market for Illegal Goods

… (rest of the article continues here)