Welcome! Please see the About page for a little more info on how this works.

0 votes
ago in Meta by

LLM use appears to have a bunch of ethical and practical problems, the most pressing one seems to be the plagiarism that seems to be pretty excessive even for code and even when not baited:

Video clip of apparently a lawyer live demoing what seems to be Co-Pilot plagiasm

I don't know anything about law, but the lawyer at one point says: "This is a copyright infringement." I've also found this: https://www.twobirds.com/en/insights/2025/landmark-ruling-of-the-munich-regional-court-(gema-v-openai)-on-copyright-and-ai-training It seems to talk about "fair use" of AI model training.

Also see this high-profile incident: https://www.pcgamer.com/software/ai/microsoft-uses-plagiarized-ai-slop-flowchart-to-explain-how-github-works-removes-it-after-original-creator-calls-it-out-careless-blatantly-amateuristic-and-lacking-any-ambition-to-put-it-gently/

Also this field study that appears to be putting the plagiarism rate at seemingly at least 2-5%: https://dl.acm.org/doi/10.1145/3543507.3583199

This article mentions a study that apparently puts the plagiarism rate at 8–15% as a minimum for the easily detectable kind: https://www.theatlantic.com/technology/2026/01/ai-memorization-research/685552/

I don't know what this means legally, but at least morally and ethically this seems sad.

Apparently, some people in the Clojure space already spoke out against LLMs, but I wasn't able to find any anti LLM policy. If there was such a policy, I would expect it to be mentioned in the following places:

I simply wanted to suggest that perhaps the project may want to adopt such a policy.

Here are some other projects that have already done so: Asahi Linux, elementaryOS, Forgejo, Gedit, Gentoo, GIMP, GoToSocial, Löve2D, Loupe, NetBSD, postmarketOS, Qemu, RedoxOS, Servo, stb libraries, Zig.

My deepest apologies if there already is such a policy, or if I'm asking in the wrong space.

2 Answers

+1 vote
ago by

My feeling here, as a contributor to several Contrib libraries that are under the Clojure CLA (and a one-time contributor to core!), is that the CLA already covers this from a legal standpoint:

"You covenant, represent, warrant and agree that:

  • each contribution that you submit is and shall be an original work of authorship and you can legally grant the rights set out in this RHCA;"

(plus various other clauses that cover copyright ownership/grant/etc)

I think any public declarations that "we don't accept AI-generated contributions" are performative.

If you don't have a legally-binding contributors' license agreement? Sure, then you probably need a stated policy about contributions -- but if you're concerned enough about AI-generated contributions to think that, you probably ought to have a legally-binding contributors' license agreement...

ago by
Something I should have mentioned in my answer (but wanted to confirm details before I added it):

The US Copyright Office has ruled that AI-generated output -- text, code, images -- cannot be copyrighted and the courts have agreed. Which means that you cannot legally submit AI-generated code to a project that requires you to have copyright assignment authority -- as Clojure does.
ago by
Hmmm, interesting. Does that mean that certain open source projects are not _allowed_ to use LLM generated content, even if they wanted to?

Will we end up with two parallel infrastructures?

One built on licenses that are gaurenteed to have zero autocomplete used in the creation of the software (given some beaurocrat's definition of auto complete) and another built on the public domain?

Would you end up with a "FreeThis" and a "FreeThat", etc... Just with different names, that have most of the same feature but don't force you to use VIM or notepad.exe to write your code in order to satisfy some random person's definition of "too much automation"?
ago by
I should take that back. Vim users are launching missles at me with three and a half keystrokes right now...
ago by
If you have a copyright-assigning contributors' license agreement, I don't believe you can legally submit AI-generated code -- since you don't own and cannot transfer copyright on that code.

I suspect the big legal question which we'll need the courts to figure out when a challenge arises is: at what point does "auto-complete" on "your" code become "AI-generated" and therefore non-copyrightable?

I don't think anyone knows the answer to that yet.

If I write the code and have Claude fix a bug in it that amounts to a few lines that I could have been inspired to write myself based on, say, a code fragment I found on StackOverflow as a fix for a similar bug... Is that still copyrightable (i.e., does that small bug fix inclusion invalidate my copyright as a whole on the rest of the code I personally wrote)? What is "fair use" in the context of AI-generated code fragments?
ago by
Mmhmm, very interesting.
0 votes
ago by

Hi Ellie, I'm one of these Clojure users who has vocally favored the use of LLMs in the Clojure community so I've been looking forward to contributing to a conversation about this. Thanks for bringing it up.

I'd like to address each of your points.

LLM use appears to have a bunch of ethical and practical problems

Information technology has a bunch of ethical and practical problems. That isn't new. One of the major ethical and practical problems with information technology today is copyright and patent law.

I don't know anything about law, but the lawyer at one point says: "This is a copyright infringement."

I'm okay with laws that prevent merchants from lying about the provenance of their goods, but I take offense with the notion that people are not free to transact honestly with one another using the information they already possess. And I reject the notion that people are not sovereign owners over the information they possess. I take offense with those who try to impose constraints on my sovereignty in that regard.

Also this field study that appears to be putting the plagiarism rate at seemingly at least 2-5%

Impersonation will increase with AI. And people who pretend to accomplish things they didn't actually accomplish deserve to be found out. Fraud is illegal and cheating is a punishable policy in most environments. But that should not be construed as restricting the reuse of publicly available information about how the world works. Reusing public information and repurposing it in some honest way that is useful in a transaction of mutual interest between people - that freedom must not be curtailed.

I don't know what this means legally, but at least morally and ethically this seems sad.

The legality of copyright and patent law under a liberal system where the subjects of a state are taken to be sovereign agents, with inalienable rights, has always been on a precarious crash course. Human individual sovereignty is simply not compatible with artificially imposed information embargoes for the sake of (supposedly) temporary economic monopolies.

And this crash course was always destined to explode when AI arrived. These two worlds cannot coexist. And we've known this for decades: all byte streams can be reinterpreted; you can't actually own bytes; a universally coherent patent system in time and space is not mathematically definable without letting one attacker patent the whole thing.

An AI future was never going to be compatible with this artificial information monopoly system that we greedily invented a few hundred years ago, under the newfound powers of the panoptical state.

Apparently, some people in the Clojure space already spoke out against LLMs, but I wasn't able to find any anti LLM policy.

I would encourage people in the technology community to debate these topics further before jumping to conclusions.

But if we look at where things are going with AI, I think it's fair to say that some percentage of their outputs will BS for a very long time.

But there's a silver lining to that headline - people will still be needed to filter that BS for a very long time.

So I think we'll develop human filtration systems to channel and filter out the noise coming out of these generators.

And the larger and more important a software project is, and the more risk there is to changing a given piece of code, the more human filtration we'll want in that pipeline.

For the Linux kernel, one would hope, so many human eyeballs have reviewed a given piece of code before it's committed that it shouldn't even matter whether a human wrote it - many humans agree the code is the right direction. That's what mattered.

So for projects like Clojure, you can have rules like "humans have to see this first," but that's already so obvious - everybody knows Rich would never let a branch of code enter core without his full agreement, even if an alien came from space and handed it to him.

For projects with very distributed control, where the direction of the project needs to be some principled philosophy that can survive any future leader's opinion on the project's direction, I can see the point in creating more abstract rules of engagement for how human filtration systems will limit the rate of BS leaking into a codebase.

Clojure isn't one of these distributed control projects. It's a collection of cool technology bits from a guy we trust won't let slop in, from humans or computers. That's already the value proposition. If I were to propose that Rich let more LLM generated content into Clojure, do we all not already know what the answer would be? Are these anti-LLM policy documents symbols of political solidarity around some group grievances regarding climate, plagiarism and slop? Or are you really worried Clojure might end up with slop in it?

Ultimately, going forward, given the avalanche of code that LLMs are about to create for us, I think having human-oriented slop filtration systems are going to be a necessary component of most open source ecosystems - so I'm not against Clojure having slop-prevention systems. But Clojure IMO is one of the least likely to ever have that happen anyway. It already has the strongest possible slop filtration system - all clojure core code changes must transact through a single person's mind named Rich Hickey. That's already the contract.

As a proponent of using more LLMs to help us explore the boundaries of what is possible, I'm also in favor of communities like Clojure adopting tools and policies and procedures to constrain the rate of change. So I wouldn't be mad at seeing policies from Clojure around it. I would just caution everyone against producing prolicy documents "Against AI" that won't even mean anything in two years when everyone has moved on to it being normal. I would frame it as being about preserving Clojure code quality as opposed to some sense in which we can turn back time and somehow not have code being generated by LLMs. That's not going to happen, folks.

Anyway. I have some strong opinions on these topics so I very much appreciate it when folks bring it up, giving us all the opportunity to think about these things a little bit deeper, so thank you for posting this. I think it would be unhelpful for the Clojure community to "go to war with AI," but I'm totally in favor arguing and having debates about pushing this stuff in the right direction and some of that will be policy docs and guidelines and whatnot. Just my 2 cents, take it with a grain of salt.

ago by
edited ago by
I thank you for your opinions. I would like to keep my response short and focused:

1. If Clojure doesn't allow LLM code contributions for the Clojure core contributors, does that impact you as a user? (My apologies if you're actually one of those contributors.)

2. While copyright has its issues, people who are in favor of dropping it typically don't propose an alternative. In the absence of copyright, wouldn't all code be CC0? Even if you personally like that, what would that do to the FOSS ecosystem? I personally doubt it would be in anybody's interest as a whole. The chardet incident seems to show this isn't purely hypothetical.

3. Isn't there a difference between pure knowledge and e.g. valuing somebody's concrete work with as little as attribution? There seems to be some nuance here occasionally missed.

4. A ban is reversible. Having tons of LLM code in your code base that you derive from, may not be. Wouldn't banning it for now perhaps be the measured choice to allow more debate?

5. "[...] some sense in which we can turn back time and somehow not have code being generated by LLMs. That's not going to happen, folks." Plenty of FOSS projects seem to be willing to put that to the test, so perhaps Clojure might consider joining them?

I don't know if that makes a ban a good idea, but I hope my input helps others figure that out.
ago by
No, I think those questions make sense. Things to consider...

I'll definitely respond soon but I'd like to tke some time to really consider your questions and give some others the opportunity to chime in.

Again, thanks for bringing this up!
ago by
> 1. If Clojure doesn't allow LLM code contributions for the Clojure core contributors, does that impact you as a user? (My apologies if you're actually one of those contributors.)

Does Clojure advertise some method by which you can be guaranteed to have your code contributed into core? Just because you're a human doesn't mean you should be able to contribute to projects. Clojure already doesn't accept drive-by PRs. It's already highly insulated from that noise. If you bring a sloppy patch, regardless of how you contrived it, you deserve to have it rejected quickly. If you had an LLM contrive it and didn't do your due diligence to vet it, you deserve negative feedback for that, or be blocked if you persist. And if literal bots start slinging PR slop at your repo, by all means block that - we'll have to start dealing with bot spam at a societal level if they start flooding our inboxes and ringing our phones like that ending scene from that Lawnmower Man movie lol. But I mean, I don't get the feeling Clojure core is super at risk here.

> 2. While copyright has its issues, people who are in favor of dropping it typically don't propose an alternative.

So before copyright and patents, artists and engineers did exist. They mostly received payment via direct commmission - like someone would ask them to make the piece of art or engineering before hand and then they would go make it. This still happens even with copyright today, where publishers give authors or artists a "forward" or "advance" on their work. But then the publisher mostly takes your ownership rights. And that's the thing: most artists and engineers these days get paid that way - salary. 99% of all art and engineering is produced by people who don't actually hold the copyright. And many of the companies holding them are only doing so defensively, to protect themselves from legal predators.

And even today, most musicians for instance will tell you that they make more money touring than they do off royalties. The distribution and publishing companies have monopolized the system and are essentially gatekeeping innovation instead of incentivizing it. The opposite of what it advertises as its benefit.

> In the absence of copyright, wouldn't all code be CC0? Even if you personally like that, what would that do to the FOSS ecosystem?

So copyleft was an immoral attack against an immoral system - a sort of fire with fire.

It was a "legal hack" over the copyright system - "Oh, you think you can invent a contract that gives you the right to tell me what I do with information within the confines of my own living room? Well, watch me create a contract that gives ME the right to tell YOU what to do in the confines of YOUR living room."

And it was a clever hack, because if you want to believe the first contract is morally acceptable, you have to accept that the second contract is morally acceptable.

But neither contract was ever morally acceptable. Copyleft was just a hacky bandaid over that original sin of greed and avarice.

> I personally doubt it would be in anybody's interest as a whole. The chardet incident seems to show this isn't purely hypothetical.

The chardet incident is a perfect example of how copyleft is just as nonsensical as copyright. It just doesn't make sense. It was never going to make sense. It only made sense in a few people's imaginations, starting a few hundred years ago, and given how unnatural it is I doubt it will stay there forever in the future. The liberal order of things is based on "natural law," and copyright and patents are fundamental perversions of that.

> 3. Isn't there a difference between pure knowledge and e.g. valuing somebody's concrete work with as little as attribution? There seems to be some nuance here occasionally missed.

I don't follow. What do you mean here?

If you're asking if I think information has more value if it comes from a human, I'd say absolutely yes, for so many reasons. First and foremost, that they actually _cared_ makes it 100 times more important to me. And if someone uses AI as a _substitute_ for the care that I expect from their art or engineering, especially in a case where that substitution is trying to be concealed - I'm likely to block you. I don't have time for any of that and if y'all want to go to war against that kind of slop, I'll join you.

> 4. A ban is reversible. Having tons of LLM code in your codebase that you derive from may not be. Wouldn't banning it for now perhaps be the measured choice to allow more debate?

Yeah, it's probably reversible. Probably not a big deal. I just think it's premature at the present juncture. It wouldn't be prudent. Nobody would actually think that Rich would have let LLM slop in anyway, so it would just end up looking like a performative gesture. Which is fine maybe... Again, certain kinds of slop I'm down to go to war with. But going to war with "AI," ... I mean, there's farming and carpentry and lots of things that won't push the world in this direction - but developing more code is not one of them... So "technology projects against AI"... It's just too much cognitive dissonance; people might think it's silly... Be a farmer or something. I still might one day. Might be fun. There's some degree of anti-AI psychosis going around right now. Para/Pro AI psychosis too, but the anti-AI variant is growing. If shit gets super scary one day, I may join the ranks of the anti-AI crazies too, but I think this'll plateau in like 2 or 3 years and then there'll be this constant 1% slop factor that requires humans in the loop for most important things for like another decade or maybe even a century.

> Plenty of FOSS projects seem to be willing to put that to the test, so perhaps Clojure might consider joining them?

Maybe. I think it'd be silly but at the end of the day it's just a document, right?

> I don't know if that makes a ban a good idea, but I hope my input helps others figure that out.

And honestly I don't know if a ban is a bad idea. If it's becoming a problem for the core team and they're getting harassed and inundated by bot slop then yeah, maybe it would be necessary. I could imagine that becoming more of a problem in the future, to where you'd have to throttle that.

Anyway, it's an interesting debate.
ago by
edited ago by
"But then the publisher mostly takes your ownership rights."

This no longer works without copyright, the store (e.g. Amazon) could sell the book without paying the writer or publisher. This is kind of what many feel like chardet is demonstrating.

"Isn't there a difference between pure knowledge and e.g. valuing somebody's concrete work with as little as attribution?"

What I meant is that LLMs break this attribution contract, too. LLMs break everything the contributor may have asked via the license, see chardet. This will make people leave the FOSS ecosystem if this becomes the norm, and the question is how bad that effect will be. This is why LLMs are so disliked by some FOSS people, it's not just copyright, it's the moral contract.

And once you add an LLM plagiarized item to your project's code, you may be part of those who broke those moral contracts.

"Nobody would actually think that Rich would have let LLM slop in anyway"

True, but I doubt he'd know who might be using AI autocomplete without saying so.

A no-LLM policy might disincentivize LLM contributors beyond the obvious slop.
ago by
> might be using AI autocomplete without saying so

Yeah, if we can't use autocomplete I feel like we're starting to lose our grip on reality.

And I don't mean to be too offensive when I say that - we're living in very interesting times and we all have to process these changes - I too am having to recalibrate my intuition.

But just listen to what you're saying...

And walk out the consequences of your strategy over the coming years...
...