admin: Draft policy on use of AI coding assistants by lgritz · Pull Request #5072 · AcademySoftwareFoundation/OpenImageIO

lgritz · 2026-03-07T00:28:47Z

Draft for comments and discussion!

Have I forgotten something important? Am I over-emphasizing something irrelevant?
Nothing is set in stone. Feedback requested.

Signed-off-by: Larry Gritz <lg@larrygritz.com>

ThiagoIze · 2026-03-07T01:41:26Z

docs/dev/AI_Policy.md

+**AI tools may not be used to fix GitHub issues labeled as "good first
+issue"**, and are strongly discouraged for "Dev Days" work. Cultivating and
+educating new contributors is part of our job, and as such, we do not want
+people to swoop in and use tools to trivially solve these tasks that were
+curated specifically for somebody to actually learn from.
+


Just as some students will secretly use AI to do their assignments, I suspect the same will happen here whether we allow it or not.

What we could do, however, is only allow a contributor to pick one item from the "good first issue". After that they would need approval from the maintainers if they want to keep submitting from this list. This would prevent someone grabbing the entire list. If they use AI for their one submission, hopefully they still got to learn something about OIIO...

Unlike a school assignment, there is literally nothing to be gained by breaking the rules here. I think we just need to let people know our expectations, and not worry about or have any overhead of allowing/enforcing this particular point. The same situation would occur if a senior developer just spent the day (without AI) doing 10 different GFIs and left none for beginners. We'd tell them to knock it off, but we wouldn't worry about putting any kind of approval system in place.

I think some people might see plenty of gain in this. I believe some might see getting a bunch of PRs submitted and accepted as helping their prospects of getting hired. Others might see it as a gamification where their goal is to get as many PRs completed across projects so that their "score" goes up. Whether they learned anything or whether they truly care about the project might be irrelevant to them.

I'm fine with starting with simple requests, like in the proposal, and only adding in enforcement if this ever starts to become a problem.

I see your point, @ThiagoIze. I guess we've already seen something like that in people who breeze through with a fuzz-induced crash who want to round it up to a major security issue to get the credit for finding it. I assume they are receiving some kind of accolades elsewhere?

There is nothing about this that lasts forever. If we're seeing a problem in practice, we can adjust and find some way to prevent it.

I think that orthogonal to AI tools, we can certainly say in our DD explanations that people can do more than one PR if they want, but only one should be GFI.

ssh4net · 2026-03-07T02:32:57Z

Highly recommend design project wise SKILL.md and AGENTS.md

For example llama.cpp has prebuilt agents file that already define some rulesets
https://github.com/ggml-org/llama.cpp/blob/master/AGENTS.md

When in our discussions that also proven my personal preferences most important is SKILL.md where defined coding style, language subsets allowed, level of abstraction, optimizations (like no inline lambdas or no hot path exceptions).
But SKILL should be designed for both Codex and Claude as they are not fully compatible (Codex is more like list of rules, Claude is more like talk with a child “you can do this, but this is bad”).
Personally I have cpp-hpc skill that I designed based on my experience what style of code easier for me to understand and support plus best practices for high performance image/data processing.

I can share my cpp-hpc skill, that can be adapted for OIIO style (like a loosen a rule against lambdas 😅)

ssh4net · 2026-03-07T02:35:43Z

Another critical rule, is fuzzing and tests.
With LLM this is way simple. And LLM can create a strong testing suite, and every time run a tests.

For example both OpenMeta (c++ and rust) are covered not only by a corpus tests but fuzzy tests. And LLM on every new feature added run this tests.

ssh4net · 2026-03-07T04:07:05Z

I also would recommend to required cross review using different LLM. For example Claude Opus/Sonet code must be reviewed by GPT5.4/Gemini 3.1 or latest Kimi, GLM or Gwen.

And one of the part of PR can be code-review.md files from LLMs that have not be used for a coding.

We actually can define a LLM_REVIEW_RULES.md where define a question set that LLM should review a code.
For example, because this project is a library that are using in VFX industry we might have critical requirements about safety and performance. That mean every input file is a possible attack vector. From malicious headers to malicious metadata. And LLM should review that PR do not introduce any CVEs.

ssh4net · 2026-03-07T06:17:02Z

Regarding "Prompt" sharing. That's idea probably appears from lack of understanding how LLM assistant coding workflow actually looks like.
At first, coder can use native language (French, Chinese or Japanese) and sharing such initial prompt will not help at all. At second, LLM assistant coding section can be hours or days/weeks long, and this is iterative process similar to project manager or senior engineer interaction with single or multiply junior/mid/senior engineers. Sharing a whole conversation not useful at all. Not sure if someone want to read my "Engrish".

meshula · 2026-03-07T06:20:28Z

docs/dev/AI_Policy.md

+to take reasonable care that code is not copied from a source with an
+incompatible license.
+
+We believe that the reputable coding assistant tools automatically exclude


I can read this as (1) wishful thinking ;) or (2) a ironic statement that there are no reputable coding assistant tools. I don't believe anything in this paragraph can be demonstrated, or is actually implemented. I can find no evidence in any source-available assistants, and the secondary implication that such avoidance may be in the training or fine-tuning is again, not demonstrable or documented. Better just to omit I think.

Am I wrong in reading the claims from Claude, Copilot, and maybe others as implying that they have some internal safeguards that prevent it from giving you answers that contain wholesale copying?

But I can omit that if you think it's best.

I think we just don't know, and I certainly don't trust the documentation from those tools being honest.

meshula

Just a comment that we should probably not imply that coding tools are more capable than they are; there is no evidence for example that e.g. GPL based code is not in the training corpus of any closed provenance LLM.

ssh4net · 2026-03-07T06:36:49Z

I highly recommend to watch a 3blue1brown video series about neural networks and how they work https://youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&si=xKDbI8nsk8rrmpjT
That will close some assumptions about GPL or other licenses.

lgritz · 2026-03-07T07:23:37Z

Regarding "Prompt" sharing. That's idea probably appears from lack of understanding how LLM assistant coding workflow actually looks like. At first, coder can use native language (French, Chinese or Japanese) and sharing such initial prompt will not help at all. At second, LLM assistant coding section can be hours or days/weeks long, and this is iterative process similar to project manager or senior engineer interaction with single or multiply junior/mid/senior engineers. Sharing a whole conversation not useful at all. Not sure if someone want to read my "Engrish".

I'm pretty sure I said that if you pre-write a spec, you should share the spec, but if there was an extended conversation, a short summary was sufficient.

Like, the following would be fine as far as I'm concerned:

"I used the Coffee 5.2 model to design a rewrite of class X to change from a hash map to a sorted list with binary search, there were several iterations about the specifics, and when I was satisfied, I had it perform the analogous transformation of classes Y and Z in a totally automated fashion with no further intervention needed."

But if you happened to write up a specific several paragraphs of instructions of a full spec and it proceeded from that in a nearly automated way, and you think it would be at all helpful for people to see exactly what you did, then you are encouraged to just paste the spec into the commit (and that's probably less work than writing a summary paragraph).

lgritz · 2026-03-07T07:34:45Z

I also would recommend to required cross review using different LLM. For example Claude Opus/Sonet code must be reviewed by GPT5.4/Gemini 3.1 or latest Kimi, GLM or Gwen.

We can recommend trying multiple tools to check each other as a sound practice if available, but I don't see how we can require it. It's not reasonable to assume people have access to multiple paid services as a prerequisite to using any of them. The review that counts is the human one, the AI reviews are merely advisory anyway.

lgritz · 2026-03-07T07:46:22Z

Highly recommend design project wise SKILL.md and AGENTS.md

Yes, I'm hoping that as people gain experience, we'll end up populating the repo with various skills and other useful things that can be shared by the developers using the popular tools.

ssh4net · 2026-03-07T07:48:20Z

We can recommend trying multiple tools to check each other as a sound practice if available, but I don't see how we can require it.

Yes, this is a recommendation from practice. LLM especially in continuous session as a human can miss a critical points.
And cross review between different closed source or open source LLM can help in pinpoint possible issues. Even if this is a false positive, comparing outputs from different models can help to review or even reject PR.

lgritz · 2026-03-07T08:01:09Z

Just a comment that we should probably not imply that coding tools are more capable than they are; there is no evidence for example that e.g. GPL based code is not in the training corpus of any closed provenance LLM.

Oh, I hope I didn't make it sound like that. I'm sure they're all trained on everything. But I am under the impression that the big paid tools like Claude Code and Copilot have some provisions built into the tooling that prevent the answers they give you from containing substantial portions exactly copied from anything.

But even aside from that, in practice, my experience is that these things when operating inside an existing code base, they are so good at making code that conforms to the style and practice and idioms of the surrounding code and meshes into it well, that my intuition is that it's not a direct copy of anything elsewhere. If I saw something that looked like it didn't belong, I'd be suspicious and probably reject it or rewrite it myself. It's hard to codify that intuition, but my feeling and experience are that I can kinda tell when something it suggests is almost certainly on the safe side of the line, versus when it is suspicious. I have much less intuition, and much less trust, about asking it to create a codebase from scratch -- then I can't help but wonder where it may have been copied from. But of course, these rules are for this project, which is a large established code base into which a wholesale copying from another code base is likely to look foreign.

We should also take a step back and remember where we are today, with no AI: We actually have no safeguards in place that prevent somebody from (purposely or inadvertently) not abiding by the DCO/CLA, and having their "human-written" PR actually contain code that is too similar to code elsewhere that they have seen, referred to while writing, or directly copied. We have always lived with a certain amount of that risk. It's not at all clear to me that using code assistants increases that risk.

lgritz · 2026-03-07T08:13:50Z

I highly recommend to watch a 3blue1brown video series about neural networks and how they work https://youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&si=xKDbI8nsk8rrmpjT That will close some assumptions about GPL or other licenses.

I think we all know how they work.

The concern about GPL is that sometimes, transformers really can spit out text that's so similar to an extended passage from the corpus that it's essentially a copy, even though it was a probabilistic process that got there. It's only the end product that's important in some sense -- if we end up with a large section "copied" (or very nearly so) from a GPL project, saying "but the AI doesn't work like that" is not going to help, we'll need to fix it.

Fortunately, in an established code base, I tend to see the results match the surrounding code base in style and idiom so well that it seems highly improbable that a protectable amount of the code matches some other codebase in a close to verbatim way. I'm not super worried about it in practice in this project, as perhaps I would be in a new project starting from scratch that had no surrounding code to conform to.

lgritz · 2026-03-07T08:34:51Z

As an overall comment, I think it's really important to state a few assumptions:

The genie is out of the bottle -- we know that people are going to use these tools for code in their PRs. They already are. Most of the senior developers I know actually want to use these tools. And for the most part, I trust them to have good judgment about when and how to use them.

If we don't have any guidelines, people will use them any which way, including ways we wish they wouldn't or that are detrimental to the project. But if we try to ban the tools or make the rules too burdensome, people will just lie to evade the rules. It will be hard to "catch" them, except in the most obvious and inept cases.

So all we're really trying to do here is find a balance of how to communicate our values and expectations in a way that uses the lightest touch necessary to prevent most of the unwanted behaviors.

So we're boiling it down to just the basics: Use coding tools if you find them helpful, but you're still on the hook for fully understanding and standing behind what you submit. Interact with the project and community yourself, not by agent. Disclose what tools you used and how (to a non-burdensome level of detail). Don't waste maintainer time with low quality PRs or interfere with other project goals like saving the curated introductory issues for actual beginners.

The rest is just fleshing these points out with a little more explanation and rationale. We don't want to get TOO prescriptive, though we might have more specific recommendations over time as we gain experience with how the tools pan out in this code base.

ssh4net · 2026-03-07T08:44:00Z

@lgritz how about create a LLM-friendly folders with Codex/Claude subfolders to use them for tailored SKILL and AGENTS files? (sadly there is no single standard there)

lgritz · 2026-03-07T08:52:29Z

@lgritz how about create a LLM-friendly folders with Codex/Claude subfolders to use them for tailored SKILL and AGENTS files? (sadly there is no single standard there)

Yes, that's precisely my intent.

cary-ilm · 2026-03-07T18:13:25Z

It's worth highlighting up front that the project maintainers will not even LOOK at a PR until the CLA is signed, and the CLA must be signed by an actual human or corporation. Maybe this is in place already, but the policy should be to auto-reject PRs without a signed CLA. In legit cases where the submitter signs after submission (which happens frequently), maintainers can always re-open the PR.

Admittedly I do occasionally help a submitter through fixing a PR and getting the CLA signed at the same time, when they appear legit and worth helping.

lgritz · 2026-03-07T18:54:10Z

but the policy should be to auto-reject PRs without a signed CLA.
... In legit cases where the submitter signs after submission (which happens frequently),

It's not frequently, it's always for new contributors. When, as CLA manager, I add an employee to the CLA list in EasyCLA, that does not complete the process. They still have to submit the PR, get the "you don't have a CLA on record" comment appended, and then get diverted to a secondary process where they have to "accept" being included in the company's CLA. And God forbid they do the submission from the wrong account or didn't have every commit signed using the very same email associated with the GH userid that the CLA system knows about.

Given the current state of how the system works, auto-rejecting the entire PR of literally every first-time contributor (and every regular who makes any of several simple mistakes) feels like it will be perceived as intentionally hostile to developers.

I think the auto-close suggestion could make sense if EasyCLA was completely overhauled -- by which I mean, there was a simple and foolproof way to do all the paperwork ahead of the PR submission and be 100% sure that the PR would go through without a hitch, and a totally reliable way to see while preparing your submission (but before you hit the final "submit" button) whether you are all clear from the CLA perspective.

And anyway, it would only protect against a true drive-by of an agent unknown to us. It wouldn't help stop somebody who had already signed the CLA once but then turns on agents to do things using their GH credentials.

lgritz · 2026-03-07T22:57:51Z

But I am under the impression that the big paid tools like Claude Code and Copilot have some provisions built into the tooling that prevent the answers they give you from containing substantial portions exactly copied from anything.

@meshula and others,

Here's one of the links that was supporting this impression for Claude:

https://privacy.claude.com/en/articles/10023638-why-am-i-receiving-an-output-blocked-by-content-filtering-policy-error

And for GitHub Copilot:

https://docs.github.com/en/copilot/how-tos/manage-your-account/manage-policies#enabling-or-disabling-suggestions-matching-public-code

etheory · 2026-03-08T22:57:30Z

Personally I wish we could just ban AI contributions outright, but I know I'm in the minority in this view.

lgritz · 2026-03-09T00:44:52Z

Personally I wish we could just ban AI contributions outright, but I know I'm in the minority in this view.

I hear you, brother. I wish this tech had never come along. But it's here and I don't think it's possible to keep people from using it. The best we can do is try to shape the norms of how it's used.

Also, the more time goes on, the more developers I like and respect really want to use these tools, and I don't want to be on the side of not trusting them to choose their own tools and use them responsibly. And now I'm becoming one of them; I don't know exactly what role it will have in my workflow a year from now, but I know I don't want to be prevented from figuring that out.

But it's on us -- through norms, rigorous code review, etc. -- to ensure that what people are doing doesn't negatively affect the quality of these projects or the way we interact with each other.

ssh4net · 2026-03-09T15:20:05Z

I tried to make AGENTS.md file with Codex 5.4 ExtraHigh

AGENTS.md

lgritz · 2026-03-09T17:07:21Z

I tried to make AGENTS.md file with Codex 5.4 ExtraHigh

AGENTS.md

This is great, thanks @ssh4net !

Do all the major assistants read AGENTS.md, or do we also need CLAUDE.md, etc., to exist and refer to this file?

ssh4net · 2026-03-10T00:36:12Z

Do all the major assistants read AGENTS.md, or do we also need CLAUDE.md, etc., to exist and refer to this file?

Claude need his own CLAUDE.md it can reference AGENTS.md but probably worth to make a tailored file. Similar for Gemini.

And still worth to design (again, per vendor) SKILLs with more technical specs on coding patterns.

Btw I was need to build llama.cpp on jetson and used Codex CLI for this, and a one of the first questions from him, was about contribution and warning about fully ai based commits that are prohibited in their AGENTS.md
So this agents files really working. Yes, as a contributor I can force LLM to ignore agents file or just remove something to make it ignore this warnings. But still a good start for.

lgritz · 2026-03-10T17:00:17Z

Claude need his own CLAUDE.md it can reference AGENTS.md but probably worth to make a tailored file. Similar for Gemini.

I'm still a beginner at this, so this is a serious question: Your AGENTS.md seems pretty generic, not model- or tool-specific, and seems like it would be totally adequate as a CLAUDE.md. So what would you imagine needs more customization per-tool rather than having the tool just refer to it and have them all share the same general instructions?

* Brief TLDR summary of our principles at the beginning. * A little more permissive language with what's expected for disclosures, but added "Assisted-by:" sign-off suggestion as the minimum. * At the risk of being more wordy, revised the IP/DCO section to make suggestions about what factors might make people more confident that it's not copied (detailed spec or revision by human, extending existing code vs new code, seeming to tightly fit existing idioms, use of tools that make statements about copying guardrails). * Clarify that these rules don't apply to using LLMs to explain or learn about the code base, nor to asking them to review your code prior to submission (assuming you don't ask them to do fixes for you). Signed-off-by: Larry Gritz <lg@larrygritz.com>

ssh4net · 2026-03-11T01:26:30Z

@lgritz looks like AGENTS.md can be a well supported generic LLM readme.
https://claude.ai/share/800344a9-f81c-4b26-91d4-60d1b93d402d

brechtvl · 2026-03-12T13:27:56Z

I think the policy is entirely reasonable, and don't really have concrete suggestions to improve it.

From experience it's very helpful to have a policy to point to. But it still leaves tricky situations, and probably there is not much to be done than deal with it case by case.

A contribution looks likely to be AI generated, and you carefully ask if it might be. And then if the reply is that it wasn't, or that only the description was generated. Now either you just falsely accused someone or the contributor is lying, and it's not a fun situation either way.
A contribution might have disclosed use of AI or said so after asking, and they are willing to use AI the right way. But the contribution still remains at a beginner level and needs significant help. Previously as a reviewer you might have made the effort to help them learn, but now it feels extractive.

And some other thoughts about AI contributions:

For AI assisted contributions, the quality bar should be higher than other contributions as there should remain a reasonable balance between time investment from the contributor and reviewer. And with AI assisted contributions that means the contributor should have more time for careful manual testing, writing good docs, thinking about corner cases, etc. Ideally AI usage is first about better quality contributions, and doing more faster is secondary.
Potentially a checklist in AGENTS.md could help improve the quality of AI contributions. Obviously for things like tests and docs which are mentioned in the PR template already. But maybe also avoiding code duplication, checking consistency of behavior between file formats, etc. This checklist could even be generated from the history of pull request reviews. I'm a bit ambivalent about this because maybe contributors would feel even less the need to understand what they are doing.
For copyright, anything that is not a clear extension to or refactoring of existing code should receive extra scrutiny, and may be rejected entirely if it's not.

lgritz · 2026-03-12T17:46:33Z

Thanks for the thoughtful response, @brechtvl. I agree with all of that.

Thankfully, I think on most of the ASWF projects, we have small developer communities and can rely a lot on trust of the developers, and trust that the maintainers can make mostly-correct judgments. It may be that none of these things are real problems in practice for us, we shall see. I don't envy the position you're in with Blender, which operates at an entirely different scale and probably will be presented with thorny dilemmas on a daily basis.

meshula · 2026-03-12T20:15:43Z

supporting this impression for Claude and Copilot

Just to close the loop, I believe that there is a general intent, but in practice https://arxiv.org/abs/2505.12546 and all the "Claude etc output Harry Potter books" stories. So my vector here is to just omit an expectation that such mechanisms might be operational :)

admin: Draft policy on use of AI coding assistants

e237296

Signed-off-by: Larry Gritz <lg@larrygritz.com>

lgritz force-pushed the lg-aipolicy branch from b4386cd to e237296 Compare March 7, 2026 00:53

ThiagoIze reviewed Mar 7, 2026

View reviewed changes

meshula reviewed Mar 7, 2026

View reviewed changes

Conversation

lgritz commented Mar 7, 2026

Uh oh!

ThiagoIze Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

lgritz Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

ThiagoIze Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

lgritz Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

ssh4net commented Mar 7, 2026

Uh oh!

ssh4net commented Mar 7, 2026

Uh oh!

ssh4net commented Mar 7, 2026

Uh oh!

ssh4net commented Mar 7, 2026

Uh oh!

meshula Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

lgritz Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

etheory Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

meshula left a comment

Choose a reason for hiding this comment

Uh oh!

ssh4net commented Mar 7, 2026

Uh oh!

lgritz commented Mar 7, 2026

Uh oh!

lgritz commented Mar 7, 2026

Uh oh!

lgritz commented Mar 7, 2026

Uh oh!

ssh4net commented Mar 7, 2026

Uh oh!

lgritz commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lgritz commented Mar 7, 2026

Uh oh!

lgritz commented Mar 7, 2026

Uh oh!

ssh4net commented Mar 7, 2026

Uh oh!

lgritz commented Mar 7, 2026

Uh oh!

cary-ilm commented Mar 7, 2026

Uh oh!

lgritz commented Mar 7, 2026

Uh oh!

lgritz commented Mar 7, 2026

Uh oh!

etheory commented Mar 8, 2026

Uh oh!

lgritz commented Mar 9, 2026

Uh oh!

ssh4net commented Mar 9, 2026

Uh oh!

lgritz commented Mar 9, 2026

Uh oh!

ssh4net commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lgritz commented Mar 10, 2026

Uh oh!

ssh4net commented Mar 11, 2026

Uh oh!

brechtvl commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lgritz commented Mar 7, 2026 •

edited

Loading

ssh4net commented Mar 10, 2026 •

edited

Loading

brechtvl commented Mar 12, 2026 •

edited

Loading

meshula commented Mar 12, 2026 •

edited

Loading