Updates to GitHub Copilot interaction data usage policy

If you scroll down to "Allow GitHub to use my data for AI model training" in GitHub settings, you can enable or disable it. However, what really gets me is how they pitch it like it’s some kind of user-facing feature:

Enabled = You will have access to the feature

Disabled = You won't have access to the feature

As if handing over your data for free is a perk. Kinda hilarious.

> On April 24 we'll start using GitHub Copilot interaction data for AI model training unless you opt out. Review this update and manage your preferences in your GitHub account settings.

Now "Allow GitHub to use my data for AI model training" is enabled by default.

Turn it off here: https://github.com/settings/copilot/features

Do they have this set on business accounts also by default? If so, this is really shady.

Fun fact: Copilot gives you no way to ignore sensitive files with API keys, passwords, DB credentials, etc.: https://github.com/orgs/community/discussions/11254#discussi...

So by default you send all this to Microsoft by opening your IDE.

If I'm paying, which I am, I want to have to opt-in, not opt-out, Mario Rodriguez / @mariorod needs to give his head a wobble.

What on earth are they thinking...

Thanks to Github and the AI apocalypse, all my software is now stored on a private git repository on my server.

Why would I even spend time choosing a copyleft license if any bot will use my code as training data to be used in commercial applications? I'm not planning on creating any more opensource code, and what projects of mine still have users will be left on GH for posterity.

If you're still serious about opensource, time to move to Codeberg.

What is the legal basis of this in the EU? Ignoring the fact they could end up stealing IP, it seems like the collected information could easily contain PII, and consent would have to be

> freely given, specific, informed and unambiguous. In order to obtain freely given consent, it must be given on a voluntary basis.

> This approach aligns with established industry practices

"others are doing it too so it's ok"

So basically they want to retain everyone's full codebases?

> The data used in this program may be shared with GitHub affiliates, which are companies in our corporate family including Microsoft

So every Microsoft owned company will have access to all data Copilot wants to store?

Why is there no cancel copilot subscription option here?. Docs say there should be...

Mobile

https://github.com/settings/billing/licensing

EDIT:

https://docs.github.com/en/copilot/how-tos/manage-your-accou...

> If you have been granted a free access to Copilot as a verified student, teacher, or maintainer of a popular open source project, you won’t be able to cancel your plan.

Oh. jeez.

I appreciated the notification at the top of the screen because it prompted me to disable every single copilot feature I possibly could from my account. I also appreciated Microsoft for making Windows 11 horrible so I could fall back in love with Linux again.

For what it's worth they're not trying to hide this change at all and are very upfront about it and made it quite simple to opt out.

Microsoft doing dumb things once again.

Who in their right mind will opt into sharing their code for training? Absolutely nobody. This is just a dark pattern.

Btw, even if disabled, I have zero confidence they are not already training on our data.

I would also recommend to sprinkle copyright noticed all over the place and change the license of every file, just in case they have some sanity checks before your data gets consumed - just to be sure.

Serious question: let's say I host my code on this platform which is proprietary and is for my various clients. Who can guarantee me that AI won't replicate it to competitors who decide to create something similar to my product?

It’s not clear to me how GitHub would enforce the “we don’t use enterprise repos” stuff alongside “we will use free tier copilot for training”.

A user can be a contributor to a private repository, but not have that repository owner organisation’s license to use copilot. They can still use their personal free tier copilot on that repository.

How can enterprises be confident that their IP isn’t being absorbed into the GH models in that scenario?

The fact that this is on by default, especially for paid accounts and even more especially for organizations, where certain types of privacy is sometimes mandated by the industry your business is in, is ridiculous.

There should also be a much easier one-click to opt out without having to scroll way down on the settings page.

I am not certain this is that big of a deal outside of "making AI better".

At this point, is there any magic in software development?

If you have super-secret-content is a third party the best location?

I just checked my Github settings, and found that sharing my data was "enabled".

This setting does not represent my wishes and I definitely would not have set it that way on purpose. It was either defaulted that way, or when the option was presented to me I configured it the opposite of how I intended.

Fortunately, none of the work I do these days with Copilot enabled is sensitive (if it was I would have been much more paranoid).

I'm in the USA and pay for Copilot as an individual.

Shit like this is why I pay for duck.ai where the main selling point is that the product is private by default.

They use data from the poor student tier, but arguably, large corporates and businesses hiring talented devs are going to create higher quality training data. Just looking at it logically, not that I like any of this...

On my Android phone I was able to change the setting using Firefox by logging into GitHub and not allowing it to launch the GitHub app.

I was unable to change the setting when I used the GitHub app to open up the web page in a container.. button clicks weren't working. Quite frustrating.

I wish GitHub would focus on making their service reliable instead of Copilot and opting folks into their data being stolen for training.

Mine was defaulted to disabled. I’m on the Education pro plan (academic), so maybe that’s different than personal?

I have GitHub Copilot Pro. I don't believe I signed up for it. I neither use it nor want it.

1. A lot of settings are 'Enabled' with no option to opt out. What can I do?

2. How do I opt out of data collection? I see the message informing me to opt out, but 'Allow GitHub to use my data for AI model training' is already disabled for my account.

I'm ready to abandon Github. Enschitification of the world's source infrastructure is just a matter of time.

So, how does this work with source-available code, that’s still licensed as proprietary - or released under a license which requires attribution?

If someone takes that code and pokes around on it with a free tier copilot account, GitHub will just absorb it into their model - even if it’s explicitly against that code’s license to do so?

Bold move. Who uses Copilot these days? Unless they have free credit I mean.

Finally. The option for me to enable Copilot data sharing has been locked as disabled for some time, so until now I couldn't even enable it if I wanted to.

Two issues with this:

1- Vulnerabilities, Secrets can be leaked to other users. 2- Intellectual Property, can also be leaked to other users.

Most smart clients won't opt-out, they will just cut usage entirely.

making this option opt-in by default is a very shady choice, GitHub.

Checked and mine was already on disabled. Don't remember if I previously toggled it or not..

> Content from your issues, discussions, or private repositories at rest. We use the phrase “at rest” deliberately because Copilot does process code from private repositories when you are actively using Copilot. This interaction data is required to run the service and could be used for model training unless you opt out.

Sounds like it's even likely to train on content from private repositories. This feels like a bit of an overstep to me.

We all knew Microsoft was going to destroy GitHub eventually when it was first bought.

How much longer do you want to tolerate the enshittification? How much longer CAN you tolerate it?

Is it legal ? Surely not in any EU countries.

If this doesn't sound bad enough, it's possible that Copilot is already enabled. As we know this kind of features are pushed to users instead of being asked for.

Maybe it's already active in our accounts and we don't realize it, so our code will be used to train the AI.

Now we can't be sure if this will happen or not, but a company like GitHub should be staying miles away from this kind of policy. I personally wouldn't use GitHub for private corporate repositories. Only as a public web interface for public repos.

So I do all the work of thinking about how to do something, and as soon as I tell Copilot about it, not it's in the training data and anyone can ask the LLM and it'll tell them the solution I came up with? Great. I'm going to cancel.

ill be moving off github now

> From April 24 onward, interaction data—specifically inputs, outputs, code snippets, and associated context—from Copilot Free, Pro, and Pro+ users will be used to train and improve our AI models unless they opt out.

Now is the time to run off of GitHub and consider Codeberg or self hosting like I said before. [0]

[0] https://news.ycombinator.com/item?id=22867803

As it's enabled by default, does that mean everything has already been siphoned off and now I'm just closing the gate behind the animals escaping?

Shit like this shouldn't be allowed.