> The data used in this program may be shared with GitHub affiliates, which are companies in our corporate family including Microsoft
So every Microsoft owned company will have access to all data Copilot wants to store?
Mobile
https://github.com/settings/billing/licensing
EDIT:
https://docs.github.com/en/copilot/how-tos/manage-your-accou...
> If you have been granted a free access to Copilot as a verified student, teacher, or maintainer of a popular open source project, you won’t be able to cancel your plan.
Oh. jeez.
Who in their right mind will opt into sharing their code for training? Absolutely nobody. This is just a dark pattern.
Btw, even if disabled, I have zero confidence they are not already training on our data.
I would also recommend to sprinkle copyright noticed all over the place and change the license of every file, just in case they have some sanity checks before your data gets consumed - just to be sure.
This setting does not represent my wishes and I definitely would not have set it that way on purpose. It was either defaulted that way, or when the option was presented to me I configured it the opposite of how I intended.
Fortunately, none of the work I do these days with Copilot enabled is sensitive (if it was I would have been much more paranoid).
I'm in the USA and pay for Copilot as an individual.
Shit like this is why I pay for duck.ai where the main selling point is that the product is private by default.
I was unable to change the setting when I used the GitHub app to open up the web page in a container.. button clicks weren't working. Quite frustrating.
Sounds like it's even likely to train on content from private repositories. This feels like a bit of an overstep to me.
Maybe it's already active in our accounts and we don't realize it, so our code will be used to train the AI.
Now we can't be sure if this will happen or not, but a company like GitHub should be staying miles away from this kind of policy. I personally wouldn't use GitHub for private corporate repositories. Only as a public web interface for public repos.
Shit like this shouldn't be allowed.
Enabled = You will have access to the feature
Disabled = You won't have access to the feature
As if handing over your data for free is a perk. Kinda hilarious.
Now "Allow GitHub to use my data for AI model training" is enabled by default.
Turn it off here: https://github.com/settings/copilot/features
Do they have this set on business accounts also by default? If so, this is really shady.
So by default you send all this to Microsoft by opening your IDE.
What on earth are they thinking...
Why would I even spend time choosing a copyleft license if any bot will use my code as training data to be used in commercial applications? I'm not planning on creating any more opensource code, and what projects of mine still have users will be left on GH for posterity.
If you're still serious about opensource, time to move to Codeberg.
> freely given, specific, informed and unambiguous. In order to obtain freely given consent, it must be given on a voluntary basis.
"others are doing it too so it's ok"
A user can be a contributor to a private repository, but not have that repository owner organisation’s license to use copilot. They can still use their personal free tier copilot on that repository.
How can enterprises be confident that their IP isn’t being absorbed into the GH models in that scenario?
At this point, is there any magic in software development?
If you have super-secret-content is a third party the best location?
1. A lot of settings are 'Enabled' with no option to opt out. What can I do?
2. How do I opt out of data collection? I see the message informing me to opt out, but 'Allow GitHub to use my data for AI model training' is already disabled for my account.
If someone takes that code and pokes around on it with a free tier copilot account, GitHub will just absorb it into their model - even if it’s explicitly against that code’s license to do so?
1- Vulnerabilities, Secrets can be leaked to other users. 2- Intellectual Property, can also be leaked to other users.
Most smart clients won't opt-out, they will just cut usage entirely.
Now is the time to run off of GitHub and consider Codeberg or self hosting like I said before. [0]
Dark pattern and dick move.
It could be incompetence but it shouldn't matter. This level of incompetence should be punished equally to malice.
What does “my code...for my clients” mean (is it yours or theirs)? If it’s theirs let them house it and delegate access to you. If they want to risk it being, ahem...borrowed, that’s their business decision to make.
If it’s yours, you can host it yourself and maintain privacy, but the long tail risk of maintaining it is not as trivial as it seems on the surface. You need to have backups, encrypted, at different locations, geographically distant, so either you need physical security, or you’re using the cloud and need monitoring and alerting, and then need something to monitor the monitor.
It’s like life. Freedom means freedom from tyranny, not freedom from obligation. Choosing a community or living solo in the wilderness both come with different obligations. You can pay taxes (and hope you’re not getting screwed, too much), or you can fight off bears yourself, etc.
Sounds like you are already opted out because you'd previously opted out of the setting allowing GitHub to collect this data for product improvements. But I can check that.
Note, it's only _usage_ data when using Copilot that is being trained on. Therefore if you are not using Copilot there is no usage data. We do not train on private data at rest in your repos etc.
I'm not sure there are any good GitHub alternatives. I don't trust Gitlab either. Their landing page title currently starts with "Finally, AI". Eek.
In contrast when you create a a GCS bucket it uses a checkmark for enabling “public access prevention”. Who designed that modal? It takes me a solid minute to figure out if I’m publishing private data or not.
Before anyone comes to me to sell me on AI, this is on my personal account, I have and use it in my business account (but it is a completely different user account), I just make it a point to not use it in my personal time so I can keep my skills sharp.
> Business and Copilot Enterprise users are not affected by this update.
It's just unusual how quickly they're going for the shakedown this time
I scratch my open source itch by contributing to existing language and OS projects where incremental change means eventually having to retrain models to get accurate inference :)
There should also be a much easier one-click to opt out without having to scroll way down on the settings page.
To add on to your (already helpful!) instructions:
- Go to https://github.com/settings/copilot/features - Go to the "Privacy" section - Find: "Allow GitHub to use my data for AI model training" - Set to disabled
> Why are you only using data from individuals while excluding businesses and enterprises?
> Our agreements with Business and Enterprise customers prohibit using their Copilot interaction data for model training, and we honor those commitments. Individual users on Free, Pro, and Pro+ plans have control over their data and can opt out at any time.
Looks like not, but would it actually have been shadier, or are we just used to individual users being fucked over?
On top of that, Gemini 3 refuses to refactor open source code, even if you fork it, if Gemini thinks your changes would violate the spirit of the intent of the original developers in a safety/security context. Even if you think you're actually making it more secure, but Gemini doesn't, it won't write your code.
@mariorod's public README says one of his focuses is "shaping narratives and changing \"How we Work\"", so there you go.
You can't believe Microslop is force-feeding people Copilot in yet another way?
> and didn't even post the direct URLs to disable in their blog post
You can't believe Microshaft didn't tell you how to not get shafted?
Ah, so when the inevitable "bug" appears, and we all learn that you've completely failed to honor anything, what will be your "commitment" then? An apology and a few free months?
Time to start pushing for a self hosted git service again.
People are weirdly willing to shrug when it's some solo coder getting fleeced instead of a company with lawyers and procurement people in the room. If an account tier is doing all the moral cleanup, the policy is bad.
I'd be curious to see which countries are affected
On Android for instance I invite you to use the GitHub app and modify your opt-in or opt outside settings... You will find that nothing works on the settings page once you actually find the settings page after digging through a couple of layers and scrolling about 2 ft.
I agree that it feels like a dart pattern for the most part, makes me want to use codeberg/self hosted git
I'm a little surprised the options aren't "Enable" and "Ask me later".
But English is not my first language so please correct me if I'm wrong.
https://old.reddit.com/r/TheSimpsons/comments/26vdkf/dont_do...
(I prefer Emacs anyway, but VSCode is a worthy tool.)
While some think this applies only to personal data, then yes. But it takes only one line of code to use my phone number for testing while I test locally a register form in the application I'm developing.
Once it gets sent to Copilot I can threaten with legal action if they are not taking it down.
Today, we’re announcing an update on how GitHub will use data to deliver more intelligent, context-aware coding assistance. From April 24 onward, interaction data—specifically inputs, outputs, code snippets, and associated context—from Copilot Free, Pro, and Pro+ users will be used to train and improve our AI models unless they opt out. Copilot Business and Copilot Enterprise users are not affected by this update.
Not interested? Opt out in settings under “Privacy.” If you previously opted out of the setting allowing GitHub to collect this data for product improvements, your preference has been retained—your choice is preserved, and your data will not be used for training unless you opt in.
This approach aligns with established industry practices and will improve model performance for all users. By participating, you’ll help our models better understand development workflows, deliver more accurate and secure code pattern suggestions, and improve their ability to help you catch potential bugs before they reach production.
Our initial models were built using a mix of publicly available data and hand-crafted code samples. This past year, we’ve started incorporating interaction data from Microsoft employees and have seen meaningful improvements, including increased acceptance rates in multiple languages.
The improvements we’ve seen by incorporating Microsoft interaction data indicate we can improve model performance for a more diverse range of use cases by training on real-world interaction data. Should you decide to participate in this program, the interaction data we may collect and leverage includes:
This program does not use:
The data used in this program may be shared with GitHub affiliates, which are companies in our corporate family including Microsoft. This data will not be shared with third-party AI model providers or other independent service providers.
We believe the future of AI-assisted development depends on real-world interaction data from developers like you. It’s why we’re using Microsoft interaction data for model training and will begin using interaction data from GitHub employees as well.
If you choose to help us improve our models with your interaction data, thank you. Your contributions make a meaningful difference in building AI tools that serve the entire developer community. If you prefer not to participate, that’s fine too—you will still be able to take full advantage of the AI features you know and love.
Together, we can continue to build AI that accelerates your workflows and empowers you to build better, more secure software faster than ever.
If you have questions, visit our FAQ and related discussion.
Mario Rodriguez leads the GitHub Product team as Chief Product Officer. His core identity is being a learner and his passion is creating developer tools—so much so that he has spent the last 20 years living that mission in leadership roles across Microsoft and GitHub. Mario most recently oversaw GitHub’s AI strategy and the GitHub Copilot product line, launching and growing Copilot across thousands of organizations and millions of users. Mario spends time outside of GitHub with his wife and two daughters. He also co-chairs and founded a charter school in an effort to progress education in rural regions of the United States.
GitHub recently experienced several availability incidents. We understand the impact these outages have on our customers and are sharing details on the stabilization work we’re prioritizing right now.
Everything you need to master GitHub, all in one place.
Build what’s next on GitHub, the place for anyone from anywhere to build anything.
Meet the companies and engineering teams that build with GitHub.
Catch up on the GitHub podcast, a show dedicated to the topics, trends, stories and culture in and around the open source developer community on GitHub.
If you don't want to wait until your PII inevitably gets sent through, you can already now file a complaint to your local supervisory authority: https://www.edpb.europa.eu/about-edpb/about-edpb/members_en
Nowadays, It genuinely feels a lot less because there are now services who will re-write the code to prevent the license.
Previously, I used to still think that somewhat non propreitory licenses like the SSPL license etc. might be interesting approaches but I feel like they aren't that much prone to this either now anymore.
So now I am not exactly sure.
Big Tech is known for clearing illegal things by their legal departments all the time.