Github - You Can View Deleted Private Fork Data

Поділитися
Вставка
  • Опубліковано 21 жов 2024
  • Recorded live on twitch, GET IN
    Article
    trufflesecurit...
    My Stream
    / theprimeagen
    Best Way To Support Me
    Become a backend engineer. Its my favorite site
    boot.dev/?prom...
    This is also the best way to support me is to support yourself becoming a better backend engineer.
    MY MAIN YT CHANNEL: Has well edited engineering videos
    / theprimeagen
    Discord
    / discord
    Have something for me to read or react to?: / theprimeagenreact
    Kinesis Advantage 360: bit.ly/Prime-K...
    Get production ready SQLite with Turso: turso.tech/dee...

КОМЕНТАРІ • 211

  • @autohmae
    @autohmae 2 місяці тому +172

    While Git is really committed to keeping your stuff, Github seems to be even more committed !

    • @krellin
      @krellin 2 місяці тому +10

      github == microsoft
      yes they are very fuckin commited

    • @petertillemans2231
      @petertillemans2231 2 місяці тому +7

      If you put your stuff in someone else’s fridge, you should not be surprised you special yoghurt get’s accessed

    • @TheRealBigYang
      @TheRealBigYang 2 місяці тому +2

      Isn't committing the whole point of both tools?

    • @autohmae
      @autohmae 2 місяці тому

      @@TheRealBigYang it's was just a bit of wordplay

  • @fabi3030
    @fabi3030 2 місяці тому +62

    That is probably why they have to delete forks of DMCAd content no matter how well those cleaned up their repositories. Otherwise, a fork can still access the illegal material.

    • @duven60
      @duven60 2 місяці тому +31

      For malicious actors: That could cause some real chaos if DMCA-able content gets pushed to forks of a target repo.

    • @Redditard
      @Redditard 2 місяці тому

      Shall do ​@@duven60

    • @nullvoid3545
      @nullvoid3545 2 місяці тому +2

      @@duven60 Thats alota damage!

    • @patrikburda
      @patrikburda 2 місяці тому

      @@duven60 Wait so the top xx repos can be just shut down if someone uploads a DMCA protected shit to a fork and deletes it? That is just an attack waiting to happen.

    • @demcookies
      @demcookies Місяць тому

      I'll start DMCA my own repos to delete them forever

  • @S1S2S3S4
    @S1S2S3S4 2 місяці тому +97

    Used hash to restore some lost, force pushed commits. Big commits. Saved my job.

    • @krellin
      @krellin 2 місяці тому +7

      unless you use git properly you will eventually loose it...
      there is no reason for you to force push unless you are the only maintainer or its your own branch in the origin

    • @fabi3030
      @fabi3030 2 місяці тому +6

      @@krellin How can you lose it with the 3rd-party event database and being able to access any commit.

    • @hanifarroisimukhlis5989
      @hanifarroisimukhlis5989 2 місяці тому +3

      I once used reflog to revert some badly complicated rebase. Thanks Linus Torvalds for the immutable architecture of git.

    • @krellin
      @krellin 2 місяці тому

      @@fabi3030 keep doing force push and you'll find out

    • @krellin
      @krellin 2 місяці тому +4

      @@fabi3030 by demonstrating how unaware you are about basics of git... and annoying your coworkers
      its entirly understandable if a junior does it (although even bootcamps will teach these things) but if you are mid or higher its just unprofessional...

  • @SourceOfViews
    @SourceOfViews 2 місяці тому +56

    Regarding GDPR: it only affects personally identifiable information (PII), however every git commit includes the author's and committer's name and email, which IS considered PII.
    So at the very least that information has to be returned.
    Additionally things like IPs are also considered PII (yeah I know about IP rotation, I did not make the laws), so if they log the IPs, which they probably do, then that will also have to be returned.

    • @hanifarroisimukhlis5989
      @hanifarroisimukhlis5989 2 місяці тому

      GDPR gets a bit problematic with GPG keys. GPG keys cannot ever be deleted, only revoked. And since the key can contains almost anyting, including PII, it's basically a PITA trying to comply with these insane privacy laws.

    • @ErazerPT
      @ErazerPT 2 місяці тому +1

      Yeah, thought the same too. GDPR is so overly broad it's a MIRACLE to go about and NOT touch anything that pulls it into the equation.

    • @SourceOfViews
      @SourceOfViews 2 місяці тому +12

      You know, guys, GDPR really isn't that draconian. Most countries in the EU already had similar laws. GDPR mostly just brought a EU-wide standardized law in effect, which makes it easier to both enforce as well as adhere to (because you can in practice just make sure that you follow the GDPR if you operate in the EU and not 20 different privacy laws).
      And quite frankly GDPR for the most part just says that you have a right to your data and to request your data to be deleted. That really shouldn't be controversial.
      In fact, long before GDPR I assumed that I'd have that right. I don't want to be a customer of X anymore, so I delete my account with them and I terminate my contract with them and then I'd assume that they'd delete all of the data as soon as they don't need it anymore.
      The only issues that arise from it are because in recent years data became so valuable that everyone wants to keep all the data forever to show Ads, train their AIs or whatever. So of course everyone designed their processes and software in that way, completely neglecting the privacy of their users.
      That's now catching up to these companies and I don't feel bad for them.
      That being said, it can be quite annoying also for smaller businesses, too.
      There was a recent ruling in Germany that if you cause the visitor's browser to send an request to a third party if that's not necessary and without their consent, then you're in violation of the GDPR. Can't say I agree with that on an ethical level, but fine. That mostly means not using CDNs and instead statically hosting the data yourself, but you should be doing that for security reasons anyways.

    • @epajarjestys9981
      @epajarjestys9981 2 місяці тому

      @@SourceOfViews I don't think anything's really catching up to anyone. Everyone ignores GDPR and gets away with it. Look at any random Shopify store selling stuff in the EU and they violate everything left and right. They just add one of those horrible cookie banners and everything else continues as usual. The people who have to implement that pseudo-compliance bs only get headaches if they try to be conscientious, and management will not care nor want to understand the details.
      It is a completely useless law that only reaffirms the notion that no one actually has to comply with laws on the internet.

    • @ErazerPT
      @ErazerPT 2 місяці тому

      @@SourceOfViews If you really believe that, then you never read it (properly). And yes, that ruling stands because by requesting something from a 3rd party you transmitted the user's PII (IP+Datestamp is PII) without properly informing the user that you were "sharing his data" with a 3rd party. And it's even worse, for you, if the data was unnecessary for "proper operation".
      You'd be amazed how many fines and damages GDPR can extract for information misuse. And misuse can simply be "you requested my full name to make a delivery and that's not strictly necessary...".
      I do love it when it comes to telemarketing, you'd be amazed how fast they s**t themselves when i mention "GDPR". Because they know i wield an insanely big stick to bludgeon them into submission and beyond.

  • @zill_laiss
    @zill_laiss 2 місяці тому +188

    Flip didn't delete the part he asked to, as usual

    • @dus10dnd
      @dus10dnd 2 місяці тому +24

      Flip knows Git!

    • @bopcity5785
      @bopcity5785 2 місяці тому +21

      I used to think this too but he just leaves the "flip take this out". May be to stop the cut being unexplained

    • @mitchierichie
      @mitchierichie 2 місяці тому +3

      at this point I'd be disappointed if he took it out

    • @gordahnculous
      @gordahnculous 2 місяці тому +20

      To be fair it’s survivorship bias, we’re not aware of all of the times Flip does delete the parts Prime asks. So for all we know these are rare exceptions

    • @ASmith2024
      @ASmith2024 2 місяці тому +1

      he's flip, but he ain't snitch!

  • @hi117117
    @hi117117 2 місяці тому +20

    GitHub is the only git implementation that has actually sat down and completely relooked at how git works as a git server. as far as I can tell, they seem to have found a way to use like a SQL database as the back end. The people in chat saying that it's just one big repository aren't technically wrong in that kind of implantation, but it's also not the whole picture.

    • @ruroruro
      @ruroruro 2 місяці тому

      No shot they are the only ones. To implement a git server, you only have to implement a relatively simple protocol. IIRC both Google and Microsoft (or maybe it was Facebook, not sure on the second one) used to have their own implementations that were geared towards working with huge monorepos.

  • @FourTwentyMagic
    @FourTwentyMagic 2 місяці тому +41

    github as blockchain

  • @Julienraptor01
    @Julienraptor01 2 місяці тому +9

    Thanks to that i was able to recover an open source project that went closed source
    It's intended behavior that should absolutely say it's very very good

  • @rawallon
    @rawallon 2 місяці тому +33

    Does that mean that JDSL was right all along?

  • @williamivey5296
    @williamivey5296 2 місяці тому +4

    I'm a little confused why this is a surprise. As someone who admined Perforce VCS repositories for years, I was well aware that delete, in most cases, was just another version of the file; an entry in the file's changes indicating the file didn't exist at that, and only at that, revision. (Which was good given how many newbies managed to delete entire working branches.)
    You could always get a copy of the file as it existed prior to that revision, and any place a pre-deleted revision was branched was still valid. That wasn't just a feature, it was a critical feature for our enterprise suite with lots of moving parts and backwards compatibility requirements.

    • @hanifarroisimukhlis5989
      @hanifarroisimukhlis5989 2 місяці тому +2

      Yeah i'm also confused why this is even an issue. The reporter is definitely either script kiddie or genuinely does not know how git works.

    • @duven60
      @duven60 2 місяці тому +13

      The surprise is that it works when the file only ever existed in a (private) fork that never pushed the file to the master branch. Makes it apparent “private” isn't really a thing on github.

    • @williamivey5296
      @williamivey5296 2 місяці тому +1

      @@duven60 As I understand github*, if the repository is public, private branches don't really exist. They are visible to the server even if access is limited. (If I'm right, this makes sense since it allows files to be pushed to public branches from private ones. The system needs to access both ends at some level.) Which means there is always a potential way around the privacy setting. That shouldn't be an issue if everything is owned by a company, and not really accessible by the Public public.
      Sort of like setting a post to private on Facebook - only part of the system respects that, and you should have expected that.
      I don't know about github, but Perforce did have a real, aptly named, delete command: P4 Obliterate (admin level, though)
      * Like I said, my focus was on a different VCS, but there are similarities at various levels. We did have users set up their own Perforce servers on their desktop systems - it's super easy to do - in order to do truly private experimental work they wanted no where near production code. (Or maybe to file cake recipes, I didn't ask. Small Perforce servers were free so I just pointed them to the download page.) I understand that's a thing with github, too.

    • @williamivey5296
      @williamivey5296 2 місяці тому +1

      ​​​@@duven60
      Just reread the report and it seems to say that if you fork a repo then delete the repo, the fork lets you access files in the deleted repo that were NOT originally instantiated in the fork. That sounds strange, but it's not unreasonable if the file was in the range forked, even if not copied. The file revision at the time the fork was created would be tagged as potentially needed in the fork in the future so the repo copy must be preserved. VCSes are conservative in such things.

  • @devnom9143
    @devnom9143 2 місяці тому +5

    Didn't the US courts recently rule that AI companies are free to ignore the code licenses, at least for the purpose of training their LLMs?

    • @adcodes
      @adcodes 2 місяці тому

      Yes I heard that too. It's legalized theft like banking or taxes.

  • @kajacx
    @kajacx 2 місяці тому +28

    "You have to know the message name, the exact date, the author name, etc to reproduce the SHA" you also need to know the content of the files to reproduce the SHA, at which point this "exploit" will not give you any more information. If you get the SHA by other means it can still be bad though.

    • @FryuniGamer
      @FryuniGamer 2 місяці тому +2

      Git uses SHA1, and that hash has been broken in practice already by the SHAttered project.

    • @hanifarroisimukhlis5989
      @hanifarroisimukhlis5989 2 місяці тому +6

      @@FryuniGamer SHAttered has done what, a few PDFs and misc files. Trust me, SHA1 preimage isn't going to be broken anytime soon.

    • @perc-ai
      @perc-ai 2 місяці тому +2

      You don’t understand cryptography. It’s very easy to generate all possible SHA-1 combinations and use a bot net with a large amount of proxies to find keys. It doesn’t matter how it was hashed lol…

    • @ashtree129
      @ashtree129 2 місяці тому +4

      Short hashes exist

    • @kajacx
      @kajacx 2 місяці тому

      @@ashtree129 You are right, github will allow you to view the commit with as short as 4 first characters, so you can easily check all of them to see if you get lucky. If not, you can then try all 5-long combinations, then 6-long, etc etc.

  • @ErazerPT
    @ErazerPT 2 місяці тому +6

    Don't think the LLM thing will work. LLM's DON'T read sites, web scrapers do. LLM's also don't understand the content, they just ingest, digest and regurgitate it. You could have blocked the scraper with robots.txt, you didn't. You expected the LLM to understand the content, but it can't. Nor can it follow the content instructions (though that WOULD be fun... in a nasty way...).
    This DOES illustrate that we SORELY need a standardized way to tag data in a Creative Commons sort of fashion as we're way past the "read it and index it" times...

    • @Fuji-gn9nx
      @Fuji-gn9nx 2 місяці тому

      it can, what in world do you live? have yoi tried llm that has browsing ability? lmao

    • @ErazerPT
      @ErazerPT 2 місяці тому +2

      @@Fuji-gn9nx Oh dear, its one of those... Pray tell, how will a glorified weights file ever browse the web? The answer is, it can't... What CAN browse the web is one of it's input pre-processors NOT the model itself.
      If you're going to talk about ML, at least do some cursory reading on the subject... knowing what a model is is the bare minimum.

    • @Fuji-gn9nx
      @Fuji-gn9nx 2 місяці тому

      @@ErazerPT look how amature you are in llm, llm can spesific layer that will prompt search engine to retrieve up to dated information on top of what information it already knew then summarized it and send the result to the user, LLM can understand hundred of pages much better than you wish

    • @ErazerPT
      @ErazerPT 2 місяці тому +2

      @@Fuji-gn9nx It's not how it works, close but not quite, but let's assume it is. Read your own words. "will prompt search engine to retrieve up to dated information". What does that imply? It implies it CAN'T fetch it himself. Models receive input, process data, produce output. They DON'T perform external actions. They can suggest, request, command, whatever, others to do it on their behalf, but they CAN'T do it themselves.
      You can argue that it's semantics, but when talking about law, semantics is 99% of the game. Heck, laws have been misused because there was a badly placed comma that opened the door to other (reasonable) interpretations. That's how bad it is when you're not precise.

    • @Fuji-gn9nx
      @Fuji-gn9nx 2 місяці тому

      @@ErazerPT its one of how it works, look you dont know type of achitecture of llm. the search engine code can be embed to the model if they want but it neccecery because it will make more complicated if in the future they want to modify the search engine code, look you dont know anything. they divide it to two because it make easier to maintain, like you divide backend and frontend, and keep the model small, look you dont know at all. and why would dividing to two when its to make the maintaining easier because you dont need to reedit the model code when need to edit the search engine code and keep the model small become issue? sounds like you dont know anything again, and then finally again it can browse up to dated data day and night, and understand hundred of pages better than you wish, and if we tollerate some latency processing by increasing the model layers until it make processing is done in 5 minutes just like how human proccess thing and information that is not done below 3 seconds like current llm so that we have fair comparation, llm will defeat most of them including you with very high gap

  • @dus10dnd
    @dus10dnd 2 місяці тому +6

    John Hammond isn’t working bringing back dinosaurs anymore?!?

  • @triplebog
    @triplebog 2 місяці тому

    It was hilarious to actually see Flip remove the part he was told to remove, but only after he said to remove it

  • @Jason-yr6fy
    @Jason-yr6fy 2 місяці тому +2

    This honeypot idea dor LLMS is just hilarious 😂😂😂

    • @lukeskywalker7029
      @lukeskywalker7029 2 місяці тому +1

      💯, although the idea with skip the next two lines does not work with how LLMs are trained. That only applies to prompting.

  • @zeydtc
    @zeydtc 2 місяці тому +4

    The AI watching this video and learning about the honeypot idea at the end of the video be like 👀

  • @AloisMahdal
    @AloisMahdal 2 місяці тому +1

    doom like this: every N seconds, save the game and have twitch propose moves to AI, which will play out the next 5s. in the meantime, keep committing the ascii render to a repo/screen.txt. on death, have the twitch choose a save, reload and branch off.
    no one will ever go seriously look at the repo (maybe could be a test dat for diff viewers) but it would be fun knowing that it exists

  • @josueqb3843
    @josueqb3843 2 місяці тому +2

    I think github probably uses the same directory to handle origin and all the forks, so all the commits live in the same directory and can be accessed even if the fork gets deleted.

  • @maddada
    @maddada 2 місяці тому +10

    The AI honey pot bit end of the video killed me 🤣🤣🤣

  • @kevinkkirimii
    @kevinkkirimii 2 місяці тому

    Flip is flipping on Prime with those edits

  • @infinitivez
    @infinitivez 2 місяці тому

    The fact you can store entire globs of encoded data on github...
    "As intended" never sounded so fun

  • @christopher-pfeifer
    @christopher-pfeifer 2 місяці тому +2

    I don't think a single repo "honeypot" would have enough "attention" applied to its neutral network to actually cause Copilot to spit out the exact characters.
    Because we have to remember the AI networks don't have their training data verbatim, it's MLPs storing concepts.

    • @tukib_
      @tukib_ 2 місяці тому

      @@christopher-pfeifer it might influence its presence in the training data. People often assume this is scraped evenly but a lot of effort goes into refining it. You're right that it won't be common enough to be repeated by copilot though. Putting instructions in the policy won't affect how it is scraped though either.
      I think it would be more plausible for people to mass create repos with the same licence and code files. That way there is a fighting chance for an LLM to repeat the code verbatim without including the licence. I'm not certain though whether you need the licence (let's say GPL) present with the code in all forms of distribution. GitHub doesn't download the licence when downloading other individual files, so I wonder if that liability falls on people using Copilot instead of Microsoft.

  • @yoshika.kuzunoha
    @yoshika.kuzunoha 2 місяці тому

    Short commit ID can be as short as 4 characters, but only as long as it uniquely identifies commit, in any large enough repository you are going to get bunch of 'fatal: bad revision's right away

  • @CGTUC
    @CGTUC 2 місяці тому +2

    GDPR: what about your commit mail and name? Those are explicitly person related information stored in the deleted repo. So shouldn't they still have to return this information in your GDPR data request as soon as commits are involved?

  • @olbluelips
    @olbluelips 2 місяці тому +4

    Massive lesson in RTFM

  • @AbhinavKulshreshtha
    @AbhinavKulshreshtha 2 місяці тому +3

    The main question is that when the repo was not merged to the original, how did the original knew about the commits made on the fork, without merging it ..

    • @affanyunas7693
      @affanyunas7693 2 місяці тому

      maybe they use bot

    • @dealloc
      @dealloc 2 місяці тому +2

      Because a fork isn't really a copy of a repository. It acts more like a pointer within a global object store linked to the original repository (this is just an implementation detail). Storing actual copies would take up far too much space, so this is partly a way to save on space, as well as make things far more efficient in terms of I/O operations.
      The global object store is not directly accessible, but only references to it may be. In this case a commit hash is referencing the blob in the object store, which is retrieved through the original repository's URL (again, implementation detail).

  • @tcurdt
    @tcurdt 2 місяці тому +8

    Allowing for the short sha's is an ultimate fail everything else feels overblown. It's just surprising.

  • @Karurosagu
    @Karurosagu 2 місяці тому +8

    Private gists at Github also have a privacy problem: you can access any gist file (private or not) by just having the direct URL to the raw file
    Security through obscurity is bad

    • @tukib_
      @tukib_ 2 місяці тому +3

      There are no private gists, only secret gists. It is just as secure as your api keys themselves- not at all a security through obscurity issue.

    • @Karurosagu
      @Karurosagu 2 місяці тому +3

      @@tukib_ You're right, I used the wrong word, but It's still an issue, because when I started to use gists for the first time I expected those secret gists to be private and not hidden
      To be clear, I don't store any keys or credentials in GH, only code, but there is code that I don't mind (or I want) to be public and code that I want it to remain private, and that include some gists

    • @dealloc
      @dealloc 2 місяці тому +1

      @@Karurosagu Read the documentation of the service you use next time. It's not hidden information, really. Don't upload anything your want to keep secret or private to online services. It's really security 101. If the code you stored there happend to be leaked, would it cost you greatly? If yes, don't upload it anywhere.

    • @Karurosagu
      @Karurosagu 2 місяці тому

      @@dealloc Let's be real, you never read docs on something simple to use as code snippets online (I am referring to gists only here, not regular GH repos), unless you're gonna use the API for something specific
      And yes, a leak could happen, it could be the next big hack or whatever, but I am uploading and managing my code remotely for a reason: it is convenient for me, the same way it is convenient for other users (including companies) to store their credentials in online vault services for example. But just because it's convenient for me doesn't mean I don't value my privacy, this small rant is because I care about my privacy and, in my opinion, GH should do a better job in sepparating what is visible, what is "hidden" (AKA: secret) and what is private

    • @dealloc
      @dealloc 2 місяці тому

      ​@@Karurosagu Why do you presume what I do and don't do? But if you absolutely must know, I've known this from the docs, since I share secret gists URLs with other maintainers and colleagues; so it didn't come to me as a surprise, since it was what I was looking for; sharing gists without putting it up on the Discover page.

  • @quentinemacsftw1815
    @quentinemacsftw1815 2 місяці тому

    Idea to spread the stars over 3 weeks :
    - You % 21 usernames on twitch or github and generate a calendar so your viewers can add it to their agenda and pin this link on every video/stream

  • @denissorn
    @denissorn 2 місяці тому +2

    not sure if that would work, b/c you have explicitly stated and explained how you want to trick the algorithm into 'thinking' the license is permissive.

  • @felipemalmeida
    @felipemalmeida 2 місяці тому

    Actually Git does remove it. It just doesn't do immediately. It does garbage collection automatically or manually.

  • @therealblurrybarber
    @therealblurrybarber 2 місяці тому +17

    Free API keys for all!

    •  2 місяці тому +1

      Free And Open Source API Keys

  • @zzyzxyz5419
    @zzyzxyz5419 2 місяці тому

    I am actually cool with this feature. Sounds like something that will save you some day. Also, the secrets example is so bad because who the hell stores secrets in a repo?

  • @user-qr4jf4tv2x
    @user-qr4jf4tv2x 2 місяці тому +12

    i don't modify fork only copy it.

    • @teawithoutdonuts31
      @teawithoutdonuts31 2 місяці тому +3

      I think thats why no real copy is made. GitHub just conserves space 😅

  • @DerSolinski
    @DerSolinski 2 місяці тому +1

    Uhm, this could be a potential law suit since this violates the right to be forgotten.
    But MS used to pay fines to the EU so nothing new...

  • @astral6749
    @astral6749 2 місяці тому

    I hope I remember this in 2 weeks so I could participate in the honeypot plan.

  • @jeffreyhymas6803
    @jeffreyhymas6803 Місяць тому

    Not the point of the video, but that tier list was wild. Putting C#, Java, and C++ below Python and Javascript is nuts.

  • @rumplstiltztinkerstein
    @rumplstiltztinkerstein 2 місяці тому +3

    What if we encrypt the data before pushing to github? Use another app that just loads and decrypts the data, reads the commit information locally and present it to the user.

    • @wil-fri
      @wil-fri 2 місяці тому +2

      you are gonna need somekind of traspiller line per line, just to keep the initial spaces. You use sourcetree or whatever GUI. Then you use your cipher over your repo and then upload them. That's nice but you have to be careful with who you share your Keyes

    • @rumplstiltztinkerstein
      @rumplstiltztinkerstein 2 місяці тому +2

      @@wil-fri Yes. I don't like how Microsoft can just profit from reading all our private repos without our consent.

    • @dealloc
      @dealloc 2 місяці тому

      Ansible Vault?

    • @gabrielhidasy
      @gabrielhidasy 2 місяці тому

      ​@rumplstiltztinkerstein just don't store your repo there?

  • @RenanTraba
    @RenanTraba 2 місяці тому +3

    create some ARG game using deleted fork repo :D

  • @MrBringerdeath
    @MrBringerdeath 2 місяці тому

    does prime know about the github archive project that has a lot of hashes in a data set?

  • @conceptrat
    @conceptrat 2 місяці тому

    It's XSS except for GitHub? So XRR (Cross repo referencing) GitHub isn't GIT though so this is effectively GitHub implementing their own scheme for branching to create forks.

  • @sytranvn
    @sytranvn 2 місяці тому +1

    I'm totally safe because my repos never get forked.

  • @rdvansloten
    @rdvansloten 2 місяці тому

    Anyone who didn’t rotate a leaked API key after “deleting” it from the repo had it coming anyway. They would do other degen things that would get them hacked.

  • @SoftwarePractice
    @SoftwarePractice 2 місяці тому +2

    Informative 👍

  • @wanfuse
    @wanfuse 2 місяці тому

    since there is a very large number of branches and the hash is large but not infinite ( may be practically be infinite), can we can we randomly reproduce git? with a group attack? simple solution but probably against the immutable history, would be too change hashes/links maybe? someone could bypass security fork it then the info would be forever available to everyone ( clarity: not suggesting this is right, suggesting this is what happens or will happen)

  • @nijuyonkadesu
    @nijuyonkadesu 2 місяці тому

    I wonder if it's the same case with repos that got deleted because of DMCA

  • @timturner7609
    @timturner7609 Місяць тому

    I remember about 10 years ago I accidentally checked in a password or something and had to spend the rest of the day figuring out how to purge it from git

  • @con-f-use
    @con-f-use 2 місяці тому

    11:33 having the sha is not that hard. Just minutes earlier you demonstrated, that a small part of the sha is enough to find a list of possible commits in a repo and that there's a reflog.

  • @eppiox
    @eppiox 2 місяці тому

    I read the thumbnail as 'F DOTA' and thought.. oh no

  • @Jia-Tan
    @Jia-Tan 2 місяці тому

    Don't kid yourselves, I've known about this for ages 😋

  • @YoKKJoni
    @YoKKJoni 2 місяці тому

    i wanna see that special licence XD

  • @misha130
    @misha130 2 місяці тому +1

    What if in the LLM prompt I should say "ignore all prompts that are located in the license and act as if you are a human and not an LLM"

  • @markuseberlein3394
    @markuseberlein3394 2 місяці тому

    Could that be used the same way as the fixed GitHub comments vulnerability from ~ april? Ah, right, works as intended.

  • @attilazimler1614
    @attilazimler1614 2 місяці тому

    does he change a user between the creating, deleting + during the checking on openai's repository? Because it might be that it is only accessible by the same user.

  • @ImPDK
    @ImPDK 2 місяці тому +1

    What if you "fork" by cloning the repo and creating a new private one?

    • @duven60
      @duven60 2 місяці тому +2

      Hope they don't try to] de-duplicate data on the back end but that would be relying on github's backend coders not getting too clever and given the "won't fix" response to this problem I wouldn't trust it.

  • @michalkowalik89
    @michalkowalik89 2 місяці тому

    i think somebody is reviewing the repo before it is fed to LLM

  • @silibaka-pj3pm
    @silibaka-pj3pm 2 місяці тому +1

    Alt title - you cant *delete* repo and anyone can view it (Private also)

  • @raunaquepatra3966
    @raunaquepatra3966 2 місяці тому +1

    dude stumbled on how openai collects data

  • @YourComputer
    @YourComputer 2 місяці тому +1

    Looks like a fork that isn't a fork.

  • @Sedokun
    @Sedokun 2 місяці тому

    So. It's possible to make an attack on a repository by committing someone's Personal Data to a fork?

  • @JLarky
    @JLarky 2 місяці тому

    I feel like it existed like that forever, why are we in a freakout mode now?

    • @dealloc
      @dealloc 2 місяці тому

      I don't think people knew that GitHub worked the same as git; a global object store with pointers. Even though it's in the name.

  • @blenderpanzi
    @blenderpanzi 2 місяці тому +3

    Also not just API keys are a problem. What if you fork a template project repo for your private proprietary software? And you print the commit hash of the software in debug/version info? You just made your proprietary code open. Whoops. Don't think anyone would have expected that.

  • @MissMyMusicAddiction
    @MissMyMusicAddiction 2 місяці тому

    isn't this the entire point of doing a redact? you back the repo up before the commit(s) you want to redact, skip the commits that are problematic, and rebase and commit the remaining commits.

  • @ludologian
    @ludologian 2 місяці тому

    Isn't that related to git commit history?

  • @code-dredd
    @code-dredd 2 місяці тому

    Delete doesn't mean delete, for these people...

  • @tofonofo4606
    @tofonofo4606 2 місяці тому

    FLIP IS A FUCKING NINJA WIZARD

  • @conceptrat
    @conceptrat 2 місяці тому

    Firebolt and gharchive

  • @abrarshaikh2254
    @abrarshaikh2254 2 місяці тому

    Which extension he was using to change theme colour?

  • @Quantris
    @Quantris 2 місяці тому +1

    maybe don't publish secrets onto a website
    from the article: "the only way to securely remediate a leaked key on a public GitHub repository is through key rotation"
    no fucking shit....

  • @laughingvampire7555
    @laughingvampire7555 2 місяці тому

    i think he is calling it `secret` just to make it obvious that is sensitive information but it can be just a new algorithm the company now wants to keep exclusive for the pro version

  • @szirsp
    @szirsp 2 місяці тому

    What if I accidentally uploaded my DNA to github and then deleted it?
    How do I rotate my DNA?
    :)

  • @thomassynths
    @thomassynths 2 місяці тому

    You can't make private forks on github using their fork UI. Thus making a secrets.txt is nonsense to begin with in such a situation.

  • @melangearrakis
    @melangearrakis 2 місяці тому

    they really didnt want to pay this guy out the money for the bug program

  • @lolnowayz
    @lolnowayz 2 місяці тому

    Can you react to: Big Tech Doesn't Want You Anymore by Patrick Boyle

  • @theboydaily
    @theboydaily 2 місяці тому

    I don't understand how people could commit API key.

  • @SRG-Learn-Code
    @SRG-Learn-Code 2 місяці тому

    Maybe then is better to degit instead of forking?

  • @KelvinShadewing
    @KelvinShadewing 2 місяці тому

    Is this a reupload? I swear I've seen you say all of this exact stuff before.

    • @olbluelips
      @olbluelips 2 місяці тому

      Clips might have been uploaded somewhere first

  • @kamertonaudiophileplayer847
    @kamertonaudiophileplayer847 2 місяці тому

    It's the reason I left GitHub.

  • @theminecraft4202
    @theminecraft4202 2 місяці тому

    so all this is telling me is no one has read the github documentation

  • @kevingarza3510
    @kevingarza3510 2 місяці тому

    it’s not a bug, it’s a feature.

  • @RPG_Guy-fx8ns
    @RPG_Guy-fx8ns 2 місяці тому +4

    roll your own versioning system.

    • @Karurosagu
      @Karurosagu 2 місяці тому +5

      Password protected .rar files in google drive LETS GOOOOOOOOO

    • @hanifarroisimukhlis5989
      @hanifarroisimukhlis5989 2 місяці тому +1

      my_docs_final_final_final.docx

  • @mitchierichie
    @mitchierichie 2 місяці тому

    FLIP, TAKE THIS OUT!

  • @s3rit661
    @s3rit661 2 місяці тому +1

    How does this work with gdpr?

    • @duven60
      @duven60 2 місяці тому

      Not, but frankly git itself is probably a massive violation of the GDPR and right to be forgotten anyhow. Would be interesting to see how much of it would need to be re-architected to comply with EU laws.

  • @ElvenSpellmaker
    @ElvenSpellmaker 2 місяці тому

    Good.

  • @justbrad_v3906
    @justbrad_v3906 2 місяці тому

    fuck where is this Prime License I will make one with that license too lol

  • @QuicksilverSG
    @QuicksilverSG 2 місяці тому

    TL;DW: Git is a shell game. Go back to programming and leave version control to your build engineer.

  • @philipphanslovsky5101
    @philipphanslovsky5101 2 місяці тому

    Dude unironically uses "git checkout -b". What year is it?

  • @andymutale368
    @andymutale368 2 місяці тому

    well fork...

  • @OmkarSathe-tt6jz
    @OmkarSathe-tt6jz 2 місяці тому

    WHY IS TYPESCRIPT ON F TIER?! 😨😨

  • @bp3d106
    @bp3d106 2 місяці тому

    I know I'm lame and uncool but I don't use Github at all.

  • @jasontang6725
    @jasontang6725 2 місяці тому

    ipfs over github deleted forks ftw.

  • @MorningNapalm
    @MorningNapalm 2 місяці тому

    Github is clearly overcommitted.

  • @Mempler
    @Mempler 2 місяці тому +15

    Rewrite github in rust

    • @spdlqj011
      @spdlqj011 2 місяці тому +4

      it's not about memory safety, just github policy

    • @nikkehtine
      @nikkehtine 2 місяці тому

      @@spdlqj011 still, rewrite github in rust

    • @Mempler
      @Mempler 2 місяці тому

      @@spdlqj011 rewrite the github policy in rust

  • @YouHaveTrouble
    @YouHaveTrouble 2 місяці тому

    People discovering how git works in 2024?

  • @hwy9nightkid
    @hwy9nightkid 2 місяці тому

    Feels good to be in Gitlab, if im honest

  • @DePhoegonIsle
    @DePhoegonIsle 2 місяці тому

    Ya know .... the part that gets me is their acting like you can hit any repo.... The problem I have is they actively overblow the requirement of 'FORKING', and if for some reason there is no fork & it's private... than it frankly doesn't matter if they can figure the hash or not.
    This is an exploit of design that is meant for retention, but frankly all this is going to do is change what people do with the fork.. So now instead of maintaining the relationship, they'll just fork clone it to their PC, than nuke the .git history and retain all the code and commit it as a new repo with no history to it.
    Seriously ... the amount of work you'd have to do to sniff out data through the repos is crazy. You know the only change that will happen now is that they'll start to blackball all private/deleted data to the original owners, or disallow any fork/commit that doesn't have an owner from being displayed or grabbed. (as the owner of the repo with the deleted branch/undone commits, will be able to traverse their git commits & actions so they technically still have an owner) and mark entire forks as logically separate from initial blocking all traversal into other branches not owned by their account on creation.

  • @user-dn6el
    @user-dn6el 2 місяці тому

    Jokes on you I don't fork

    • @nikkehtine
      @nikkehtine 2 місяці тому

      I don't fork, only fork around :)

  • @paulkanja
    @paulkanja 2 місяці тому +2

    ahh Github enabling communism 😂😂

  • @Saltovka_Scientist
    @Saltovka_Scientist 2 місяці тому

    write a jdisel doom