Why doesn't Facebook use git?
Вставка
- Опубліковано 13 чер 2024
- Huge shoutout to Graphite for sponsoring this video, always happy to nerd out about stacked diffs for a bit (even if Mercurial is a thought that gives me nightmares)
SOURCES
graphite.dev/blog/why-faceboo...
devblogs.microsoft.com/bharry...
stacking.dev/
arstechnica.com/information-t...
engineering. 2014/01/07...
Check out my Twitch, Twitter, Discord more at t3.gg
S/O Ph4se0n3 for the awesome edit 🙏 - Наука та технологія
I should go to bed
There's someone at the door 😶🌫
Stop it subconscious
Same…
come over i have one
Night night
tldw: fb chose to move away from git when it wasn't scaling well for very large repos. Git can handle such large repos now, but FB still using mercurial
Thanks bro
maybe add that the git maintainers at the time were not the most cooperative in helping out fb to address the performance issues which resulted in fb adopting mercurial again
Real MVP.
Also, FB Engineering uses stacked diffs, which are the norm and recommendation with Mercurial, but not with Git. The sponsor Graphite implements stacked diffs for / on top of git.
That and it supports diff stacking which git doesn't
Microsoft tried splitting up .NET into separate repositories, versioning hell, until they made it into a monorepo. They can still build stuff separately, but their entire development environment is in one repo.
Monorepos do not magically solve communication between teams and interface definitions. You can "just create a PR" all the same with multiple repositories. And in a monorepo scenario you'll still get flak for touching stuff other teams are responsible for without talking with them. There's always some level of communication, diplomacy and bureaucracy when things are large scale.
I'm the lead developer at my company and I refuse
Facebook seem to have a tendency to try to idiot-proof their tooling.
Of course, but at least this is possible!
The other teams can review the parts they're responsible for in your PR, but at least with a monorepo, the whole feature is an atomic unit that's either present or missing. With distributed repos, you can have different versions of each component, and it's possible that one component supports the feature, but the other doesn't, breaking things. So you need to build version management on top of it, which adds complexity.
Yes, that is true. However, you don't loose this in a monorepo. If you want, you can still make changes just in one part and wait for someone else to do his part.
In a monorepo, you gain the ability to potentially do changes in all parts and sevices of the software. And it can also be much easier to work together with other teams. You can start a branch, start doing the changes you need, and before you are completely done and it's completely working, the frontend dev can come along, grab your branch and start the project on his machine, can start implementing stuff.
Then, it depends a bit on your workflow, but potentially, you could make a PR with the combined changes to update everything at once, without needing to worry about breaking your staging temporarily.
Even better, when doing changes elsewhere, you can run your build, run your tests, etc and see unexpected issues earlier.
The alternative is, you update a project, it gets published to your company's package feed, you pull the package in your other projects, down the line you notice something broke somewhere, you do fixes in the first repository again, pull updates again in other repositories, see that sth else is broken now, and so on. It's just so much more annoying and time consuming.
Do we have to support you on patreon to make you use Dark Reader? 😅
LOL
@syedmohammadsannan964 every heard of open-source?
@syedmohammadsannan964 what's wrong with it mane?
As a software engineer making video games I can tell you that perforce is the only source control used in basically every AAA video game, would be pretty cool if you looked into it to see a perspective from a different industry!
As a game developer (who worked at AAA too), Perforce is pretty much universally hated by the whole industry. It's not used because it's good, it's more like OracleDB (very specific and hell to migrate)
@@krisitof3 Facts... I hate OrcaleDB. Currently using it on a project I'm working for contract. Currently my personal company started working on a game. After 12 years as a software engineer pivoting into game development as I always wanted to do. But never touch your heart of perforce. Currently using Mercurial because the last project I was working on for about 9 years in Mercurial. For me and the team that I'm leading it's out of familiarity.
never heard, is it VC for source code or assets as well
Perforce may be more agile now, @@Special1122 , but when I was considering it, alongside Git, Mercurial and Bazaar, to replace VSS, in the noughties, it felt very old school.
As someone who was at Facebook at the time and personally knew some of the source control folks working on this I can corroborate this article.
The only point is that I believe that stacked diffs workflow were a thing independent from Mercurial. We had those on git, too. The workflows and tooling significantly improved over the years though and a lot of that work happened on top of mercurial obviously.
Yeah, if I understand the timeline correctly, it looks like mq (mercurial queues) predated Facebook involvement in the project.
In game dev land where codebases and file sizes can get huge mercurial and perforce are a lot more common at least before git lfs
Alienbrain perhaps.
I have only ever used Mercurial once in a small startup that I worked for and I have to say that I enjoyed it a lot more than Git.
MS getting git to scale video would be great
We tried mercurial in 2012 I think and had quite a lot of merge issues, I don't remember the specifics but unfortunately the engineering team running the pilot were really defensive and presumed we were using it wrong, only to be repeatedly shown to be wrong.
They ended up moving to Git in the end.
I didn't really have any preference, and us guinea pig devs were open minded about it, unfortunately the team proposing it seemed to be the problem, though I'm sure they blamed us.
I've used CVS, SVN, Mercurial, Git, and I don't really care as long as it works!
Like most people use CLI anyways. If anything if it is posible to make so many GUIs for CLI app why in the world doesn't someone make custom CLI front end for those tools. You like Mercurial CLI but need to use git, use Mercurial commands for git.
That really surprises me. Even after 14 years of using git, I'm appalled by it's terrible default merge heuristics. I long for the quick and simple merges of my mercurial days. Even now, when merges get difficult, I pull out kdiff3, the default Mercurial merge tool, and it makes complex git merges *so* much easier.
1:06 “Mercurial, svn, and git” um no, the newly formed dvcs front was pushed by mercurial, git, and bazaar. SVN was a reigning king of centralized version control at the time of passing.
Also better support for monorepos and branch management is why Mozilla is using Hg for Firefox.
*was using
The're phasing out mercurial in favour of git.
@@isaactfa it's true that they've been migrating to git for a while now, but it's mostly because newer developers don't know anything but git, and official firefox docs still point to hg repo
He said back in the day svn was one of the big 3. Svn is still wildly used. Svn came out only 4/5 years before git.
@@isaactfa Look at the state of mozilla right now, I don't think version control should be a priority.
You forgot to mention: Linus had tried mercurial, but mercurial didn't scale for the Linux kernel at the time, so git was created.
Mercurial was beautifully engineered from the start: what needed highest performance is in C and the rest in Python. And it always started with “how can we do that right in the UI? What does the backend need so that this always works?”. For example before adding mutability to the code, they added "phases" to the core (whether something has moved to other repos: secret/draft/public) and now with the hg evolve extension, you can have safe collaborative history rewriting.
Also that’s why everything felt like it just works - very different from things like git ... --autostash which only works for some commands.
As an ex Facebook employee I can say I love landing 13 high stacks of diffs but hate it when it breaks half way through from a merge conflict
I used mercurial before git and it’s actually really nice since the CLI is much easier to understand. Mercurial has become a Betamax and every IDE and CICD system has really good git support to cover up the uglier parts of git.
The GIT CLI is so bad, you have to literally browse the GIT source to learn to actually use it properly.
@@Bozebounfortunately true. Hence why I have so many GIT aliases
Considering the fact that they had to basically neuter the shit of flow (their worst version of typescript) to the point where it doesn't do type inference anymore and everything has to be explicitly defined because the code base is so large that running type inference was basically impossible, I'm not sure how this whole idea of one giant repo for everything is such a good idea for everyone.
I am all for mono-repos, but there is a certain size limit where unless your company has the budget to have a team writing custom programing languages/type systems to solve your code base scaling problems, I suggest sticking to the "standard" git/repos model ;)
Please do a video on Microsoft scaling git for their purposes. Very interested.
The fascinating thing about this story to me is that it mirrors my own personal growth in certain ways. The brazen young developer who is convinced that everyone else is stupid and they should just do it your way. Eventually growing up to be a wizened individual that knows collaboration leads to new ideas and potentially improvements you'd never imagined.
for distributed version control (dvcs), the big three are/were git, mercurial (hg), and bazaar (bzr). subversion (svn) was more of a traditional client-server, file-locking version control system, alongside cvs and others; svn was the last/best of that breed before it became completely supplanted by dvcs.
Nah, I don’t rate monorepos at all. There are huge downsides and few upsides. All of the downsides you’ve said to having smaller repos aren’t issues with the repos, they’re issues with shitty processes.
As one who misses mercurial still, after 5 years of having to use git, I find the sentence "git might be more user friendly" a bit crazy. No stress but *my* 2 cents: Mercurial is a well designed application. Git is a framework based on an object store, with tools bolted left and right onto it.
Git is complete insanity..
No wait, I meant: Git is complete insanity...
The third dot is very important!
What is also interesting to mention, if Mercurial had scaled at the time Linus needed it for the Linux kernel, he would have used Mercurial.
@@autohmae Exactly. Totally sad story for the Mercurial author, such a beautiful tool and then this. github then was the nail in the coffin
Same here. I still use Mercurial for all my personal problems, because it is just so much nicer to use, and can only work well with git at work thanks to magit.
Yes. git is capable, but lovable? Depends.
This is really cool. Inspired me to look up other control systems for Mac and Unix.
hey man love your videos and streams, keep it up!
This is why i really like gerrit (git-based), you can do stacked reviews and cross repo reviews so that things get merged simultaneously so that build changes across the global change.
Agreed, the use of topics are essential to multi-repo workflows, ensuring changes to a dependency doesn't break things using it.
More folks need to check out the patch-theory-based DVCSs like Darcs & Pijul
This 100%
Love this type of content Theo!
Google uses mercurial also. I think it's great. Such an improvement over perforce.
Was going to mention Google. I know at least chromium was and probably still is on mercurial
@@kienanvella Since chromium is open source it actually doesn't use the Google internal tooling. It's just git with gerrit on top. The internal projects at Google "kinda" use mercurial as in they have a mercurial-based client but there is a ton of custom code around it.
fig (google's mercurial version of git) and critique (Google's pull request tool) are so much better than GitHub.
I recently (at the start of this year) left Google to join a startup, and man, do I miss fig.
Been meaning to read this. Love that there are people on youtube that read me stuff 😅
The Perforce consistency problem was considered enough of a security loophole at Amazon that it sealed the deal on moving the company to Git in 2013 or so
learning how Meta solves these optimization problems at scale is such an interesting topic that needs to be discussed more!
I think another important thing when it comes to dev push back is personal incentives. If my company wanted to move to some piece of technology, even if it might be the most optimal for what we want I might push back because it could harm personal development.
I feel like that's a downside/risk of all these custom workflows and getting too used to big tech internal tooling.
If only the Git CLI were as usable as the Mercurial CLI.
Git alias help a ton
This is going to be an odd question. How did you get your browser window to be equidistant from the edges of the screen? I've been doing this obsessively for a decade, so I'm curious.
I think its the Arc Browser for Mac, (Beta for Windows)
At my first job I needed to teach myself to use a version control system. I went with mercurial because it was so much easier to understand how it worked! That was followed by 8 years at a different job where I learned git. But I still have a soft spot in my heart for mercurial. Git was tough to learn at first but I figured out most of it eventually
Great video Theo! Thank you 🙏🏾
I would definitely watch the Microsoft git back story video.
Yeah but I bet you cant tell me why they use windows at microsoft
Re discussion around 8:50 - I've seen too many senior engineers go this route, and they have huge, _huge_ PRs. I think I can read between the lines here, and see that you want feature _slices_ to ship to prod so you can iterate quickly (multiple ships a week or a day to get stuff out), but i worry too many others hear "mono repo let's me cook" and take >1 week to do a huge feature release. And i know SW release is separate from product release, so the flags could still be off and you keep merging fixes to your broken feature, but c'mon - monorepos may be enabling some antipatterns here. small prs ftw.
I do agree that _flexibility_ is a powerful tool. many small repos do not lend themselves to being flexible, monorepos don't take away any paths to production from you.
@@seannewell397honestly asking because we’ve been having the same discussions at work recently: how do many smaller repositories not contribute to flexibility? How is a large monorepo, conversely, more flexible?
We have so many architectural & engineering principles that show modularity - generally speaking - has less cognitive complexity, is more maintainable, is more flexible & easier to change, easier to test at a low-level, separates concerns, supports encapsulation, etc etc…. And we see how these principles manifest themselves in things like shared libraries & packages that I pull in to a project instead of keeping all of that code in my own codebase. Or how large monolithic programs are less optimal for all of those reasons compared to modular programs, except where other concerns like maybe performance is critical.
I’m still trying to wrap my own head around the apparent contradictions between writing small, modular, loosely coupled code but using a huge monolithic repository that co-mingles it with everyone else’s code and takes away my code’s fine-grained version control. The workarounds for most of these problems just go back to simulating smaller discrete repos like using a codeowners file to prevent others from modifying my code in the monorepo’s subdirectory.
Lol… It really just seems like we enjoy going around and around by rehashing & relearning the same concepts & pitfalls over and over again. I dunno.
That's why feature branches are a thing, you have offshoots from that for the smaller PRs but the feature remains one cohesive unit until it's ready to ship
@@TurtleKwitty trunk based development with features hidden behind flags is a much better system imo. If feature branches are too long-lived, merge conflicts become inevitable.
@@AZaqZaqProduction and add huge file sizes on top, you get conflict and you don't even know was that line supposed to be 200 lines below.
saw this live, good as always!
8:18 how does monrepo source solves problem of sync deployment?
You can achieve snapshot like behavior with almost anything. Be it Git submodules or custom file that keeps track hashes (like webOS did).
This. It seems they never heard of this...
Back in 2009/2010 The small dev shop I was working in made the call to switch away from SVN. We switched to Mercurial after a long discussion simply because we felt more comfy creating in-codebase addons to Mercurial whereas Git seemed much more hostile to that sort of activity. However, that only lasted about 4 years and we eventually ended up switching to Git (Which was a very smooth transition) after the company bought another that was using Git.
Never worked on a monorepo but I love the sound of it. If it breaks, it breaks locally, not on dev test or prod.
Surely it's generally "horizontal" not "vertical" (though quite likely vertical in terms of networking and that's the mistake people make not using a monorepo because they think every svc needs an entire repo). You don't have a different prod and dev repo do you?
@@Bozebo I worked on a project where they had, I think 5 front end apps, which for the most part had a lot in common and a shit ton could have been reused. You could not have people from one app switch to another as it was that much different structure.
i need to stop procrastinating
Besides: do you know that with the infocalypse extension there’s Mercurial tooling to have fully decentralized repositories over Freenet / Hyphanet, including pull-requests? (though this was hurt a lot by incompatibilities in byte/string handling between Python2 and Python3 - took ages to debug, because with Python 3 the different interfaces file system, network, version control give and require different data structures, but with Python2 they were all plain strings)
Even as a CS student, how source control fundamentally works and best practices are almost never actually talked about in classes outside of the basics. Even now, I only have a very surface level knowledge of how it's actually used in a professional environment.
1:10 Mercurial, git and Bazaar were the main 3 distributed version control system.
SVN (Subversion) and CVS were the other most used (open) version/revision control systems.
Mercurial, git and Bazaar were the pretenders to the throne, but Mercurial, git and svn were most commonly talked about in those early days, unless you were part of a few very specific communities. CVS's popularity was already waning, and my first VCS (RCS) was thankfully no longer anywhere to be seen.
I would love to see a video about lfs and other things Microsoft did to improve git
I need that Microsoft git video!! That sounds sick!
I was on the initial team from Perforce visiting Facebook around 2009. They didn't reject us because of some fundamental design flaw though, by 2009 it just wasn't a forward looking solution.
really cool piece of history!
Hey Theo, when you say that you can add breaking changes to the BE and update the FE consumer in the same PR for a monorepo, I get it. But how are those services being deployed? I mean how do you guarantee that the FE won't be deployed before the BE or vice versa? How do you deploy those together? As always, great video and content! Thanks
Very good question! I have two answers, "how most do it" and "how I do it"
How MOST do it: Build automated CI/CD, point clients at 'versioned' servers, leave old servers deployed for X amount of time (see: "skew protection")
How I prefer to do it: Server the frontend THROUGH your backend, so the server generates the "most current" client on every request
@@t3dotgg Why on every request? I mean I know what you mean but it sounds like build the thing on every request xD
Mono or multi repo makes literally no difference to that problem whatsoever?
@@t3dotgg Thanks for your reply! Maybe you could do a video on the "deploying monorepos" topic? That would be awesome! After watching this video, I tried searching for it, and I didn't find anything.
I'd be curious to know more about how the versioned builds/servers work, how they rollback (I mean, if you rollback the server, the client might hit a dead endpoint, or some old endpoit which is not compatible, because the breaking change is not there anymore), how long the "old" servers are kept running, how the CI/CD avoids redeploying the services that didn't change, how often they deploy (they probably have thousands of merges every day), etc.
On a much smaller scale, I believe Vercel/NextJS does something like this. I mean, a single NextJS repo is basically a monorepo (api folder + frontend), and vercel generates new deployments on every push, including preview/testing ones.
And I'm curious to know more about your preferred approach as well. Would that work for an SPA?
So nice to hear maybe you can male a series
Isn't a stacked diff just branching off an existing branch in order to create a chain of PRs, each based off the previous one? If so, my company does that when building a feature that needs a lot of changes delivered at once since it breaks down each chunk of work into a smaller PR for review. The article made it sound like a Mercurial invention but I would have thought git always had this feature? Or maybe it was always possible in both tools but Facebook popularized it as a workflow?
This is the thing. Although there is a suggestion that the stack can be a branched graph hinting at another feature that would be maybe a bit harder to replicate simply with git. Not impossible. One way to do it would be to have a featureA and featureB branch and then branch off of one of them e.g. featureA creating featureC and then merge featureB into featureC. And here we go, we have a stacked diff workflow.
I suspect meta built more tooling around stacked diffs to facilitate it easily, but it can be feasibly replicated in git.
Coincidentally we tried stacking workflow without even realizing it when a developer just created a new branch off an existing branch that's awaiting PR review (because the new feature is dependent on it). It became a nightmare when we approved the review and squashed the PR. Now the new branch, even when rebased, would duplicate all the commits from the original PR even if those diffs were squashed into a single commit. Without a highly focused developer training covering "interactive rebase", this is not an easy situation to fix. The second PR for the new branch erroneously shows all the commits from the first PR even though none of those commits contribute to the overall diff of the second PR at all.
Conclusion: Git is not designed for stacking workflow.
Above a certain size, monorepos become more of a burden than a help. And at any size they disincentivize code modularity.
need microsoft ish video as well, entering in git land, acquiring github and related things at that time
I've worked at several companies that use monorepos, and every one of them proudly proclaimed that google does it so it must be right. All of them also had terrible engineering leadership and zero architectural convention. They used monorepos because they built a businesses on top of a mishmash of uncoordinated and badly-planned apps, all stitched together into a tangled mass that no one dared change. Monorepos are like waterfall - sometimes they're the right approach. Most of the time though, they're a smell.
Where I work once had a monorepo in svn. Merges were a nightmare. I remember merge reports that were so long they kept breaking Confluence.
When we moved to git, we split out the project into dozens of repos, with a complex system to deploy all of those repositories, and a gerrit review system to keep to make sure changes in one, didn't break another. Merges were easier, but we had many more of them.
When we moved away from the arcane orchestration system we were using, to using maven, we started coalescing those repos back into something closer to a mono repo, as dependencies got increasingly difficult to manage. Now, we rarely commit to more than two or three repo's depending on which project we support.
Please do the scaling git and cover the technical aspects of how it works and the tools and technologies that are used to accomplish such a feat!
Our company use mercurial too instead of git too. It works great.
i like how the command and short name for mercurial is hg, which is chemical symbol for mercury :)
Mercury is after all, very mercurial, as is software development. *8')
"Ideally you should have one thing you clone that has all the services you run and all the services they need to run"
Does nobody use submodules? Does having one "monorepo" that just links to the correct versions of various other repos not work?
Just a classical skill issue. They don't know how to use the tool properly so they blame the tool.
I've tried to promote the use of submodules many times at work, but they are quite a blunt tool, and every time I've tried, I've faced pushback.
For git status, they bring the inode integration, so it doesn't need to check, inode is notifying about the changed files, although I don’t know why this option is not on by default.
Please do a video on stacked diffs. This is new to me (2 yr casual git novice, speaking)
This video improved my impression of Meta
Git really shot itself in the foot ignoring its scaling issues for so long. I was asked to make assessment to see if git could replace Source Depot, (Microsoft version of Perforce) over a decade ago. I proved pretty convincingly that it couldn't. I asked the git team at the time if time if there was a way to mitigate these issue and got pretty much the same response that Facebook got.
that's pretty sad. i think this behavior is something ingrained in the Linux community
The way these early git maintainers acts is indicative of Linus Torvalds influence lol.
I'm really interested in the video about how Microsoft scaled up git.
im going to bed, unlike that dude V
What Browser / OS does he use?
I love this side bare, instead of bottom task bar
How well was bazaar working on large monorepos back then? I always thought it was the nicest as a sole developer.
Phabricator is now called phorge and community maintained
Curious about something now…what if git sort of had multiple repositories in a larger repository, say along module lines, such that versioning is synced across them all but it doesn’t have to process modules that you didn’t edit? So in a sense it’s internally sharded, but without the drawbacks of version async. Under the hood it would be like separate repos that are forced to undergo all branches, commits, and PRs together, but it looks like one repo at the front end.
Is that like git modules or git subtrees ?
More of these please!
Does Facebook have any open source projects that use Mercurial though? So, they still use git for some things
Great video that brings back a lot of nostalgia. I especially love the focus on the human aspects and on communication.
Early on when I just joined the React Native team, I remember visiting our London office. A random engineer from the office found me during lunch and told me I was one of the last ~100 engineers still using Git instead of Mercurial, and he asked me what was missing for me to change.
I had only held out because I worried about disrupting my workflow before Mercurial was really solid, and I told that engineer that I would switch (and I did).
I was waiting to hear where sapling fits into the story, but nothing.
OCM is a major part of any org when they have to make hard decisions, you can’t just make a change and expect everyone to get onboard or get out.
Radicle decentralized peer to peer git looks interesting as well
7:37 "magic of the monorepo is... that it is in sync" ⬅this so much! 🙂 Performance of git is bad and/or complicated, but being in sync is a great point of using something that actually support monorepo.
It’s crazy the things these companies have to do to scale the repos. I use lfs (at work not for lols) and you have to be careful with things like making sure you don’t read every file or accidentally download everything. I think I ran vim once and downloaded like 70gb of the repo before i noticed and my computer went crazy.
and seeing the stack diff thing does make me a little jealous because I’ve had to learn to rebase and do all these things to keep things updated correctly when doing multiple features 😭
i would be really interested in a direct comparison of pijul against mercurial
followup should be about Sapling! :)
Please make a video about how you can make a tweet into a 20 minute video. This could have been a half-length UA-cam Short with the same info.
Ive learned that game devs typically use perforce. It works. Seems to handle merge conflicts better than git.
Depends. It's a commercial solution so I doubt most gamedevs use it, maybe some big game studios do. Git LFS seems to do the job for most.
Game devs also use C++ and "fat IDEs" like visual studios. Not everything the cool kids do works for us nerds ;)
Perforce is good at managing binary files, common in game development.
@@elirane85 VS's C++ debugger is insanely good. I've never seen any C++ vscode setup that comes close to Visual Studio's functionality.
@@nikkehtine Perforce is considerably more preferred over git + lfs in the game dev industry. From my experience, the fact it's a commercial solution doesn't play much impact in the conversation of viability. git + lfs still has a lot of complications over perforce for game dev, especially for larger games.
That said, I personally don't like perforce, and using it felt really clunky and unintuitive for the years I've done game dev with it.
Interested in this vid, former FB swe, never knew the why behind mercurial, but loved the experience with FB tooling
Regarding things being in sync: Aren't git modules or git subtrees a way better solution for that? That way you can do version pinning but retain a lot more flexibility as to how your projects are integrated with each other.
Yes.
git modules are a nightmare when anything goes wrong. In the free software projects where we had used them, we painstakingly moved away again, because they broke too hard when something was wrong. Missing robustness. Except for one where we had to re-introduce that, but that’s just a shell repo with some github actions.
At first I hated the monorepo, but over time I came to love it and now it's the git flow that feels weird lol
So what is stacked diff? To me it just looks like a mess of constant cascading rebases
The tooling does that for you, but yes behind the scenes that's what happens.
I feel like I understand it, but I also feel like I don't know how I'd work with it.
Like someone describes to me how to drive a car, you get in, pull the parkingbrake, check mirrors, release the brake, slowly release the clutch and accelerate slightly.
That's the feeling I have, I have it described how to drive a car, but I don't know how to drive a car.
I use stacked diffs (in git) to break a widespread change into comprehensible steps. If you were just faced with the overall diff to hundreds of files it would be overwhelming, you wouldn’t see the wood for the trees. For example, show the kernel of the change in one or two files first to demonstrate the concept and then show the same change applied to all files. Meanwhile only when it’s applied to all files can you test the final change.
How can a website have more lines of code then the Linux kernel? Maybe the issue is not Git is the bloated Facebook website.
Too much abstraction while Linux is mostly low level meat stuff?
I think the key is the kernel vs the entire operating system. Also, I think it's a factor that this was pre react most likely, meaning every single thing was separate code. No reusable objects. And all the css and js code and stylings add up to a lot of text. Also, 17M lines of code is roughly 500-600mb of raw text if around 30-60 characters per line, which could be on the high end. Additionally, there are a ton of backend services that run Facebook that the user doesn't see. Like all the data management services, user feed AI code, etc.
"A website"
Talk about underestimation lmao
We're talking about code written by roughly 20 000 engineers in the span of 20+ years. The largest ever social media app, marketplace, advertisment platform, machine learning and AI research center, VR development center, home of React and React Native, countless internal MVPs, analytics and infrastructure tooling, websites, blogs and who knows what else...
Facebook is not just a website!! Lots of things happen on the back
17M is not a lot for all the things Facebook is doing. It's even kind of small. I wouldn't be surprised to hear they are over 100M by now.
What browser are you using?
Lol my new company has a repo per microservice... I tried explaining we can still do microservices with a monorepo but they don't listen lol.
So if you go to make a common change (e.g. edit some linter rules) you have to do that in like 10-20 places now.
And yeah, multiple PRs for a feature (when it could be 1) is counterproductive.
"I tried explaining we can still do microservices with a monorepo but they don't listen lol."
I've been there. Apparently engineers don't know what a directory is!
please talk about scale git
Mercurial is the Pepsi of source control systems.
Nah, more the Dr Pepper, I'd say. *8')
"Interestingly, the Git maintainers appear to change their tone two years later "
That's normal, a little of friction into new changes
if Szorc is a Polish name, as it appears to be, then it’s pronounced, ‘shorts’
Would love a video on Git-LFS 😃
I kinda wish fossil was more of a thing. It seems like such a pleasant alternative to the other VCS.
Oh god. I use it at work for some repos and I have to disagree. OOTB it seems nice but you quickly miss many git features and run into many nuisances.
@@WHYUNODYLAN Dang, I am sorry to hear that. Are the devs aware of the missing features?
@@WHYUNODYLAN Maybe it's because you're used to git way of work? When you pilot stuff, try to embrace solution as whole.
@@Dhalucario The fossil devs? Yeah, sorta.
The main difference for version control is that fossil doesn't allow rewriting of history, which is very intentional. I do a lot of weird stuff with my repos so that's already a fairly big nuisance for me. However, I sorta misrepresented my point because the other features I prefer in git, well, they aren't actually to do with git.
Fossil is a whole "project management" system--issue tracking, wiki pages, etc. So it provides the same stuff as e.g. Github, but, frankly, it's not as good as what I'm used to. For instance, there's no concept of pull requests so at work we have to perform code reviews in Jira tickets. Fossil devs consider it to be more "featureful" for having this stuff built-in, but it's also very restrictive, since you can't jump between tools like you can with Github/Gitlab/Codeberg.
@@d3stinYwOw I could maybe agree with this if we were able to lean into fossil fully. We still have to use a whole bunch of other tools for project management, so we only use fossil as a VCS system. That being said, I think I'd still much prefer something like Forgejo if we were to go the route of "one tool for everything".
I'd be interested in a deicaded video on that. Put it on my que
Infra dev at airbnb. I worked at airbnb... definitely nothing to brag about or even mention. Holy crap...
14:47 Buy in is so critical for ANY project. If you don't have end-user buy in (i.e. the people who will be most impacted by a change), then the system will go entirely unused. Software development is more about getting social acceptance than the language or code being written/used. Yes, you still have to develop the software, but if no one uses it, then it was a waste of time.
Stacked diffs are the way the light and the future especially when you have multiple goals for a single codebase.