I built my own 16-Bit CPU in Excel

The Magic of RISC-V Vector Processing

Prototype nədir? __proto__ necə işləyir? Sadə İzah!

To Brawl AND BEYOND!

ЧТО ЖЕ МЫ КУПИЛИ СОБАКЕ ВМЕСТО ТАБАЛАПОК😱#shorts

❗ ШОК! УДАР ПО КРИМСЬКОМУ МОСТУ! Зеленський ДАВ ДОЗВІЛ! Нові ЦІЛІ України!

The Anatomy of a Modern CPU Cache Hierarchy

BitLemon

Переглядів 8 063

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 14 гру 2024

КОМЕНТАРІ • 32

@TheUpriseConvention 4 дні тому ⁺¹⁰
Thank you so much for your videos! I’m currently a machine learning engineer trying to cover the computer science theory I never learnt at school. These videos are a goldmine!
@BitLemonSoftware 4 дні тому ⁺³
I'm glad it helps you ☺️
@KrisRyanStallard 4 дні тому ⁺²
Excellent video. Informative without getting bogged down in too many unnecessary details
@BitLemonSoftware 4 дні тому ⁺¹
Thanks! It's really nice to hear.
@harshnj 3 дні тому ⁺⁴
You have been subscribed.
Just don't stop making this quality videos
@BitLemonSoftware 3 дні тому ⁺¹
I won't ☺️
@szymonozog7862 4 дні тому ⁺²
Love the series so far, keep it going!
@mateuszpragnacy8327 4 дні тому ⁺⁶
Really good videos. It is really helping me design cache for my Minecraft cpu ❤
@BitLemonSoftware 4 дні тому ⁺³
Thanks! Good luck
@stefanopilone957 4 дні тому ⁺³
thanks, very clear, I liked & subscribed
@BitLemonSoftware 4 дні тому ⁺²
Thank you for the support!
@dj.yacine 4 дні тому ⁺³
Thanks 👍. high quality 💯
@abunapha 3 дні тому
תודה
@chetan_naik 4 дні тому ⁺⁴
Informative video but why stop at L3 or L4 cache why not add L5, L6, L7 and so on cache to improve performance?
@turner7777 4 дні тому ⁺¹¹
Probably diminishing returns with increased cost and complexity
@der.Schtefan 4 дні тому ⁺⁸
It's the way how memory is implemented. By the time you are past L3, the latency starts being almost as "slow" as main system ram on a memory bus. L1 is usually very expensive and space eating SRAM
@jedijackattack3594 3 дні тому ⁺²
This has been done before. Intel did a L4 cache on broadwell for certain C cpu chips using a big external Sram die.
The first problem is that Cache is rather expensive. Die cost scale exponentially with die size so a 100mm^2 die is 4x as expensive as a 50mm^2. And modern CPU cores are actually quite small zen 5 is only around 4mm^2 but the on the full zen 5 CCD half the die area is that 40 MBi of L3 cache and the 8MBi of L2. And as an additional problem thanks to the high clock speeds, as the size of the cache increases the latency will increase as well just moving the data from the cache back to the processor.
Cache is also quite power hungry even when off so they tend to want to minimise it for consumer platforms, especially if it is going to be idle a lot like a phone or as intel did allowing the whole cache to be powered off on the performance cores.
As for why we still don't see them trying with L4 or L5 cache, there are a lot of so called cache unfriendly workloads. These work loads tend to have a few things in common. Low levels of exploitable instruction level parrallelism, high levels of random branchs (especially consecutive branches) and a large random and sparsely accessed dataset. Doing these things tends to result in a processor being unable to effectively speculate ahead or reorder instructions to hide latency leaving it purely at the mercy of the memory subsystem to determine how long the stall will be. Thanks to the random sparse accesses the caches are unlikely to contain the right data as data is being trashed and discarded constantly and is unlikely to be prefetched correctly thanks to the randomness. A bigger cache may allow you to brute force this problem as AMD has proven with the X3D line of chips but if the data set is still sufficently bigger than cache, that hit rate is not improvement, you will see no improvement in performance. And if your new big cache doesn't have enough bandwidth to feed the improvement in hit rates you will also have a problem where the cores end up stalling waiting to the cache to actually get around to servicing their request. Making a cache higher bandwidth makes it bigger and more expensive.
This is part of the reason that a lot of modern CPU optimisations and a lot of HPC software optimisation focuses on how to make sure the data and istruction stream is as predicatable and cache friendly as possible.
@chetan_naik 3 дні тому
@@jedijackattack3594 Well explained, I also wonder when cache miss occurs would the latency be just RAM latency or RAM latency + latencies of all the cache levels combined?
@jyotiradityasatpathy3546 3 дні тому
@@chetan_naik depends on the access architecture, usually serial and not parallel, which means the latencies would be added. However a main memory access time is far far larger than a register file access
@der.Schtefan 4 дні тому ⁺⁷
You did not explain associativity.
@BitLemonSoftware 4 дні тому ⁺⁶
I explained it in the first video of the playlist
@stachowi 2 дні тому
very good (and to the point).
@BitLemonSoftware 2 дні тому
Thanks
@abskrnjn 2 дні тому
How did you made this video, cool visuals
@BitLemonSoftware 2 дні тому
Thanks. Photoshop and CapCut
@anonymoususerinterface 2 дні тому
Can I ask where you get this knowledge from? I would like to know more!
@BitLemonSoftware 2 дні тому
My own knowledge as software/firmware engineer + research I do for each video. You can see the sources I used in the description
@hatsuneadc 16 годин тому
What happens if cache is not available yet in the L3 when another core tries to access it? Does it wait for it to propagate? Or does it take the last known (old) state?
@BitLemonSoftware 8 годин тому
I didn't fully understand the question. In any case, a cache will never pass stale values to the processor core.
@mikevirutal79 2 дні тому
great video. do you have courses in udemy?
@BitLemonSoftware 2 дні тому
No I don't. Should I?
@mikevirutal79 2 дні тому
@@BitLemonSoftware YES

Наступне

Автоматичне відтворення

I built my own 16-Bit CPU in Excel

I built my own 16-Bit CPU in Excel

The Magic of RISC-V Vector Processing

The Magic of RISC-V Vector Processing

Prototype nədir? __proto__ necə işləyir? Sadə İzah!

Prototype nədir? __proto__ necə işləyir? Sadə İzah!

To Brawl AND BEYOND!

To Brawl AND BEYOND!

ЧТО ЖЕ МЫ КУПИЛИ СОБАКЕ ВМЕСТО ТАБАЛАПОК😱#shorts

ЧТО ЖЕ МЫ КУПИЛИ СОБАКЕ ВМЕСТО ТАБАЛАПОК😱#shorts

❗ ШОК! УДАР ПО КРИМСЬКОМУ МОСТУ! Зеленський ДАВ ДОЗВІЛ! Нові ЦІЛІ України!

❗ ШОК! УДАР ПО КРИМСЬКОМУ МОСТУ! Зеленський ДАВ ДОЗВІЛ! Нові ЦІЛІ України!

Что выбрать Вике айфон или таба лапку? SchoolBoy Runaway

Что выбрать Вике айфон или таба лапку? SchoolBoy Runaway

How Cache Works Inside a CPU

How Cache Works Inside a CPU

Math News: The Fish Bone Conjecture has been deboned!!

Math News: The Fish Bone Conjecture has been deboned!!

Become a Malloc() Pro

Become a Malloc() Pro

What is the Smallest Possible .EXE?

What is the Smallest Possible .EXE?

When Optimisations Work, But for the Wrong Reasons

When Optimisations Work, But for the Wrong Reasons

Devin just came to take your software job… will code for $8/hr

Devin just came to take your software job… will code for $8/hr

What Does An Electron ACTUALLY Look Like?

What Does An Electron ACTUALLY Look Like?

Linux File System/Structure Explained!

Linux File System/Structure Explained!

A Number to the Power of a Matrix - Numberphile

A Number to the Power of a Matrix - Numberphile

Ердоган ЖОРСТКО поставив на МІСЦЕ Путіна! В Кремлі терміново ГОТУЮТЬСЯ закінчувати ВІЙНУ.

Ердоган ЖОРСТКО поставив на МІСЦЕ Путіна! В Кремлі терміново ГОТУЮТЬСЯ закінчувати ВІЙНУ.

Тайское мороженое в Калининграде

Тайское мороженое в Калининграде

The Witcher IV - Cinematic Reveal Trailer | The Game Awards 2024

The Witcher IV — Cinematic Reveal Trailer | The Game Awards 2024

Пилот обманул смерть ракета пролетела рядом с ним #shorts

Пилот обманул смерть ракета пролетела рядом с ним #shorts

ЧТО ОПАСНЕЕ? ОТВЕТЫ ВАС ШОКИРУЮТ... (1% ОТВЕЧАЮТ ПРАВИЛЬНО) #Shorts #Глент

ЧТО ОПАСНЕЕ? ОТВЕТЫ ВАС ШОКИРУЮТ... (1% ОТВЕЧАЮТ ПРАВИЛЬНО) #Shorts #Глент

Как найти себе жену? Больше - тут @stas.yornik.shorts

Как найти себе жену? Больше - тут @stas.yornik.shorts

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

Что будет если украсть в магазине шоколадку 🍫

Что будет если украсть в магазине шоколадку 🍫