Hi Prof. Onur Mutlu, All the facts beautifully explained. Hats Off! Thanks for the lectures. Now, I have two doubts, first how memory controller is selecting the rank no. from given address? You have shown only row, channel. & column selection from a coming address) ( Time : 1:19:22 - ua-cam.com/video/IUk9o9wvX1Y/v-deo.html ) Second doubt: Do you have any video explaining the Memory Controller & PHY interconnection with working with DRAM Chip. How different timing signals like DQS etc work together? If yes please share the link.
If you pipeline subarray access in banks won't you essentially end up thrashing the global row-buffer? Seems like this will increase the frequency of precharging (and then power consumption AFAICT).
In THREAD CLUSTERING MEMORY SCHEDULING My doubts are - 1. Which threads are you refering here ? Is it hardware thread or kernel/user thread? 2. How does memory controller gets the information related to thread?
1. It is hardware threads. More specifically, hardware context ID. 2. The hardware context ID is communicated with each request to the memory controller. I would suggest reading earlier works on the topic of thread-aware memory scheduling, which cover such information in more detail: Mutlu and Moscibroda, "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors", MICRO 2007. people.inf.ethz.ch/omutlu/pub/stfm_micro07.pdf Mutlu and Moscibroda, "Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems", ISCA 2008. people.inf.ethz.ch/omutlu/pub/parbs_isca08.pdf
A single row buffer consists of multiple bytes. For example, you can have a DRAM with 8kB row buffers. The column decoder is used to select the block (again block size can be different, for example 64 bytes) in the row.
I think in the images it is abstracted to show a single bit; but under the hood, the columns actually consist of 8 stacked memory cells (connected to the same row-column intersection).
Amazing Lectures!!! Thank you for making it public. Hope some day will listen to your live lecture and Interact with you
Lecture starts at 9:04
Very helpful, thank you
main memory starts at 7 40
Hi Prof. Onur Mutlu,
All the facts beautifully explained. Hats Off! Thanks for the lectures.
Now, I have two doubts, first how memory controller is selecting the rank no. from given address? You have shown only row, channel. & column selection from a coming address) ( Time : 1:19:22 - ua-cam.com/video/IUk9o9wvX1Y/v-deo.html )
Second doubt: Do you have any video explaining the Memory Controller & PHY interconnection with working with DRAM Chip. How different timing signals like DQS etc work together? If yes please share the link.
If you pipeline subarray access in banks won't you essentially end up thrashing the global row-buffer? Seems like this will increase the frequency of precharging (and then power consumption AFAICT).
In THREAD CLUSTERING MEMORY SCHEDULING
My doubts are -
1. Which threads are you refering here ? Is it hardware thread or kernel/user thread?
2. How does memory controller gets the information related to thread?
1. It is hardware threads. More specifically, hardware context ID.
2. The hardware context ID is communicated with each request to the memory controller.
I would suggest reading earlier works on the topic of thread-aware memory scheduling, which cover such information in more detail:
Mutlu and Moscibroda, "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors", MICRO 2007.
people.inf.ethz.ch/omutlu/pub/stfm_micro07.pdf
Mutlu and Moscibroda, "Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems", ISCA 2008.
people.inf.ethz.ch/omutlu/pub/parbs_isca08.pdf
I don't get the point that each column gives 8 bits?? Column decoder is also a mux which selects one of the bits in row buffer??
A single row buffer consists of multiple bytes. For example, you can have a DRAM with 8kB row buffers. The column decoder is used to select the block (again block size can be different, for example 64 bytes) in the row.
I think in the images it is abstracted to show a single bit; but under the hood, the columns actually consist of 8 stacked memory cells (connected to the same row-column intersection).
Why there are so many "@"?
I know.