Question About CU Selection Logic. #73

diantaowang · 2024-12-19T11:34:38Z

diantaowang
Dec 19, 2024

When we assign a CU to a workgroup, do we use a simple round-rebin algorithm. The code look like this. Have we considered other prioritization algorithms?

cu_next := cu // switch default
switch(fsm) {
  is(FSM.CU_PREFER) {
    // FSM.CU_PREFER: generate preferred CU ID. Currently we prefer CU after the last allocated CU
    cu_next := cu + 1.U
  }
  is(FSM.RESOURCE_CHECK) {
    when(fsm_next === FSM.ALLOC) {
      // RESOURCE_CHECK -> ALLOC:  generated selected CU ID
      cu_next := cu + PriorityEncoder(resource_check_result.asUInt)
    } .otherwise {
      // RESOURCE_CHECK -> REJECT: DontCare
      // RESOURCE_CHECK: if(cache_updated) {do nothing} else {CU step}
      cu_next := Mux(resource_check_repeat, cu, cu + RESOURCE_CHECK_CU_STEP.U)
    }
  }
}

And why choose the serial scanning method? As the CU increases, the performance overhead of this part will become very large.

Thanks!

Humber-186 · 2024-12-21T08:27:16Z

Humber-186
Dec 21, 2024
Maintainer

Yes, we use round-robin while selecting CU for WG.

We have researched some complex thread-block scheduling strategies, but we currently have no plans to apply them in practice. Perhaps we will work on this in the future.

Of course, some less complex strategies are relatively easy to implement, such as BFS and DFS (see this). There is evidence suggesting that Nvidia uses similar strategies.

For the second question, FSM.RESOURCE_CHECK isn't pure serial. Within a single clock cycle, RESOURCE_CHECK_CU_STEP CUs will be checked. If RESOURCE_CHECK_CU_STEP equals the number of CUs, this process is fully parallel.

In most cases, CTA scheduler is not the throughput bottleneck of the whole GPGPU, so a fully parallel structure is not used to reduce hardware area cost.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DSP Lab of Tsinghua University

Question About CU Selection Logic. #73

{{title}}

Replies: 1 comment

{{title}}

Select a reply

DSP Lab of Tsinghua University

Question About CU Selection Logic. #73

diantaowang Dec 19, 2024

Replies: 1 comment

Humber-186 Dec 21, 2024 Maintainer

diantaowang
Dec 19, 2024

Humber-186
Dec 21, 2024
Maintainer