qGPT-J GPU device 에 따른 qparam 변동 이슈 #17

deeplearningfromscratch · 2024-04-01T09:11:48Z

calibration 단계를 검증하는 데 있어서, A100 으로 calibration 한 qparam 과 H100 으로 calibration 한 qparam 이 서로 다릅니다. 따라서 A100 에서 calibration 하고 evaluation 하는 경우 “측정 결과값” 의 결과를 재현할 수 없습니다.
- “측정 결과값” 에서 사용한 qparam 은 H100 에서 calibration 한 결과 입니다.
또한, 현재 가용한 H100 이 없습니다. 따라서 v1.1 결과 검증단계에서는 아래 첨부된 accuracy 와 qpram 이 일치하는지 여부로 판단하고자 합니다.
- 이 때, accuracy 는 A100 에서 calibration 및 evaluation 한 결과이고, qparam 은 위 calibration 과정에서 얻은 결과물입니다.
[TODO] 이후 릴리즈에서부터는 H100 에서 calibration 을 포함하여 end-to-end 양자화 모델 정확도 확인이 필요합니다. @BeomGeunCho
[TODO] 어떠한 차이에 의해서 device 간 qparam 이 다른지 확인할 필요가 있습니다.

deeplearningfromscratch mentioned this issue Apr 1, 2024

construct mlperf qGPT-J evaluation #16

Merged

Provide feedback