A Deep Dive into Query Execution Engine of Spark SQL - Maryann Xue

Поділитися
Вставка
  • Опубліковано 29 сер 2024

КОМЕНТАРІ • 4

  • @user-sw9kd9pv4n
    @user-sw9kd9pv4n 3 роки тому

    Excellent session, very well explained

  • @megharaina7561
    @megharaina7561 4 роки тому

    what is the difference between a normal stage in job and a WSCG . Do multiple pipelines within a single WSCG correspond to two separate stages

    • @vinayakalagwadi5866
      @vinayakalagwadi5866 4 роки тому

      according to my understanding the stages are separated only if there is a blocking op which requires aggregations ,,

    • @reesewinterspoon9417
      @reesewinterspoon9417 3 роки тому +3

      Pipelines are separated by blocking operations, while stages are separated by shuffles (in most cases). A stage can have multiple pipelines. For example: the stage with hash aggregate will have multiple pipelines because hash aggregate is a blocking op, if its just a partial hashagg since that does not need a shuffle, all operations would be in a single stage. Hope that helps