Databricks - Data Quality - PyDeequ - Introduction

Поділитися
Вставка
  • Опубліковано 8 січ 2025

КОМЕНТАРІ • 11

  • @jbernece
    @jbernece 2 місяці тому

    Excellent stuff Apostolos. Thanks for sharing this very detailed and complete solution.

  • @NoahPitts713
    @NoahPitts713 3 місяці тому

    Its like you read my mind on this topic. I've been looking for DQ libraries for the last couple of weeks. Thanks for sharing!

  • @Bijuthtt
    @Bijuthtt 3 місяці тому

    your videos are awesome. good job sir

  • @emmazagorianou8389
    @emmazagorianou8389 3 місяці тому

    As always amazing tutorial!!!

  • @MartinOlowe
    @MartinOlowe 2 місяці тому

    Hi, amazing vid. What is your cluster config i.e. databricks runtime, work type, spark config?

    • @AthanasiouApostolos
      @AthanasiouApostolos  2 місяці тому

      Thank you :) Just a regular cluster, the cheapest option for the demo. For production you would need to use something more apparently.

  • @ParasUpadhyay-q6y
    @ParasUpadhyay-q6y Місяць тому

    Does it work with UC enabled clusters in databricks?

    • @AthanasiouApostolos
      @AthanasiouApostolos  Місяць тому

      @@ParasUpadhyay-q6y yes it shouldn't matter.

    • @ParasUpadhyay-q6y
      @ParasUpadhyay-q6y Місяць тому

      @AthanasiouApostolos pydeequ is not working on a shared access mode cluster. The same code is working fine for me in single mode. is it expected?