PySpark Kickstart - Your first Apache Spark data pipeline

Поділитися
Вставка
  • Опубліковано 5 вер 2024

КОМЕНТАРІ • 6

  • @khandoor7228
    @khandoor7228 Рік тому

    Great series. Thanks for this! I like the databricks env, I work with it everyday but mostly I'm concerned with the pyspark commands and your workflow of structuring data, data manipulation and general data engineering methods. PySpark stuff generally.

  • @Toast_d3u
    @Toast_d3u 5 місяців тому

    great content, thank you

  • @patzeranalysis3452
    @patzeranalysis3452 Рік тому

    When I saw pipeline in the title, I was imagining a pipeline similar to ETL workflows in some tools

    • @DustinVannoy
      @DustinVannoy  11 місяців тому +1

      The term has multiple meaning, but it's meant to be building the ETL functionality via code rather than using a UI tool or declarative language.

  • @torandbuch3028
    @torandbuch3028 Рік тому +1

    Great, but for some people, the first 15 minutes are really not what they are looking for when they click on your video :)

  • @rainfeedermusic
    @rainfeedermusic Рік тому

    After downloading how do I upload the dataset to databricks?