Real-Time Data Pipelines Made Easy with Structured Streaming in Apache Spark | Databricks

Поділитися
Вставка
  • Опубліковано 2 гру 2024

КОМЕНТАРІ • 18

  • @tejusization
    @tejusization 4 роки тому +6

    28:45 to 29:40 is the best!!! :D just don't miss that. sets the context

  • @danielmackie82
    @danielmackie82 6 років тому +4

    What would be an open source equivalent of DB delta?

  • @ashwinkumar5223
    @ashwinkumar5223 Рік тому

    Superb Explanation

  • @mohitmehta3788
    @mohitmehta3788 4 роки тому +2

    Very simplified approach of explaining streaming.

  • @HridyanshiB.
    @HridyanshiB. 6 років тому +1

    Good explanation about streaming..Thanks

  • @yourstruly5DA
    @yourstruly5DA 5 років тому +1

    Good presentation. Would like to understand more how it could integrate and scale with Apache Kafka.

    • @satria5403
      @satria5403 4 роки тому

      Hi, please let me know if u have good resources for this. thank you

  • @JanekBogucki
    @JanekBogucki 4 роки тому +3

    23:30 A single rogue timestamp which is one hour ahead of the second max timestamp would drop all earlier buckets except one bucket corresponding to this single anomalous value. This is fragile.

    • @onewithsixonewithsix601
      @onewithsixonewithsix601 4 роки тому

      Unless there is crazy issue in code manipulating timestamp. It is not a probable scenario to get timestamp ahead of actual unix time.

  • @zhengfang303
    @zhengfang303 5 років тому +1

    When the data has entered the dataframe, if the data has been updated or deleted, how can I update or delete it in the dataframe?

    • @parthadeb3723
      @parthadeb3723 4 роки тому

      A dataframe is immutable. You cannot update a dataframe. You have to create an new dataframe.

  • @venkat.k4392
    @venkat.k4392 4 роки тому

    Appreciated. Thanks you for a great knowledge share.

  • @GrayMatterSoftware
    @GrayMatterSoftware 4 роки тому +2

    Want to know about the best Practices for Real-Time Analytics Architecture on Big Data?
    Read here: www.graymatter.co.in/real-time-analytics-bigdata-architecture/
    Know more: www.graymatter.co.in/real-time-analytics/
    Watch here: ua-cam.com/video/lXdYk3hak54/v-deo.html

  • @thesleepyhead7273
    @thesleepyhead7273 6 років тому +1

    How about integrating this with Tensorflow Serving for end to end Analytics paradigm

  • @albertoandreotti7940
    @albertoandreotti7940 5 років тому +1

    This is a big disappointment. You cannot stream pipelines built with dataframes. Unified processing framework?? come on!
    You have to build new versions of all your algorithms so now they can work with a DStream? What a waste of time.