Implementing Pyspark Real Time Application || End-to-End Project || Part-1

Поділитися
Вставка
  • Опубліковано 14 чер 2023
  • In this video we will discuss about , implementing Pyspark application in Pycharm and reading the Files Dynamically from the Respective Folders..
    Pre-Requisite::
    Spark and Hadoop Installed, Python, Pycharm
    Link to DataSet::
    Download City Dimension File at below Link:
    prescpipeline1.blob.core.wind...
    Download Prescriber Fact File at below Link:
    prescpipeline1.blob.core.wind...
    #azuredatabricks
    #dataengineering
    #dataanalysis
    #pyspark
    #pythonprogramming
    #dataengineering
    #dataanalysis
    #pyspark
    #python
    #sql

КОМЕНТАРІ • 39

  • @erwinfrerick3891
    @erwinfrerick3891 2 місяці тому +1

    Great explain, very clearly, this video very helpfull for me

  • @Ravi_Teja_Padala_tAlKs
    @Ravi_Teja_Padala_tAlKs 10 місяців тому +2

    Good explanation 😊, now am confident on structure of folders in pyspark works
    Thanks

  • @prabhatgupta6415
    @prabhatgupta6415 Рік тому +3

    you r ahead of everyone in explanantion.

  • @rohilarohi
    @rohilarohi 3 місяці тому

    This video helped me a lot.hope we can expect more real time scenarios like this

  • @user-fn9sg9xp5p
    @user-fn9sg9xp5p Рік тому +2

    good content

  • @shibajena4205
    @shibajena4205 Рік тому +2

    good explanation

  • @pawansalwe1926
    @pawansalwe1926 Рік тому +2

    👍👍

  • @ravisamal3533
    @ravisamal3533 10 місяців тому

    Hey Great Explanation. Please could you reshare the csv file which is used. Not able to extract the file mentioned in your description

    • @DataSpark45
      @DataSpark45  10 місяців тому

      drive.google.com/drive/folders/1XMthOh9IVAScA8Lk-wfbBnKCEtmZ6UKF?usp=sharing

  • @sainadhvenkata
    @sainadhvenkata 14 днів тому

    @dataspark Could you please provide those data links again because those link got expired

  • @skateforlife3679
    @skateforlife3679 8 місяців тому

    Instead of get_env_variables.py, we could use .env file isn't it ?

  • @skateforlife3679
    @skateforlife3679 8 місяців тому +1

    I think the code looks too verbose and need some refactoring to simplify things. Overall good content

  • @akaile2233
    @akaile2233 11 місяців тому

    Sir, Can we use Scala in Intellij IDE for the project ?

    • @DataSpark45
      @DataSpark45  10 місяців тому

      yes you can use brother.

  • @komalibellana9514
    @komalibellana9514 Рік тому

    I am not able to download the fact file,I am getting the error in extracting the file

    • @DataSpark45
      @DataSpark45  10 місяців тому

      drive.google.com/drive/folders/1XMthOh9IVAScA8Lk-wfbBnKCEtmZ6UKF?usp=sharing

  • @0adarsh101
    @0adarsh101 4 місяці тому

    can i use databricks community edition?

    • @DataSpark45
      @DataSpark45  3 місяці тому

      Hi, You can use databricks, then you have to play around dbutils.fs methods in order to get the list / file path as we did in get_env.py file. Thank you

  • @commenterdek3241
    @commenterdek3241 8 місяців тому

    Hello. Does anyone know hindi and can explain this project to me entirely in Hindi (not very much detailed manner, just briefly) in 30 mins or so? I'm a fresher and all this is going bouncer over my head, help out pleaseeee😢😢😢

  • @vishavsi
    @vishavsi 5 місяців тому +1

    I am getting error with logging. Python\Python39\lib\configparser.py", line 1254, in __getitem__
    raise KeyError(key)
    KeyError: 'keys'
    can you share the code written in the video?

    • @DataSpark45
      @DataSpark45  5 місяців тому +1

      sure, here is the link drive.google.com/drive/folders/1QD8635pBSzDtxI-ykTx8yquop2i4Xghn?usp=sharing

    • @vishavsi
      @vishavsi 5 місяців тому

      Thanks@@DataSpark45

    • @subhankarmodumudi9033
      @subhankarmodumudi9033 5 місяців тому

      did your problem resolved?
      @@vishavsi

  • @ChetanSharma-oy4ge
    @ChetanSharma-oy4ge 2 місяці тому

    how can i find this code? is there any repo where you have uploaded it.?

    • @DataSpark45
      @DataSpark45  2 місяці тому

      Sorry to say this bro , unfortunately we lost those files

  • @prabhatgupta6415
    @prabhatgupta6415 Рік тому +1

    sir why have u no used databricks for transformation?

    • @DataSpark45
      @DataSpark45  Рік тому +2

      Hi generally all the application development would be done with IDE and also it's easier to maintain folder kind of structure . Though you can develop in DataBricks But it's majorly for Analysis Part

    • @nandesh783
      @nandesh783 10 місяців тому +1

      @@DataSpark45 but DataBricks internally using spark and even its used in DEV,QA and PROD also? Current trend is also DataBricks right? Please correct me if my understanding is wrong!

    • @skateforlife3679
      @skateforlife3679 8 місяців тому

      @@nandesh783 Any answers ?

  • @SaadAhmed-js5ew
    @SaadAhmed-js5ew 4 місяці тому

    where's your parquet file located?

    • @DataSpark45
      @DataSpark45  3 місяці тому

      Hi, r u talking about source parquet file! It's under source folder

  • @aiviet5497
    @aiviet5497 5 місяців тому

    I can't download the dataset 😭.

    • @DataSpark45
      @DataSpark45  5 місяців тому +1

      Take a look at this :
      drive.google.com/drive/folders/1XMthOh9IVAScA8Lk-wfbBnKCEtmZ6UKF?usp=sharing

  • @pranaykumar581
    @pranaykumar581 3 місяці тому

    Can you provide me the source data file?

    • @DataSpark45
      @DataSpark45  3 місяці тому

      Hi in the description i provided the link bro

  • @ritesh_ojha
    @ritesh_ojha 6 місяців тому

    AuthenticationFailed
    Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. RequestId:ea8e17b4-701e-004d-1db1-573f6a000000 Time:2024-02-04T21:31:20.0816196Z
    Signature not valid in the specified time frame: Start [Tue, 22 Nov 2022 07:36:34 GMT] - Expiry [Wed, 22 Nov 2023 15:36:34 GMT] - Current [Sun, 04 Feb 2024 21:31:20 GMT]

    • @DataSpark45
      @DataSpark45  5 місяців тому +1

      where did you got this error bro

    • @ritesh_ojha
      @ritesh_ojha 5 місяців тому

      @@DataSpark45 while downloading data. But i got data from part 2