Keshav- Learn !T - Self
Keshav- Learn !T - Self
  • 85
  • 340 384

Відео

Merge multiple part files with ADF / Merge partition files / merge multiple files into a single file
Переглядів 614Рік тому
Merge multiple part files with ADF / Merge partition files / merge files with ADF / combine multiple files into a single file with ADF
Create Dynamic Dataframes in PySpark
Переглядів 1,4 тис.Рік тому
Create Dynamic Dataframes in PySpark
CONTENT-MD5 In Azure Storage Account | Content-MD5 in ADF Get Metadata Activity
Переглядів 834Рік тому
CONTENT-MD5 In Azure Storage Account | Content-MD5 in ADF Get Metadata Activity
RDD #3: All About Pair RDDs in Spark | Pair RDD | flatMap() Vs map()
Переглядів 356Рік тому
Pair RDD : Is Key value pair RDD (key , value), (Key,Value) Create Pair RDD : 1. From Regular RDD 2. With In-memory collection (with Parallelize directly) Pair RDD Functions : reduceByKey countByKey countByValue Scenarios : How many times each word is repeated in the given document ? Totale sales amount by product (by key) ? Count of by key ? Count By value ? flatMap() Vs map() flatMap : each e...
RDD #2 : RDD Operations In PySpark | RDD Actions & Transformations
Переглядів 396Рік тому
As continuation to the previous video on RDD concepts, in this video I covered the different operations been supported by RDD
RDD #1 : RDD In PySpark | Different ways to create RDD in PySpark
Переглядів 336Рік тому
RDD In PySpark | Different ways to create RDD in PySpark
End to End Data Migration ETL Framework with Azure Data Factory & T SQL
Переглядів 3,3 тис.Рік тому
End to End Data Migration ETL Framework with Azure Data Factory & T SQL create table ETLMeta.SourceSystemMetaData ( SNO int identity(1,1), sourceSystemId int, sourceType nvarchar(500), SourceFilePah nvarchar(500), SourceFileName nvarchar(500), sourceSystemName nvarchar(50), sourceSchema nvarchar(100), sourceTable nvarchar(100), targetSchema nvarchar(100), targetTable nvarchar(100) ) go insert i...
Collect_Set Vs Collect_List | PySpark
Переглядів 1,2 тис.Рік тому
Collect_Set Vs Collect_List | PySpark
Create Dynamic DataFrame | Exec Method | Dynamic DataFrame
Переглядів 1,5 тис.2 роки тому
Create Dynamic DataFrame | Exec Method | Dynamic DataFrame
Convert DataFrame Columns Into Dictionary (Map) | create_map() | Columns to Dictionary
Переглядів 2,1 тис.2 роки тому
Convert DataFrame Columns Into Dictionary (Map) | create_map() | Columns to Dictionary
Convert Dictionary Key Value Pairs to Multiple Fields in DataFrame | Key Value Pairs to Columns
Переглядів 2,4 тис.2 роки тому
Convert Dictionary Key Value Pairs to Multiple Fields in DataFrame | Key Value Pairs to Columns In this video I shown how do we export dictionary Key value pairs and covert keys to multiple columns in dataframe.
Define DataType for Keys and Values in Python Ditionary | MapType()
Переглядів 2202 роки тому
Define DataType for Keys and Values in Python Ditionary | MapType() In this video I spoke about MapType and shown how to define the structure and datatype for Dictionary Key Value Pairs.
Different ways to Create Dictionaries & Fetching Key Value Pairs from Dictionary
Переглядів 6092 роки тому
Different ways to Create Dictionaries & Fetching Key Value Pairs from Dictionary
map() vs flatMap() In PySpark | PySpark
Переглядів 6 тис.2 роки тому
In this video I shown the difference between map and flatMap in pyspark with example. I hope will help. Please have look. Have a peek into my channel for more on PySaprk, ADF and other Azure concepts.
Convert DataFrame Column values to List | PySpark
Переглядів 4 тис.2 роки тому
Convert DataFrame Column values to List | PySpark
Repartition Vs Coalesce
Переглядів 5162 роки тому
Repartition Vs Coalesce
Data Reconciliation | Counts, Measures, Values Reconciliation | Data Recon
Переглядів 4,3 тис.2 роки тому
Data Reconciliation | Counts, Measures, Values Reconciliation | Data Recon
Map() Transformation in PySpark | PySpark | Lambda function
Переглядів 3,4 тис.2 роки тому
Map() Transformation in PySpark | PySpark | Lambda function
#9 UDFs in PySpark | Convert Functions to UDF in PySpark | Register UDFs to Spark SQL
Переглядів 1,1 тис.2 роки тому
#9 UDFs in PySpark | Convert Functions to UDF in PySpark | Register UDFs to Spark SQL
#5 Dynamic Joins in PySpark | Create Function
Переглядів 1,7 тис.2 роки тому
#5 Dynamic Joins in PySpark | Create Function
#3 Joins in PySpark | Semi & Anti Joins | Join Data Frames in PySpark
Переглядів 2,6 тис.2 роки тому
#3 Joins in PySpark | Semi & Anti Joins | Join Data Frames in PySpark
#2 Different ways of creating Data Frame in PySpark | Data Frame in PySpark
Переглядів 2,8 тис.2 роки тому
#2 Different ways of creating Data Frame in PySpark | Data Frame in PySpark
#1 Spark Architecture | PySpark | Azure Databricks | Spark Cluster | Cluster Nodes
Переглядів 2,9 тис.2 роки тому
#1 Spark Architecture | PySpark | Azure Databricks | Spark Cluster | Cluster Nodes
Nested If in ADF | If Condition in ADF | Nested Condition
Переглядів 7 тис.2 роки тому
Nested If in ADF | If Condition in ADF | Nested Condition
Access Secrets From KeyVault In ADF | Pass Connection Strings Securely to ADF through KeyVault
Переглядів 9853 роки тому
Access Secrets From KeyVault In ADF | Pass Connection Strings Securely to ADF through KeyVault
Invoke Logic Apps in ADF | Trigger Logic App | Send Mail Notification With Logic Apps
Переглядів 3,2 тис.3 роки тому
Invoke Logic Apps in ADF | Trigger Logic App | Send Mail Notification With Logic Apps
Custom Logging in ADF | Audit log For ADF | Dynamic table load with ADF
Переглядів 5 тис.3 роки тому
Custom Logging in ADF | Audit log For ADF | Dynamic table load with ADF
Configure Azure Data Factory Pipeline | Drive ADF Pipeline with SQL Table| Load data dynamically
Переглядів 4,3 тис.3 роки тому
Configure Azure Data Factory Pipeline | Drive ADF Pipeline with SQL Table| Load data dynamically
Loop through Folders & Files | Iterate through Folders | Execute one Pipeline in other pipeline
Переглядів 9 тис.3 роки тому
Loop through Folders & Files | Iterate through Folders | Execute one Pipeline in other pipeline

КОМЕНТАРІ

  • @kiranchavadi7881
    @kiranchavadi7881 17 днів тому

    You are doing great, please do not stop making videos!

  • @kiranchavadi7881
    @kiranchavadi7881 17 днів тому

    Very detailed and clear understanding!

  • @naren2146
    @naren2146 20 днів тому

    I'M highly interested in taking the azure course.Could you please share the online course details and let me know when will you start the new batch is scheduled to begin?

  • @ceciliaayala3923
    @ceciliaayala3923 23 дні тому

    In 2024 the "Post SQL Script" option is not available, but I solved that by setting the Post SQL in the "Pre SQL Script" of the next task, and a the end of the pipeline I added a Script object with only a "Pre SQL Script" for the last Post SQL. Thanks for the video!

  • @Ks-oj6tc
    @Ks-oj6tc Місяць тому

    Good session, Thankyou

  • @pallaviak11
    @pallaviak11 3 місяці тому

    This is very helpful video, I am new to ADF, and wanted to implement similar scenario, thanks for this.

  • @Lolfy23
    @Lolfy23 3 місяці тому

    voice is very low... not acceptable

  • @vru5696
    @vru5696 5 місяців тому

    Can you also create video on copy files from SharePoint location to ADLS? Thanks

  • @WolfmaninKannada
    @WolfmaninKannada 5 місяців тому

    sir thanks alot for demonstrating how to perform iot data stream on azure .It really helped me to get in depth understanding & service use cases. please do make more end to end videos on stream data analytics

  • @g.suresh430
    @g.suresh430 6 місяців тому

    Nice Explanation, I want all the columns in both the examples

  • @PS65501
    @PS65501 7 місяців тому

    if I have 4-5 level of sub folders how this will work ?

  • @GaneshNaik-lv6jh
    @GaneshNaik-lv6jh 7 місяців тому

    Good Explanation sir, Thank You.....

  • @chandrashekar3649
    @chandrashekar3649 7 місяців тому

    U are not updating playlist sir.... pls look into it.....

  • @chandrashekar3649
    @chandrashekar3649 7 місяців тому

    U are not updating playlist sir.... pls look into it.....

  • @AnandZanjal
    @AnandZanjal 7 місяців тому

    Great video on Azure! Really helpful and easy to follow. Thanks for sharing!

  • @Tsoy78
    @Tsoy78 7 місяців тому

    Thanks, I think this can be enhanced and shortened, as you could have dynamic expression with "if-else" inside the actual "If" condition expression

  • @rathnamaya6263
    @rathnamaya6263 7 місяців тому

    Thank you so much😊

  • @soumyabag5268
    @soumyabag5268 8 місяців тому

    Can i get link for the dataset?

  • @mounicagvs9020
    @mounicagvs9020 8 місяців тому

    Keshav , can u share the link of the next video where u compared the contents of file using md5?

  • @gauravpandey211088
    @gauravpandey211088 9 місяців тому

    Once you have this column as RDD post transformation, how do you add it back to existing data frame as a new column?

    • @berglh
      @berglh 8 місяців тому

      If you want to do this in a Spark data frame, and store the results, use the pyspark.sql.functions.split function to split the string by a delimiter, this will return an array column like map. Then to get the same sort of effect as flatMap inside and existing data frame, you can use the pyspark.sql.functions.explode function on the array column of split values. import pyspark.sql.functions as f; df.withColumn("split_values", f.split(f.col("product_descriptions"), " ")); df.withColumn("exploded", f.explode(f.col("split_values"))); Keep in mind, it depends on what you're trying to do; map and flatMap are useful if you want to return the column from a data frame to then do other work in the programming language outside the context of Spark; say for instance, getting a list in Python to iterate through using another Python library. If you want to retain the data in the data frame, you're usually better off using the built-in Spark functions on the data frame columns directly, in some cases, these are calling map and flatMap internally on the RDD, but it typically results in less code for the same performance. There are circumstances where map and flatMap methods can be slower in my experience; sticking to the Spark/Pyspark built-in column functions is best. You can build a data frame from an RDD using RDD.toDF(), but you will need some kind of index value to join it back on to the source data frame in a meaningful way, due to the way that Spark does partitioning between executors, there is no inherent order of the data and would make joining an RDD back (at scale) pointless without a column to join on. So, this goes back to the point that using the built-in functions avoid all this hassle.

  • @SriVanshi
    @SriVanshi 10 місяців тому

    How to do in scala spark?

  • @handing2857
    @handing2857 10 місяців тому

    Very clear explained

  • @spawar2443
    @spawar2443 11 місяців тому

    too much advertise

  • @vishwavihaari
    @vishwavihaari 11 місяців тому

    It worked for me. Thank you.

  • @CloudandTechie
    @CloudandTechie 11 місяців тому

    Great , can you share the scripts and the resources used in the project on github or drop box is possible, Thanks again for this wonderful session.

  • @tejpatnala709
    @tejpatnala709 11 місяців тому

    I have doubt, in case we need to connect to on prem sql server do we need to have ssms (connected with sql server) installed in same machine where adf is present ?? Or else it will pass the test connection with out any server installed in the same machine but in diff machine will it work the linked service test connection?

    • @Saikumarguturi
      @Saikumarguturi 6 місяців тому

      I think so , it's No need to install ssms

  • @dhp106
    @dhp106 Рік тому

    This video helped me SO MUCH thank you.

  • @muzaffar527
    @muzaffar527 Рік тому

    Where to define Nested if, Nested else, outer elif, outer else else

    • @muzaffar527
      @muzaffar527 Рік тому

      I thought we are calling outer activities in these conditions, but it is simple sql queries. Now understood the concept. Great approach. it helped me in my pipeline. Thank you.👍🏼

  • @muzaffar527
    @muzaffar527 Рік тому

    I didn’t understand how and where to write queries - query1, query2, query3.. Could you please help.

  • @morriskeller
    @morriskeller Рік тому

    Nice and clear, thank you!

  • @muzaffar527
    @muzaffar527 Рік тому

    As discussed in the video, is real time scenario recorded? Please share the link if it is recorded, I didn't find in your playlist. Thank you.

  • @sethuramalingam3i
    @sethuramalingam3i Рік тому

    super bro

  • @s.ifreecoachinginstitute0077

    Nice Video Bro, But I have small doubt on after completion of mapping data ?which data stored in azure sqldb?

  • @shankarshiva5587
    @shankarshiva5587 Рік тому

    I need to display the data from table which have special character in databricks Ex: select first-name from table name. It throws error

  • @Rmkreddy92
    @Rmkreddy92 Рік тому

    nice explanation and collective information shared on each concepts. Thank you very much bro.

  • @kachipamarthy8254
    @kachipamarthy8254 Рік тому

    Hi Keshav, I would like to discuss with regarding the training can I have your email I’d please to discuss

  • @hiteshlalwani5519
    @hiteshlalwani5519 Рік тому

    Hi Keshav, Can we add a DATE table in the modelling portion which act as a bridge table between two respective tables?

  • @SK-wp4tm
    @SK-wp4tm Рік тому

    Hey can you please share the PPT that you have used in this video ?

  • @azyamp
    @azyamp Рік тому

    thank you, was very useful

  • @MrTejasreddy
    @MrTejasreddy Рік тому

    HI bro,recently found u r channel u r grate content...can u plz make a video powerbi report on azure

  • @88edits
    @88edits Рік тому

    better than most lecturers

  • @funtimewithlekyasri9445
    @funtimewithlekyasri9445 Рік тому

    Nice presentation.

  • @abhinavrai6519
    @abhinavrai6519 Рік тому

    Thank you so much for this perfect explaination.

  • @jadhavsakshi5834
    @jadhavsakshi5834 Рік тому

    Hey we can also do this in this way right for example if we want to insert data into multiple tables using different CSV files dynamic way

  • @aravind5310
    @aravind5310 Рік тому

    Great efforts. If you have added Databricks then it's so helpful.

  • @khandoor7228
    @khandoor7228 Рік тому

    awesome, nice piece of code!!

  • @mateen161
    @mateen161 Рік тому

    Nice one. Thank you!

  • @shaikshavalishaik4457
    @shaikshavalishaik4457 Рік тому

    Thank you Keshav for excellent videos, Can you please do a video on to read data from Azure SQL Data Base server and write CSV file to Blob storage. If it is already done can you please share the link of the video

  • @krishnachaitanyareddy2781

    can you share these ppt