Azure Data Lake Storage (Gen 2) Tutorial | Best storage solution for big data analytics in Azure

Поділитися
Вставка
  • Опубліковано 16 жов 2024

КОМЕНТАРІ • 252

  • @fernandos1790
    @fernandos1790 4 роки тому +39

    I don't usually comment on youtube, but Adam I will make an exception for you. Your videos are easy to follow and educating, but most of all, straight to the point. The amount of time allocated to the video is perfect. Best wishes and please continue making training videos.

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому +3

      Wow! Thank you Fernando for such a heart-warming feedback. More videos are coming!

  • @ajeetdwivedi2294
    @ajeetdwivedi2294 2 роки тому +3

    I don't know why I understand everything which you teach without even repeating the video twice. Its so much clear and to the point and especially demo part, starting from the background to the practical everything you present is just wow. God bless you !!

  • @sekhara
    @sekhara Місяць тому

    Adam you are the best! Like everything in your content (tone, pace, crisp, concise, no nonsense, straight to the point, good coverage, well thought through, list goes on...). I'm one of those who rarely post a public comment; here you go... doing it for you.

  • @fsfs5665
    @fsfs5665 4 роки тому +9

    I have been watching a lot of Azure videos. This one is the best and I will study more of your catalog. Thanks!

  • @CrypticPulsar
    @CrypticPulsar 4 роки тому +3

    I spent days scouring the web for documentation and I was left out cold, then I thought I give UA-cam a try and I found yours up on the top, and I can't be happier.. thank so so sooo much Adam!! Keep up the excellent work.. very easy to follow, simple, rich content.. loved it!!!

  • @sushmamc8904
    @sushmamc8904 29 днів тому

    This was crisp and you have covered everything that the beginner should know. Thanks a lot!

  • @shockey3084
    @shockey3084 4 роки тому +5

    Each and every second you took is informative.
    Great learning from you.

  • @carl33p
    @carl33p 4 роки тому +1

    Your demos are fantastic. Love that you go step by step and don't skip things. Much appreciated.

  • @shailendersingh7093
    @shailendersingh7093 2 роки тому

    This is first time i am commenting anyone on a UA-cam videos. I have seen thousands may be.
    "The best videos and so easy to understand" keep it up Adam

  • @adamolekunfemi6314
    @adamolekunfemi6314 3 роки тому +1

    Excellent video. Taught with simplicity and clarity, without any noise.

  • @estebanrodriguez.11
    @estebanrodriguez.11 4 роки тому +1

    This is excelent. I'm preparing DP-200 and 201, and your videos have a lot of information concentrated, summarized and explained very simply. Thanks!

  • @刘振-t1u
    @刘振-t1u 2 роки тому +1

    Thanks Adam's wonderful video, it's really easy to understand the ADLS!

  • @discovery-dx3ry
    @discovery-dx3ry 4 роки тому +1

    Your videos are very easy to follow. Many thanks for your effort to create all the azure videos.

  • @tarvinder91
    @tarvinder91 2 роки тому

    This is such a great tutorial especially that you share all teh difference with normal storage account. Its was extremely helpful for me

  • @alexandrupopovici2366
    @alexandrupopovici2366 2 роки тому +3

    Great video as per usual, your channel is my go-to for preparing for Azure Certification Exams!
    I do have a question regarding the ADLS Gen 2 hierarchy, as one of my friends is preparing for the DP-900 exam (regarding a practice question).
    The question asks you to match 2 of the following 3 terms ([Azure Storage Account], [File share], [Container]) in the following hierarchy (only one answer is allowed):
    Azure Resource Group - [TERM 1] - [TERM 2] - Folders - Files
    The suggested correct answer is:
    Azure Resource Group - [Azure Storage Account] - [File Share] - Folders - Files
    But I don't see any reason why the following answer would not also be correct (besides maybe because containers being called file systems in ADLS Gen 2?):
    Azure Resource Group - [Azure Storage Account] - [Container] - Folders - Files
    What is your take on this? Thanks for taking your time!

  • @javimaci4615
    @javimaci4615 3 роки тому +1

    Adam you are a rock star. Your videos are extremelly well done. Thanks and keep up the good work!

  • @empowerpeoplebytech2347
    @empowerpeoplebytech2347 4 роки тому +1

    Great explanation of many things together and also explaining the differences and linkage between ADL, ADB, PBI, etc. Thank you very much Adam for this one.

  • @sau002
    @sau002 2 роки тому

    I came to this video not expecting to learn much. I was wrong. Very useful.

  • @saurabhrai8817
    @saurabhrai8817 4 роки тому

    Hey Adam !! You are becoming one of the BEST AZURE AUTHORITIES / SME on the Internet. Keep up the good work. Thanks for sharing your knowledge in such a simple way. Kudos !!

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Wow, thanks, that's a nice thing to say. I wouldn't say so since I'm just a trainer, but I love your enthusiasm and appreciation. Thank you kindly my friend :)

  • @anushamantrala5527
    @anushamantrala5527 4 роки тому +1

    Your videos are really worth watching Adam , really thanks for the beautiful content 😁want many more videos from your side.Thanks in advance.

  • @alperakbash
    @alperakbash 3 роки тому +1

    Unbelievable tutorial. Thank you so much for helping me to find everything I look for at one place.

    • @alperakbash
      @alperakbash 3 роки тому

      excluding Power BI of course. I am a tableau fan =D

    • @AdamMarczakYT
      @AdamMarczakYT  3 роки тому +1

      Awesome, thanks!! :D

    • @AdamMarczakYT
      @AdamMarczakYT  3 роки тому +1

      Hehe, thanks alright, we all have our preferences :)

  • @vijayt3678
    @vijayt3678 4 роки тому +1

    Wow such a clear and simple explanation about Data lakes. Absolutely awesome thank you Adam for your great efforts for the community... More power to you...👍👍

  • @mgvlogs5948
    @mgvlogs5948 4 роки тому +1

    you make very simple and easy explained videos well done Adam!

  • @lg061870
    @lg061870 4 роки тому +6

    this is insanely complete! wow.

  • @totanlj18
    @totanlj18 4 роки тому +4

    I really really thank you!! Your video makes me the week!

  • @Rana-zi4ht
    @Rana-zi4ht 2 роки тому

    Hey Adam!
    That was very informative and clear explanation about data lake👏 .Thank u a lot

  • @emiliogarza6446
    @emiliogarza6446 Рік тому

    Your content is gold, thanks a lot for making these videos

  • @TechnoQuark
    @TechnoQuark 4 роки тому

    Hi Adam... Barvo. Excellent work. I recently watched few of your videos and they are absolutely fabulous... Thanks

  • @afzaalawan
    @afzaalawan 3 роки тому +1

    What a great explantion with practical.. you are Star

  • @yemshivakumar
    @yemshivakumar 4 роки тому

    Never seen this kind of KT video's to public. Thanks Adam for spoon feeding video to improve Azure knowledge.

  • @bethuelthipe-moukangwe7786
    @bethuelthipe-moukangwe7786 Рік тому

    Thanks for lessson, your videos are very helpful to me.

  • @SatyaPParida
    @SatyaPParida 4 роки тому

    Fabulous tutorial.wish to see more like these. Informative content ✌️. I'll be using this knowledge in my project.Much needed video.Thanks

  • @Haribabu-zj4hd
    @Haribabu-zj4hd 3 роки тому +1

    Very nice video explained clearly the concept, thank you so much Me.Adam🙏.

  • @allanramos5721
    @allanramos5721 3 роки тому +1

    Thanks for the contribution, Adam!

  • @max_frame
    @max_frame 4 роки тому +1

    Excellent video, exteremly clear and concise. Thank you!

  • @RohitJadhav-ik8gt
    @RohitJadhav-ik8gt 3 роки тому +1

    You are fantastic !!! Thanks for sharing valuable content.

  • @GuilhermeMendesG
    @GuilhermeMendesG 2 роки тому

    What a great video. Thanks Adam!

  • @noahmcaulay4420
    @noahmcaulay4420 4 роки тому +2

    Thank you! Extremely helpful video, and very informative. :)

  • @vzntoup
    @vzntoup 5 місяців тому

    This tut was a blast! Thank you

  • @shantanudeshmukh4390
    @shantanudeshmukh4390 4 роки тому +1

    You are amazing Adam !! How can one know all these things, Azure, Power BI, ADF, Data Lake. You are genius. Thanks for knowledge sharing !

  • @ANAND237
    @ANAND237 Рік тому

    Great demo. Thank you Adam

  • @fivesontheinside
    @fivesontheinside 4 роки тому

    Your videos are wonderful, sir! would love to see an in-depth one on Azure Monitor, perhaps how services such as these (storage/blob/data lake) can tie into it. I find the variety of monitoring options a bit overwhelming without knowing which are worthwhile. Have a great day! (please let me know if I just missed an Azure Monitor video somewhere in here)

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Great suggestion, I surely have monitor on thelist! Thanks for watching :)

  • @sinyali8370
    @sinyali8370 3 роки тому

    Very good and comprehensive tutorial, thank you!

  • @seagoat666
    @seagoat666 Рік тому

    Amazing Demo!! Many Thanks!!

  • @mallikarjunap7302
    @mallikarjunap7302 4 роки тому

    Its excellent video for ADLS to connect to data bricks and power BI

  • @frclasso
    @frclasso 3 роки тому +1

    Very good tutorial, very helpful. Thank you.

  • @niluparupasinghe7307
    @niluparupasinghe7307 3 роки тому

    Excellent and very practical tutorial, thank you...

  • @GiovanniOrlandoi7
    @GiovanniOrlandoi7 2 роки тому

    Great video. Thanks Adam!

  • @icici321
    @icici321 4 роки тому

    Great Video. Your explanation is very nice and easy to understand. Thanks very much.

  • @janvi_gupta_group
    @janvi_gupta_group 4 роки тому +1

    Awesome demonstration how to create and connect ADLS and running Scala code with databriks

  • @michalhutny7356
    @michalhutny7356 4 роки тому +2

    Great work, as always!

  • @zulumedia6374
    @zulumedia6374 11 місяців тому

    Fantastic and useful video. Thanks!

  • @AdamMarczakYT
    @AdamMarczakYT  4 роки тому +1

    Please note that since the release of the video there were some changes made to the service. For instance an immutable storage feature is now in preview for ADLS :) azure.microsoft.com/fr-ca/updates/immutable-storage-for-azure-data-lake-storage-now-in-public-preview/?WT.mc_id=AZ-MVP-5003556

    • @avnish.dixit_
      @avnish.dixit_ 4 роки тому +1

      Fabulous Work. Just one think always try to make you videos from production point of views.
      And it would be great it upload few new videos on "Data Mapping Flows" on Delta Lake, and Databricks features such such Mounting, Caching Streaming operations

    • @avnish.dixit_
      @avnish.dixit_ 4 роки тому +1

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      I'm working on improving my workflow. By end of 2020 I want to have new streaming PC with better setup which would allow me to more freely create videos and reduce time required to make them. When this happens I will be able to make more videos faster and MDF in ADF is surely a big topic of interest to me. :) Thanks for tuning in!

    • @sammail96
      @sammail96 3 місяці тому

      @@AdamMarczakYT Bro Why you stop making videos?

  • @clapton79
    @clapton79 3 роки тому +2

    For Databricks mounting: Please note that Azure version as of today will copy the secret ID and not the password itself if you hit copy at the end of the line just like Adam does. Copying the secret password seems only possible immediately after the creation of the secret by the copy button that appears right after the password. Took me some time to figure this out..

    • @AdamMarczakYT
      @AdamMarczakYT  3 роки тому

      Hehe! Good catch! Microsoft updated UI and now new keys have two columns Value and ID. Both have copy a button. Just make sure to use the copy button in the Value column :) Thanks!

  • @pratimaprabhu2029
    @pratimaprabhu2029 3 роки тому

    Very well explained 👍🏻

  • @enavea
    @enavea 3 роки тому

    Interesante representación de contexto de como manejar un lago de datos que hoy tenemos y como estos los podemos transformar en información y prepararla para la inteligencia artificial sobre ellos.

  • @jurges8544
    @jurges8544 2 роки тому

    Hi Adam thank you for the video, it was great. I have just one question in relation to Hadoop compatible access? This means that it can be connected with Hadoop, or it uses Hadoop every time it has some action inside the Data Lake . Thanks a lot once again

  • @constantini82
    @constantini82 2 роки тому

    amazon tutorial, you explain so well, thanks

  • @selwynalexander9750
    @selwynalexander9750 4 роки тому

    Super Adam!...Good for Analytical usecases

  • @dianpriyambudi
    @dianpriyambudi 4 роки тому +1

    Love it, many thanks Adam!

  • @ahsanijaz6318
    @ahsanijaz6318 3 роки тому +1

    great video...can you please also make a video of how we can move the Microsoft Navision data to the data lake

    • @AdamMarczakYT
      @AdamMarczakYT  3 роки тому +1

      Thanks! Unfortunately Microsoft Dynamics is not my field of expertise. Nav is pretty old system too so it's hard to find any useful examples :(

  • @GG-uz8us
    @GG-uz8us 4 роки тому +1

    A quick correction, you should use select.write.csv(...) at 23:20, otherwise you would write all columns from original csv to the new csv file.

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому +2

      Ah a good eye indeed. Coincidentally I noticed this as well yesterday as I was conducting training on this very topic. Cheers 😀

  • @gurubazi
    @gurubazi 4 роки тому

    I really Haapy with your explanation and presentations..it helps me a lot

  • @MohMOh-kv9gg
    @MohMOh-kv9gg Рік тому

    Hi Adam! Great video! 1 question: I have created an issue reporting, inspection and ideas apps on one of my team in Microsoft Teams. How do I export that data into Azure Data Lake?

  • @drScorp1on
    @drScorp1on Рік тому

    Great video, just subbed.

  • @sanniepatron8260
    @sanniepatron8260 4 роки тому +1

    thank you for the videos! i am starting with databricks and it super clear! do you have some videos of delta lake databricks like merge things? it will be awesome to learn more about it!

  • @seb6302
    @seb6302 4 роки тому

    Love your videos Adam!

  • @francisjohn6638
    @francisjohn6638 Рік тому

    Really awesome !

  • @justair07
    @justair07 4 роки тому +1

    When creating containers, how do I know exactly how many containers I should make? For example, if I'm creating 5 apps that are completely independent of each other and the apps save pictures that the users take to the storage account, should I have 5 containers (1 for each app)? Or 1 container to support all apps?

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Hey Justin. There aren’t any specific limits scoped around containers so this is a design decision. There aren’t any specific guidelines so you should match what feels right for your organization and use case. But there are storage account level limits so those could be a deciding factor between one and many storage accounts, check those out in here docs.microsoft.com/en-us/azure/azure-resource-manager/management/azure-subscription-service-limits?WT.mc_id=AZ-MVP-5003556

  • @manishalankala1622
    @manishalankala1622 3 роки тому +1

    Well Explained

  • @ashseth7885
    @ashseth7885 3 роки тому

    Hi Adam , thanks for your valuable time to create this video. I faced a problem while performing Add Role Assignment step , I saw that Azure has removed AzureAD from "assign access to" drop down list, . Please suggest any other approaches to mount data lake . Appreciate your efforts. Thanks

  • @sammail96
    @sammail96 3 місяці тому

    4:52 ADLS Gen2 supports soft delete for blobs. When enabled, deleted blobs are retained for a specified period before permanent deletion1.
    However, soft delete for containers is not supported during the upgrade process2.

  • @SS-eu4eb
    @SS-eu4eb 4 роки тому

    Very clear explanation. Thanks!

  • @nmhoang310
    @nmhoang310 4 роки тому

    Good tutorial. Easy to understand.

  • @venkatx5
    @venkatx5 4 роки тому

    Excellent Adam!

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Thanks as always, you are very active :) nice to see that.

    • @venkatx5
      @venkatx5 4 роки тому

      @@AdamMarczakYT Azure is Interesting + your videos are great as it has clear explanation with Demo.

  • @MrLenzi1983
    @MrLenzi1983 3 роки тому +1

    Adam your tutorials are amazing! is it possible to copy metadata and files from sharepoint and ingest into data lake using ADF?

    • @AdamMarczakYT
      @AdamMarczakYT  3 роки тому +1

      It should be possible using REST api, but I would advise against it. This is what Logic Apps were created for. Thanks for watching!

  • @prashanthkommana7105
    @prashanthkommana7105 4 роки тому +1

    Wonderful demo. Can you please give us demo on Datafactory and API as well please.

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Thanks. Actually I already have few data factory videos (4) using blob storage and SQL, but blob and ADLS are so similar that if you would watch those and change connector to ADLS you wouldn't notice the difference. For API, what do you mean?

    • @prashanthkommana7105
      @prashanthkommana7105 4 роки тому

      @@AdamMarczakYT Hello Friend. Something on PAAS services. Also plz plz plz plz give full demo on Azure Site Recovery. Migrating OnPrem Infra to Azure. Please.

  • @yuliyacher67
    @yuliyacher67 3 роки тому +1

    Thank you for information.

  • @mohitjoshi1361
    @mohitjoshi1361 2 роки тому +1

    At 10:30, I understand read and write, but how execute works here? What is execute permission in adls?

  • @pdsqsql1493
    @pdsqsql1493 2 роки тому

    Fantastic video

  • @hotkissfuck101
    @hotkissfuck101 4 роки тому +1

    can u make video on service principle? with a sample demo

  • @shantanuchakraborty4266
    @shantanuchakraborty4266 4 роки тому

    very good for beginners..Thanks to you.

  • @labelledamesansmerci8607
    @labelledamesansmerci8607 4 роки тому +2

    Can you make more videos on Azure Data Bricks calling multiple notebooks, making RDBMS calls, logging etc.,

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому +1

      I will want to make some series in the future on databricks. :)

  • @terryliu3635
    @terryliu3635 4 роки тому

    Good hands-on intro, thanks!

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Thank you! :) Glad you enjoyed it Terry.

  • @ObjectDesigner
    @ObjectDesigner 3 роки тому

    Hey Adam, do you have input for the below 3 questions, ADLS + CDM topic. Best Daniel
    Question #1: am I right that CDM does not have restriction on how the CDM folders in ADLS should be organized (folders, subfolders), (data grouping, data isolation). As I understand it can have any folder structure but the manifest / cdm / model json-s have to be placed in it. I saw examples on docs.microsoft but can different logic be implemented as well ?
    Question #2: access (read / write ...) can be defined by using Azure Active Directory, Access Control Lists there is no additional feature if a folder is a CDM folder, so there is no difference in the possibilities if a ADLS folder is a CDM Folder or not ?
    Question #3: is it possible to automate the entity.cdm.json file generation ?

  • @puttarinkesh8535
    @puttarinkesh8535 3 роки тому

    Thank you, very nice video

  • @shawndeggans
    @shawndeggans 4 роки тому

    It looks like we don't necessarily need to use databricks, because ADF now support "Data Flows," which are a kind of no-code data transformation process. What are your thoughts on that? Is ADF a good substitution for Databricks (its actually using Databricks under the hood) for more advanced data transformation jobs?

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому +1

      hey Shawn, very good question. Since Microsoft removed Data Flow step which allowed you to input your own script blocks I'd say for advanced scenarios I would use databricks since I would want to have full control. Microsoft also removed ability to provide your own link service for dataflow which in return also means if you want to connect to data sources within your private networks then public integration runtimes will not be able to connect (it can however connect to firewall vnet protected resources using managed identity), nor will you be able to add custom libraries to your dataflow (again, you don't own cluster so you can't control this) hence again narrowing down some scenarios. Net net, my opinion is that general directory of Data Flow is for simple cloud transformation scenarios at this point in time.

  • @agupta51
    @agupta51 4 роки тому

    Excellent presentation.

  • @rohitkarnatakapu4760
    @rohitkarnatakapu4760 4 роки тому

    Really nice and informative video. Can you provide me some context on meta data storage as well. Like if i store 1 TB of data in my ADLS, then how much meta data will be generated and stored? I am looking more towards cost as BLOB storage doesnt charge you for meta data.Looking forward for your reply

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      It shouldn't be much, unless you use Delta Table which contain all history of changes for your tables. Thanks for watching.

  • @GG-uz8us
    @GG-uz8us 4 роки тому

    Very good introduction video, thank you. A quick question, why using AccessKey to mount Azure Blob Storage, using Service Principle to mount Data Lake? How do I use AccessKey to mount Data Lake?

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому +1

      Good question! There is pretty much no difference, I just wanted to show both approaches, service principal is recommended but access key will work too.
      Check out how to use access key for data lake in the docs: docs.databricks.com/data/data-sources/azure/azure-datalake-gen2.html

    • @GG-uz8us
      @GG-uz8us 4 роки тому +1

      @@AdamMarczakYT Thank you. Another favor, will you be able to give a quick demo about Databricks? My impression about Databricks is all about in memory processing, good candidate for data streaming. Do you have a demo about from EventHub or ServiceBus to Databricks?

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Check this out docs.microsoft.com/en-us/azure/azure-databricks/databricks-stream-from-eventhubs
      I have only intro video on databricks as of now.

  • @suchintyapaul
    @suchintyapaul 4 роки тому

    Great video. One query. When you wrote back to the lake at the end of the demo, it was in partitions. How can we write back in a single file without partitions?

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Good question! Since databricks is based on Spark and Spark is using Hadoop file system, it is normal behavior for files to be split into partitions. You can force single partition by using repartition or coalesce function with parameter of value '1'. If you wan't to skip entire folder with all hadoop parts then you can google for some scala/python scripts to do it. In general paritioning is good practice so merging is not recommended for bigger files as they will need to be loaded into memory which will cause issues with bigger data sets. Most other technologies like SQL DW (synapse), data factory are able to read from partitioned data sets just fine.

    • @sekhar8994
      @sekhar8994 4 роки тому +1

      Adding to Adam's , You can write it into a single partition using coalesce/repartition and and then using os.path, delete the files that doesn't have a pattern *.csv/parquet and rename the file .

  • @CoopmanGreg
    @CoopmanGreg 4 роки тому

    Excellent. Thanks!

  • @aniketsamant455
    @aniketsamant455 4 роки тому

    Nice explanation... I have one question ....if I upload file into folder created in data lake gen2 , that file will follow herarchiel file system or flat namespace system ?

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Thanks. Everything in ADLS is handled under hierarchical structure.

  • @satori8626
    @satori8626 Рік тому

    If I want to use blob storage to store some files with low costs, and delta lake storage to store other files in a structured directory, do I need to create two separate storage accounts?

  • @RaviKumar-op8gb
    @RaviKumar-op8gb 2 роки тому +1

    Is it possible to attach two data lake to single instance of databricks

    • @AdamMarczakYT
      @AdamMarczakYT  2 роки тому

      You can attach as many as you like. :)

  • @prakash4190
    @prakash4190 3 роки тому

    Thanks for the video. I have two ADLS instances, dev and prod. The data is sourced from various systems to prod and then migrated to dev instance as well. Is there any service or tool available to compare all the folders and files between these two containers on dev and prod.

    • @AdamMarczakYT
      @AdamMarczakYT  3 роки тому

      Not that I know of. You probably would need to write PowerShell script yourself for this and compare their MD5.

  • @DavidOkeyode
    @DavidOkeyode 4 роки тому

    Awesome!

  • @GAMER-zz4cc
    @GAMER-zz4cc Рік тому

    Hi Adam
    Please provide the ADF series from basic to advanced level it helpful for me

  • @simple-security
    @simple-security 4 роки тому +1

    Can you use data lakes with Azure Sentinel?
    I see other SIEMs boasting their data lake backends..

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Can you explain what kind of scenario are we talking about? I didn't have exposure to many SIEM systems.

    • @simple-security
      @simple-security 4 роки тому

      Another way to put it is: can you use an azure analytics workspace to connect to a data lake and search it using the kusto query language or are these 2 different animals?

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Don't think that's possible. Here is the list of supported data sources for sentinel docs.microsoft.com/en-us/azure/sentinel/connect-data-sources

  • @fsfehico
    @fsfehico 4 роки тому

    Hey Adam that's a great demo. I want to know how you can programmatically put files in a folder based on the date of the file if I have year-->month-->day subdirectory structure, and then use a search pattern to only choose files of a particular month of year during processing of data within the data lake. Any ideas on how?

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      Hey thanks. What technology? scala in databricks?

    • @fsfehico
      @fsfehico 4 роки тому

      @@AdamMarczakYT I'm able to do that with adf but I guess it won't be bad if you have a way of doing that in scala also.

    • @AdamMarczakYT
      @AdamMarczakYT  4 роки тому

      With databricks you can simply run something like
      val year = 2019
      val month = 12
      df.save.json(s"/mnt/data/$year/$month/myfile.json")
      and to get the files
      val files = dbutils.fs.ls(s"/mnt/data/$year/$month")
      note I wrote this on my phone so there might be typos, but you get general principle

    • @fsfehico
      @fsfehico 4 роки тому

      @@AdamMarczakYT Right on. Thanks Adam!!

    • @sekhar8994
      @sekhar8994 4 роки тому

      When you say date of the file , do you mean ,last modified date of file or a date column in file ? if its last modified date , You can use input_filename() function to get the last modified date of file in a new column of dataframe and then accordingly you can get year, month & day as new columns and finally when you write back , just use parttitionBy() with year , month & day.