Azure Databricks Monitoring with Log Analytics

Dustin Vannoy

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 5 вер 2024

КОМЕНТАРІ • 59

@KamranAli-yj9de 4 місяці тому
Hey Dustin,
Thanks for the tutorial! I've successfully integrated the init script and have been receiving logs. However, I'm finding it challenging to identify the most useful logs and create meaningful dashboards. Could you create a video tutorial focusing on identifying the most valuable logs and demonstrating how to build dashboards from them? I think this would be incredibly helpful for myself and others navigating through the data. Looking forward to your insights!
@DustinVannoy 4 місяці тому
This is what I have plus the related blog posts. ua-cam.com/video/92oJ20XeQso/v-deo.htmlsi=OS-WZ_QrL-_kkwWu
We mostly used out custom logs for driving dashboards but also evaluated some of the heap memory metrics regularly as well.
@KamranAli-yj9de 4 місяці тому
@@DustinVannoy Thank you. It means a lot :)
@atanuchatterjee7470 Рік тому
Hi Dustin, wonderful explanation, very well explained. I have one small question can I use Pyspark to capture log using Log Analytics ? the example you have given in 10:49 position is not working
@venkatapavankumarreddyra-qx2sc 9 місяців тому
Hi Dustin. How to implement the same using scala. I tried but the same solution is not working for me. Any advise?
@xinosistemas 8 місяців тому
Hi Dustin, great content, quick question, where can I find the library for Runtime v14 ?
@DustinVannoy 7 місяців тому
Check out this video and the related blog for latest tested versions. It may work with 14 also but only tested with LTS runtimes.
ua-cam.com/video/CVzGWWSGWGg/v-deo.html
@annas6516 Рік тому
Hi, that's very helpful! thanks! I am building a new jar file that was recently updated for Runtime 11. the init script provided on the GitHub repo does not have Regex filter (export LA_SPARKLOGGINGEVENT_NAME_REGEX)in it. what is this filter for? I am currently not getting my logs in Log analytics but cluster is running. I wonder if that's because of this filter not set
@ronatienza3747 2 роки тому
Hello Sir Dustin, I just want to ask why my cluster doesn't run, the error says spark errror:driver down. I'm using spark-listeners-loganalytics_3.2.0_2.12-1.0.0.jar and my cluster version is scala 2.12 and spark 3.2.0. Hope you can help me, thank you.
@film-masti-777 Рік тому
In the log analytics workspaces, there are some custom tables (ex - SparkLoggingEvent_CL). Have you created those manually in log analytics workspace after bringing log data into the workspace? or it will be there by default?
@DustinVannoy Рік тому ⁺¹
Those should get created once the monitoring library is initialized and sending data. I've seen a several minute delay when your cluster is writing to those for the first time.
@mathieu1917 2 роки тому ⁺¹
Thank you very much for the video. Do we need to create de custom logs first in log analytics?
@DustinVannoy 2 роки тому ⁺¹
You just need to create the log analytics workspace and set the correct variables (in the video I put them as secrets then referenced in the advanced cluster configuration). When the cluster initializes with the Spark monitoring init script it will create the custom logs.
@film-masti-777 Рік тому
I see for DBR>11, this method wont work (due to restrictions from spark side). Any suggestion on alternatives we can use for DBR >=12 to bring Log4J output and databricks cluster event logs, databricks job, notebook logs into log analytics workspace?
@DustinVannoy Рік тому
Use this version that was contributed by some of the Databricks folks to get the same capabilities. github.com/mspnp/spark-monitoring/tree/l4jv2
@mohamedelghazal1036 Рік тому
Thank you Dustin for the video.
In fact i made the tutorial and i got the result in the log analytics for SparkLoggingEvent_CL and SparkMetric_CL. However, i got an error regarding the SparkListenerEvent.
Here is the issue :
Data of type SparkListenerEvent was dropped because number of fields * is above the limit of 500 custom fields per data type.
Any advise plz.
Thank you
@user-ni1ms6nx4j Рік тому
Hi Dustin, Excellent content and delivery. We Microsoft pnp for v13 runtime built into jars and added to clusters. It works well for single clusters, but doesn't work with shared or SQL clusters. Do you have advise?
@DustinVannoy 9 місяців тому
Shared clusters now allow for init scripts and it can be configured, but you can't use the LogManager.getLogger call to add your own custom logging like I show for single user (assigned) access mode. I am close to having README and video for newer runtimes ready.
@yatharthkaushikk Рік тому
Hey Dustin, thanks for the video.
I am facing Init script Failure after creating the environment variables and shell script in init script path. Could you help me what could be the possible reasons for that?
@sauravsanthosh7270 2 роки тому
Hey Dustin, are there any documentations or guides(in the GitHub repo or otherwise) where I can make sense of the column names and name_s values? Can't seem to find any :(
@sagnikachakraborty5887 2 роки тому
Hi. A few questions. But, these are related to azure data studio and not databricks
1.How do we log pyspark scripts in azure data studio? Do we need to download any jars, or can we get the template, modify it and pass it to the spark submit?
2.Where can one find the spark/home directory in the data studio?
@DustinVannoy 2 роки тому
Sorry, I haven't used pyspark from Azure Data Studio.
@Shivamsinghtv Рік тому
@sagnika got any answers?
@sagnikachakraborty5887 Рік тому
@@Shivamsinghtv Nope, not yet
@antony_micheal 8 місяців тому
Hi Dustin how can we send stderr logs into azure monitor
@DustinVannoy 6 місяців тому
I'm not sure of a way to do this, but I haven't put too much time into it. I do not believe the library used in this video can do that, but if you figure out how to get it to write to log4j also then it will go to Azure Monitor / Log Analytics with the approach shown.
@ew3995 Рік тому
can this be terraformed?
I believe it can, happy to share process if necessary
@DustinVannoy Рік тому
I haven't done much with using Terraform for cluster creation. Getting the right init script and uploading correct jars is the important part. If you create an example that involves Terraform please let me know.
@ravikumarkumashi7065 11 місяців тому
Great vedio thank you very much...we tried to set up monitoring for 12.2 using log42j version of jar and we see import fields like message, level etc are coming in as blank do you have it working for 11.3 LTS and above ?
@DustinVannoy 11 місяців тому
i plan to release a video at some point showing 11.3+.Some of the field names may have changed to take a look at everythign coming through, but you are probably experiencing something I haven't seen. Make sure to use this branch for 11.3+: github.com/mspnp/spark-monitoring/tree/l4jv2
@ravikumarkumashi7065 11 місяців тому
That will be really great...I am waiting for your vedio on 11.3+ and yes I am using the same branch
@geisonlourenco8750 2 роки тому
Hello. Is it possible to configure The init script as a global init script? When i try to put The same text inside The spark-monitoring.sh in global script my cluster is terminating. I already test in the cluster init script and works fine
@DustinVannoy 2 роки тому
I moved away from global init scripts so I'm not sure. I had too many issues with global init scripts doing things I had not planned on a new cluster. Are you sure you have the right spark version and environment variables set?
@Shivamsinghtv Рік тому
This solution logs the log4j logs in log analytics. Is there a way to log the standard output as well?
@DustinVannoy Рік тому
I have not seen a way to send standard output form Databricks to log analytics. I tried briefly but decided it wasn't important for my project. One thing I don't like about that is anytime someone does a DataFrame.show() it prints actual data to stdout. I would not want that in my log analytics environment,
@maheshrathi1354 2 роки тому
Hey Dustin Vannoy i m getting init scripts issue not able to start cluster Please help
@DustinVannoy 2 роки тому
Please send some details and screenshot to me on LinkedIn. Runtime, which jars are uploaded to dbfs:/databricks/spark-monitoring, and what the even log error says.
@industrial6 2 роки тому
Is it possible to run init script on Spark 3.1.2 ? I tried to add new profile "scala-2.12_spark-3.1.2" but it doesn't work.
@DustinVannoy 2 роки тому ⁺¹
I want to test it and add a new commit, but the steps that you need would be:
1. Add maven profile
2. Build with maven
3. Upload the new jar that will have 3.1.2 in the name.
4. May need to update the init script, but if I remember correctly it is using the actual cluster version.
@DustinVannoy 2 роки тому
It works for me with these compiled jars I just added. Upload all of them or at least the 3.1.2 one. This has not been tested extensively.
github.com/datakickstart/spark-monitoring/tree/master/src/target
@jaymajor4086 Рік тому
Hi Dustin , i was wondering if there is a video for Windows users
@DustinVannoy Рік тому
Of how to build the project with Windows? Would you need guidance on installing java and sbt on Windows also?
@jaymajor4086 Рік тому
Hi @@DustinVannoy , thanks for responding.
Yes please a video on how to implement / build this from a Windows environment. I managed to install Java and Scala but would welcome a brief run through to see if i am doing things correctly as i am struggling to implement this.
Thank you
@user-eg1ss7im6q Рік тому
can you do azure overwatch and datadog demo please?
@DustinVannoy Рік тому
I cannot do datadog will put overwatch on the list to see if I can cover it in future. Thanks for the suggestion.
@NaisDeis 9 місяців тому
How can i do this today on windows?
@DustinVannoy 9 місяців тому
I am close to finalizing a video on how to do this for newer runtimes and i build it on windows this time. I use WSL to build this on windows. For Databricks Runtimes 11.3 and above there is a branch named l4jv2 that works.
@darta1094 Рік тому
Nicely omitted most of configuration options making this video useless
@soucianceeqdamrashti1701 2 роки тому
@dustin, I configured the init script but got this error: Cluster terminated.Reason:Init script failure
Cluster scoped init script dbfs:/databricks/spark-monitoring/spark-monitoring.sh failed: Script exit status is non-zero
@DustinVannoy 2 роки тому ⁺¹
Did you see the couple of troubleshooting steps in my written post-
dustinvannoy.com/2021/08/09/monitoring-azure-databricks-with-log-analytics/ ? Check those out first and if that isn't it let me know what cluster runtime you are using.
@saikumarreddydevareddy8245 2 роки тому
Thanks for sharing the knowledge as video. After sending logs to log analytical workspace from databricks , we observe few errors like below on log analytical workspaces.
Could you help how to fix this ??
For your reference:
The following fields' values sparkPlanInfo_children,Properties_spark_executor_extraClassPath,physicalPlanDescription of type SparkListenerEvent have been trimmed to the max allowed size, 32766 bytes. Please adjust your input accordingly.Ingestion (Field content validation)Warning
Data of type SparkListenerEvent was dropped because number of fields 2429 is above the limit of 500 custom fields per data type.
The following fields' values sparkPlanInfo_children,Properties_spark_executor_extraClassPath of type SparkListenerEvent have been trimmed to the max allowed size, 32766 bytes. Please adjust your input accordingly.
Data of type SparkListenerEvent is * dropped * to * * * *: -*. * message: The *'* * of * * been *. * '[*].sparkPlanInfo.children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[*].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].children[0].*[0]', * *, * *.
@DustinVannoy 2 роки тому
Is the physical execution plan really long? I haven't seen all the children[0] text in any logs I viewed but we saw similar messages when our query plans were super long because we used a lot of withColumn statements with nested logic like case statements.
@saikumarreddydevareddy8245 2 роки тому
@@DustinVannoy is there any way I can control or set the number of columns to send to log analytic workspace from databricks ? I have seen how to control the events with LA_SPARKLISTENEREVENT_REGEX , but that is not helping me.
@saikumarreddydevareddy8245 2 роки тому
@@DustinVannoy also because of this error, I believe needed columns are properly generated in log analytics workspace and that’s why grafana dashboard are not properly showing up. For example : STAGE LATENCY dashboard is having error “ Failed to resolve scalar expression Named: Stage_Info_Stage_ID_d”
@shantanudas7514 3 роки тому
hi @dustin Cluster scoped init script dbfs:/databricks/spark-monitoring/spark-monitoring.sh failed: Script exit status is non-zero getting this while running
@DustinVannoy 3 роки тому ⁺¹
There are a few common causes. Check out the writeup I just added a link for the description, including troubleshooting steps at the end. Make sure you are using a Spark 3.1.1 cluster and the Key Vault secrets are setup. If you want to email me the message you see looking at the Event Log JSON for the failed step I can probably provide more guidance.
dustinvannoy.com/2021/08/09/monitoring-azure-databricks-with-log-analytics/
@shantanudas7514 2 роки тому
@@DustinVannoy i have been managed to resolve the issue, thanks a lot for your quick reply.
@prasanthiv4855 2 роки тому
@@shantanudas7514 I am getting same error.Could you please let me know how you managed to resolve?
@pavansoma4566 2 роки тому
@@shantanudas7514 Please let me know how you managed to resolve the issue

Наступне

Автоматичне відтворення