Auto telentry for for infra and, often, for third party apps is not a problem. There is a variety of exporters and, lately, eBPF can discover a lot. The problem is with our own apps. They need to be instrumented with otel.
I find it shortsighted that it supports only kubernetes. Most companies have some ec2 instance somewhere, be it a database instance or some historical deployment somewhere. By going pixie, I feel I have to accept to be blind on the rest of the infra. Not sure how hard it would be to have an ebpf enabled agent to run on linux on instances that reports to pixie. Not in their roadmap unfortunately.
what I think is pixie is better than groundcover bcz in groundcover it is a SAAS solution so our important data we are sharing to its cloud so for security purpose I shall use Self-Hosted Pixie, if ground cover provides self hosted I will definitely use ground cover. whats your view on this?
If you need a self hosted solution, pixie is definitely the way to go. We'll what'll happen when/if ground cover starts offering a self hosted version.
You can obfuscate the date for ground cover and also the data is stored in the cluster and portal container only just make a connection out of the cluster to the cloud to give you the ui so the data is actually still safe
Open telemetry was supposed to decouple data collection from data visualization. Is this not a step back? Can't pixie be just another collector on the otel ecosystem? Or a UI in competition to grafana?
Pixie is a few things at the same time. It collects data as exporters do but through eBPF, it stores data like, let's say, prometheus, and it visualizes data like Grafana. Whether that's good or bad really depends on how well it does all those things. As for otel... It is a standard API and not really a data collection. It describes how to expose data so that collectors can get it.
@@DevOpsToolkit , what do you mean that is not a collector of observability data? That is exactly what it is! And more ... After we want to filter data or mask personal data for compliance, create span metrics from traces... Set multiple exporters... And much more... Also . In Enterprise environments, we need to send the same observability data to elk or splunk because some teams love to do Analytics on that data... Or send traces to a E2E Jaeger or whatever...with otel I don't have to install an agent for tool x just because some team prefers that tool... I just add an exporter to otel!... And there are more like semantic standardization of logs metrics and traces.
I think we missubderstood each other. I thought that by collector you meant an agent that collects data from some data source (in this case otel). My bad.
@@DevOpsToolkit ok... fair enough. to be clear... otel is supposedly the colecting layer of observability data right? for what I understand its a large ecosystem that can collect observability data in multiple tech, like talking to prometheus endpoints, read files and even ebpf. that is defined in the "reciever" part of the otel collector. what I understand pixie has its own ebpf agent but also stores the data it collects and offers an ui that looks really cool and I like the script and cli aproaches. then it says that it can export its data to open telemetry.... imo pixie should decouple the data colecting part(leave that for an ebpf reciever in otel) and then consume the data from open telemetry collector to its local observability data store and UI. I hope I am being understood :) in my enterprise we really trying to get everyone to "use only otel to colect observability data" so that we stop the proliferation of diferent observability agents.
@@hugolopes5604 You're right when collectors are concerned. However, when I talk about otel, I talk first and foremost about instrumentation itself and, to be more precise, instrumentation of your own apps. I do think that otel is important on all levels, but I see it much more relevant for instrumentation inside my own code than for external collectors (e.g., node resources, Kubernetes API, etc.). The reason I'm saying that lies in the amount of effort I need to puit into it. When I add instrumentation to my code, that is a "real" effort and I do not want to get into the situation where I need to rewrite that code because I changed the tool where that data is stored. That's where otel gets fanatical support from me. Then there is the rest of data (e.g., memory, CPU, Kube, etc.). It's great if that is otel, but not that critical since the effort is small (I just had to install come agent to expose that data). I'm not saying that should not be otel (it should), but, rather, that the gain (or the loss) is smaller. All in all, what I'm trying to say is that the main question is how you instrument "stuff" rather than how you collect data that is exposed through that instrumentation.
Hi Viktor, Kudos to the good work you are doing. I have questions & Ideas with respect to observability while chaos engineerig has to be in place. Can you suggest me how can I reach you?
Yes. You normally run it as a daemon set which runs a pod on every node. The difference from typical agents is that it is eBPF meaning that it gets "injected" into the kernel with all the benefits that brings.
Not related to the video but, I can't find material on your channel regarding distributed file systems. Any opinions? (SeaweedFS, JuiceFS + MinIO, Ceph,GlusterFS,HDFS,Swift) Points i'm looking into for these options at the moment: Open Source,Object Storage,Block Storage,File Storage,Scalability,Fault Tolerance,High Performance,Easy to Use,Commercial Support,Community Support,Replication,Data Deduplication,Compression,Encryption,REST API,Multi-site Support,Tiered Storage,Erasure Coding,POSIX Compliant,Snapshots,Data Versioning,Geo-Replication,Data Integrity Checks,Access Control,Quota Management,Audit Logs,Rack Awareness
MS just opensourced their take on eBPF MS retina!
What do you think of Pixie? Is it worthwhile exploring?
I'll definitely explore it!! Quality content as allways.. Salute 🫡 V. Farcic !!
Are there any other tools that has auto telemetry in built ?
Auto telentry for for infra and, often, for third party apps is not a problem. There is a variety of exporters and, lately, eBPF can discover a lot. The problem is with our own apps. They need to be instrumented with otel.
I find it shortsighted that it supports only kubernetes. Most companies have some ec2 instance somewhere, be it a database instance or some historical deployment somewhere. By going pixie, I feel I have to accept to be blind on the rest of the infra. Not sure how hard it would be to have an ebpf enabled agent to run on linux on instances that reports to pixie. Not in their roadmap unfortunately.
You can import data from other sources.
00:41 Nice!
Hello Viktor, any plan to make a video on Apache Skywalking??
Adding it to my to-do list...
Hi Viktor, Thank you for inspiring me always with your videos. Can you share the name of the tool you use for creating technical video / presentation
I externalized video editing to an agency and I'm not sure what they use 😔
Cool
Nice however security guys would have to go and change their armour ... :)
what I think is pixie is better than groundcover bcz in groundcover it is a SAAS solution so our important data we are sharing to its cloud so for security purpose I shall use Self-Hosted Pixie, if ground cover provides self hosted I will definitely use ground cover. whats your view on this?
If you need a self hosted solution, pixie is definitely the way to go. We'll what'll happen when/if ground cover starts offering a self hosted version.
You can obfuscate the date for ground cover and also the data is stored in the cluster and portal container only just make a connection out of the cluster to the cloud to give you the ui so the data is actually still safe
Deepflow is nice tool too
Open telemetry was supposed to decouple data collection from data visualization. Is this not a step back? Can't pixie be just another collector on the otel ecosystem? Or a UI in competition to grafana?
Pixie is a few things at the same time. It collects data as exporters do but through eBPF, it stores data like, let's say, prometheus, and it visualizes data like Grafana. Whether that's good or bad really depends on how well it does all those things. As for otel... It is a standard API and not really a data collection. It describes how to expose data so that collectors can get it.
@@DevOpsToolkit , what do you mean that is not a collector of observability data? That is exactly what it is! And more ... After we want to filter data or mask personal data for compliance, create span metrics from traces... Set multiple exporters... And much more... Also . In Enterprise environments, we need to send the same observability data to elk or splunk because some teams love to do Analytics on that data... Or send traces to a E2E Jaeger or whatever...with otel I don't have to install an agent for tool x just because some team prefers that tool... I just add an exporter to otel!... And there are more like semantic standardization of logs metrics and traces.
I think we missubderstood each other. I thought that by collector you meant an agent that collects data from some data source (in this case otel). My bad.
@@DevOpsToolkit ok... fair enough. to be clear... otel is supposedly the colecting layer of observability data right? for what I understand its a large ecosystem that can collect observability data in multiple tech, like talking to prometheus endpoints, read files and even ebpf. that is defined in the "reciever" part of the otel collector. what I understand pixie has its own ebpf agent but also stores the data it collects and offers an ui that looks really cool and I like the script and cli aproaches. then it says that it can export its data to open telemetry.... imo pixie should decouple the data colecting part(leave that for an ebpf reciever in otel) and then consume the data from open telemetry collector to its local observability data store and UI. I hope I am being understood :) in my enterprise we really trying to get everyone to "use only otel to colect observability data" so that we stop the proliferation of diferent observability agents.
@@hugolopes5604 You're right when collectors are concerned. However, when I talk about otel, I talk first and foremost about instrumentation itself and, to be more precise, instrumentation of your own apps. I do think that otel is important on all levels, but I see it much more relevant for instrumentation inside my own code than for external collectors (e.g., node resources, Kubernetes API, etc.). The reason I'm saying that lies in the amount of effort I need to puit into it. When I add instrumentation to my code, that is a "real" effort and I do not want to get into the situation where I need to rewrite that code because I changed the tool where that data is stored. That's where otel gets fanatical support from me. Then there is the rest of data (e.g., memory, CPU, Kube, etc.). It's great if that is otel, but not that critical since the effort is small (I just had to install come agent to expose that data). I'm not saying that should not be otel (it should), but, rather, that the gain (or the loss) is smaller.
All in all, what I'm trying to say is that the main question is how you instrument "stuff" rather than how you collect data that is exposed through that instrumentation.
Hi Viktor,
Kudos to the good work you are doing.
I have questions & Ideas with respect to observability while chaos engineerig has to be in place.
Can you suggest me how can I reach you?
Ping me on LinkedIn or twitter and I'll send you back my calendly link. You'll find my info in the description of any video.
To execute ebpf also, we need a pod on each node. Is that correct. If so its similar to agent
Yes. You normally run it as a daemon set which runs a pod on every node. The difference from typical agents is that it is eBPF meaning that it gets "injected" into the kernel with all the benefits that brings.
Not related to the video but, I can't find material on your channel regarding distributed file systems. Any opinions? (SeaweedFS, JuiceFS + MinIO, Ceph,GlusterFS,HDFS,Swift)
Points i'm looking into for these options at the moment: Open Source,Object Storage,Block Storage,File Storage,Scalability,Fault Tolerance,High Performance,Easy to Use,Commercial Support,Community Support,Replication,Data Deduplication,Compression,Encryption,REST API,Multi-site Support,Tiered Storage,Erasure Coding,POSIX Compliant,Snapshots,Data Versioning,Geo-Replication,Data Integrity Checks,Access Control,Quota Management,Audit Logs,Rack Awareness
I haven't done any video on that subject. Adding it to my todo list.