Break all the things: eBPF as an agent of chaos - Scott Gerring

Поділитися
Вставка
  • Опубліковано 21 вер 2024
  • eBPF is known for its use in the networking, security, and observability domains, improving the resilience and performance of distributed systems. However, with its extensive reach into the kernel and user space, it is also well-positioned to disrupt processes running on a local machine, selectively injecting chaos and allowing us to observe and measure the impact.
    Join us for this talk where we take a look at the different types of chaos experiments eBPF is well suited for and how this might help us improve our systems.
    ---
    Don't forget to subscribe to the channel and join the Cilium & eBPF slack here: slack.cilium.io.
    If you're Learning eBPF for the first time, Liz Rice's eBPF book is a great resource. Download it here: isovalent.com/...
    #ebpf #cloudnative

КОМЕНТАРІ • 1

  • @wolpumba4099
    @wolpumba4099 9 днів тому

    *Break All the Things: Leveraging eBPF for Chaos Engineering*
    * *0:19** Chaos Engineering Background:* Scott introduces chaos engineering, highlighting its origins at Apple and popularization by Netflix. It involves experimenting on systems to enhance their resilience to failures.
    * *0:34** Traditional Chaos Experiments:* Common methods include disrupting VMs and network connectivity. Tools like Chaos Mesh (sometimes using eBPF) and cloud platforms like AWS facilitate these experiments.
    * *0:50** Benefits of Chaos Engineering:* It helps identify emergent failures in multi-service systems and weaknesses in individual services under specific failure conditions.
    * *1:14** Limitations of Traditional Approaches:* Large-scale cloud environments may be unnecessary for testing individual services, and traditional component testing with stubs can be cumbersome.
    * *1:28** Shifting Left with eBPF:* eBPF offers a "universal language of destruction" for injecting failures directly at the kernel and network level, simplifying testing and allowing earlier detection of issues.
    * *2:16** eBPF-Based Failure Injection Examples:*
    * *2:27** Network I/O Disruption:* eBPF programs can intercept and drop network traffic based on criteria like process ID or port, simulating network outages.
    * *3:01** System Call Failures:* eBPF can intercept system calls (like `openat`) and return error codes, simulating resource access failures. This approach is potentially generalizable across different system calls.
    * *3:45** Traffic Control (TC) for Packet Loss:* Leveraging the TC subsystem, eBPF can selectively drop outgoing traffic for specific processes, mimicking network packet loss.
    * *4:23** Future Directions and Potential:*
    * *4:26** Integration with Test Frameworks:* Scott proposes integrating eBPF-based chaos experiments into common testing frameworks like JUnit or Go test, enhancing developer workflows.
    * *4:37** Exploring New Probe Points:* The eXpress Data Path (XDP) could be used for coarser network failures, and the Linux Security Modules (LSM) for simulating security breaches.
    * *4:52** Simulating Resource Exhaustion:* eBPF could be used to manipulate cgroup limits to create resource exhaustion scenarios.
    * *5:03** Call to Action:* Scott encourages feedback and contributions to the project, providing a link to his GitHub repository with example code.
    I used gemini-1.5-pro-exp-0801 on rocketrecap dot com to summarize the transcript.
    Cost (if I didn't use the free tier): $0.05
    Input tokens: 13312
    Output tokens: 494