Using NETCONSOLE to debug Linux (and Proxmox) Kernel Panics

Поділитися
Вставка
  • Опубліковано 13 лип 2024
  • In this video, I'm going to setup Netconsole, so you can capture kernel panics and logs on headless systems. I know some of you are doing wild things with graphics drivers and iGPU passthrough, so hopefully this helps you debug them when your other display outputs aren't working.
    Blog with commands:
    www.apalrd.net/posts/2024/pve...
    Support me on Ko-Fi if you enjoy my content and find it useful:
    ko-fi.com/apalrd
    Feel free to chat about my upcoming projects on Discord!
    / discord
    Timestamps:
    00:00 - Introduction
    00:45 - Overview
    02:44 - Simple Setup
    04:56 - Log on Boot
    09:20 - Systemd Journal
  • Наука та технологія

КОМЕНТАРІ • 20

  • @LampJustin
    @LampJustin 8 днів тому +14

    Btw the systemd journald also logs kernel messages, which is quite helpful when they get truncated by dmesg or when the system got rebooted as you show the logs from the last boot: journalctl -b -1 -ke

  • @blevenzon
    @blevenzon 8 днів тому +3

    Thanks so much, you rock dude. Awesome content as always

  • @AntranigVartanian
    @AntranigVartanian 8 днів тому +1

    Somehow doing this on Solaris 10, around 20 years ago, was much simpler and debugging was easier as well. Good to see Linux catching up.

    • @autohmae
      @autohmae 7 днів тому

      This is how easy it has been since 2001, so 'catching up' is a big word, nothing changed.
      BTW Warning this might make you feel old: Solaris 10 was from 1992, 32 years ago.

  • @drunkbear889
    @drunkbear889 8 днів тому +2

    Another viewpoint: If using a P2P L3 link to the server (from a router, or L3 switch). I would not see a problem with leaving it on all the time.
    Some switches can can do inbound rules on the interface connected to the server for forwarding packets based on filters to only select destinations (think MT CRS devices). Such that even devices on the same LAN segment, will not receive the broadcast packets destined to ff:ff:ff:ff:ff:ff (switch chip ACL rules). (~port mirroring, but filtered for very specific traffic)

    • @apalrdsadventures
      @apalrdsadventures  8 днів тому +3

      It's also possible to not use the broadcast mac if you want to leave it on all the time, you then need to set both the unicast IP and unicast MAC since it doesn't do ARP / NDP.
      I also tried using multicast and that works fine on the Netconsole side, but receiving multicast is a lot harder than netcat, so I didn't go down that path. That may be easier to filter depending on your switch.

  • @PCMagikHomeLab
    @PCMagikHomeLab 8 днів тому

    Thumbs UP! :)

  • @someoneelse3876
    @someoneelse3876 8 днів тому +2

    I got a lot of kernel panics when I switched from iscsi to nvme over fabrics/tcp, I gave up and went back to regular iscsi.

  • @ws_stelzi79
    @ws_stelzi79 8 днів тому

    well *someone* "had got" the need to make a video on an advanced Linux/kernel debugging! 😉😇

  • @DawidKellerman
    @DawidKellerman 8 днів тому

    I like your original videos ! The lets do the same vid as some one else is tiresome! Any chance for a video on proxmox offline mirror?

  • @NiHaoMike64
    @NiHaoMike64 8 днів тому +1

    Wouldn't serial console also be a good option if the machine in question has a serial port?

    • @apalrdsadventures
      @apalrdsadventures  8 днів тому

      Yes, and I have a video on that too. A lot of machines don't have them, though.

    • @autohmae
      @autohmae 7 днів тому

      @@apalrdsadventures every Linux VM on KVM/Qemu does of course... what would be a good serial console server software which has the option to log everything ? A quick look around I think I can probably just run a bunch of stty processes to capture the logs.

    • @apalrdsadventures
      @apalrdsadventures  6 днів тому +2

      For VMs of course, but when you are doing experimental hardware passthrough you tend to crash the host kernel

    • @autohmae
      @autohmae 6 днів тому

      @@apalrdsadventures ahh, that's sad, that shouldn't really be able to happen.

    • @apalrdsadventures
      @apalrdsadventures  6 днів тому +1

      Usually it's because the host kernel was already using that device and you took it away improperly to give to the VM. If you do that with the primary GPU, then you have no graphics device to debug from either.

  • @OscarCarlsson1986
    @OscarCarlsson1986 8 днів тому +1

    The blog entry isn't live yet =)

    • @apalrdsadventures
      @apalrdsadventures  8 днів тому +1

      fixed it, the timer didn't do it on time for some reason