Can a single 1.2 GHz core process 10 Gb/s? Yes, it can!

Поділитися
Вставка
  • Опубліковано 21 лис 2024

КОМЕНТАРІ • 142

  • @alexgartrellwork335
    @alexgartrellwork335 День тому +157

    The reason "downsized" packet performance is important is that TCP-ACKs and other small packets exist organically. So packets-per-second is actually a relevant and important metric for router performance. With sufficiently large payloads, throughput is just a direct memory access benchmark because you're just copying stuff around and not doing that much "thinking."

    • @jamess1787
      @jamess1787 День тому +13

      DPDK has a way to handle this, but you're right, it's optimized for heavier throughput, with smaller packets seeing higher latency then you would normally see. (Set your icmp packet sizes larger and you'll see the latency problem disappear)
      Think about DPDK like more of a "pipe" connecting the two end devices together, it doesn't matter how much volume of fluid you put in there and it'll get there in a timely fashion, especially if there is heavy throughout on the system.
      DPDK works great for things like SCTP. 🤘

    • @jfbeam
      @jfbeam 12 годин тому +1

      Properly efficient routing (and switching) is done WITHOUT copying. The hardware receives the frame into memory, and that's where it stays. Any forwarding is done by reference to that memory buffer. Copying is what kept linux and BSD networking so slow for so long.

    • @alexgartrellwork335
      @alexgartrellwork335 9 годин тому +1

      @@jfbeam that’s kind of true but irrelevant because the Direct Memory Access thing I’m talking about is how the NIC gets the data from main memory via PCIe.
      en.m.wikipedia.org/wiki/Direct_memory_access

    • @rnts08
      @rnts08 5 годин тому +3

      There's a reason real enterprise routers are rated in pps not bps.

  • @BitZorg
    @BitZorg День тому +74

    I'm very happy to hear that NXP was willing to open source what would be needed, both options seem very promising to me.

    • @hubertnnn
      @hubertnnn День тому +7

      Yep, with them open sourcing the necessary parts, proprietary solution starts to feel better than the vpp, due to compatibility with commonly used tools.

    • @jfbeam
      @jfbeam 12 годин тому +1

      It's not so much "open source" as them not caring what you do with the SDK after your $40k check clears. (when almost everything is in a microcode blackbox, there's no secrets in the SDK.)

  • @BobWidlefish
    @BobWidlefish День тому +115

    High-end networking geek here. Small packet performance is critical for core internet equipment. If you can’t send 14.4m 64 byte frames plus receive 14.4m 64 byte frames at the same time: your not doing 10 GbE.
    Bulk throughout is trivial and doesn’t require any fancy hardware: a mundane PC can easily do tens of Gbps with large packets.

    • @lyth1um
      @lyth1um День тому +5

      there is linux vpp stuff, there is a guy doing a ring with x86 of the shelve hardware.

    • @desperateopportunist586
      @desperateopportunist586 День тому +2

      ​@@lyth1umDo you know which guy is doing that stuff? I want to check it out

    • @Galileocrafter
      @Galileocrafter День тому +9

      That’s what i have been saying all along. 10 Gb/s the easy way is 812743 Mpps with 1500 byte MTU packages. 10 Gb/s the hard way is 14,88 Mpps with 64 bytes MTU packets. 10 Gb/s the realistic way is a realistic IMIX traffic profile.

  • @bennetb01
    @bennetb01 День тому +44

    It's not that the packet size needs to be downsized but the PPS that VPP can do. There are a lot of things that use small packets like DNS, etc and there is specifically a test called IMIX. It's not perfect but the idea is to test throughput using various packet sizes that mimic more of a real world solution.
    A lot of commercial routers can put up huge numbers with 1500 (and more with 9000) byte packets but even when you MTU is set that high you will find the average package size is much lower. It would be good to know the performance of the router with 64-byte packets (the lowest) as well as IMIX (or something else that is not the ideal max packet size).
    Again it doesn't matter what your MTU is set to, it's the average sized packet. Thinks like DNS or ACKs are going to be a lot of smaller packets.

  • @minifig404
    @minifig404 День тому +11

    Thank you for fighting through this. I'm really glad to hear you have fully open-source options on the table.
    CPU microcode being closed is not new, and I consider that just something that you have to put up with in this world (so far).

  • @BrinkGG
    @BrinkGG День тому +33

    Well I'm not getting anything done for the next 21 minutes.
    EDIT: I'm Glad NXP is allowing their binaries to be shared under the project's licensing. That's huge! Looking forward to seeing future parts of this project. :D

  • @salvadorseekatzrisquez2947
    @salvadorseekatzrisquez2947 День тому +29

    The encryption-decryption part is really impressive

    • @morsikpl
      @morsikpl День тому +8

      That's true! Even many enteprise solutions guarantee about 3Gbps when encrypting/decrypting traffic on 10Gbps interface!

  • @originaljws
    @originaljws День тому +6

    Not only am I excited about the results (which are solid. nicely done.), I'm so grateful you listened to the comments here and the other people following this project. Thank You for listening. I can't wait for availability of these routers. This is a fun project to follow and at the end of this rainbow is a useful and maintainable tool.

  • @spx2327
    @spx2327 День тому +17

    Tomaz talks like an US drill sergeant, I am always stressed out after watching his video's. Sometimes I even start doing push ups 😅

  • @gcs8
    @gcs8 День тому +10

    Nice, I am used to only seeing DPDK and it's other friends in the data center as part of NSX-T, I never brought it up as I only ever saw it for specific NICs and not on any embedded stuff outside of something like a DPU. I think this would be a super cool thing to get into more common use. I think if you make the router OS able to be virtualized with the same feature set when paired with a supported NIC (it was mostly Intel and Broadcom last I checked) that could really open up home lab stuff for some cool things like "pocket universes" or just an easier way to play with OSPF/BGP with enough oomph behind it to make it fun for the lab. This could open a lot of SDN fun up.

  • @danielberes3099
    @danielberes3099 День тому +4

    If you want the VPP interfaces to be visible in the kernel, you can use lcp plugin.
    Also using 2 workers for VPP could give you better performance, but there is the price of higher consumption...

  • @woobm
    @woobm 22 години тому +2

    Choosing DPDK & VPP and open sourcing everything is what makes this project to me very interesting! I´m glad you don´t go with a HW specific solution. This will make future HW updates much easier. I also hope you will give a push to DPDK adoption for other open source projects, helping to improve home networking equipment performance. As a side effect this might help to save some energy and resources. Bravo!

  • @geogmz8277
    @geogmz8277 20 годин тому +1

    😊 I'm glad everyone said it.. PPS vs Throughput.. As a former WISP owner that used Mikrotik at first 😂 I had my days of sadness with Tilera vs MIPS vs ARM!

  • @ludufre
    @ludufre 18 годин тому

    Even though I am from Brazil and consequently will not be able to buy it when it is finished because of our import tax, I am following this project closely. I am amazed to see it being built in public. Congratulations to everyone involved.

  • @SwissPGO
    @SwissPGO 14 годин тому +3

    A hacker that somehow gets access to the terminal will be instantly lost because he can't use ping 😂

  • @raresmalene5569
    @raresmalene5569 День тому +11

    64 sized packets, or it is not 10gbit/s,small packets are quite important, if it doesn’t do it with the rfc 2544, on imix and both 64 packet size, not gonna do anything better than a 100 dollar router. Max mtu on your device is just the limit at which the computer starts to fragment the packets, it is like having an earthmover( mtu 9000) vs a wheelbarrow(mtu 64), yes you can carry more faster, but you are building a clay pot, you can not use an earthmover to carry the clay

    • @originaljws
      @originaljws День тому +2

      Recognize he's capping the CPU performance and testing the performance limits. Internet-Mix packet sizes and full clock rate operational benchmarks are obviously important as you point out, but that doesn't make a good thumbnail. All zero payload benchmarks are almost as useless as all max(MTU) without more context, I'm impressed with the metrics he shared, and am pleased to see the positive response to the kernel/source license problems. I look forward to this project continuing to develop and becoming a tangible, order-able device.

  • @chrisdixon5241
    @chrisdixon5241 21 годину тому +2

    Pleased to hear that NXP responded positively to the feedback and even could suggest an alternative

  • @mazensmz
    @mazensmz День тому +4

    I liked your video because you put the answer in the title.

  • @SkeptiSquid
    @SkeptiSquid День тому +2

    Hats off to you, this is a great development.

  • @jfbeam
    @jfbeam 11 годин тому

    For those having trouble following along, DPDK is network drivers and stack done entirely in userspace. ('tho in this case, it sounds like they still had hardware offload enabled.)
    For the record, there is nothing one can do in userspace that cannot be done faster in kernel space. But kernel code requires a great deal more detail. When your crap-code crashes in userspace, you just run it again. When your crap-code crashes in the kernel, at minimum, you'll need to reboot, at worst, _power cycle_ the whole system. (and in userland you can use whatever stupid programming language(s) you want.)
    In general, interrupts are significantly more efficient than polling. A "while(1) { are we there yet? }" system never stops, so it never goes into any low-power state. If your loop is "check everything and sleep (wait 1ms)", now you've artificially created latency and jitter. (hint: how does it know when to wakeup? Interrupts.) On an active networking device, one would _assume_ there's an unending stream of traffic to process, but there isn't; there are sizable gaps between packets, and a takes time to send and receive frames where the CPU isn't involved.
    Similarly, hardware routing and switching systems are unimaginably faster than a general purpose CPU. I've thrown away single-digit-dollar chips that did IPv4 route lookups in a single clock cycle. DRAM is not TCAM. No ARM core can do a route computation (much less a full table lookup) in a single clock cycle. In a consumer level device, with consumer level traffic, that's not such a big deal.

    • @tomazzaman
      @tomazzaman  10 годин тому

      Thank you for the explanation!

    • @owenhilyard3157
      @owenhilyard3157 5 годин тому

      Kernel and user space are tied for networking efficiency, it’s all a matter of where you map the PCIe registers for the NIC into memory and where you tell the NIC to deliver packets. Being in userspace and not needing to share the NIC also means you can do things with offloads that would be patently insane in a kernel context, like specialize the entire network stack for routing 128 byte packets.
      Interrupts are more efficient, unless you have some more modern ISA extensions to x86 which let you put a core to sleep until the NIC writes to a particular cache line. This means you have two loops, a “while I have seen a packet in the last 500 microseconds” loop, and an outer one which puts the processor to sleep until the NIC delivers new packets. This processor feature exists almost entirely because of DPDK, and helps immensely with power usage when not under load (because all of your cores can go into sleep states, potentially lower than interrupt servicing would allow). This means you avoid the interrupt machinery in the CPU entirely. When at high packet rates, you are forced to transition to polling since even the 14.8 million interrupts per second for 10G would cause issues for a system. A CPU should never be idle under load, you feed the NIC some more descriptors with packets to transmit and then go process the next round of packets while it DMAs them out and sends them.
      DPDK has more or less equal access to hardware offloads compared to the kernel, but you can use them much more carefully. You can spawn one queue per CPU core, tell the NIC to do hash-based load balancing, and then let the cores lose. If you hold back one core for the OS, those CPU cores never need to interact with the kernel again.
      For a long time, DPDK was also using unrestricted DMA, meaning that any error would mean power cycling the system, the same as a bad kernel driver, but now can use iommu protection to stop that.
      DPDK has been more or less tied with the Linux kernel for a while in terms of raw packet IO, but if you can specialize DPDK will start running away with a win rather quickly unless you are rewriting chunks of the kernel network stack.
      As an added benefit, DPDK abstracts most relevant NICs (and AWS/Azure/GCP NICs) if you wanted to test your software in CI with a normal server.
      Yes, dedicated hardware is faster, but if you want that go buy an FGPA to string the chips together and the cost of your router spikes quickly. CPU is plenty for 10G.

  • @schectermf350
    @schectermf350 День тому

    Great video, thanks for covering DPDK and VPP in depth. I've spent many hours reading and always been frustrated with how developer-centric the documentation is. Really looking forward to getting hands on with your product! Really hope VyOS will pull finger and get VPP working too in mainline.

  • @hubertnnn
    @hubertnnn День тому +6

    I see two possible issues with this approach.
    First is the power usage and heat generation, since one of the cores is constantly at 100% even when your network does nothing at all.
    Second is possible latency increase, since when you poll instead of using interrupts then you don't respond to events immediately when they happen but on the next poll, so time between polls is your extra latency. This one may not be an issue, since 100% cpu suggests busywaiting without any sleep, but I would still like to see a test confirming if latency is not increased.

    • @BirknerAlex
      @BirknerAlex 20 годин тому +3

      DPDK has no impacts on latency under real world circumstances. Many datacenters (ISPs) running DPDK applicances for DDoS mitigation. At my previous employeer we had a DPDK appliance to protect game server traffic from DDoS and maybe you know that but game server traffic is one of the most challenging things when it comes to latency. And guess, it worked like a charm, it used iBGP to route entire networks thru the appliance after being processed from the router before hitting the core switching stack.

    • @ZleekWolf
      @ZleekWolf 15 годин тому +1

      ​@@BirknerAlexI'm sure ya right that latency isn't an issue. The cores only job is busy-waiting, when not already processing data currently anyway. It even saves on ISR entry/exit delays, so it should be rather faster than slower in my view.
      I'd still share the concern about the wasted energy tho. Sure those cores are more efficient than some beefy x86 Server CPU, but they're not "free", I mean that's why ARM also has power saving modes and dynamic frequency scaling for example. I also don't whine about one core on a 64 core server with a daily load overage of 40 being dedicated to burn 1% more energy in my rack at work, and that core will have plenty of real work to do as well. But let's be honest, any home router is going to spent 99% of it's time idling, or exchanging a few dozen packets a second processing background noise. I got a 6 core Xeon E-2146G in my server sipping like sub 10-12W System total with the 10G NIC (excluding SSDs and some other extensions/peripherals, don't have exact numbers in my head anymore rn), so this router core HW (without Wi-Fi and other stuff ofc) hopefully ain't gonna be sipping such numbers for way less performance, just because it's busy all day checking if maybe finally someone has a packet for DPDK. I'm very interested in this project, because hosting/running your Router/Firewall/Networking on your main server has plenty downsides, and I still run a separate AP for Wi-Fi ofc, but not if I'd be investing in yet another 24/7 energy hog 😅

    • @sasjadevries
      @sasjadevries 11 годин тому

      Polling *can* have way less latency than interrupts. Interrupts are good for occasional events, but when you have a lot of them you have to do context switching for every interrupt separately, and finish performing the previous thing, before you can take on the next one. When polling, you can bunch multiple events together into an array/vector, and process them in one go. So the result is that you have worse latency in optimal conditions and low load, but better performance under heavy conditions.
      In Computer graphics the same kind of optimisation is used. Graphics drivers when receiving data/drawcalls used to wait for more data, to send a bigger chunk of data, and the graphics driver would need some algorithm to estimate how much to wait, and how often to send data to GPU; but with the introduction of Vulkan this task (along with other tasks) moved to the application/game. And a lot of other optimisations and tweaks that used to be done in the driver (Kernelmode) during the directx11/openGL days, are now done inside of a game (Usermode) thanks to Vulkan.
      Do you see the resemblance to where DPDK+VPP is going 😉?

    • @hubertnnn
      @hubertnnn 8 годин тому

      @@sasjadevries Yep, I mentioned in one of the other comments that I worked with microcontrollers that had a similar batching interrupt system, where an SPI controller would send an interrupt when a message arrives but will not send any more interrupts for next messages until you empty the buffer, so you can batch process multiple messages on single interrupt without creating hundreds of context switches. That would be good to have on linux, but I gess it might not be so easy without hardware support.

    • @sasjadevries
      @sasjadevries 7 годин тому

      ​@@hubertnnn VPP stands for Vector Packet Processing. So sending a vector of packages is the whole idea of that software package😉, they even put it in the name.
      I'm not a network expert, but I just got curious a few months ago, when someone in the comments mentioned VyOS, with the DPDK+FDIOVPP stack. That's when I looked into it...
      Basically the whole vectorisation is done in VPP, and their main selling points are that VPP runs 100% in userspace, that it's hardware agnostic, and deployment agnostic.
      I kinda like this approach for high throughput. So the low level interface (DPDK) is simpler and more predictable, by polling instead of handling interrupts. And the application level software (VPP) can pick the vector size that fits the polling rate.

  • @sekanderbast452
    @sekanderbast452 День тому +5

    First, I‘m very impressed by the performance!
    One question though, as one Core is now constantly pegged, in what way does this impact power consumption? Is there a notable difference between this solution and the old proprietary sdk when at idle or when routing?

    • @xdevs23
      @xdevs23 День тому

      It's pegged, but probably not actually using much power. I guess it's just busy-waiting on packets, which should be just some conditional branches and compares, nothing too crazy. Nevertheless, it's taking CPU time that could have been used elsewhere.

    • @tomazzaman
      @tomazzaman  День тому +5

      You're right, the core is at 100%, but it's basically just a constant loop of polling the interfaces.

  • @Seandotcom
    @Seandotcom День тому +4

    lmao as an embedded developer I totally feel the cross-compiling mess

  • @c0p0n
    @c0p0n 22 години тому +1

    I'm fine with the compromise that the microcode be proprietary. I do think the sources should be available, or at least they should have them security-audited and the results published.

  • @0zux45
    @0zux45 День тому +5

    I really like performance graphing with grafana/prometheus or whatever else. I assume vpp already has the capabilities for an external software to pull that info?

    • @0zux45
      @0zux45 18 годин тому

      my previous answer was hidden, cause i linked to the vpp manual.
      Short answer: yes, yes there is.
      Long answer: they have a python package that can access a lot of statistics, it's really easy to use!

  • @uis246
    @uis246 12 годин тому +1

    Try to test flow offloading in kernel too. It might allow to process 10 Gb/s on vanilla kernel without any additional userspace software.

  • @RichardLofty
    @RichardLofty 19 годин тому +2

    1ghz is not that far from the today's standard 3ghz.
    The absolute recrod maximum today is 5ghz.
    I would understand a 100x difference, but everyone seems to forget how fast current tech is.
    And this is forgotten because of bad software.

  • @cjcox
    @cjcox 17 годин тому

    Yeah, this was a bigger issue (10Gbit) in the pre-Nehalem days (talking DC side where you have more powerful processors, there was a day when there were struggles). Just like entropy exhaustion was also a huge issue. Advancements have made those things less of an issue.

  • @MarkoCloud
    @MarkoCloud День тому

    Sweet! RoutSI is going to be super power efficient if that's all it takes to run 10Gb!

  • @MikkoRantalainen
    @MikkoRantalainen 12 годин тому +1

    Wouldn't it be enough to use Linux kernel and io_uring interface to minimize the overhead you normally get with interrupts?

  • @outseeker
    @outseeker День тому +2

    mm i like what a few ppl have mentioned in the comments here about testing with 10gb/s each direction, with the data being all tiny packets like you might see on a super busy network. 10gb/s in a solid data stream isn't the same as 10gb/s of all varieties of packet hammering the device?

  • @deeeezel
    @deeeezel День тому +1

    I need to get my hands on one of these router when it available

  • @JoJoDramo-ih7qk
    @JoJoDramo-ih7qk День тому +1

    Pardon my ignorance, would it make any sense to replace the current Linux stack for dpdk + vpp for perf? Or would create a big hole in security in normal cliental pc without any real improvement because muticore bulk x86 CPUs?

  • @sezam84
    @sezam84 День тому

    Nice Video… are You planning to put this project on kickstarter or similar platform? I am interested in the product :)

  • @xdevs23
    @xdevs23 День тому +9

    I didn't know DPDK existed. Looks very promising. However, it feels like it is non-standard. I don't know if administrators really want to deal with DPDK/VPP if Linux already provides really good infrastructure.But hey, as long as it works and doesn't interfere with what I do - that's fine. It would be interesting to see the latency on this. The bandwidth may be high but latency is also something to keep in mind.

    • @jamess1787
      @jamess1787 День тому +1

      DPDK is used in server environments where the host OS doesn't need to interfere with the data, when you can tunnel the traffic to separate containers or guest VM's.
      Cellular infrastructure/mobility cores for LTE infrastructure do this to optimize hardware requirements (and space available). It's really cool tech, weird to think about it's complexities tho. Not sure how Linus or any of the kernel maintainers agreed to implement it 😂

    • @hubertnnn
      @hubertnnn День тому

      Never heard of it either, but the interface looks like CISCO's router command line interface, and since it was made by cisco, I wouldn't be surprised if it actually is their router CLI.
      And if it is, then administrators already know it. It would be a bit worse for non administrators, because cisco's cli is a pain to learn with many non obvious things.

    • @xdevs23
      @xdevs23 14 годин тому

      @@hubertnnn but administrators don't necessarily know how to work with Cisco. I also don't quite trust Cisco as these are big corporations and you don't really know what you're getting into and it could turn out to be a huge enterprise mess.

  • @rnts08
    @rnts08 5 годин тому

    I'd like to see a performance run over a couple of hours with the standard IMIX stream. It is more relistic for day to day use.

  • @MoraFermi
    @MoraFermi День тому +1

    dpdk, yep, called it.

  • @Adam130694
    @Adam130694 День тому +2

    All of that work for 2xSFP+ & 3x2.5GbE?

  • @jamess1787
    @jamess1787 День тому

    I ran an EPC (vEPC) LTE core using DPDK. It was "black box" aside from the usual sysadmin/sysop stuff.
    It's cool how fast you can push your hardware. Vendor had some weird DPE (data plane) bug but that's aside from the point.

  • @SwissPGO
    @SwissPGO 14 годин тому

    Would some variable polling frequency be able to lower the worry some have about power consumption? With few traffic, the frequency could be lower with up to 100ms sleep, and if packets arrive, the sleep is eliminated ?

  • @hasanismail.
    @hasanismail. День тому +1

    also huge fan of the project

  • @YonatanAvhar
    @YonatanAvhar День тому +1

    How does having a single core pinned to 100% affect "idle" power consumption over using Linux kernel based networking?

  • @pianoman4Jesus
    @pianoman4Jesus День тому +2

    Oh wow! The magic of DPDK+VPP! +1 vote to go in that direction from me. And I do not do Twitter, or X, or BlueSky... I hope you still check your old fashion email address for when I need to get in touch with you outside of a UA-cam reply comment. 😎 I will next send this to my "Right Hand Man in America" who built our custom Linux firewall platform 20 years ago. I really like your enthusiasm dedicated to this much needed space in the IT industry. Thank you SO much! Over time, I hope you will have enough volume growth that you would consider a next level up model with more network ports. 🥳🧐

  • @EndreSzasz
    @EndreSzasz День тому +7

    100% non stop on one core... there goes power efficiency and heat. So you keep the CPU pegged 24/7 for that 5 minutes of 10Gb transfer you do per day.

  • @D9ID9I
    @D9ID9I День тому +4

    I guess Mikrotik like rb4011 or rb5009 can do that without any issue

  • @DanielRodriguez-ff5cs
    @DanielRodriguez-ff5cs 11 годин тому

    Thanks!

  • @Quozul
    @Quozul День тому +1

    I'm a bit concerned about the 100% core usage, won't it increase the idle power usage of the router? 🤔

    • @tomazzaman
      @tomazzaman  22 години тому +1

      Negligible. Maybe half a watt.

  • @derickasamani5730
    @derickasamani5730 2 години тому

    What of 25 Gbps traffic, can a single core push that much traffic with DPDK and VPP enabled?

  • @MikkoRantalainen
    @MikkoRantalainen 12 годин тому

    UEFI and ACPI which are used in regular PC hardware instead of device tree are the worse option technically. A superior way would be to have a device tree provided by the BIOS and there would be no need for UEFI or ACPI after that.

  • @DubitoErgoSum-pd6cr
    @DubitoErgoSum-pd6cr 19 годин тому

    Damn, yes!

  • @foxfoxfoxfoxfoxfoxfoxfoxfoxfox
    @foxfoxfoxfoxfoxfoxfoxfoxfoxfox День тому +1

    What kind of performance difference is there between VPP and native linux packet forwarding? What happens performance wise if you switch the network driver to polling mode in linux?

    • @qdaniele97
      @qdaniele97 День тому +3

      My guess would be that at zero to very little traffic, Kernel networking with interrupts would be slightly more efficient/faster, but at any level of traffic above that VPP would get the advantage.
      That because, even with no traffic at all, VPP would still be polling NICs to know if there's something new while the kernel would be doing other things waiting for NICs to tell it something happened.
      With more traffic instead things wouldn't change much for VPP, but a lot for the kernel which would be receiving lots of interrupts and having to constantly stop what it was doing to listen to what NICs have to say.

    • @hubertnnn
      @hubertnnn День тому +1

      @@qdaniele97 I see that as well. Maybe release 2 versions of the OS/firmware, one with classic Linux kernel (the default one) and one with VPP.

    • @qdaniele97
      @qdaniele97 22 години тому +1

      @@hubertnnn That could be an option (or you could install your own OS).
      But I think it's likely a router will always in a situation where VPP has the advantage

  • @johnjbateman
    @johnjbateman 18 годин тому

    Where do you buy your custom keyboards now?

    • @tomazzaman
      @tomazzaman  7 годин тому

      Still have a couple in stock, so no need to, but I must admit, I did buy the wooting to test out their halo-effect switches :)

  • @sledgex9
    @sledgex9 День тому +1

    I wonder why DPDK chose to constantly poll the interface vs asking the kernel to notify it when a packet arrives. And then continue in userspace. Aka use the kernel only for the raw packet notification.

    • @triffid0hunter
      @triffid0hunter День тому +2

      Context switching is expensive - en.wikipedia.org/wiki/Context_switch#Cost

    • @sledgex9
      @sledgex9 День тому

      @@triffid0hunter Doesn't polling need context switching too? I mean each time you ask the kernel "is there a packet yet?" you enter momentarily kernel space for the kernel response. I could be totally wrong though.

    • @Bronko15344
      @Bronko15344 23 години тому +1

      @@sledgex9the whole point is to allow direct device-userspace interaction without the kernel being involved

  • @FredrikRambris
    @FredrikRambris День тому

    What performance can we expect using stock linux drivers and networking stack?

    • @tomazzaman
      @tomazzaman  22 години тому

      I was still able to get 10Gb, but never on a single thread, those get to about half that.

  • @AnIdiotAboard_
    @AnIdiotAboard_ 23 години тому

    Fun Fact many 10 Gbps 48 port switches used to run on a 333 Mhz Single core CPU. So 1200 Mhz should piss it without any stress :)

    • @heheelium
      @heheelium 22 години тому +1

      Those do HW offloading.

    • @mrlazda
      @mrlazda 15 годин тому

      But they have switching ASIC, and CPU is there only for configuration and housekeeping functions. All switching is done in ASIC, not in the processor. For example, for 48 10Gbps switch, you can use, for example, Maevell Prestera ASIC and processor will be at 0% usage at Layer 2 ans Layer 3 wire-speed switching.
      SoC, in this case, is missing that functionality. Basically, on a good day, in reality, this SoC is gigabit capable L3 switch (comparable ARM SoC are sub gigabit in real world user case, most are in range 300-600Mbps)

  • @LuminousWatcher
    @LuminousWatcher 13 годин тому

    So on twitter a post was called a tweet. Is a BlueSky post called a BS?

  • @tomaszjeniec
    @tomaszjeniec День тому

    What’s the power consumption change with that core always blasted?

    • @tomazzaman
      @tomazzaman  22 години тому +2

      Negligible difference. Around half a watt.

  • @kristopherleslie8343
    @kristopherleslie8343 День тому +1

    What about L3

  • @yerdude
    @yerdude День тому

    Tomaž is gold

  • @Saturn2888
    @Saturn2888 21 годину тому

    I have no clue what's happening in this video anymore. So many acronyms. I clicked on it not knowing what I was getting into, and then it got into topics I have no clue about. I've never heard of your channel before, so this is all random for me, but I eventually got very lost trying to figure out what the video is even talking about.

  • @shephusted2714
    @shephusted2714 День тому +1

    can you make one more enterprise with more ram etc - cheap 100g router/switch - i think quite a few people will start going from 10g to 100g fiber - it is the first place you invest

  • @maksiodzidek1
    @maksiodzidek1 День тому

    good job

  • @trendingtopicresearch9440
    @trendingtopicresearch9440 21 годину тому

    Also with 50 firewall rules?

  • @georgehooper429
    @georgehooper429 День тому

    Very nicely done. From the outside it seems a long way to get this type of through put. It looks like you added a few more gray hairs setting this all up. But in the end it works. Well done!
    I realize you might have a limited lab environment but it would be interesting to setup all 10GbE ports with a iperf system. I think there was 4 in your build (sorry poor memory). See if you can push 10GbE per port through the router. You might have a 2x2 setup for iperf testing. The idea is to see at what point you saturate that single core, and then need to dedicate a second or third core to networking while keeping the remaining core(s) for the kernel and system management (snmp, dhcp and such).
    I think its a good plan to keep one of the interfaces (1GbE) connected to the kernel as an out of band management interface. This will keep port forwarding of data interfaces offline until the system is fully booted and the system status/setup is confirmed.

  • @ms2649
    @ms2649 День тому

    What about the thermals now that one core is pinned at 100% all the time?

    • @tomazzaman
      @tomazzaman  22 години тому

      Adds about half a watt to total power consumption. Negligible, but once we have the cases manufactured, we'll of course run the proper tests to make sure it really doesn't impact anything.

    • @ms2649
      @ms2649 20 годин тому

      @tomazzaman does the software only work with polling? I haven't done much with networking on embedded linux or device drivers myself but it seems not ideal to constantly poll for data (but i have 0 experience with this so who am i to judge)

  • @noxos.
    @noxos. День тому

    Can you make a homelab tour

  • @massimilianogilli1164
    @massimilianogilli1164 19 годин тому +1

    Isn't this quite a clickbait title?. The CPU doesn't do shit since everything is offloaded as you demonstrated many time

  • @CRCinAU
    @CRCinAU День тому

    Soooooo, no iptables? no nftables?

    • @tomazzaman
      @tomazzaman  22 години тому

      Correct. VPP does come with it's own firewall though. Both statefull and stateless.

  • @chewbaccabg
    @chewbaccabg 13 годин тому +1

    Sure.. now put some vlans, firewall rules and so forth & try again :)

  • @fgfgfgfgfgfg1003
    @fgfgfgfgfgfg1003 22 години тому

    Single core can't, single core plus a lot of asics additions - yes

  • @scottxiong5844
    @scottxiong5844 День тому

    Nice!

  • @dave7244
    @dave7244 2 години тому

    WHY ARE YOU SHOUTING AT ME??

  • @matjazmuhic550
    @matjazmuhic550 День тому

    Sixpthy.... Ptheven...

  • @pcislocked
    @pcislocked День тому

    perfect timing for... well, its 1am but a good video is a good video ig

  • @SB-qm5wg
    @SB-qm5wg День тому

    Mikrotik crs305 has 4 SFP+ ports and runs on a tiny 1-core 32bit 800mhz cpu.

    • @EndreSzasz
      @EndreSzasz День тому

      That is a switch, the packets don't get to the cpu, switch chip deals with them. If they go to the cpu it can barely do 1Gb. Check the test results on their product page.

  • @ShadowFandub
    @ShadowFandub День тому

    HAHAHAHA

  • @AtTheLetterM
    @AtTheLetterM День тому

    Please no white backgrounds im dying .

  • @hasanismail.
    @hasanismail. День тому

    4rth

  • @jackipiegg
    @jackipiegg День тому +1

    You can do 10g but refuse to do 2.5g instead of 1g facepalm

    • @hubertnnn
      @hubertnnn День тому +1

      He said it in one of the previous videos. That CPU have modes with specific lists of interfaces in each mode.
      You cannot just distribute the bandwidth as you please.

    • @jackipiegg
      @jackipiegg День тому

      @@hubertnnn
      Its 2024 and he's still releasing 1gbe nic and calling it "pro". instant fail and no one will buy it.

    • @zackey_tnt
      @zackey_tnt День тому

      You are one of those new fangled modern day Grinches aren't ya

  • @Holy_Hobo
    @Holy_Hobo День тому +14

    Bluesky is cringe

    • @chimpo131
      @chimpo131 День тому +1

      this guy also sounds like such an insecure douc he whenever he talks 😂

    • @rapamune
      @rapamune День тому

      It has virtually already turned into an echo chamber with extreme moderation. Only viable for radical progressive left users at this point in time.

    • @PR-cj8pd
      @PR-cj8pd День тому +3

      There's nothing wrong with twitter

    • @marshallb5210
      @marshallb5210 День тому +5

      xitter is cringe

    • @eat.a.dick.google
      @eat.a.dick.google День тому +4

      X is the worst cringe.

  • @cheako91155
    @cheako91155 День тому

    Not great for a home network for 1-core 100% 24/7? Interrupts and *DMA are wonderful, why go back to a world without them? * You shouldn't be able to get physical addresses from userspace.

    • @hubertnnn
      @hubertnnn День тому

      Interrupts use CPU resources. They are good for low traffic, but for high traffic polling is better.
      A perfect situation would be the ability to disable interrupts until all data in the queue is processed.
      Some microcontrollers I worked with had this feature where you would receive just one interrupt and no more until it is cleared (which happens after queue empties).

    • @cheako91155
      @cheako91155 День тому

      @@hubertnnn because 100pct is free...wtf u talking about.

  • @Triro
    @Triro 16 годин тому

    Lol bluesky is just as worse now as twitter....
    So you've officially lost me on the project.

  • @craftefixxxx
    @craftefixxxx День тому

    first,

  • @YonatanAvhar
    @YonatanAvhar День тому +1

    How does having a single core pinned to 100% affect "idle" power consumption over using Linux kernel based networking?