Why can't powermeters get more accurate? Even if companies weren't making false claims

Поділитися
Вставка
  • Опубліковано 25 лип 2024
  • I've been asked this a few times over the years. I did phone in the bit about the duty cycling problem, but didn't cut it because I think it might lead into something else later.
    0:00 Intro
    0:38 Error calculated wrong
    1:59 The right error calculation
    2:40 So which device is right
    4:33 Method 1 Odd man out
    5:47 Method 2 Transitive
    8:46 The Problem with Sig Figs
    10:13 The duty cycling problem
    11:40 Summary
    12:14 Rambles
    14:04 One more dig
  • Наука та технологія

КОМЕНТАРІ • 29

  • @gplama
    @gplama Рік тому +24

    Great work Keith. This video popped up at the end of another day here testing and retesting meters and spending way too much time down the rabbit hole. You bring up a LOT of very good points about making claims of what is 'correct'. I'm sure I've slipped up a lot with my wording and/or assumptions many times. This is a great reminder to always question the data, even when it appears good.

    • @kwakeham
      @kwakeham  Рік тому +9

      You and Ray have been following this advice just about all the time. It's confidence in your reviews that mean if people spend a few minutes searching they can see if the marketing numbers hold true or are just lies. I'm not wordsmith, so minor hiccups and miswords I don't really notice. It's the core. You're on the A game not like the peer reviewed stuff

  • @Dcrainmaker
    @Dcrainmaker Рік тому +19

    Great video Keith. Indeed, as we ‘lose’ that third wheel (so to speak) due to lack of PowerTap hub disc option, it’s become more challenging to have three concurrent units on road. As you noted, there’s still a good opportunity for developing guidelines for how to properly compare power meters, depending on time/budget - min thresholds. Various napkin-level attempts have been made over the years, but collectively we’ve never quite escalated to something more widely circulated. Of course, as Shane pointed out above, eventually you reach the rabbit holes that take days of untangling, or years of historical testing experience. Either way, appreciate the kind words!

    • @kwakeham
      @kwakeham  Рік тому +6

      That kind of makes me want to do two things. A video on the flaws at an internal level inside devices and then the tests that can exploit them. And the other is build a new trainer test rig using a giant (7hp) servo motor that was from some work for a trainer company I did. However unless someone needed it I don't have a good reason to build said thing (again, and improved. The original was mainly torn down).

    • @Dcrainmaker
      @Dcrainmaker Рік тому +3

      @@kwakeham First, I meant to mention in my initial comment but got distracted halfway through - the video quality and graphics of this video were a solid step up! Nicely done. As for flaws and how to show them - I think that'd definitely be useful. And not so much in the manner of trying to kill devices, but to help companies catch the issues earlier, which in turn makes for better consumer devices - plus of course such a video is helpful for reviewers to find more areas to dig into. As for a trainer test rig, I've always loved the concept on paper, albeit, the challenge I've had with ever moving forward on building a device is that these days it seems 99% of issues that power meters have (once they arrive on my doorstep) tend to be ones that machines wouldn't typically find in a lab setting. So everything from how me as a human pedals, to how it handles certain outdoor rough conditions, to even heat indoor drift (which a rig might help, assuming a rig can compensate correctly).

  • @tedwingate
    @tedwingate Рік тому +3

    There's a whole other argument to be made on accuracy vs. precision. It's very easy to conflate the two concepts.

  • @Ed.R
    @Ed.R Рік тому +3

    Having developed and tested my own power meter I appreciate the problem here. Explained very well as your videos always are.

  • @chuckb4375
    @chuckb4375 Рік тому +1

    I imagine that "flock concept" plays a role here, like when cars queue up in a lane without valid reason. If company 'A' makes the current de-facto gold standard that other companies have been conforming to, new makers/models might being doing tweaks to avoid being flagged by analyzers/reviewers as being your "Odd man out". The nail that sticks out gets hammered down!
    It would be sad if the deviations of a new innovative design were actually more accurate, but suppressed.
    Early in my career when phone line data modems were still a thing and somewhat of a dark art, I brought my product designs to a small (2 people) respected independent test house that had custom-design test equipment and automated torture tests that took several days to run. At the end of it, I knew if I had a design/component problem to fix, and incidentally how I stacked up against competition. The key point is that the measuring stick was primarily a calibrated custom stimulus-response test harness, NOT another vendor's product. The power meter + smart trainer market size might not be large enough to support a small independent test house, but in an industry where accuracy claims are on every sales sheet, it seems that's what is needed...

  • @BiffBruise
    @BiffBruise Рік тому +1

    FWIW, a useful analysis "tool" for comparing / evaluating two measurements is the Tukey mean-differences (aka Bland-Altman) plot. It extends the average comparison mentioned in the video to a visual method which evaluates the entirety of the observations

    • @kwakeham
      @kwakeham  Рік тому +1

      Ah yes, I haven't used that often but very useful too! Thanks for the suggestion.

  • @StuartLynne
    @StuartLynne Рік тому

    The same discussion also holds for other sensor types, heart rate being a good example (which for some can include ECG and accelerometer data.)
    There is also data loss during transmission which can cause dropouts for power and especially for heart rate (which impacts HRV statistics derived from RR intervals.)
    While most head units appear to store (e.g.) power once per second. You can get and save it more often. Although that depends on the power meter/trainer implementation. And how they are measuring the data and if they are willing to report the data faster. The Wahoo Kickr Race Mode is an example. That works via (sort of) Bluetooth over Ethernet. But ANT+ allows for 8hz reporting (default is typically 4hz) and the BLE Cycling Power Sevice has the Request Sampling Rate procedure to set the sampling rate (which is a UINT8 with 1 hz resolution.)
    I'm building a Windows app that can record any of the training-related sensors at least for indoor rides. Power, trainers, heart rate, SMO2, VO2, etc. And as many data streams as you want, ANT+ or BLE. Work in progress, but the intent is to implement as much of this type of testing as possible for use in realtime while on your trainer. www.fitnesshrv.com

  • @adamsinbox
    @adamsinbox Рік тому

    Very much appreciate the in the weeds view in this topic.

  • @RolfOstergaard
    @RolfOstergaard Рік тому

    Kieth. Good nerdy video. And you didn't even talk about the many ways error (percentage) can be defined (something never really specified). Or how - with the method you described - we should describe error at/near 0W :) :) :)
    After I went from designing power meters to designing torque sensors for ebikes it is fun to see that we achieve much better accuracy, lower latency, and better resolution (time/angle/torque/power) for a lot lower price now. I suspect there is a future sports PM based on ebike technology just waiting to happen :)

  • @robertchung4914
    @robertchung4914 Рік тому +3

    Keith, I agree with you that most reviews/reviewers are flawed. However, I disagree that there are no usable references. When I came up with VE 20 years ago, it was exactly because I was interested in evaluating PM accuracy. So, I don't compare one PM against another (or against two others): I compare power and speed against known physical parameters.

    • @tanhalt
      @tanhalt Рік тому

      ...and strain gauge PMs that allow some sort of "stomp test" (i.e. static check of torque or force measurement) go a LONG way towards ensuring accuracy...all that's left in the calculation is accurate rotational speed measurement, which shouldn't be THAT difficult to accomplish ;-)

    • @kwakeham
      @kwakeham  Рік тому

      If two things say they have similar accuracy you can't just go around saying one is right. That's amateur hour and anyone who's studied any statistics or design of experiements knows this. The other fact is these are load cells. There is no "magic" and in effect they are pseudo "static" devices if you move / rotate your reference frame. Most failure is in testing forces that shouldn't give results and tend to "appear" response. As for comparing to physical matters, personally I'm not a fan of trying to compare a strain gage to a rube goldberg machine of math and assumptions. It might "work" but load cells and strain gauges don't need such methods to actually test, there are way more detailed stats that come with a professionally calibrated load cell then anyone can get trying to test a powermeter against estimation methods.

  • @bleigh6562
    @bleigh6562 Рік тому +1

    Thanks for the video.
    Makes one look at everything with more of a critical eye.

  • @victrolaman2007
    @victrolaman2007 Рік тому

    Excellent video, thanks for making. I have always questioned what I see as the elephant in the room, measurement accuracy traceability back to a national standard. Such traceability is required for the calibration of most measurement instruments used to verify results. In the various reviews, I have not heard any mention of this traceability being provided by the power meter supplier. Also, I was aware that there would be limitations introduced due to sampling quantization, however, I found interesting that the data is only reported in integer values. I assumed that floating point would have been specified in the standard. Thanks again, very interesting and insightful.

    • @kwakeham
      @kwakeham  Рік тому

      You're very right. They don't provide it, nor do companies provide the standard load cell like characteristics. This is more down to marketing. People can at least fiend understanding "1% error is better than 1.5%" but give them creep response, off axis response, full scale error, etc etc then it's not marketing specs. This is one of the difficulties with consumer devices. Almost all companies use weights that have some level of tracability or tend to be checked regularly against scales calibrated with traceable standards. Nobody cares in the consumer public. Most don't know what NIST is.
      But that's only half the problem. Just using a tracable weight isn't quite enough. PM's don't only get one force, they are subjected to many in various directions that don't contribute to the torque that is used to calculate power. These are the most telling of the failure at the force sensing level. If a device senses in the X direction and I apply forces in Y, it should always remain zero, but most don't. And by how much is where problems arise. Now add forces in Z and 3 associated torques (which can be tested by offsetting forces as well) and now you have the basis of what should be tested.

  • @Justin14100
    @Justin14100 11 місяців тому

    Hi Kieth,
    Once again, a great video. I've enjoyed a lot of your work. I have an observation that I am wondering if you could help me answer. I've noticed that on rough and undulating MTB terrain, my Quarq reads higher than my Power2max (both on MTB's with a couple of years of data). DC Rainmaker noticed it to in his Garmin Rally Review. The Quarq matches my road data much more consistently (going back to 2014).
    Might this be a sampling thing? I've read (without strong source material), that the Quarq sample rate is much higher? Or perhaps a data smoothing issue with the Power2max on rough terrain?
    Thanks!

  • @davet003.5
    @davet003.5 Рік тому

    Thanks Keith. What would it take to change fit files to record decimal power data and how much will it disrupt all the apps that use that data point?

  • @StavrosAvramidis42
    @StavrosAvramidis42 Рік тому

    Very nice video!
    In a similar mood, I reverse engineered the wahoo's Direct connect protocol for capturing those at 10Hz data 😌.

    • @kwakeham
      @kwakeham  Рік тому +1

      That sounds cool, did you publish your work anywhere?

    • @StuartLynne
      @StuartLynne Рік тому

      Yes, please! Can we look at anything?

  • @user-iw9hg4wq6r
    @user-iw9hg4wq6r Рік тому

    Hello! I am develping a cycling anaylsis service similar to Intervals or WKO, and I use statistical analysis of power data uploaded by users to determine the accuracy of power meters. Detecting the errors in power meter readings is as challenging and interesting as estimating FTP or power duration models.
    I have created a detailed cycling physics simulator, similar to bikecalculator. By comparing actual PR (personal record) data from Strava with computed values from the simulator for millions of cases, I can define the range of error between actual power and the simulator. This error range includes factors such as air resistance, rolling resistance, transmission loss, bike weight, incorrect rider weight input, and riding habits.
    Based on this, I have developed a software that can determine power meter accuracy, and if the maximum error range is within 10-11% according to our simulator, the power meter is considered to be usable without significant issues.
    Users of my service can input their power, time, weight, and other data for uphill rides of 4% or more over a period of 3 months into the simulator, and receive a report that allows them to easily compare the difference in values. This report is very useful. Of course, it may be difficult to expect the level of accuracy per second mentioned in the video, but overall it is not too challenging to track fitness.

    • @kwakeham
      @kwakeham  Рік тому

      Estimating things is not quite my realm. I'm a bigger proponent of directly measuring things. So I can't really help here. You're getting into aero stuff potentnailly.

    • @user-iw9hg4wq6r
      @user-iw9hg4wq6r Рік тому

      @@kwakeham Ok. I respect your viewpoint. Personally, I believe that any method is fine as long as users can determine if their power meter is "shit" or not. If I see any bogus powermeters, I'll be sure to visit this beautiful realm again.

  • @SMarkGee
    @SMarkGee Рік тому

    I thought the new Verve PM claims a greater than 1% accuracy.

  • @jd0johnson
    @jd0johnson Рік тому

    I think a good question is how accurate do the power meters need to be for a cyclist for them to be useful. After all, we as cyclists, are not seeking a scientific instrument we are seeking a training aid.