6 Ways Scientists Fake Their Data

Поділитися
Вставка
  • Опубліковано 22 гру 2024

КОМЕНТАРІ • 166

  • @PeteJudo1
    @PeteJudo1  Рік тому +4

    You can protect your privacy and support the channel by getting 20% off DeleteMe at joindeleteme.com/JUDO20

  • @themartdog
    @themartdog Рік тому +213

    I really, really like your initial point. There is no reason why big journals shouldn't publish when something doesn't work. The fact they don't makes no sense at all

    • @MimouFirst
      @MimouFirst Рік тому +26

      Agreed.
      I find it quite against the scientific method to ignore 'no positive result' results.

    • @sri5086
      @sri5086 Рік тому +3

      @@MimouFirst but then the problem is deciding which 'no positive result' to publish.

    • @MimouFirst
      @MimouFirst Рік тому +19

      @@sri5086 I think we should publish everything in a database, so that researchers know what's been done and what the result and the method of testing was. Would be the best imo. Now there might be studies going unpublished 'in a drawer' that weren't statistical significant due to a mistake and without the mistake it could be significant and useful. We won't know, since it's not published.
      Financial incentives won't let this become a reality though. Quite sad.

    • @ahmednematallah
      @ahmednematallah Рік тому +11

      They don't, because it significantly increases the volume of papers. Which people are very unlikely to read or cite, leading to a lower h index. People are interested in positive results. Not saying it's fair, but a better solution would be starting negative results journals with a different measurement metric.
      Another problem is that allowing negative results to be published and be given the same weight as positive results incentivises authors in a wrong way.
      The main problem is journals rejecting perfectly good papers to brag about a low acceptance rate, or just because the reviewers are biased or picky.

    • @themartdog
      @themartdog Рік тому +3

      @ThePowerMoves no there wouldn't. There are plenty of metrics that could be used to decide what to publish and not publish.

  • @9adam4
    @9adam4 Рік тому +101

    Why did the urologist accept only certain patient specimens for his data set?
    He was pee hacking!!! 😅

  • @theondono
    @theondono Рік тому +53

    I left academia altogether because after spending 500h in the lab collecting data, I was constantly pushed to re-do all my measurements with the sole intent of statistically faking them.
    Previously to that I had spend a month making tests to validate how many samples I should take for each experiment. My results was that I needed at least 250 samples to have a good estimator of the average. I proposed moving forward with that number, but was told first to do 500 samples per experiment (doubling my time in the lab unnecessarily).
    My measurements look good to my supervisor (they matched his simulations) one day, he left for a conference where a competing research group showed their results, then suddenly my numbers did not match anymore.
    I asked if there was something wrong with previous simulations, the answer was no. He just told me "This numbers must be wrong", so he wanted me to "run the experiments again", only this time with 100 samples. I plainly refused. He ran the experiments using 80 samples, and then presented a culled set of less than 50 as his "results".

    • @johndor7793
      @johndor7793 11 місяців тому +3

      Out of curiosity what was the subject? maybe provide a little more detail it would be interesting to know.

    • @theondono
      @theondono 11 місяців тому

      @@johndor7793 My supervisor designed a chip (ASIC), that performed an electronics task (programmable delay cell).
      I was supposed to measure a particular parameter of that behavior. I did my job, and just for completeness presented a comparative of the same value measured with different number of samples (different runs). Then measured each of the prototypes individually. Sent all the graphs his way, including stdev results (which was the primary figure of merit).
      Then he left for a conference where another team presented a very similar device (built with different tech), with much better performance (their stdev was orders of magnitude smaller).
      At his return, he wanted to repeat all measurements with lower number of samples. His justification was that the high number of samples was artificially increasing the stdev. I complained that made no sense, if anything we should use *more* samples to capture more rare events, even if their contribution was small.
      The "procedure" he ended up using for data processing involved not only reducing the sample amount to 80, but he proposed removing the smaller 15 and the higher 15 values to "remove outliers". Unsurprisingly his measurements had very low stdev.

  • @griof
    @griof Рік тому +54

    As a mathematician/statistician, I have helped a few friends to drive the statistics for their research (mostly "low profile" medical research). They did a good job collecting data, and actually they got significant p-values (technical details below). The problem was the not-interesting topic they choose... For example: a very simple technique to reduce pain during some simple but common nose interventions. These friend they all get refused by journals and expositions favouring other researchers with less evidence but more "engagement"
    Tech. Details: No exotic stats, just a standard test for homoscedasticity, normality (shapiro-wilk) and then t-test/chi for null hypothesis.

    • @coolieo2222
      @coolieo2222 Рік тому

      .

    • @hungrymusicwolf
      @hungrymusicwolf Рік тому +3

      Yep, and that's why we get bogus science today. Science is a common good, it should not be measured by its popularity as it currently is in journals.

    • @kayakMike1000
      @kayakMike1000 Рік тому +1

      There's lies, damn lies, and statistics. (I have a math degree I don't really use)

  • @stephenmcinerney9457
    @stephenmcinerney9457 Рік тому +7

    2:01 Stopping rules/'Data peeking' 3:00 Deleting Outliers/Data trimming 4:14-6:03 ad 6:03 Variable Manipulation 7:31 Excessive Hypothesis Testing 8:29 Excessive Model Fitting 11:36 Conclusion and Acknowledgment

  • @Kagrenackle
    @Kagrenackle 11 місяців тому +9

    I'm a mathematician and I love that I don't have to do experiments or find "statistically significant" results. It's still publish or perish and it's difficult to establish important results but the logical nature of it makes it harder to fake.

  • @sunway1374
    @sunway1374 Рік тому +60

    "No. 3 Variable manipulation" is what I see happens the most in my field of research. Most of my eminent colleagues don't consider it a problem. It's called discovery. In fact, often there is not even a hypothesis, just run the analysis and see which variables work.
    Sometimes, a student or a postdoc reports a particular set of variables give the highest correlation. The supervisor says great, here is the explanation and here is the story, it is reassuring this fits the theory. However, in a subsequent meeting the student comes back and says unfortunately he made a mistake in the analysis, here is the correct set of new variables. Well, don't you worry. We find another theory and story, it fits something else!
    I don't even work in social science, psychology or behaviours science. I am in physical science. It is still possible to fish for any relationship in the data and find a physical explanation backed up by mathematics to explain it. These studies are published in top journals in my field, as well as those in the Nature family.

    • @tech_priestess_channel
      @tech_priestess_channel Рік тому +3

      Can you, please, explain, why is "Variable manipulation" bad at all? That was unclear from the video, and is unclear from the comment as well.

    • @PeteJudo1
      @PeteJudo1  Рік тому +15

      @@tech_priestess_channel Variable manipulation significantly increases the chance of false positives. This is because the world has some randomness to it. So by engaging in variable manipulation, you are essentially relying on randomness to produce your effect, rather than intentionally testing what works in a systematic way. Sometimes, variable manipulation is justified if the authors clearly state that this is "exploratory" research that needs to reproduced. For example, they might say, "Ah, this wasn't what we intended to study, but we observed this other interesting effect that warrants further research." The problem is that often, as I said in the video, scientists will write up the experiment as if they were intending this the whole time, which leads to lots of legitimate "looking" studies, that are actually just the result of luck.

    • @sunway1374
      @sunway1374 Рік тому +5

      @@tech_priestess_channel No. Sorry. I can't explain it well. I am not sure it's actually bad myself. That's why I didn't explicitly say it this way or that way. But let me try...
      Say you have a dataset for recorded observation of 100 independent variables (x's) and 1 variable (y) you want to predict. How many of these x variables would you expect to have a statistically significant high correlation with y? It's actually another 'higher' level of statistical significance you can also specify and test. You would expect by some fluke some x variables would be highly correlated with y but there is no actual physical connection. This is called False Discovery. You can look it up.
      So, if you do variable manipulation, you are just allowing yourself a high possibility of False Discovery. Often, the researchers will then write "More research is needed" in the conclusion.
      But here is another common problem of the academic culture. The original researchers have already got their moments of fame and moved on to other new ideas. Who want to do the more research that supports other people's old studies? If you do, you are actually doing good science. But you won't be a star and you will struggle in your career. The original researchers will get all the glory.

    • @meneldal
      @meneldal Рік тому

      I think it's okay to be fishing for some relations, but you should apply a greater standard than 0.05 for those. I don't think it's bad if you test 30 things and find one with a significance of 0.005, but 0.05 is likely just random chance.

    • @deusex9731
      @deusex9731 Рік тому

      @@PeteJudo1 OK i understood it way better now. So the issue is more the false representation of a "Study", rather than the method. If everything is stated clearly as an exploration of an interesting effect and from there they are working in standard ways to reproduce it, this wouldnt be much of an issue?

  • @Batmans_Pet_Goldfish
    @Batmans_Pet_Goldfish Рік тому +44

    Thank you for speaking on this. P-hacking is part of what makes reading scientific studies so confusing for the layperson, and causes results to be so often misinterpreted by communicators.

    • @heyman620
      @heyman620 Рік тому +1

      No bro, you just don't know the field. Communicators often don't give a F about science, kind of like this channel lately (the dude just rides the fraud hype and diminishes the name of many honest scientists to the uneducated public while he communicated pseudo-science for years, I liked this channel during the Gino thing but it got embarrassing). In my eyes, expecting to understand papers without knowledge about a field is just not a serious type of behavior and a severe case of the Dunning-Kruger effect, which was reproduced multiple times (not p-hacked). Like, my field is CS, and I can't understand Math papers. Or Physics. Common bro, a little self-reflection.

    • @forstuffjust7735
      @forstuffjust7735 Рік тому +2

      ​@@heyman620Sadly i have to agree with you, while academic word has many shady stuff (heck i would say most sources i used for my msc thesis are not that honest). This channel detoriated into clickbait rage against academia channel, because thats where the views are.

  • @SugarBoxingCom
    @SugarBoxingCom Рік тому +15

    Feynmann summarized your video with s single quote : trying to find a theory in a pile of data

  • @extraleben6734
    @extraleben6734 Рік тому +17

    I have accompanied diploma, master's, and doctoral theses for 6 years as a computer scientist specializing in driving simulation. Removing outliers from studies was standard procedure for psychologists, and I have seen quite a bit. Since that time, psychologists and doctors of all kinds are no longer highly regarded by me. It is a shame how I now look down on these fields.
    But I don't consider variable manipulation to be a problem because it can certainly happen that one comes across something much more interesting in their studies, and omitting that would be sad.

    • @Qkano
      @Qkano Рік тому +2

      Boy was this kind of "fraud" common... but more often than not it was "innocent" / "well motivated", not becasue the researchers were intentionally setting out to cheat the system.
      Personally - I had a rigid approach to it - if I collected the data point, it was listed in the results section. When there were good (supporting ) reasons for rejecting a data point, they were listed in the results and the point flagged as excluded from the analysis. The reason could even be as simple as "so erroneous it had to be wrong" but still it was listed.
      Spurious results can occur for a number of way that are unavoidable, operator error, instrument glitches, transcription (recording) errors etc, Clearly, if I was trying to estimate the distance to the moon, and had 49 results at ca 250,000 miles ish - and one at 25,000 miles few would argue the outlier had to be included in the average.
      So while discussing the results - my analysis might exclude outliers, their existence was never hidden from the reader.

  • @srremus9781
    @srremus9781 Рік тому +20

    From the chemical scientific field: It's a waste of resources if several groups perform the same experiment's over and over and then don't publish what doesn't work out. Maybe you publish your procedure/idea and someone else finds the key step for success. That's part of the progress in my opinion. at least other scientists won't waste resources on it again if it's a dull.
    There should be a data base for failed experiments. Like very short publications, what you did, why, what you aimed at and what came out.

    • @joerudnik9290
      @joerudnik9290 Рік тому +2

      Absolutely, failure is valuable information!!!

  • @markcarey67
    @markcarey67 Рік тому +4

    I had a statistically significant amount of drugs in my system so I had to resort to pee hacking....

  • @sillysad3198
    @sillysad3198 Рік тому +5

    i was tought to delete exactly one outlier on each end.
    it was given a complicated justification which is roughly it is more LIKELY that we are deleting a measurement mistakes
    but if we delete more outliers it becomes less likely.

    • @FalkonNightsdale
      @FalkonNightsdale Рік тому +2

      Exactly… I was studying statistics and we were told to cut 2,5% from each side…

  • @psychotropicalresearch5653
    @psychotropicalresearch5653 10 місяців тому +2

    P-hunting: The illogical in pursuit of the indefensible [KG]
    Oscar Wilde [foxhunting: the unspeakable in pursuit of the uneatable]

  • @jivekiwi
    @jivekiwi Рік тому +4

    Thanks Pete. I'm a 44 year old guy who reads a huge amount yet I have learnt so much from your videos. You have brought up many disturbing aspects to research which is a bit gutting, to be honest but a sad truth is better than a promising lie. After the first one I watched, UA-cam has been throwing many similar vids my way and this needs to be more widely known. Even Dan Ariely! Deleted his book a week or so ago.

  • @kayakMike1000
    @kayakMike1000 Рік тому +2

    Scientists often run mathmatical models and try to claim their dumb software is an experiment.

  • @AlvinRamasamy
    @AlvinRamasamy Рік тому +3

    “If you graph the numbers of any system, patterns emerge.”

  • @GooseCee
    @GooseCee Рік тому +1

    This video was EXTREMELY fascinating and I was so captivated the whole time! Good work :)

  • @WisdomThumbs
    @WisdomThumbs 9 місяців тому +1

    Funny. Explaining this to my friends in 2021, and other types of scientific fraud, earned me a cussing out and the “contrarian” moniker.

  • @jasonmoy5452
    @jasonmoy5452 Рік тому +1

    Selective sampling can be just interaction effect, which is perfectly fine as long as you justify and be honest about it.

    • @allagnstall
      @allagnstall 10 місяців тому

      I'm no scientist. Forgive me, but are you prepared to say how _justify_ is operationally defined?

  • @spagzs
    @spagzs Рік тому +4

    If you pitch a study that examines homosexually in seagulls…you’ll get a grant 😂

    • @samsonsoturian6013
      @samsonsoturian6013 5 місяців тому

      Pitch a study that examines piracy in seagulls, you won't

  • @kylejohnson8447
    @kylejohnson8447 10 місяців тому +1

    Absolutely baffled. I always assumed that if a result was surprising, the first thing these scientists would do is replicate the experiment before giving the conclusions any credit. How is it that this isn’t the case?! As a mechanical engineering senior who wants to go to grad school, I cant imagine that if an experiment showed any sort of statistical significance for the first time that it wouldn’t be immediately tested by others. Is this isolated to behavioral science/sociology type fields or is this also a problem in STEM

  • @luszczi
    @luszczi Рік тому +5

    Often it's not fraudulent and researchers are just fooling themselves. "I KNOW this effect exists, I can show it if I just get rid of those outliers. Those outliers must be due to anyway".

    • @geokm7717
      @geokm7717 Рік тому +1

      That makes lots of sense

    • @straightfacts5352
      @straightfacts5352 Рік тому

      The easiest person in the world to lie to is yourself.

  • @Guishan_Lingyou
    @Guishan_Lingyou Рік тому +1

    I don't see any excuse for reputable journals not instituting mandatory preregistration of any paper that might be published: the hypothesis, methods, etc... should be given to the journal before the study is run. That would vastly reduce the freedom to use questionable, and non-transparent techniques to make results look more significant than they are.

  • @klikkolee
    @klikkolee 6 місяців тому

    Regarding variable manipulation and excessive hypothesis testing: To me, they're different forms of the same idea: you acquire data to test a hypothesis, and the hypothesis is not supported, but the data indicates that a similar hypothesis may be correct. If we just ignore that, then the line if inquiry dead-ends. There needs to be a way to still pursue those ideas. Is negating the increased chance of false positive just a matter of requiring a new, independent data set?

  • @weeb3277
    @weeb3277 7 місяців тому +1

    I like how it happens so often there are patterns.
    Soon there will be best practices too.

  • @adonm6998
    @adonm6998 Рік тому +6

    "trust the science" . I do, its the corporations and scientists i dont trust

  • @Ken-er9cq
    @Ken-er9cq 11 місяців тому +1

    Many studies are smaller than they should be, which means things like confidence intervals are large. Then the conclusion is going to be that there is no effect but there may be a large effect, we just don’t know. Journals should not publish this type of paper, because you are rewarding someone for doing bad science. However if I get a result like eating X increases the risk of Y by a factor of 1.01 (95% Ci 0.98,1.04) then that tells me that it is only possible that it has a small effect. Compare with 1.1 (95% CI 0.5,1.7) where it could have a sizeable result in either direction.

  • @debasishraychawdhuri
    @debasishraychawdhuri 2 місяці тому

    No data-trimming should be allowed. "I don't understand why it is that way" is not an argument in favor of throwing it out.

  • @aayambasnet548
    @aayambasnet548 Рік тому +1

    Can you explain why number 3 is wrong?
    Wouldn't seeing patterns where you did not expect before the very basis of new science? If you unexpectedly see such patterns in physics, and dig deeper into it, it could give a whole new paradigm. Of course, it may be a false positive like you mentioned, but you can't really say that focusing on such correlation is entirely bad. If the researcher sees the correlation, and then researches more about that correlation, I do not see anything wrong with it.

    • @metalslegend
      @metalslegend Рік тому

      You have to report this in your paper then, that:
      "1. We found no asociations between variables XY",
      "2. But we found associations between XZ and we did this and this after that" ...
      But most papers just report the XZ story to begin with, even changing their original hypotheses for XY to similar ones for XZ.
      And thats super bad! Not only did you not report the XY story, you claimed that testing XZ was successful, which is not right. XZ came up randomly while testing XY.

    • @joinedupjon
      @joinedupjon Рік тому +2

      Its not obvious to a lot of people why its super bad... The p value is the chance of getting that correlation by a sheer fluke. If you throw enough data against the wall you'll eventually get a p

    • @stephenmcinerney9457
      @stephenmcinerney9457 9 місяців тому

      @@joinedupjonYes, "Multiple Testing Problem". It's not allowed to keep shopping multiple hypotheses against one dataset until you find one that satisfies the magical p

  • @HellRaiZOR13
    @HellRaiZOR13 Рік тому +1

    The main reason why I dont wanna go to academia and dont wanna become a scientist after I finish my PhD.

  • @philidor9657
    @philidor9657 10 місяців тому +1

    Can someone explain to me why "variable manipulation" is considered misconduct? The way I understood it explained, it seems totally reasonable to pivot your research to something else when you notice a certain result during an experiment that is more interesting than the one your currently studying. It's really common in my field, chemistry, to discover interesting reactions while you're doing unrelated research and turn that into a paper instead...especially if the work you were doing before wasn't working well.

    • @zyrohnmng
      @zyrohnmng 9 місяців тому +2

      If, when doing your research, you collect data on many variables (let's say 20 variables in addition to the one you're testing). The probability that 1 of those 21 variables will show a statistically significant result when there is none is much, much higher than the chance that the 1 variable you were planning to test showing a statistically significant result.
      The ethical thing would be to discard this collected dataset showing that your experiment failed, note that there is a potential this other variable has promise, and collect a new set of data, specifically for the experiment for the other variable..
      It's bad practice because it increases the odds of you getting a false positive.

    • @Wanhope2
      @Wanhope2 8 місяців тому

      I strongly suspect that case can be comorbid with stopping unfortunately
      Ideally people could be comfortable and actually able to publish negative results.

  • @EvilDMMk3
    @EvilDMMk3 Рік тому +1

    Say if you were doing a study and you noticed a strong correlation that wasn't your original hypothesis, what should you do? Clearly you should not publish falsly but also there might be a real result there.

    • @PeteJudo1
      @PeteJudo1  Рік тому +2

      You can call it out in your write up, but make it clear that it was not your original intention for the study, and that in needs replication in it's own randomised control trial to verify. What you should never do, is pretend like that was the intention of the study all along, which unfortunately is what happens more often than we would like to admit.

  • @mathijs58
    @mathijs58 Рік тому

    Great video on an initially confusing term. Well explained, and recognizable examples. Now try to find some stock footage of 'scientists' that are not so squeeky clean, real science is quite messy, even if you don't resort to p-hacking etc...

  • @Queen1001N
    @Queen1001N Місяць тому

    A related problem is that just because something is statistically significant doesn’t mean something real is occurring. There’s an episode on Scishow about P-values. They opened with a team that put a salmon in an MRI machine and ran an analysis to see if the salmon could distinguish various emotions on human faces. The results did have statistical significant. The problem? The fish was dead. It was literally a fish they just bought at the supermarket. (Know that this wasn’t meant to be a real study. It was meant to be kind of a stunt.)

  • @leannevandekew1996
    @leannevandekew1996 Рік тому

    Another video on Zimbardo from Stanford University published research on the Brown Bread Study and the Prison Study.

  • @DoriansPortrait
    @DoriansPortrait 8 місяців тому

    Wait.....so you're saying I can be like a scientist too, I can just delete my data!?

  • @poornoodle9851
    @poornoodle9851 Рік тому +2

    When scientists pick and choose data, science devolves into belief…basically its religion.

  • @halneufmille
    @halneufmille Рік тому

    My feeling about 3, 4 and 6 are
    - You should openly say what your initial intent was in terms of variables and samples, and tell us it didn't work out.
    - You should openly say all the other things you tried in terms of alternative variables or samples.
    - The fact that there seems to be an effect for this other variable or this subsample may be interesting if the statistical significance is high. It may constitute interesting hypotheses for future research and be valuable for the advancement of science.
    Just imagine if Fleming said: well it looks like this mold is having some antibiotic properties, but since I didn't set out to study this from the start, I will just ignore it and not report it. If he had been out of time or funding, he wouldn't have discovered penicillin.

  • @BrendenFP
    @BrendenFP Рік тому +1

    Researchers should be required to outsource their statistical analysis to (an) independent statistician(s). The person who makes a hypothesis should not be the one who (statistically) tests it.

  • @the19trier
    @the19trier Рік тому

    Now you get an even better topic and channel!!

  • @conradsieber7883
    @conradsieber7883 Рік тому

    Journals should be responsible for auditing a sample of the studies they publish...

  • @TheSpiritualCamp
    @TheSpiritualCamp 11 місяців тому

    I'm not an expert, so please can someone explain to me what is wrong with #3 ("variable manipulation") ? If I experiment for one specific correlation (like clubbing and extraversion) but the data happen to show significant result for another correlation (like clubbing and agreeableness), what is unethical about it ? Why not consider it as just a lucky finding ? Just like the scientists who tried to develop a drug to cure angina, and accidently discovered viagra ? Is their invention less valuable because it wasn't the goal they tried to achieve in the first place ?

  • @zackinator1439
    @zackinator1439 6 місяців тому

    I am not a professional in the field by any means, but I'm casually interested. But one thing I don't understand is why the example given of the clubbing affecting extrovertedness is wrong. If you collect a sample of people who do and don't go clubbing and have them take a personality test, you can use that data to attempt to find correlations in clubbing and personality type. If you think it may affect extroversion but it actually affects agreeableness, you still found a correlation that is just as valid as if you found one for extroversion no? If the test is a big five personality test that gives results for different aspects of personality and not just intro/extroverted then that test can be said to reasonably approximate your sample's personality trends. So in both cases, your sample is a mix of people who go clubbing and people who don't, and your collected data is personality data. So, other than the fact you're changing the specific trait you thought might have a correlation, both are still just under the same question of "Is there a correlation between clubbing and personality?". As long as your sample is large enough and as free of bias as reasonably possible, and none of the questions asked were specifically tailored to one or the other, yes your question may have been ever-so-slightly different, but in the other case you would have collected the same data, from the same sample, and done the same statistics.
    In the example of the sample slicing, where you chop up the sample until it fits your hypothesis, that wouldn't necessarily work because if you take a subset of your sample, that subset is by definition smaller than your original sample, and may not be large enough to be statistically significant. If you take a varied sample and find no correlation, then it seems like there might be a correlation for a specific subgroup, you go and collect a new sample of that subgroup that is of appropriate size and bias.

  • @danielastoica3354
    @danielastoica3354 Рік тому

    Great job Pete!

  • @0.-.0
    @0.-.0 Рік тому

    Thanks for your source in the description!

  • @brootalbap
    @brootalbap 11 місяців тому

    The majority of researchers p-hack in one or more ways. There are rewards for doing so and it hurts one's career not to.

  • @NICE-NG163
    @NICE-NG163 Рік тому

    We must never forget when they coerced the children for use as paper-logical to temporarily and marginally "protect" adults.

  • @SquizzMe
    @SquizzMe Рік тому

    The number of people who blindly accept anything scientists publish is astonishing and frightening.

  • @vw9659
    @vw9659 Рік тому

    To imply that this is widespread is nonsense. Maybe in the social sciences but certainly not in areas I know.
    Trial registration is widely practiced, which means that before you start the study you have to publish what you intend to do and how you intend to do it, including what statistical analysis you will use, and what its statistical power is to answer the specified research questions only. And then you do the study. When you want to publish it in a decent journal you have to demonstrate that it was originally registered, and that you did the study in the exact way it was designed.
    Also, in team-based research, it would be very hard for any individual to tamper with the data. The senior investigators for example often don't know exactly where the raw data is, or how to find any given variable in it (there may be 100's of variables).
    But in physics, chemistry and other hard sciences, you often don't know what you're going to find until you do the study. If no one has done that research before, you can't define how to analyse the data before you see the data. That's how new knowledge is discovered. If you analyse the data in a way that is not convincing, or draw conclusions that are not justified, other scientists will recognize that. And then when they try to replicate the results, they won't be able to.
    Most data fraud examples you have covered are are lone wolf researchers in "soft" disciplines, manipulation of things like exemplar images in the published data that has no actual relationship to the study's data, or rare graduate students or postdocs who may have fabricated data.
    Finally, scientific consensus is based on replication, not single studies. Anyone fabricating data to achieve a particular result knows that their study will never be able to be replicated by others. You do see "null" studies that find no significant result published - particularly when they are counter to an initial study that may have suggested something different. And then the consensus will change to match what those later studies have found. If that happens more than once for a given scientist, they will get a reputation for bad work.

  • @alexstewart8097
    @alexstewart8097 4 місяці тому +1

    1- Not in vain something to the effect of what makes a scientist great isn't his, or her natural talent but their good moral character was said by Albert Einstein, so perhaps in throwing away Judeo Christian values from society theY might have thrown the baby with the water 2.
    2-So Judo , it is Time for you to bring those back ASAP. Hajime!
    3- Love your videos...Shema!!!

  • @chinhhoang6304
    @chinhhoang6304 Рік тому +1

    You're definitely not wrong about the state of science being crooked to some extent. But you're definitely not right to say that some of those practices are to "fake data". There's a huge difference between intentionally p-hacking and data exploring and robustness check for certain research design. And they all depend on the nature of your research questions. Some practices might be fair in some designs but seem to be unfair in some others. Don't overgeneralise things like that.
    Still, I like your videos. Keep improving!

  • @1789Bastille
    @1789Bastille Рік тому

    can you interview elsevier, springer nature, wiley about this?

  • @doctorlolchicken7478
    @doctorlolchicken7478 11 місяців тому

    Like everything else, the problem is incentives. Incentivize scientists to be biased and they will be. Your explanation of the various issues is very simplified. Those are all real problems, but it’s all a question of degree. Unfortunately it is not easy to self-identify whether what you are doing is valid or you’ve gone too far. The only “rule” of best practice I can think of is that you must include everything you did in the paper. Also, you should sensitivity/stress test your results. In other words, show how slight variations impact the conclusion. If your conclusion doesn’t hold up with variations then chances are it’s not a reliable result. In a sense, “p-hacking” is choosing to only selectively document some of the research.

  • @picahudsoniaunflocked5426
    @picahudsoniaunflocked5426 Рік тому

    6:32 what of us whose whole being is clubbing???

  • @Patrizsche
    @Patrizsche Рік тому +3

    This video isn't about faking data, but faking results 😭😭😭

  • @templeodoom4634
    @templeodoom4634 11 місяців тому

    All my time in academia taught me is that academic vigilante should be a paid profession

  • @gladiatorzz2061
    @gladiatorzz2061 Рік тому

    I think overfitting is the most insidious of these. It can be very difficult to detect because science is inherently messy. Worse, it implies a causal relationship where one does not exist. In contrast, many of the others can be identified as anomalous results or as special cases.

  • @takiyaazrin7562
    @takiyaazrin7562 Рік тому

    Enlightenment of academic - You are a good channel

  • @steveboel12
    @steveboel12 Місяць тому

    How widespread is this publish or perish culture?

  • @echen1716
    @echen1716 10 місяців тому

    I don’t agree with number 3. It can be an interesting second order question and an interesting additional finding

  • @TigerTzu
    @TigerTzu Рік тому +3

    I don't understand why variable manipulation is a problem. Every explanation I've seen of why it's bad is just the presenter saying "it's bad science" "it causes false positives" (this video is no exception), but no one ever actually explains why essentially re-titling a study when you notice a correlation you didn't initially predict is a problem.

    • @ironyelegy
      @ironyelegy Рік тому +1

      Calling your hypotheticals actual data that represents real stuff does seem like a recipe for disaster, but I am no scientist

    • @Oler-yx7xj
      @Oler-yx7xj Рік тому +3

      The problem is that the p-value (the probability that the correlation is purely by chance) gets calculated incorrectly. If you look for multiple potential correlations, the chance some one of them to randomly appear in the data is higher, than if you only look for one correlation. And therefore, the p-value is estimated lower than it is and your work is called significant even though it is not. That's like if you throw 5 dice a chance to get at least one 6 is higher, than if you throw 1 dice.

  • @charlescrawford9972
    @charlescrawford9972 8 місяців тому

    Excessive Hypothesis Testing: Gerrymandering for science!

  • @pedromenchik1961
    @pedromenchik1961 Рік тому +3

    Not the chair tricking us into thinking that Pete has super buff shoulders

    • @PeteJudo1
      @PeteJudo1  Рік тому

      I occasionally do lateral raises :P

  • @adilneves6527
    @adilneves6527 3 місяці тому

    So how can we trust the research?

  • @nilshorgby3080
    @nilshorgby3080 10 місяців тому

    If you have a large dataset and you find a strong significance in a subset of the data, would it still be valid if the p-value is less than the original specified significance level divided by the number of subsets you examined. For example if I look at men and women separately, this would double the possibility of finding a false significance. If I have specified a significance level of 0.05, and get a p-value < 0.025 for women but not for men, would the result still be considered significant, as the p-value is less than 0.05 even when multiplied by the number of groups?

  • @dadsonworldwide3238
    @dadsonworldwide3238 Рік тому +1

    Theyre following the structuralism in place though. We predetermine a theory of everything then change interpretation of evidence to fit the mythology weve made.
    You can't follow the evidence where it leads and appease a grand unified evolutionary theory of everything cause & effect won't permit it.
    Causal unity here pushes effective complexity there.
    Modernization act in America and basically across the west created a chaldean minded modeled structuralism granting simplicity to the top of the higher archy and pushed the complexity and division down upon many different disciplines. Within each discipline its 2 competing cults ,2 opposite archetypical minds . Then under them we have actual industry where engineers & worker/ lab tech & mechanic all absorb the complexity of this classical physical lawisms.
    This is anthetical to not only the classical American founding it is also traps and causes serious division between each industry working with the same elements but forced into rationalizing the system they work for benefit of all. It creates different terminology and language that requires mediating translator bishops to bring them together for collaborations.
    This is the 1900s Modernization act structuralism it is a pagan model we have imposed upon ourselves.
    The unity & simplicity clearly needs to be in the feild following evidence where it goes and not about fitting it into grand theory of everythings.

    • @dadsonworldwide3238
      @dadsonworldwide3238 Рік тому

      Of course this means many boomers 60+ love this structuralism but why wouldn't they ,it allows them to deterministically imagine a simple form and all chaldean minded philosophical arguments to bring evidence back under its theory of everything has 5000 years of record to plagiarize and use if measurements challenge this belief system

  • @mackss9468
    @mackss9468 Рік тому

    We must start publishing the NULLS!!! It’s still very important information to have.

  • @shawnmclean7707
    @shawnmclean7707 Рік тому

    The outliers is what we need to pay attention to.
    The rest of the data is just like following the masses, nothing really interesting there.

  • @bammeldammel
    @bammeldammel Рік тому +1

    Thank you for bringing this to the attention of the public. However as a scientist I would ask you to be careful with wording, as I got the impression that you where often talking about the scientists doing something wrong. While the black sheep are luckily still a tiny minority.
    I am still waiting for a journal to publish vigorously done studies that did not result in confirmation of the hypothesis.

  • @huypt7739
    @huypt7739 9 місяців тому

    Coldfusion was just around the corner...

  • @picahudsoniaunflocked5426
    @picahudsoniaunflocked5426 Рік тому

    Statistics By Jim maybe good interview?

  • @joshuaryan1946
    @joshuaryan1946 8 місяців тому +3

    You are great--BUT FOUR SOLID MINUTES out of twelve is spent on your selling your advertisers. PLUS interruption by a UA-cam sponsor. This is ridiculous. And the result is, you go through your presentation so fast that several parts are hard to follow, with no examples to make them clear.

    • @Wanhope2
      @Wanhope2 8 місяців тому +1

      This right here! Need to spend more time considering the brutal ad ratio. Though I understand that creators are slaves to the algorithm trends on length

  • @davidnorman5488
    @davidnorman5488 Рік тому

    How trustworthy are the graphs?

    • @metalslegend
      @metalslegend Рік тому

      Any graph?

    • @QP9237
      @QP9237 11 місяців тому +1

      If they aren’t clearly scaled/bounded, more often than not it’s deceptive, expecting the reader to do any of the formal scaling calculations for the data you presented is unreasonable and dubious at best since the presentation method was your (as the researchers) deliberate choice. Think every time Apple likes to post their stupid graphs when they compare “how much faster x is than y” while presenting nothing more than a generalized curve without any bounds.

  • @vsm1456
    @vsm1456 Рік тому

    I think some of these methods have similarity with common cognitive biases.

  • @MK-ih6wp
    @MK-ih6wp 5 місяців тому

    How many of these deceptive tactics were used to justify the safe & effective v’s during c19?

  • @blist14ant
    @blist14ant Рік тому +1

    evil scientists

  • @kayakMike1000
    @kayakMike1000 Рік тому

    Climate scientists do this crap all the time.

  • @idcharles3739
    @idcharles3739 Рік тому

    Mainly what you are saying is that statistical significance isn't significant.
    For example your first point about peeking - "if they had carried on, they might have found that the data took them away from statistical significance".
    Firstly that could be applied to every experiment ever conducted. Either they had statistical significance or they didn't. The point of statistics is that it's supposed to tell you when there's enough data. And if more data can disturb that conclusion then there's something wrong with the concept of statistical significance in the first place.
    Ditto your idea that you should know what you're looking for before you start. If statistics works, it shouldn't matter what you're looking for -if the data you find is significant, the truth doesn't care whether you stumbled across it by accident.

  • @two_horus7337
    @two_horus7337 Рік тому +1

    Hey, let me just put this here for the algorithm :)

  • @jorgebuitrago1016
    @jorgebuitrago1016 Рік тому

    Fake data = Click

  • @RUHappyATM
    @RUHappyATM Рік тому

    Global Warming data fudge...OMG!

  • @9adam4
    @9adam4 Рік тому +4

    I don't think there's any pressure to perish.

    • @sunway1374
      @sunway1374 Рік тому

      I see what you mean. His English is not correct, not normally. But... The "Publish or perish" thing can be considered a single cultural phenomenon and these words are used together so often. So, most people would not find it strange or wrong when they hear it said like here. Still I agree with you. It could be better to say "pressure of publish or perish" instead of "pressure to publish or perish."

  • @finite-element
    @finite-element Рік тому

    Number 5 is p hacking? This uploader might need to learn a little more math. Neural network can generalize pretty well with nonlinear decision boundary. Especially when the relation in study is a multidimensional nonlinear relation. Higher order polynomial features are legit. Kernel machines exist for a reason.

  • @expensivepink7
    @expensivepink7 8 місяців тому

    is anything real😪