What is Synthetic Data? No, It's Not "Fake" Data

Поділитися
Вставка
  • Опубліковано 17 чер 2024
  • Learn more about Synthetic Data → ibm.biz/Synthetic-Data
    Synthetic data is artificially generated data versus data based on actual events, but it's not "fake" data. It replicates the properties of real data without the troubles of capturing it, such as confidentiality, low-volume, or expensive-to-validate. With synthetic data, it's easier and less costly to train AI models, however, it's not a panacea. For example, synthetic data may not fully represent the unexpected events that happen in the real world. In this video, Martin Keen explains what synthetic data is, its uses, benefits, and challenges; he wraps up his presentation by explain how it's generated.
    Get started for free on IBM Cloud → ibm.biz/buildonibmcloud
    Subscribe to see more videos like this in the future → ibm.biz/subscribe-now
    #datascience #businesssolutions #lightboard #ibm #computerscience #data #machinelearning

КОМЕНТАРІ • 45

  • @tmastana
    @tmastana Рік тому +2

    Amazing series and very classical and engrossing style of explanation... keep up the good work

  • @segunadewola
    @segunadewola Рік тому +13

    Great video! Best of luck SFC😂

  • @yassontheroad4038
    @yassontheroad4038 Рік тому +4

    I like this friendly instructor

  • @amazingwarrior4
    @amazingwarrior4 6 місяців тому +2

    What is very interesting about this concept is the validity and reliability of them. Why they don't talk about it! it's essential when we talk about mathematical set's of any data!

  • @arturocaceres9973
    @arturocaceres9973 Рік тому +4

    Excellent!!!

  • @ndz7372
    @ndz7372 Рік тому +1

    Loved this so much wow

  • @danielmaciel3447
    @danielmaciel3447 9 місяців тому +15

    I am amazed how this dude can write backwards so perfectly

    • @IBMTechnology
      @IBMTechnology  9 місяців тому +1

      See ibm.biz/write-backwards

    • @danielmaciel3447
      @danielmaciel3447 9 місяців тому +1

      @@IBMTechnology aha! I knew some sorcery was involved

    • @xaxfixho
      @xaxfixho 6 місяців тому

      Have you noticed they all seem to be left handed 🧐

  • @StorageGuru
    @StorageGuru Місяць тому

    Very simply explained ...👍

  • @rickharold7884
    @rickharold7884 Рік тому +3

    Yes, cool stuff. We use synthetic data for tracking trucks in the field. By taking existing labeled data and transforming the truck in three dimensions to get the additional data for the model.

    • @evetsnilrac9689
      @evetsnilrac9689 6 місяців тому

      Sounds like you used existing real data about the trucks. How is that synthetic data? I fear I'm misunderstanding this.

  • @anirbanc88
    @anirbanc88 Рік тому +1

    so cool, thanks

  • @mthoko
    @mthoko Рік тому +8

    Great series from IBM in general and this instructor specifically . Slightly hopeful on the Southampton bit but if you can't dream, what's the point of it all😃

    • @MartinKeen
      @MartinKeen Рік тому

      I appreciate your generous use of "slightly hopeful" 🙂

    • @vkris81
      @vkris81 Рік тому

      Always had a sweet spot for the saints… hope my club could give a new home for JWP

  • @lozanojavier
    @lozanojavier Рік тому

    I find it difficult to stop thinking about Martin Keen, and his prediction about Southampton's future in the Premier League. It's quite remarkable that both Southampton and Leicester will be battling it out in the Championship to regain their positions in the top tier in 2025. A great example of the problems with synthetic data.

  • @ndz7372
    @ndz7372 Рік тому

    Thank you so much

  • @user-ef4df8xp8p
    @user-ef4df8xp8p 4 місяці тому

    Very interesting..

  • @tyrojames9937
    @tyrojames9937 Рік тому +1

    INTERESTING.😀

  • @anandkalhore4089
    @anandkalhore4089 4 місяці тому +2

    Can synthetic data be as effective as real data? Wouldn’t model getting trained with synthetic data be giving false results when used against real data?

  • @prettypenny2353
    @prettypenny2353 Рік тому

    Excellent presentation and excellent instructor.

  • @marshmallow4181
    @marshmallow4181 5 місяців тому

    Which bord you use.. ?

  • @HoustonKhanyile
    @HoustonKhanyile Рік тому +4

    I think this video might have jinxed Southampton. Instead of winning the Premier league they are now getting relegated.😢

  • @ianoldfield2598
    @ianoldfield2598 Рік тому +1

    Interesting, if rather simplistic. Having spent the past 5/6 years developing a synthetic police-data model, it is not easy or cheap (if time is factored in). Rows and rows of financial transactions might be easy to generate, less so, complex family groups, locations, incidents and crimes, vehicles, organisations, where these are interlinked, related and reflect real-world scenarios. Whilst IBM has some excellent tools such as i2 and Watson, the real data in those systems would be unlikely to be made available for sythesising.

  • @seanrrr
    @seanrrr 9 місяців тому

    Synthetic data has been very useful in my field (gene regulatory networks; maps of interactions that affect gene expression within cells). We can't manually test the interactions of tens of thousands of genes, especially across tens/hundreds of thousands of species, so we predict them using large molecular datasets.
    The problem is, how can you evaluate the accuracy of a prediction algorithm if you don't know what's true or false? Synthetic data is super useful, since you can generate data with known interactions that you can compare to. Algorithms can then be ranked on how close their predictions match the synthetic dataset. A great example is the GNW DREAM Network Inference Challenge, if you want to see how they use this!

    • @brandonsnider5871
      @brandonsnider5871 7 місяців тому

      I love how Synthetic Data works. It's very, very useful. I just really worry that people will start training models on Synthetic data in scenarios in which it would be dangerous to use data that is not perfectly based in reality.

  • @nicoles_handle
    @nicoles_handle 4 місяці тому

    using the prem was the perfect hook icl

  • @karengomez3143
    @karengomez3143 Рік тому +1

    Takeaway:
    Made up data can be used to deal with biased real word data and can be obtained from data sources or transforming existing data by adding noise or using GANs.

  • @watipasokamanga8908
    @watipasokamanga8908 9 місяців тому

    nice, now I can generate data for my HIV viral load detector model at no cost

  • @nagkumar
    @nagkumar 3 місяці тому

    Why is it not called a fake message that is not clear in the video..

  • @itdataandprocessanalysis3202
    @itdataandprocessanalysis3202 Рік тому +2

    Thanks for the video.
    May I ask... is this British accent?

    • @MartinKeen
      @MartinKeen Рік тому +1

      It is. Although I have been in the US for a good while now, so maybe a bit of a Mid-Atlantic accent.

    • @itdataandprocessanalysis3202
      @itdataandprocessanalysis3202 Рік тому

      @@MartinKeen Thank You.

  • @almor2445
    @almor2445 29 днів тому

    How is this not basing later models on copies of copies of potentially incorrect data? Won't we end up with piles of structurally sound, true seeming noise eventually?

    • @almor2445
      @almor2445 29 днів тому

      Imagine I use the latest gpt model to scrape the wiki page regarding a political view point and generate 10 new pages of slightly different content based on that. All 10 will contain the lacks, flaws and biases in the original. What does thus achieve? We already have enough examples of the language in use so it's not for that. If it's for quality facts, you're not generating synthetic facts, just copies of previously learned ones. Is it just a way to get around intellectual property laws by making copies of something no one owns?

  • @michaelcharlesthearchangel
    @michaelcharlesthearchangel Рік тому

    Programming/MetaProgramming/Hypergramming.
    Hypergramming is AI created synthetic databasing.

  • @lllcinematography
    @lllcinematography Рік тому

    is this the hallucinations from llms like chatgpt that everyone hates put to good use?

  • @Hiram8866
    @Hiram8866 Рік тому +2

    Its been all downhill since Lawrie McMenemy left. #sfc

    • @MartinKeen
      @MartinKeen Рік тому +2

      Sadly true - and that was 45 years ago!

  • @ashleygahl3638
    @ashleygahl3638 Місяць тому

    when he said, the years when my team won the prem title, i said, lies, all lies 😀😆

  • @maxwellmogambi6032
    @maxwellmogambi6032 2 місяці тому

    hey am from the future 2024, and SFC is not winning the premier league, sorry😂!! educative lesson💯

  • @pradeep422
    @pradeep422 Рік тому

    lol u kiddin southanpton next winners haahha..