2024 updated single-cell guide - Part 2: RNA Integration and annotation

Поділитися
Вставка
  • Опубліковано 27 жов 2024

КОМЕНТАРІ • 42

  • @zexalinishere
    @zexalinishere Місяць тому +3

    Keep these up. I cannot tell you how amazing these walkthroughs are. Truly. I’ve been wet lab my entire life with zero idea about any transcriptomics yet have really have been wanting to transition into dry lab but had no idea where to start. Until I found your videos. I’m literally telling everyone about them. Seriously, please keep these line by lines going. You’re amazing. Thank you

    • @zexalinishere
      @zexalinishere Місяць тому

      You’re going to pop off just keep it up

  • @dawoodahmad6028
    @dawoodahmad6028 18 днів тому

    Thanks for your informative tutorials. i am just waiting for your next video. 🙂

  • @laloulymounia9266
    @laloulymounia9266 6 місяців тому +5

    Thanks a lot for these valuable tutorials! You really doing an incredible job for students who do not have access to formal courses in bioinformatics!

  • @duuuou7811
    @duuuou7811 Місяць тому

    This video and all other tutorials in your channel are super super helpful. Please keep doing this and looking forward to see the following analysis.

  • @Birkirrey
    @Birkirrey Місяць тому

    These videos are incredible! Keep it up!

  • @marwanmohamed6575
    @marwanmohamed6575 6 місяців тому +1

    Thanks a lot, very nice especially the umap inside umap trick

  • @damemasgasAlina
    @damemasgasAlina 5 місяців тому

    Excellent tutorial Mark! I've been doing this for years now, but still managed to pick up a few new tricks. Love your careful and polite approach explaining the UMAP drama. Thank you for your work :)

    • @sanbomics
      @sanbomics  5 місяців тому

      I may have had to edit it to make it more polite that in was xD

    • @sanbomics
      @sanbomics  5 місяців тому

      eg, took out the part about using a globe while driving a car lol

  • @whinetimev
    @whinetimev Місяць тому

    Hi Mark. I love your tutorials!!! I'm sure you're very busy so in the mean time, for next steps, which of your videos would you most recommend? A) "Pseudobulk single-cell analysis in Python with Scanpy and pyDeseq2" and the GSEA portion onwards of "Differential expression in Python with pyDESeq2" tutorials or B) the analysis portion of "Complete single-cell RNAseq analysis walkthrough | Advanced introduction"? Thankyou!!

  • @ykoy1577
    @ykoy1577 5 місяців тому

    Thank you for your great tutorials. I am always waiting for your next video. Thank you so much!!

    • @sanbomics
      @sanbomics  5 місяців тому +1

      Trying to get around to it this week if I have time!

  • @Z3ratoss
    @Z3ratoss 6 місяців тому

    I learned some new tricks, and I have been doing this for a while!
    I think instead of shuffling prior to UMAP you can also just pass sort_order=False

    • @sanbomics
      @sanbomics  6 місяців тому

      Unless they changed it recently, I think that is probably still dependent on the order of the cells in your dataset. So DX would still seem shuffled but something like sample would still be overplotted

    • @Z3ratoss
      @Z3ratoss 6 місяців тому

      @@sanbomics huh that's pretty bad than.
      Maybe I should write a PR

  • @TheXu122
    @TheXu122 4 місяці тому

    Thank you so much for your videos! I am a grad student who recently started a sing cell project and since I found your channel, your explanations and code have been getting me through this tough time.
    I was wondering if you will be planning on doing cNMF in the future? It is something that I and our lab have had difficulty with.
    Thanks again!

    • @sanbomics
      @sanbomics  4 місяці тому

      I can definitely keep that in mind for a future video!

  • @cocomom1808
    @cocomom1808 3 місяці тому

    For the hyperparameter tuning part, I got stuck with the Deprecation Error: The `RunConfig(local_dir)` argument is deprecated. You should set the `RunConfig(storage_path)` instead.
    anyone has any suggestions on how to solve it?? Thanks!

  • @mehdiraouine2979
    @mehdiraouine2979 4 місяці тому

    we're still looking forward to the future part ;D

    • @sanbomics
      @sanbomics  4 місяці тому +1

      I know i know xD. I was going to start working on it this weekend. I have been very busy!

    • @sanbomics
      @sanbomics  4 місяці тому +2

      someday soon...

    • @Dumbo-eo5ps
      @Dumbo-eo5ps 4 місяці тому

      @@sanbomics we're all hoping for this series to be completed so we can implement it, we're rooting for you! we're grateful for anything you can share :D

  • @mehdiraouine2979
    @mehdiraouine2979 6 місяців тому

    Hi, I've been following your videos closely lately as they are very intuitive! Thank you as always for these fantastic tutorials. I have very recently started learning about bioinfo and I have a very loose understanding of what each tool does. For example, the difference between dimensionality reduction methods such as PCA, UMAP, and Non-negative Matrix Factorization (NMF). With UMAP and PCA being very similar with the difference being one is non linear and the latter is linear. However I fail to understand why some would use NMF to analyze any type of RNA seq data, does it provide results that UMAP downstream analysis cannot perform ? or is there any other reason to use NMF? I'd be grateful if you could help me understand.

  • @avp300
    @avp300 5 місяців тому

    Thanks Mark for the second part, as usual it's highly informative! Just one question about SCVI-SCANVI label transfer, it also predicts the labels of reference along side the 'unknown', do you mind quickly checking ref's predicted labels with ground truth labels and find out what is the percentage of correctly predicted labels? After following your July 11 2022 video I have played with it a lot and only managed to get 87% correct prediction rate. Thanks!!

    • @sanbomics
      @sanbomics  5 місяців тому +1

      I can check. The number is a little low but doesn't sound too unreasonable, since we set n_samples_per_label to 100. If you increase that number your ground truth prediction rate might increase, but at the cost of label:unknown disparity.

  • @JianlongJia-kv8fw
    @JianlongJia-kv8fw 5 місяців тому

    Very good tutorial!!!! I would like to ask a question .
    model = scvi.model.SCVI.setup_anndata(
    adata, categorical_covariate_keys=[‘sample’],
    continuous_covariate_keys=[‘percent_mito’, ‘percent_ribo’]
    ).
    Why not just specify sample as batch here.
    For example.
    model = scvi.model.SCVI.setup_anndata(
    adata, batch_key=‘batch’,
    continuous_covariate_keys=[‘percent_mito’, ‘percent_ribo’]
    ).
    Wouldn't this more directly point out that sample is a batch. Or what is the difference between these two?
    Thank you very much for your help!

    • @sanbomics
      @sanbomics  5 місяців тому

      Funny enough, this is a question I have asked the scVI team in the past. I was told that it wouldn't make much of a difference. Typically I save batch for when I integrate multiple different studies or technologies or species together.

  • @duadpeada5068
    @duadpeada5068 5 місяців тому

    Very cool video! Could you please tell us how to do something similar to your introduction with the umap transforming to the logo??

    • @sanbomics
      @sanbomics  5 місяців тому +1

      I have the video where I turn my cat into a UMAP. Let me know if that helps, if not, I can maybe post the code.

  • @sapienthought1103
    @sapienthought1103 3 місяці тому

    this is waaaaaaay underrated

  • @laloulymounia9266
    @laloulymounia9266 6 місяців тому +1

    By the way, when I am running the scvi models, despite having a 4080hx GPU and cuda installed it barely is being employed when training the models, instead it uses the integrated GPU. When I moved the code to my friend’s computer who has a better GPU, his 4090 CPU is running at 80% when the training models as showed by the system statistics. Do you perhaps have any idea what the issue might be ? In terms of time needed to complete the task I’d say my computer is not too slow compared to his.

    • @izthed9117
      @izthed9117 6 місяців тому

      Sometimes the cpu is running high at the start and then the gpu gets employed. Is the same usage after couple minutes?

    • @laloulymounia9266
      @laloulymounia9266 5 місяців тому

      yeah, and I noticed the integrated GPU is getting used a bit in an inconsistent manner. While my friend s GPU is running at 70% in a consistent manner

    • @izthed9117
      @izthed9117 5 місяців тому

      @@laloulymounia9266 i guess you meant that you have a 4080 rtx? or just an intergrated gpu ? in general these algorithms would run in cuda which need an nvidia gpu (rather than intergrated one). If you can check if cuda is available by using nvcc -V to check if its enabled (look at drivers etc.) and in case you have both intergrated gpu and individual GPU cou can specify what to use. If you just have lets say a ryzen iGPU im not sure how possible it is to use it ( cuda is for nvidia architecture)

    • @sanbomics
      @sanbomics  5 місяців тому

      When you start training the model does it say the GPU is being used? If yes, and If it's still running decently fast, I wouldn't worry. GPU utilization and all the warnings/errors is a common frustration with scvi but I have been told it is being addressed

  • @georgieb1326
    @georgieb1326 5 місяців тому

    Is there a reason you used CellTypist before integration? It means that the overclustering done by CellTypist is different to the overclustering done post-integration when annotating (which is making annotation a bit confusing in my case)

    • @sanbomics
      @sanbomics  5 місяців тому +1

      You can do it after depending on how many cells you have. With this many cells it becomes almost impossible because it requires a dense matrix.

  • @izthed9117
    @izthed9117 6 місяців тому

    Thanks a lot for the tutorials!! Does someone has any idea if the hyperparameter tuning uses the layer counts
    (with raw counts) because i dont get what it inputs to do the grid search. I just want to do tuning for just the data integration model.

    • @sanbomics
      @sanbomics  6 місяців тому +1

      if you don't specify a layer it will just use .X (which should be raw)

  • @sapienthought1103
    @sapienthought1103 2 дні тому

    and next time never came