Update the phenotype column in PLINK

Поділитися
Вставка
  • Опубліковано 11 жов 2024

КОМЕНТАРІ • 20

  • @ultimatesmuckle
    @ultimatesmuckle 10 місяців тому +2

    If only there was a word that could accurately represent how much gratitude and love I feel for you every time I watched one of your videos and solve a doubt... you are awesome Professor

  • @lbalbuen
    @lbalbuen 2 роки тому +1

    Thank you for creating this video, Professor. Very helpful!

  • @andreaclark1143
    @andreaclark1143 Рік тому

    Thank you so much for this wonderful content, professor!! Your materials have been invaluable for my research project. 🙏

  • @farradzaki4379
    @farradzaki4379 2 роки тому +1

    Thank you professor for your hard work.

  • @samrawitgebeyehu7648
    @samrawitgebeyehu7648 Рік тому

    Thank you, Prof. Gabor, for another amazing video, I wonder how you set the phenotype column ( height data.tex) file that you used for --pheno.

    • @samrawittsehay1610
      @samrawittsehay1610 Рік тому

      For you information I have a ped and map file

    • @GenomicsBootCamp
      @GenomicsBootCamp  Рік тому

      This comment seems to be disconnected. Could you clarify?

    • @GenomicsBootCamp
      @GenomicsBootCamp  Рік тому

      The "PhenotypeMeasures.txt" file used for the --pheno should be in the same directory as PLINK. It has a header line, so these columns could be used according to needs, as explained in the second half of the video.
      Was this your question, or I missed something?

  • @darrylhadsell8813
    @darrylhadsell8813 2 роки тому +1

    Hello Professor
    Thank you so much for the work you are doing with these tutorials. I have found them very helpful. I have some work that involves close to 200 phenotypes and I have been using the one-by-one approach, but would certainly like to create a set of looks that could handle the whole work-flow.
    After creating the phenotype specific bed file I then use GEMMA to run the association and then with a combination of R scripts and plink scripts (both 1.07 and 1.9) I then create region plots where p-value is plotted on the left y-axis, estimated recombination rate is plotted on the right y-axis, and the snps are color coded as a function of LD. Have you an further thoughts on how looping could be employed withoult creating a massive number of files?

    • @GenomicsBootCamp
      @GenomicsBootCamp  2 роки тому +1

      Hi, My approach with GEMMA is the following (probably you do something similar, but I list it here to see if we deviate at any point):
      1) create a file wich contains all individuals (rows) and phenotypes (columns). This file has a header, to make use of PLINK's --pheno and --pheno-name options
      2) create a vector in R for the phenotype names - same names as in the file in 1)
      Loop starts here for each elelment of the vector, replacing the PHENOTYPE name for the PLINK update
      3) prepare the PLINK file for analysis using PLINK
      system(paste0("./plink --file ...... --pheno pheno_file_from_point1.txt --pheno-name ", PHENOTYPE ," --make-bed --out phenoForGemma
      4) run GEMMA - I run the relationship matrix preparation and the actual GWAS run each time. The relationship matrix could be technically done just once. For me the time was not a factor, for so many phenotypes you probably run it just once, outside of the loop.
      5) plot the results in R. The output file name from GEMMA is always the same, so a standardized script could be created. Then save the GEMMA output files (with the actual results) and the manhattan plot by renaming the file involveing the PHENOTYPE name from the loop
      Loop ends here.
      This still creates a bunch of files, depending on the number of phenotypes you are after, but at least you get rid of the temporary ones, and you can concetrate on the plots/results files.
      Was this you were after?

  • @valeriateodosieva271
    @valeriateodosieva271 2 роки тому +1

    Really helpful! Thanks! Just curious whether when using the --assoc function to analyze data, does it calculate p-values based on the phenotype of interest (present in the fam. files), or do we need to specify somehow the phenotype after the command?

    • @GenomicsBootCamp
      @GenomicsBootCamp  2 роки тому

      To be honest, I prefer GEMMA for GWAS analyses, so a more detailed check on this would be needed. But from the info I see on the PLINK website, I think the phenotype column from .fam file that is being used for this.

  • @mohammadj.shamim9342
    @mohammadj.shamim9342 2 роки тому +1

    Thank you so much professor. I really appreciate it. I have another question that usually comes handy in explaining genetic explaination of phenotypes which is narrow sense and broad sense heritability. I wonder if PLINK can measure? I read the document, but somehow could not figure it out in acgt64 software.

    • @GenomicsBootCamp
      @GenomicsBootCamp  2 роки тому

      Hi, this is a bit more complex, coming down to modeling, if you consider the entire genetic variance (so including dominance and others), ad the narrow sense is just the additive variance. Maybe there is a video on this at some point, but not now, as it is further away from the current focus, quite deep in quantitative genetics.

  • @MaleenPeters
    @MaleenPeters 7 місяців тому +1

    Dear Professor, this is very helpful. thank you. I have two questions? 1. what if I want to update the phenotype column with an outcome coded as 0 and 1 (No, Yes), wouldn't the zeros be considered missing? 2. what happens if the number of individuals in the phenotype file is not proportional to that in the genotype files?

    • @GenomicsBootCamp
      @GenomicsBootCamp  7 місяців тому +1

      Hi,
      Question nr. 1: I am not sure about the behaviour. Best to try it out in a small sample. But this is relevant only if you want to use the file further with PLINK. In a worst case scenario you rename it to 1-Yes, 2-No
      Question nr. 2: It does not matter. The phenotype file can have any number of entries, only the ones with matching FID+IID will be updated. If you have more in the pheno file, these will be ignored, if less, these entries will not be updated in your ped file.

    • @MaleenPeters
      @MaleenPeters 7 місяців тому

      @@GenomicsBootCamp thank you so much for the feedback

  • @Crass1000
    @Crass1000 Рік тому +1

    Would this work with the binaries? just changing for bfile?

    • @GenomicsBootCamp
      @GenomicsBootCamp  Рік тому +1

      Yes, it should work the same way across all PLINK files. If binary ped files are used, the bfile is to be specified, as you mentioned.