This channel is the most underrated bioinformatics learning channel in the UA-cam. Please those who watch, kindly subscribe and share.. As always, good work.
Makefile has awful terrible disgusting syntax. It also has other limitations which snakemake doesn't have because it is a pretty ancient program. Snakemake is a modernised make essentially
Hey! I can see and try. The thing is now i just have my own workstation so I'll need to play around with getting a few VMs spun up and then get them together into a cluster. I haven't ever done this but it does sound really interesting to at least try to get working!
Thanks for the tutorial. I have a question. Is it a good practice to add the step of star genome indexing in the snakamake workflow? as it is a resources consuming step.
So this could depend on the process, pipeline, core that you work in. If you are in a bioinformatics core then one could argue to have a folder of different versions of the genome/genomes and use whatever a PI wants you to use. Alternatively, if you have a "1 and done" sample run or perhaps a couple, then incorporating the genome building into the workflow *could* save key strokes. Really what you think is best. Also, could have a check in a job to where if there is an already built genome then you can just use it but otherwise build the genome. This would be beautiful if working in a core and maybe someone has a mouse, maybe someone wants hg38 for a comparison, maybe another hg19 because thats what theyve always used. Hopefully this makes sense. Ultimately comes down to the situation for the project and future project possibilities. If not needing to be flexible then no doesn't make much sense to include but for adaptation to unique projects could be valuable.
Hey, Alex my question is that whenever I try to install snakemake by using this command conda create -y --name snakemake snakemake-minimal. It generates an error that command not found. Even when I have anaconda package installed at my machine using conda command doesn't work for me. Kindly give me suggestion for solving this.
Amazing! I have one question. Why you do not need to add the inputs from trimommatic rule in the rule all? You only added from Star and FASTQC. Thanks!
Think of it like a chain of commands. The rule all is like the back of the train, just in front of that is star, in front of that, trimmimatic, in front of that raw reads at the start of the train. As long as the output of something like trimmimatic is called with star, it's being triggered by calling star, but there's nothing following star that would trigger that to run. I think i could have simplified it even further by calling featurecount in rule all instead of star and by calling that, would trigger star having to run, which would trigger trimmimatic needing to run, etc. Fastqc doesn't have a downstream program that uses it's output, so that needs a trigger call in rule all. Hope this helps! There's a few workflow managers but i think snakemake is the easiest to understand while nextflow and Cromwell/WDL scrips i find very confusing but might do the Cromwell one in a future video as I learn - it can use docker containers that have all tools installed inside for highly reproducible workflows.
Thanks, Ali. The reads that I have on GitHub aren't in links for this video (sorry about that). Example read files can be found at: github.com/ACSoupir/Bioinformatics_UA-cam/tree/master/Raw%20Reads/files They are subsampled to cut down on the size and allow upload to github, but through the "SRA Download, QC, and Trimming" video there are directions to download from the sequence read archive (SRA).
This channel is the most underrated bioinformatics learning channel in the UA-cam. Please those who watch, kindly subscribe and share.. As always, good work.
you are a life saver!
okay lets see and practice...... Alex has made a new tutorial for us ..
Haha I like the practice part. A great way to learn is to do!
Amazing stuff!
Great video, very helpful! Thank you
Really great video, are there any specific advantages that you find when using a snakefile as opposed to a makefile?
Makefile has awful terrible disgusting syntax. It also has other limitations which snakemake doesn't have because it is a pretty ancient program. Snakemake is a modernised make essentially
Please create another video on how to run a pipleline on HPC e.g. slurm using snakemake --profile
Hey! I can see and try. The thing is now i just have my own workstation so I'll need to play around with getting a few VMs spun up and then get them together into a cluster. I haven't ever done this but it does sound really interesting to at least try to get working!
Amazing!
Thanks for the tutorial. I have a question. Is it a good practice to add the step of star genome indexing in the snakamake workflow? as it is a resources consuming step.
So this could depend on the process, pipeline, core that you work in.
If you are in a bioinformatics core then one could argue to have a folder of different versions of the genome/genomes and use whatever a PI wants you to use. Alternatively, if you have a "1 and done" sample run or perhaps a couple, then incorporating the genome building into the workflow *could* save key strokes. Really what you think is best.
Also, could have a check in a job to where if there is an already built genome then you can just use it but otherwise build the genome. This would be beautiful if working in a core and maybe someone has a mouse, maybe someone wants hg38 for a comparison, maybe another hg19 because thats what theyve always used. Hopefully this makes sense.
Ultimately comes down to the situation for the project and future project possibilities. If not needing to be flexible then no doesn't make much sense to include but for adaptation to unique projects could be valuable.
@@alexsoupirThanks for the clearance
Hey, Alex my question is that whenever I try to install snakemake by using this command conda create -y --name snakemake snakemake-minimal. It generates an error that command not found. Even when I have anaconda package installed at my machine using conda command doesn't work for me. Kindly give me suggestion for solving this.
Amazing! I have one question. Why you do not need to add the inputs from trimommatic rule in the rule all? You only added from Star and FASTQC. Thanks!
Think of it like a chain of commands. The rule all is like the back of the train, just in front of that is star, in front of that, trimmimatic, in front of that raw reads at the start of the train. As long as the output of something like trimmimatic is called with star, it's being triggered by calling star, but there's nothing following star that would trigger that to run. I think i could have simplified it even further by calling featurecount in rule all instead of star and by calling that, would trigger star having to run, which would trigger trimmimatic needing to run, etc.
Fastqc doesn't have a downstream program that uses it's output, so that needs a trigger call in rule all.
Hope this helps! There's a few workflow managers but i think snakemake is the easiest to understand while nextflow and Cromwell/WDL scrips i find very confusing but might do the Cromwell one in a future video as I learn - it can use docker containers that have all tools installed inside for highly reproducible workflows.
Perfect! :)
Great video...
I am unable to find the raw files in the repo
Thanks, Ali.
The reads that I have on GitHub aren't in links for this video (sorry about that). Example read files can be found at:
github.com/ACSoupir/Bioinformatics_UA-cam/tree/master/Raw%20Reads/files
They are subsampled to cut down on the size and allow upload to github, but through the "SRA Download, QC, and Trimming" video there are directions to download from the sequence read archive (SRA).
more snakemake vids
Need to think of more cool things to do with it! When my wetlab stuff starts working out I'll be able to throw some more together.