Usually, aligners will create the '.bam' for you, as well as the '.bai' file. BWA as an example, that only generate 'sam' file, which you can use the SAMTOOLS to covert it to the '.bam' file, also used the SAMTOOLS to create the '.bai' file. bwa xxxxxx (map command) | samtools view -BST - -o xxxx.bam then, samtools index xxxx.bam Hope this helps.
@@loganchen7889 thanks my 2nd question is how to create vcf file from fast or fastq files .is it necessary to first go through bcf tools or direct way to create vcf file or is it . mandatory first to have bcf file then vcf . please elaborate the answer with syntax and example
@@haroonzeb7087 I think what you have encountered is about variants calling. The VCF format was used to store the variants information, including contig (chromosome), location, reference, alternative base, and other related information. The simplest way to get vcf file from the raw fastq/fasta file should include two processes. 1. Mapping: align the sequences in the fasta/fastq file to the genome. 2. Variants calling: use a variant calling algorithm, deepvariant (mentioned by Shirely), GATK (widely used) to call the variants. The default output format is VCF. What you mentioned, bcf, is a binary format of VCF, if I remember correctly. Maybe, the examples of this process will be presented in the further videos, I am not sure, as I am also an audience of the course. Hope this message helps.
@@loganchen7889 absolutely i know VCF .but how to create VCF file is it necessary for the creation only BCF tools or BCF syntax is used as mandatory .and could you shed light on if VCF is created via bcf tools then or any other yours recommendationfor the creation of VCF file thanks in advance
@@haroonzeb7087 I am not sure if I understand correctly. I think many other tools, other than bcftools, which you mentioned, gatk, deepvariant, which I mentioned before, could create vcf/bcf. "The relationship between BCF and VCF is similar to that between BAM and SAM." (evomics.org/vcf-and-bcf/). I am not familiar with the bcftools, I don't know if anybody still used it to call variants. There are best-practice pipelines for GATK on both somatic and germline variants calling, you can refer (gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-) and (gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels-).
Great info!
Thank you! You are great!
thanks a lot! very informative and fast explanation
Like your explanation. Do you know how to annotate a sam file? Can you send a link?
It will be great if you add links for appropriate information in comments. Thanks!
hi , how to create .bam and .bai files
Usually, aligners will create the '.bam' for you, as well as the '.bai' file. BWA as an example, that only generate 'sam' file, which you can use the SAMTOOLS to covert it to the '.bam' file, also used the SAMTOOLS to create the '.bai' file.
bwa xxxxxx (map command) | samtools view -BST - -o xxxx.bam
then,
samtools index xxxx.bam
Hope this helps.
@@loganchen7889 thanks my 2nd question is how to create vcf file from fast or fastq files .is it necessary to first go through bcf tools or direct way to create vcf file or is it . mandatory first to have bcf file then vcf . please elaborate the answer with syntax and example
@@haroonzeb7087 I think what you have encountered is about variants calling. The VCF format was used to store the variants information, including contig (chromosome), location, reference, alternative base, and other related information. The simplest way to get vcf file from the raw fastq/fasta file should include two processes. 1. Mapping: align the sequences in the fasta/fastq file to the genome. 2. Variants calling: use a variant calling algorithm, deepvariant (mentioned by Shirely), GATK (widely used) to call the variants. The default output format is VCF. What you mentioned, bcf, is a binary format of VCF, if I remember correctly. Maybe, the examples of this process will be presented in the further videos, I am not sure, as I am also an audience of the course. Hope this message helps.
@@loganchen7889 absolutely i know VCF .but how to create VCF file is it necessary for the creation only BCF tools or BCF syntax is used as mandatory .and could you shed light on if VCF is created via bcf tools then or any other yours recommendationfor the creation of VCF file
thanks in advance
@@haroonzeb7087 I am not sure if I understand correctly. I think many other tools, other than bcftools, which you mentioned, gatk, deepvariant, which I mentioned before, could create vcf/bcf. "The relationship between BCF and VCF is similar to that between BAM and SAM." (evomics.org/vcf-and-bcf/). I am not familiar with the bcftools, I don't know if anybody still used it to call variants. There are best-practice pipelines for GATK on both somatic and germline variants calling, you can refer (gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-) and (gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels-).