A quick tip when you find yourself opening something in Excel just to format data: There are usually better ways to do it that are actually easier. Here are 3 ways that are more reproducible: 1. `samtools depth -H reads.bam`. Yup, samtools depth actually has a flag for adding the header. Use --help on any tool to find such useful parameters. 2. `df = pd.read_csv('raw_depth.tsv', sep='\t', names=['chromosome', 'position', 'depth'])`. 3. If neither the input nor output tools can adapt, then you can use bash or python to format the data. For example: To make it comma-separated: `samtools depth reads.bam | tr '\t' ',' `. To add a header with awk: `cat depth.csv | awk 'NR==1{print "chromosome,position,depth"}{print}'`. So always look for ways the tool outputting the file can help format it, ways the input file can take in the input as-is, and finally ways to make the change programmatically without manual editing. Okay so that became longer than a quick tip, but hopefully this helps!
I was learning python on free code camp should i switch to kaggle for python? with allt hese tutorials, just wanted to make sure im not hopping around to much
Hello sir, I have been following you for a while now, ever since i got admission to study MS Bioinformatics in the UK. You really did inspire me to go for it. I want to know where to really start from as I am new to this field. I would also love to know if I can connect for Mentorship. I love the field and would like to give it my best through proper guidance. I'd love to hear from you.
A quick tip when you find yourself opening something in Excel just to format data: There are usually better ways to do it that are actually easier. Here are 3 ways that are more reproducible:
1. `samtools depth -H reads.bam`. Yup, samtools depth actually has a flag for adding the header. Use --help on any tool to find such useful parameters.
2. `df = pd.read_csv('raw_depth.tsv', sep='\t', names=['chromosome', 'position', 'depth'])`.
3. If neither the input nor output tools can adapt, then you can use bash or python to format the data. For example: To make it comma-separated: `samtools depth reads.bam | tr '\t' ',' `. To add a header with awk: `cat depth.csv | awk 'NR==1{print "chromosome,position,depth"}{print}'`.
So always look for ways the tool outputting the file can help format it, ways the input file can take in the input as-is, and finally ways to make the change programmatically without manual editing.
Okay so that became longer than a quick tip, but hopefully this helps!
This is very helpful thank you!
Thanks!
Just Wonderful... We need more.
Damn! Waiting for that!
this content is so very important, thank you so much for this
Hi, love the videos on bioinformatics! Saw that AWS Omics just released, have you used it yet? If so, what are your thoughts on it?
I was learning python on free code camp should i switch to kaggle for python? with allt hese tutorials, just wanted to make sure im not hopping around to much
Hello sir, I have been following you for a while now, ever since i got admission to study MS Bioinformatics in the UK. You really did inspire me to go for it. I want to know where to really start from as I am new to this field. I would also love to know if I can connect for Mentorship. I love the field and would like to give it my best through proper guidance.
I'd love to hear from you.
Hi, I’m a beginner and I was trying to turn the file from cram into bam but it kept saying chr is not presenting, how could I solve that?
Hello! I have the same problem, have you found a solution?
Amazing.
thanks a lot