I thought the exact same thing. Because say that your query sequence is one amino acid, the chance of finding that amino acid in any database is extremely high
Yes, but with a smaller query sequence, you'll have a smaller score too. Let's say you have one amino acid, then the maximum score the alignment can achieve is the identity score. But if you have hundred amino acids, then the absolute scores will add up and be much higher. Tell me if you got it. It took me a while to wrap my head around it too
These are random defined values to be able to go from an exponential curve to a linear curve (just like he did with the functions for these 2 fittings). They are not biological or computational relevant, they are just some random numbers.
E value can only approach 0, it cannot be exactly 0. Remember, E-value depends on 3 variables: length of query seq, length of db seq and score. Score is an exponential power, so that wouldn't make E-value 0. So essentially, you'd need to make query length 0 or db seq length 0. So you're essentially not comparing anything. E-value approaches 0 when you have high score, because score is in negative power and has greater effect on E-value than the other two variables m and n. Also, note that if length of database does effect E-value i.e. if your database size is small, you'll start noticing significant E-values more than when the db length is large.
Excellent description of E-values
Perfect! Exactly the what I was looking for! Thank you sooo much.
Shouldn't a smaller query sequence increase the possibilities of getting the match by chance?
I thought the exact same thing. Because say that your query sequence is one amino acid, the chance of finding that amino acid in any database is extremely high
Yes, but with a smaller query sequence, you'll have a smaller score too. Let's say you have one amino acid, then the maximum score the alignment can achieve is the identity score. But if you have hundred amino acids, then the absolute scores will add up and be much higher.
Tell me if you got it. It took me a while to wrap my head around it too
@@ernstdarwin2695 Now i get it, thanks for explaining!
you saved my life. tks
Thanks so much! You made it much eaier to understand
How do you find the values of K and λ for a matrix? I can't find an answer anywhere and its driving me insane.
These are random defined values to be able to go from an exponential curve to a linear curve (just like he did with the functions for these 2 fittings). They are not biological or computational relevant, they are just some random numbers.
What happens when we have an Evalue of "0"? What does that tell us about our aligned sequence?
E value can only approach 0, it cannot be exactly 0. Remember, E-value depends on 3 variables: length of query seq, length of db seq and score. Score is an exponential power, so that wouldn't make E-value 0. So essentially, you'd need to make query length 0 or db seq length 0. So you're essentially not comparing anything. E-value approaches 0 when you have high score, because score is in negative power and has greater effect on E-value than the other two variables m and n. Also, note that if length of database does effect E-value i.e. if your database size is small, you'll start noticing significant E-values more than when the db length is large.
Hiii I wonder how can i find % similarity in blast ?
Excellent