You know what? You ARE the man of the year. This is undoubtedly a top-notch video. SUPERB explanation!!! I'm working on my master's thesis which is about speech recognition, and this video simply hit the spot. Your explanation just cleared up lots of questions I had before.
2:30 Even Kai-Fu Lee is drinking the Apple Kool-Aid here. Here in 2018, John has multiple interfaces: Twitter, Facebook, telephone (potentially more than one), and probably others. It's not just "computer, call John", it's "computer, call John-by some appropriate means to the task at hand, with decent prospects of getting a response in a timely manner, without pissing him off too much". What's artificial about the interaction is not thumbing through the phone book, but the remote nature of the interaction, where we don't really have a clue what John is presently doing, and whether we _should_ contact John in the present moment, or with what degree of insistence. In the natural setting, it's pretty hard to interrupt someone without at least some clue about their present task, social, or cultural context. So the unnatural interface here *is the computer itself* and its inherent one-sided remoteness. And so now we can observe that the unnatural interface via the unnatural appliance is to look a number up in a phone directory, and not be too surprised. Steve Jobs has since forever-I was there-set himself up as the electron-whisperer. His message is not that digital technology is inherently divergent from our human evolutionary context, it's merely that you need a sufficiently enlightened design guru to intervene, and all will be set right. Now I *love* digital technology, but even so I've never once in my life claimed it was natural to human nature, or would ever be entirely natural to human propensities. Perhaps I simply understand evolution better than most people, because making this error doesn't even seem possible from where I sit. Even in the Kurzweil scenario, where humans become half-man, half-machine, I suspect our human and machine halves would retain some significant Spock-McCoy boundary friction. We're not engineering ourselves out of this problem any century soon.
3:45 "For example, when we are driving ... " But research shows that humans don't do a good job of avoiding fleshy objects when merely trying to talk and steer at the same time. Again, the unnatural, remote nature of your telephone connection makes talking and driving more dangerous than when you chat with your passenger-because your passenger is typically monitoring your attentional demands, and playing ball (small children excepted, but these are dangerous cargo when not strapped in place and suitably distracted with a Nintendo device). Just as soon as we have self-driving cars (to save us from our multitasking primitive selves), we'll go straight back to the visual interface for whatever we're doing while the machine drives, because visual is the fat channel of human apprehension. However, I'm not going to dock Kai-Fu Lee any points for failing to predict FOMO fatalities back in 1993 (yeah, the person killed by your hurtling hunk of metal misses out on a lot, but what can you do? "Not text? Like, how's that possible?") FOMO used to be the survival reflex, but times have changed.
One of the best introduction tutorials to speech recognition
You know what? You ARE the man of the year. This is undoubtedly a top-notch video. SUPERB explanation!!! I'm working on my master's thesis which is about speech recognition, and this video simply hit the spot. Your explanation just cleared up lots of questions I had before.
nice explantion
Wow! We are in 2020 and not even close to what was expected for 2010! :D
It was for 2000!
2:30 Even Kai-Fu Lee is drinking the Apple Kool-Aid here. Here in 2018, John has multiple interfaces: Twitter, Facebook, telephone (potentially more than one), and probably others. It's not just "computer, call John", it's "computer, call John-by some appropriate means to the task at hand, with decent prospects of getting a response in a timely manner, without pissing him off too much".
What's artificial about the interaction is not thumbing through the phone book, but the remote nature of the interaction, where we don't really have a clue what John is presently doing, and whether we _should_ contact John in the present moment, or with what degree of insistence. In the natural setting, it's pretty hard to interrupt someone without at least some clue about their present task, social, or cultural context. So the unnatural interface here *is the computer itself* and its inherent one-sided remoteness. And so now we can observe that the unnatural interface via the unnatural appliance is to look a number up in a phone directory, and not be too surprised.
Steve Jobs has since forever-I was there-set himself up as the electron-whisperer. His message is not that digital technology is inherently divergent from our human evolutionary context, it's merely that you need a sufficiently enlightened design guru to intervene, and all will be set right.
Now I *love* digital technology, but even so I've never once in my life claimed it was natural to human nature, or would ever be entirely natural to human propensities. Perhaps I simply understand evolution better than most people, because making this error doesn't even seem possible from where I sit. Even in the Kurzweil scenario, where humans become half-man, half-machine, I suspect our human and machine halves would retain some significant Spock-McCoy boundary friction. We're not engineering ourselves out of this problem any century soon.
Why is there no distinction between "phone" and "phoneme"? You cannot see phonemes on a spectrogram because they are abstract concepts.
3:45 "For example, when we are driving ... " But research shows that humans don't do a good job of avoiding fleshy objects when merely trying to talk and steer at the same time. Again, the unnatural, remote nature of your telephone connection makes talking and driving more dangerous than when you chat with your passenger-because your passenger is typically monitoring your attentional demands, and playing ball (small children excepted, but these are dangerous cargo when not strapped in place and suitably distracted with a Nintendo device). Just as soon as we have self-driving cars (to save us from our multitasking primitive selves), we'll go straight back to the visual interface for whatever we're doing while the machine drives, because visual is the fat channel of human apprehension.
However, I'm not going to dock Kai-Fu Lee any points for failing to predict FOMO fatalities back in 1993 (yeah, the person killed by your hurtling hunk of metal misses out on a lot, but what can you do? "Not text? Like, how's that possible?") FOMO used to be the survival reflex, but times have changed.