Page Ranking and Search Engines - Computerphile

Computerphile

2 100

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 7 лют 2025

КОМЕНТАРІ •

@necrofalcata 9 років тому ⁺⁵
This is awesome :) 17 years ago if you told someone that the idea explained in this video would spawn a multibillion dollar company, I don't think they would believe you
@CaptTerrific 9 років тому ⁺³⁴
As a marketing data analyst, I can tell you that despite that Google may use these complicated, multi-faceted algorithms, at the end of the day if there is a retailer that matches a given search criteria, then one of them will invariably show up on the first page.
Sure, it's logical that Google needs to point people to purchases in order to be able to sell their own ad business... but I miss the old Google (circa 2000) when if you searched for, say, "sweater," you'd get nothing but results for how sweaters are made, the history of sweaters, and how sweaters are a secret government plot.
I'm honestly afraid to search for things I'm legitimately interested anymore - not because I believe Google cares about my life, but because I'm sick of getting retargeted ads on every single website after I perform a search.
@klieu90210 9 років тому ⁺²
+Higgins2001 I'd say less people are writing about the history of sweaters, their manufacture, etc., and more retailers are selling them and have websites on which they may be sold.
@JasonOlshefsky 9 років тому ⁺¹⁰
+Higgins2001 I too miss the days when the Google search was comprehendible: I used to know that it would favor things where the words I typed were near one another in the text of the page, and if I put it in quotes it would ONLY show me quoted strings (e.g. an arbitrary choice of "toasty horse" returns "Toasty's horse rider" which is not what I typed.) Now I have to fight against it guessing what my habits, preferences, and location are. And especially that it will always point me to a new product to buy-often when I'm actually looking for information on an older model, for instance.
@stoppi89 9 років тому ⁺⁴
+Higgins2001 It could be, that a great portion of that is not google changing the algorythm, but rather more people buying stuff online on more online platforms that didn't exist back then. So while people in 2000 may have searched for "sweater" to learn about how they are made mostly, today most people might search for "sweater" because they want to buy one online. In 2000 they bought the swaeter in a RL shop. This might be a huge factor, it might be a tiny factor, I don't know, and im not a marketing data analyst.
@stoppi89 9 років тому ⁺⁵
+Jason Olshefsky Try DuckDuckGo, for the habits, preferences and location i mean. That search engine doesn't use these factors (AFAIK).
@rogerwilco2 9 років тому ⁺¹
+Higgins2001 I use NoScript to make it hard for website owners to figure out who I am, so they can't tell google and have them to targeted adds. This doesn't just apply to Google.
Google knows when I click on a link in their search engine and what videos I watch on UA-cam. That already gives them a lot of information.
@HexerPsy 9 років тому ⁺⁴²
What about images? How does a search engine determine how to rank images?
@stoppi89 9 років тому ⁺¹⁷
+HexerPsy I guess by filename, description on website it was found, website it was found, text on website, Metadata of file. Maybe even Color (if you search for yellow car, pictures with much yellow in the center and named car.jpg).
@HexerPsy 9 років тому
Stoppi Hmmm... probably...
@forevatrolling 9 років тому ⁺⁴
+HexerPsy I think they sort images based on the page the image is on, the text around it and the name of the image, they also store a bunch of metadata about it.
@stoppi89 9 років тому
username_unavailable
I agree
@modellking 9 років тому ⁺²
+HexerPsy further i think they search for "similar" pictures and look how often they were clicked in the search to categorize the picture and add their own meta this way to their DBs
@DisdainforPlebs 9 років тому ⁺³³
He wears sunglasses indoors, so you know he's a real hacker.
@romaknafel4116 2 роки тому
Hahahahaha
@Triantalex 2 місяці тому
false.
@vkillion 9 років тому ⁺¹
Before Google started getting big, I really liked Dogpile. It was a metasearch engine. It searched using other search engines and compiled the results.
@Roxor128 9 років тому ⁺²
You could use an evolutionary approach to balancing the different weighting factors.
Have 100 servers running versions of your algorithm with different weightings, each server getting 1% of all your search engine's total traffic, then mate the ones which get the top results the most often to produce the next generation.
@BorealSelfReliance 9 років тому
The way this video described search engines in the first part was really interesting (no new information for me, but was presented in a really interesting way).
@Sharir1701 9 років тому ⁺²
I like the shoutout to Tom Scott there ;) He definitely deserves it!
@BrunsterCoelho 9 років тому ⁺¹
keep these videos coming, really interesting subject and this guy explains it really well
@MaxJNorman 9 років тому
Very happy with the amount of father ted in these videos
@wlfbck20 9 років тому ⁺²⁸
How do search engines keep their (necessarily) high response speed if they cross check pages with the user profile? Especially for more complex checks, for example: You mentioned the search for a horse show and the conclusion from you being a CS that it's probably a present. This seems like something which just takes too long to evaluate.
@CelmorSmith 9 років тому ⁺⁴
+wlfbck20 The search engine probably first presents the best result it can now and makes the checks after you clicked a link in the results in background, like if it was a present and you clicked a link where you can order it, it evaluates the effectiveness of the results and the order of the results after the effect.
@stoppi89 9 років тому ⁺¹¹
+wlfbck20 Most of it's computing is already done when you search for something. Google already knows where you are and if you are a CS. So the conclusion that the horse might be a present can be made really fast, because the complicated stuff (finding out wich sites are for horse presents or where you live and so on) is already computed. It's the same strategy as the "spiders". Google doesn't search the entire web everytime you search for something. They already did it, and know what to show you as soon as you search, because they can just look at their "handy lists".
@detro1tjok3r 9 років тому ⁺¹
+wlfbck20 To leverage caching, they probably already categorized your profile as one of many similar profiles ahead of time. From there, it just takes one person in that group (of millions?) to take the latency hit of caching it. In other words, when you search, they may not uniquely cross check with only your profile, but a representative profile that your profile contributed towards.
@ldesltroy 9 років тому
+wlfbck20
They have auto complete right :)
so its processing the search before you click the button.
@harborned 9 років тому ⁺²
Would be keen to see a discussion on speed of searches and image searches (as mentioned in the comments below)
@paul_4223 9 років тому
Thanks. Very interesting video and well explained. Please keep making them about various topics you only touched : relevance, what methods were used in the past, why a few methods maybe work great in small scale but not in large scale (of users) etc.
@klieu90210 9 років тому ⁺⁸
It's hard to dislike a guy who's infatuated with horses :D
@foobars3816 9 років тому
You said there were so many other things you couldn't cover and that we should leave a comment about the type of things we want to hear about. My request is for ALL of the things you haven't discussed. If this takes 100 videos then I'm happy with that.
@rolandgharfine534 9 років тому
You're bringing back Tom Scott for another video YAY!
@Stormfox93 9 років тому ⁺⁵⁷
So many Tom Scott shout-outs XD
@RobTyleruk 9 років тому ⁺¹
+Stormfox there were?
@firstnamelastname-oy7es 9 років тому
+Rob Tyler In the fake web page URL's featured in the nice animations.
@RobTyleruk 9 років тому ⁺²
Bungis Albondigas Oh thanks, I'll re-look at the video
@RobTyleruk 9 років тому ⁺³
+Bungis Albondigas haha saw them, they weren't even subtle!
@banandanand1515 6 років тому
Stormfox seaxyc
@RandomNUser 9 років тому ⁺¹
I would like to learn more!
I click the first link after a search, if I don't find the content I was looking for, I click the second one, and so on. Do they (search engines) also measure this stuff?
how do they measure relevance?
Would a ranking system (user driven) work for positioning, or do they expect users to just fake their scoring?
bring us more information! =)
@VikramVetrivel1 9 років тому
+RandomVidsFrapsUser Yes, that is measured too. If for example you click on a page and then return to the search results by pressing the "back" button or by searching for the same keywords again, they track how long you spent on the page that they sent you to, and based on the time spent they'll know whether the page they served you was satisfactory or not.
@DarkParadeHF 9 років тому ⁺²
I realistically would not mind knowing the full 60 year history. by the way thanks for the time you already have invested
@NazimUddin-ns4pm 8 років тому
I recommend you check out the case studies around organic SERP CTR. I've seen a lot of sites move from the 20s up to first page with SerpClix.
@BunnyFett 9 років тому ⁺¹
SEO is serious business. Web development is fun.
@linkVIII 9 років тому ⁺⁹
Does semantic web actually matter?
@Scy 9 років тому ⁺⁶
+linkviii It does if you care about people using screen readers as well as developers' sanity.
@gokiyono 9 років тому
+linkviii It makes it so much easier to read when you develop it
@dominikrodriguez456 9 років тому
+Haniff Diiltiown
@TrebleWing 9 років тому
FUN FACT: "Page Rank" is actually not named as such because is about webpages. It was named after founder Larry Page
@bikutoso 9 років тому ⁺¹
I wonder if there are some static weights over certain domains, like if i search for "Horse" i will get Wikipedia as one of the first searches. Or if it is just what was explained in the video?
@gizmoguyar 9 років тому
+Crozix That's a really good point. They probably do weight the Wikipedia domain as a whole because it's such an incredibly popular site. So it might be that for single word searches, or phrase-noun searches, that the domain's weight is increased on the assumption that information on the subject is being sought. For example "Horse grooming" probably won't return Wikipedia, while "Horse Breeds" might.
@stoppi89 9 років тому
+Crozix If I understand you correctly, then yes, these static weights are exatly what they were explaining. It's the link and get linked to thing. Many sites link to Wikipedia, and it is in the top 5 most visited websites. That gives it an enormous static weight. Plus, many people who search for single words (nouns) want to go to wikipedia to learn about it. Many people know what wikipedia is, so if an unknown website comes up first, explaining what a horse is, then many users will probably still click on the wiki link, because they know wiki explains stuff. This process makes it so the unknown website on top falls below wikipedia, which is what more people want. Popular websites like Wikipedia and UA-cam often show up because they get clicked alot by searches and because of their big static weight. Althought that static weight of yours is probably always changing (in small amounts). Sorry for wall of text, thanks for reading anyway;)
@gizmoguyar 9 років тому
Stoppi That makes sense. I think the original question was about the weighting of the entire domain. For example a very very obscure wikipedia page might not have a high page ranking; it might have more links to sources than there are links to itself. But because the wikipedia domain is popular the page is ranked higher. I think your explanation is good though, and definitely still applies.
@stoppi89 9 років тому
gizmoguyar Ah, I see. Well that is probably taken into account, too, yes. But another thing is, a very obscure wiki page only shows up if the search is very obscure, and therefore might not have many results, and certanly no high ranked ones. So i guess if you search for underground stuff, it is probably even more likely to get a wiki result. Also because most people know what a horse is, but maybe not so much know what Epizootic lymphangitis is, so for that search term, more people who search it want to know what it is. (I totally searched google and wikipedia for that example :D)
@LikelyToBeEatenByAGrue 9 років тому
There is a feature that google deployed which can match an image to similar images around the web. I'd love an explanation on how it works.
@1flovera 8 років тому
so once you build a graph with webpages and their ranks, what can you use to make a conclusion out of the graph ?
@simoncowell1029 9 років тому
Great video, thank you ! Are eigenvectors used ? If so, please could you make a video about it ?
@jasongladen82 9 років тому
Can you talk about the.. The conflict between commercial interest (like advertisers wanting to be on the first page at any cost) and Major search engines wanting to not be broken...
@Scratchifier 9 років тому
Would that mean that a forums section with external links takes away from a site's reputation? Or does user content not matter to google?
@Malonomy 9 років тому
A thing I'd love to know is this : is it possible to do this backwards ? Let me explain : here, we're talking about searching for term X and Y in a set of documents and finding which documents are the most relevant ones.
But, say I'm reading a new document, what kind of algorithm can I use to know what it is about, and have it put in the same place with other documents that are about the same topic ? TF-IDF only gives scores for individual words (as far as I know), and most often, a document's subject can not be only described with a unique word (think about news articles, for instance)...
@veggiet2009 9 років тому ⁺¹
Isn't there a meta search engine out there which puts a search to multiple engines and then compares the results combining the results?
@NikolajLepka 9 років тому ⁺⁵
thing to note here is that Yahoo uses Bing internally
@pcfreak1992 9 років тому
+Nikolaj Lepka do you have any proof for that? As far as I know Yahoo was there before Bing came along..
@HydroByte 9 років тому ⁺¹
+Nikolaj Lepka
Hi, are you sure? Do you have any references about this subject? In my testes, yahoo and bing have different responses for the same search.
@NikolajLepka 9 років тому ⁺²
Do a quick google search about yahoo and bing.
It's something about Yahoo making a deal with Microsoft a few years ago to use bing as its back-end search technology
@HydroByte 9 років тому
+Jonas Heinrich
Nice! Thanks. We developed an anti-plagiarism app and this might help to improve performance, since Bing and Yahoo has the same engine, in theory.
@NikolajLepka 9 років тому
***** how would this information help performance?
@jasongladen82 9 років тому
How about Map Reduce is used to produce part of the index?
@lvachon 9 років тому
I'd like to know more about how these massive indexes are actually stored on the computer.
@TheRealDrWho 9 років тому
When I search, I'll typically click have a look and either I like website #1 press back after say time T1. Then I'll click on a second link, read that and so on; either to continue for say 4 or 5 pages(links from my search) or maybe stop there after the second link.
Q. Does the length of time I stay on each opened link get measured and account for which website is better for me? I.E. if I stay on link 1 for 10sec but link 2 for 5min, then I refine my search is link 2 promoted as such?
Q. why oh why did they stop google desktop, it was brilliant in it's days...
Q. how can an organisation (don't know if it's the right word) measure the popularity of google vs Yahoo vs Bing and so on? Can we see how google always stay at the top or maybe one day fall from existence...
@6l0w135 9 років тому
I'd like to learn about how online javascript programs such as games are found by Google. They don't contain much text except for their title so how does Google know that they are relevant?
@jeremiejollivet4683 9 років тому
Now that people ask questions rather than keywords, I wonder how the outputs of a search engines differ.
@CarterColeisInfamous 9 років тому
Pagerank is so old, its evolved into something much more as social media cannibalizes the link graph and the signals became spammed by link buyers
@SuviTuuliAllan 9 років тому ⁺²
So what do you guise use? Seeks? Yacy? Starpage? DuckDuckGo?
@Humineral 9 років тому ⁺²
DuckDuckGo primarily > google if I don't get the hits I want > bing if I manage to sift through every result google throws at me.
@unvergebeneid 9 років тому
+Suvi-Tuuli Allan DuckDuckGo with a browser plugin that provides a "Search on Google" button ;)
@tllong2 9 років тому ⁺¹
+Suvi-Tuuli Allan DuckDuckGo & if that fails then the same search with !G at the end, DuckDuckGo's way of allowing you to do a search using Google, or other search engines using different letters after the '!'.
@SuviTuuliAllan 9 років тому
Neat, yeah. I hear that Bing is good for pr0n.
@shabasupermayn 9 років тому
how are matrices used in the page rank ?
@shabasupermayn 9 років тому
+vandos1 thankss I'm trying to learn for uni aswell want to make a search engine for a project lol
@Yupppi 3 роки тому
It must be infuriatingly difficult to boost the right sites for a person who for example is generally curious but wants to get cold hard facts, really high quality scientifical level information and educational content on a topic that's fairly common (or the search words are fairly common), and there exists a really fantastic individual maintained expert site that is barely known in mainstream, but has all the possible info you would ever want about the subject. Internet used to be full of those sites.
@hakimmolla8915 8 років тому
It is a graet video .Thank for your post video
@MrRolnicek 9 років тому ⁺¹
Sooo, just a question to throw out there. How do you guys think Google uses its own DNS to improve the search engine? (If at all) And is it even possible?
@stoppi89 9 років тому
+MrRolnicek As far as I understand DNS and Google (not as much as you probably), their DNS is helping the spiders with locating websites and contents, but I can't really see how a DNS could improve the quality of search results. Because it doesn't read content nor why users (who use the DNS) are there. He sees WHAT they visit, but google sees that anyway, even if you don't use their DNS. Does this help?
@MrRolnicek 9 років тому
Stoppi I guess, but how does google see what they visit "anyway"? I mean sure google knows what anyone who uses their DNS wants to visit, but unles they go there from actual google and you dont use their DNS then they wouldn't know?
@stoppi89 9 років тому
MrRolnicek
Oh, I meant when you visit after a google request, cause that's probably what mostly matters for the rating of sites. But apart from that, and maybe more for adwords (google ads) and less for the search, they track you over trackers (I don't know about that, but the ghosterything detects stuff), and over google analytics, which is a service on basicly every site, they know who visitid what when from where and so on.
@cubedude76 9 років тому
how does that ranking system work for websites that are useful because they link to so many things, like google for example. google links to WAY more things than it gets linked to so shouldn't it be a very unimportant website?
@nerdytshirts8188 9 років тому
+cubedude76 Yes. But you are not going to search for "Google" on Google. If you search for "Google" on Bing, for example, Google will be at the top of the list because of the page rank for the specific term of "google", it has the most authority, namely the keyword in the URL and other people linking to it as Google.
@felipevareschi7773 9 років тому
3:38 to hell with the topics in the video , time to read all of this :D
@JuanThomas 8 років тому
Thanks, why not get some great SEO tools from *FollowingLike*?
@Caelum1337 9 років тому
I want to see more SEO related videos!
@DeltaHedra 9 років тому
My lovely Horse? Nice Father Ted reference!
@popcorn908 9 років тому
How was this type of research used 60 years ago?
@zatoichiMiyamoto 9 років тому ⁺²
hi guys, greetings from Chile! land of the earthquakes and tasty wine!!
@Pianoguy32 9 років тому ⁺¹
+zatoichiMiyamoto you forgot about copper that makes most of our computers work :p
@PJemus 9 років тому ⁺¹
+zatoichiMiyamoto your earthquakes have pathetic ground acceleration.
@legitt6093 9 років тому
I don't know what they've (Google) done, but the relevancy of my searches have only went downhill since 2010.
@malcolmbryant 9 років тому
Please slow down and speak more clearly. You have things of value to convey.
@tomatensalat7420 9 років тому
I always thought spiders are called crawlers. Is that simply another name, or is there actually a difference?
@jjppmm29 8 років тому
spider... is that an English/British thing?.. because I have always just called them crawlers
@Muzer0 9 років тому
Is "My Lovely Horse" a Father Ted reference?
@goshisanniichi 9 років тому ⁺¹
Bayesian Networks and other probability-based methods...
@pridemechanical815 2 роки тому
So a small family owned plumbing company with limited SEO knowledge and limited budget could not realistically compete with large scale company competitors in the area who hire a team to manage their website everyday using all the Goolge SEO tactics.
@griffinesq 9 років тому
Cool. father ted reference?
@amihartz 9 років тому
I read this a "Rage Pranking" for some reason. I think I'm dyslexic.
@j7ndominica051 9 років тому
That panda looks fittingly evil. What is Google getting when it obfuscates search results under mile long internal proxy links? Then, when that result is followed, a google site must be called up first, to receive a redirect to the real site, which takes a while if a new secure connection needs to be established or if the google server is busy.
@LazyMasterGamer 9 років тому ⁺²
Tom Scott FTW!
@trevorWilkinson 9 років тому
nice advertising for Tom Scott
@GuildOfCalamity 9 років тому ⁺¹⁶
Talk about the "Deep Web"
@lightsidemaster 9 років тому ⁺¹
+GuildOfCalamity Yes please
@osalbaro 6 років тому ⁺¹
That's pretty much the pages Google doesn't index
@pcfreak1992 9 років тому ⁺²⁹
This guy really seems to be nervous or on too much sugar :-D
@ButzPunk 9 років тому ⁺³⁵
+pcfreak1992 he was just distracted thinking about horses
@AnstonMusic 9 років тому ⁺²
+Ben Rowe My Lovely Horse sounds a bit fishy...
@TheWeepingCorpse 9 років тому
+Anston Music i'll just fetch my stool.
@AnimeReference 9 років тому ⁺¹⁶
+pcfreak1992 This guy's fantastic. He has the clearest expression of any of the computerphile guys.
@stoppi89 9 років тому ⁺³
+pcfreak1992 Typical Computer Science guy while not being infront of a Screen and having to talk to RL people
@brookygamesvr 9 років тому
Did you say 60 years or 16 years of research. It sounds like 60.
@Obama_OReilly 7 років тому
1:23 "my lovely lovely lovely horse"???
@redex6004 9 років тому
Is Siri every going to be any good? (Natural Language question)
@apburner1 9 років тому
Why does this guy bounce so much? Meth, coke, amphetamines?
@ian1842 9 років тому ⁺²
The way this guy wears his sunglasses annoys me.
@LeDinx 9 років тому
My Lovely Horse!
@troyadams19 9 років тому
When a spider crawls the web, why doesn't the server it's requesting the code from recognize it as a bot and refuse it's request?
@ScarfmonsterWR 9 років тому ⁺⁴
+Troy Adams well, it can and some probably do, but considering how much money traffic from search engines generates it'd be counter-productive for most.
@qtheplatypus 9 років тому ⁺⁷
They can and sometimes do. Also you can add a "robots.txt" file that tells spiders not to look at some things. However most sites want spiders to index it. Indeed the robots.txt can be used to add info that makes the spiders job easer.
@commentpost907 9 років тому
+? the Platypus I've never thought the spider as a anthropomorphic computer program. Until now...
@mbalicki 9 років тому ⁺¹
+Troy Adams: Most of the time (if we're not talking about deep web) one want's very much to be well indexed by a crawler. Websites are meant to be visited, read and interacted upon by users and what better way to have them on your website, than to let people know about you by web-searching?
@GalanDun 9 років тому
Yahoo is powered by Bing.
@tomatensalat7420 9 років тому ⁺²¹
Woho, DuckDuckGo! :)
@SkateTube 9 років тому
Question is about wanting this, but $_Gets this..
@teekanne15 9 років тому
his inability to sit still drives me nuts
@Scy 9 років тому ⁺⁴
+teekanne15 Computer guys aren't the most relaxed when we're trying to eloquently explain something. If I have to debug or refactor something by explaining to myself how it's supposed to work I have to get up and walk around. Maybe the brain is screaming for more oxygen.
@Chr0nalis 9 років тому
I am dissapointed that www.pony-horse.horse does not exist.
@Neceros 9 років тому
He bounces when he talks.
@WouterWeggelaar 9 років тому
Good that DuckDuckGo popped up there.
And nice plug for +TomScott
He has done it again according to the BBC ;)
@shivi_chronicles 5 років тому
obsessed with Horses smh.
@kevinfontanari 9 років тому
*A random ordinal number*
@klieu90210 9 років тому
+Kevin Fontanari Six thousand nine hundred forty-third
@MoviMakr 9 років тому
You guys should talk about the deep web.
@mrgeorgejose9132 9 років тому
sharada reddy so to special me
I am your my best friend forevear
@SupLuiKir 9 років тому ⁺²
LAST!
(at the time of posting)
@xylongevity 4 роки тому
real hacker
@jayzo 9 років тому
As an SEO this video is irritatingly out of date.
@lmotaribeiro 9 років тому
Markov CHAAAAAIN
@ShivKumar-lh3wr 8 років тому
SWARG
@deldrinov 9 років тому ⁺¹
First!
@vishnumangalath 9 років тому ⁺²
+deldrinov ugh...
@deldrinov 9 років тому
+VishnuM I know, but it was just 15s
@ShadowLuchs 9 років тому
+VishnuM if you would stop being annoyed by "first" sayers, they would go away. Just saying...
@tedchirvasiu 9 років тому
ayyyyyyyyyyyyyyyyyy
@DanieleTrapani 9 років тому ⁺²
+Ted Chirvasiu macarena
@ArnoldsKtm 9 років тому ⁺¹
+Ted Chirvasiu lmaoooooo
@-.._.-_...-_.._-..__..._.-.-.- 9 років тому ⁺²
You're all living in a bubble.
@ArnoldsKtm 9 років тому
+David S. Look at this guy. :D
@raynoldcsya8317 9 років тому
Second!
@BrunsterCoelho 9 років тому
keep these videos coming, really interesting subject and this guy explains it really well
@ShivKumar-lh3wr 8 років тому
SWARG
@muhmmadtehseen7109 8 років тому
.

Наступне

Автоматичне відтворення

DeepSeek is a Game Changer for AI - Computerphile