It is interesting how adding more dimensions increases the average distance between points in a unit cube (it's a small exercise to check that for 1-dimensional cube the answer indeed is 1/3). Here is a listing of a small simulation in JavaScript (1 to 10 dimensions, 10^7 samples): dim=1: 0.3333 dim=2: 0.5215 dim=3: 0.6616 dim=4: 0.7777 dim=5: 0.8786 dim=6: 0.969 dim=7: 1.0515 dim=8: 1.1282 dim=9: 1.1999 dim=10: 1.2674 It feels like this can be attributed to the fact that with increasing dimensions, the outermost part of the unit cube takes more and more relative volume. However, the maximum possible distance √dim (diagonal) is also increasing, so these averages are likely to approach infinity, but there is one interesting thing: There seems to exist a limit of AvgDist(dim) / √dim, and it is around 0.4082 I have no idea what it means, maybe that is some known constant :D Upd: 0.4082 is pretty close to 1/√6, if that helps.
The fact that more dimensions increase the distance is logical, I think (for a cube), because the extra dimension can only increase the distance between points: it would only remain the same if the points have the same coordinate on that extra dimension, and can never decrease. Evaluating that constant scale factor relative to sqrt(dim) is a nice exercise though. You're essentially determining sqrt(sum(x_i^2)/N), where x_i is the difference between two uniformly distributed variates. This is the square root of the mean of that squared difference. That mean is 1/6, it turns out, so that constant is 1/sqrt(6) indeed.
It means that the average segment length is for all practical purposes directly linearly proportional to the length of the longest diagonal - AvgDist(dim) = 0.4082 * √dim, especially for high dimensions . To be more precise, the average segment distance is somewhere between cca 33.33% and 40.82% of the length of the longest diagonal, smaller for low dimensions, higher for high dimensions, but that's still pretty well bounded for a quick estimate. If you don't know the dimension, you can just use the constant and you won't be much off. It has implication for high-dimensional stochastic processes, where increasing the dimensionality of representation increases the average length between randomly generated samples or noise or average length of vectors generated using such stochastic data, which means the errors/noise is growing with dimension. If that is a problem, you can maybe stabilize it by dividing or multiplying proper stuff in the equations by square root of dimension or it's powers (e.g. second power is just dim), by which you pretty much stabilize the average length regardless of the dimensionality of representation. Your constant is not the important thing here, the practically linear proportionality of average segment and longest diagonal is.
One interesting detail here is that if you add a third random point and examine the average area of the triangle with vertices at the points, the problem becomes much simpler. It's even possible to deduce the answer (11/144) without using integration.
It is pretty funny how much more complicated the closed form of the answer is compared to what you'd expect from the layout of the problem. You would think the average distance between two points on a square would simplify down to something pretty rudimentary but instead it's got this weird sum of fractions involving the number 15 and a natural log out of seemingly nowhere.
I agree! It seems like a classic problem where a nice constant or solution would pop out but it's almost more interesting because of how nasty the closed form solution appears!
A slight reframing: we first compute the distribution of X ~ Y ~ U(0,1) - U(0,1) ~ U(0,1) + U(-1,0) (random variable obtained by taking the difference of two independent uniforms on (0,1)) and then compute the expectation of sqrt(X² + Y²) The distribution of the difference U(0,1) - U(0,1) is the same as that of U(0,1) + U(-1,0) whose pdf is a triangle function (as convolution product of two rectangle functions) which is centered around 0. This is why we split the integral and get affine-linear terms like (1-x) and (1+x). The sqrt(X² + Y²) distribution reminds me of the chi-distribution, except that we swapped out the Gaussian random variables for our "triangle distributed" random variables X and Y, so I wonder if there's a connection between those.
The channel mindyourdecisions had this problem a couple of years ago. A similar problem, the average distance of two points on a circle, was presented on 3blue1brown IIRC. Interesting problems, the math is over my head but I like visualizing them using geogebra or programming.
It should be Mathemaniac who presented the circle version of the problem, he's also using manim so the style of his video is very similar to 3b1b's. Here's the link to the video series: ua-cam.com/play/PLDcSwjT2BF_XRsHboeAQ5esyUd6RdHtgC.html
That was a shit video, he glossed over the hard details because he's a hack and then went fully into the hard details. I watched that video so many times and each time I hate it even more. If anyone uses that video as a template to instruct they'll be a horrible professor. He's a great example of someone that just uses the algorithm for profit.
2 роки тому+1
@@thelightningwave Why do you keep watching it, if you hate it?
@ I don't. I mean I kept watching that video to understand what he was talking about, but each time I watched I just realized that it's a horrible description/explanation of the problem. It wasn't until I watched it Soo many times and when flammable maths openly mocked him that I realized that mind your decisions is just math ClickBait and doesn't really explain things in a concise manner. I watch like at most three of his videos a year. I used to gobble his videos up like the end of the world was nire.
Look at a general (orthogonal) rectangle with horizontal side x and vertical side y inside the unit square. Without loss of generality assume P to be the bottom left corner and Q the top right corner. The required distance is √(x²+ y²). The weight of this value is the wiggle room of the rectangle within the square, which is (1-x) horizontally and (1-y) vertically. The average distance between two arbitrary points becomes (with integration boundaries x,y both in [0,1]) ∫∫(1-x)(1-y)√(x²+ y²)dxdy/∫∫(1-x)(1-y)dxdy because we have to divide by the average of the weight factors.
Doesn't anyone agree that sub he does of y equals x tan theta shouldn't even be allowed..you don't know y will ever equal x times tangent theta..y will only.ewual x sometimes for some points..serjosuly I dknt see why anyone would.ever think of thst..why nkt just y equals tantheta and x is sec theta or cosine and sine..something sensible and more logical!!
A similar approach. If you consider the positive lengths in the whole unit square then you get 4∫∫(1-x)(1-y)√(x²+ y²)dxdy directly with the factor of 4 coming from the 4 cases (+,+), (+,-), (-,+),and (-,-) for the signs of the coord. differences.
When calculating the integral (starting from 13:30), it is really better to go to the polar coordinates. 4*2*∫[0;π/4]∫[0;1/cosφ] (1-r*cosφ)*(1-r*sinφ)*r^2*dr*dφ= The multiplier 2 compensates for the fact that we integrate up to π/4. and not up to π/2. =... =∫[0; π/4] [(2/3)*(1/ cosφ^3)-(2/5)*(sinφ/cosφ^4)]dφ= ln(√2 +1)/3+(2+√2)/15.
Yea don't you agree that substitution was INSANE and should NOT BE ALLOWED I have NEVER seen a sub where you INCLUDETHE OTHER VARIABLE in the sub..what thenfuck y equals x times tan tbeta..how and why pit x in there . I don't see why or how ANYONE would EVER think of that..come on..that shouldn't eve. Be allowed why do that..maybe y equals cosine or just tangent x...why not just that
great solution, and a nice little lunch break project! import math import random data=[ ] lengths = [ ] trial_number = 10^6 for n in range(2*trial_number): point = [random.uniform(0,1), random.uniform(0,1)] data.append(point) for n in range(trial_number): length = math.sqrt((data[n][0]-data[n+trial_number][0])**2 + (data[n][1]-data[n+trial_number][1])**2) lengths.append(length) average = sum(lengths) / len(lengths) print(average)
If you like these kinds of simulations, I recommend looking into the numpy package and its random features. This simulation can be done with: ---------- import numpy as np TRIALS = 10**6 # Make 2 arrays with random 2D points. (the default is already between 0 and 1) points1 = np.random.uniform(size=(TRIALS, 2)) points2 = np.random.uniform(size=(TRIALS, 2)) # For each pair of points, calculate the difference of their coordinates. # Example: the pair (0.2, 0.8) - (0.3, 0.5) will give (-0.1, 0.3). # You can think of this as a vector that takes you from point2 to point1. diff_vectors = points1 - points2 # Now, for each of these differences, calculate the "size" of each of those vectors. # axis=1 makes it calculate the size of *each* vector, instead of the whole matrix. distances = np.linalg.norm(diff_vectors, axis=1) print(distances.mean()) # built-in average -------- This is shorter, faster to write, more readable, can scale to higher dimensions easily and runs *much* faster. I got 40x the speed.
@@tmpqtyutmpqty4733 well, yes, of course. Good luck solving the 1000-D case analytically though!
2 роки тому+3
@@lidarman2 You should start your avgdst with None, not 0.0. Zero is not a reasonable value. (And overriding all initial zeroes would even be wrong in general. Your solution can't tell whether avgdst has just not been set or whether all it currently just happens to sum to 0.) Numerically, it's better to just add all the samples, and divide them once at the end.
2 роки тому
@@lidarman2 Who suggested creating the whole list? Yes, for your quick and dirty exercise, it's fine. You can keep a running count and a running sum, and at the end you divide them. No need to keep a whole list. That's simpler than working out the math like you did, and has fewer numerical problems.
At 19:37 I noticed that log((1+sqrt(1+x^2))/x) = log(1/x+sqrt((1/x)^2+1)) = arcsinh(1/x). It makes integration by parts simpler. After that, I used a tangent substitution.
In case you were wondering [spoiler alert] [ ( 2 + sqrt 2 ) / 15] + [ ln (1 + sqrt 2)) / 15 ] = 0.5214 to 4 dp, so if you guessed "a half" you were close drive.google.com/file/d/1antIBxzFTw5Cfvw6nxMKLn_634byUxDA/view?usp=sharing I was slightly puzzled by the speedy intro of 4D space at the start, and by the near-absence of any mention of the probability function. We are calculating the "Expectation Value" of the distance, d(P, Q) between points P, Q of a unit-square. The distance function is a function of 4 parameters , the (x,y) co-ords of points P, Q, which can each be chosen independently in the set [0, 1]; hence a 4D probability space. OK, got that. Call this the 4D unit-cube, 4U. The required EV is then the integral over all points of 4U of prob(P, Q) x d(P, Q), where prob (P, Q) is the probability of choosing the 4 parameters of (P, Q). Since the choices are all distinct and independent, we have prob(P, Q) = 1 / total number of (P, Q)s. But what is the total number of (P, Q)s? I took me a minute to twig that the needed "counting" function is the function that "clocks" (takes the value) "1" for each choice of (P, Q), but that's just the "characteristic function", χ on 4U, which when integrated over 4U yields the "vol" of 4U (I think the Prof mentioned this) Thus prob (P, Q) = 1 / vol 4U, which is just a number which can be factored-out of our original integral; and, in fact, it's just the number 1 (which the prof also mentioned), so it, in effect, disappears Sorry, rather laboured all that
I remember that one other math channel posted same question, but using a little bit different concept about probability density function in the middle.
I was lost by about two minutes in. Why is a four-dimensional hypercube involved? What does it even mean to integrate the distance formula? Like I understand integrating a function over some interval, and I get (I think?) that the integral(1 dxdydz)= the volume of that bounded region, but.... yeah I'm stuck.
I really thought there would be a way to do this without integration, using some clever trick. By the way, you sure of your answer? I put the integral that we had at the start ( the integral of Sqrt[(x2-x1)^2+(y2-y1)^2] ) in Mathematica and it spit out this as the answer: 1/120 (16 + 8√2 + 80π - 70 Log[1 + √2] - 15 Log[2] - 40 Log[3 - 2√2] + 30 Log[2 + √2]) In particular there is a π in the answer 🤔
I put the final answer in a calculator and got 0.52140543316. I ran one million simulations in excel and got 0.521539 Probably need a lot more simulations...
Jacobian is there, because an ordinary "substitution" such as u = 3x will "inflate" the region by a factor of 3 .. and so the same quantity (such as 9x^2 + 7, or aka u^2 + 7) placed into a new domain (such as u : 0 to 30, instead of x : 0 to 10) ... well it just aint right. Without some sort of Jacobian factor (such as a 1/3 = dx/du). Otherwise you're getting your quantity accumulated across three times as much territory ...
Some knowledge from probability theory allows to carry out transformations 4-dimensional integral to two-dimensional faster. A random variable A is the difference between two independent random variables x1 and x2, uniformly distributed over the interval [0,1] : A=x1-x2. Its distribution density is constructed as a convolution of the distribution densities x1 and x2. Result: pA(x)=1-∣x∣ if x∈[-1,1] ;pA(x)=0 if x∉[-1,1]. The same result for the distribution density pB(y) of a random variable B=y1-y2. According to the average calculation rule: Avg=∫[-1,1]∫[-1,1] sqrt(x^2+y^2)*pA(x)*pB(y)dy*dx= =4∫[0,1]∫[0,1] sqrt (x^2 +y^2)*(1-x)*(1-y)dy*dx. It is clear that for an n-dimensional unit cube, calculations can be performed using the formula Avg=2^n*∫[0,1]∫[0,1]∫[0,1]...... (1- x)*(1-y)*(1-z)......*sqrt (x^2+y^2+z^2......)dx*dy*dz....
@@lPlanetarizado Avg(n)= 2^n*∫[0;1]...∫[0;1](1-t(1))*...(1-t(n))*sqrt [t(1)^2+...+t(n)^2]dt(1)*...dt(n). (1) The inequality about the arithmetic mean and geometric mean gives [ t(1)^2 +...t(n)^2]/n ≥[t(1)^2*...*t(n)^2]^(1/n)=[t(1)*...*t(n)]^(2/n). So sqrt [t(1)^2+...+t(n)^2]≥ sqrt(n)*[t(1)*...*t(n)]^(1/n). (2) Of (1) and (2) Avg(n) ≥(2^n)*sqrt(n)*{∫[0;1] (1- z)*z^(1/n) dz}^n=... ...= (2^n)*sqrt(n)*{1/(1/n +1)- 1/(1/n +2)}^n = = (2^n)*sqrt(n)*{n^2/[(n+1)*(2n+1)]^n = =sqrt (n)/[(1+1/n)^n*(1+1/2*n)^n] It turns out that for n ≫1 Avg(n)≥sqrt(n)/[e*sqrt(e)]=sqrt(n/e^3). The estimate is rough, but shows that Avg(n) is proportional to sqrt(n).
2 роки тому
@@lPlanetarizado You can use en.wikipedia.org/wiki/HM-GM-AM-QM_inequalities to prove divergence in the original formulation. You don't need to solve the problem to a closed form to prove divergence.
@ i see ....i knew there has to be an easier way...i tried the integral - just for fun- with spherical coordinate substitution, but you get a complicated expression too
(1) What is the average distance of 2 points on the edges of a unit square? (2) How does the ratio of question (1) to the original average in the video change as dimension grows towards infinity?
It's very quick to work out that the result is exactly 1/3 in one dimension, and then using the triangle inequality gives you an upper bound of n/3 in n dimensions.
2 роки тому
You can use en.wikipedia.org/wiki/HM-GM-AM-QM_inequalities for some bounds. Or more generally, artofproblemsolving.com/wiki/index.php/Power_Mean_Inequality
It can be interesting to ask the same question for a square with non-constant density. Depending on the density function there should be parts of the square with "more points" than others, so the average distance should change. However analytically solving the integral may really become challenging if not impossible
Hi, Seams like the distance d((x1,y1),(x2,y2)) is counted twice, once such as d((x1,y1),(x2,y2)) and another time such as d((x2,y2),(x1,y1)), no? I don't think the variables substitutions fixe that, one of the 2's factors has to be canceled. To answer that question, may be we could decompose into 4 squares, and calculate the value for two non overlapping squares (where you avoid this potential double counting).
Oui il y a un double comptage dans le sens où l'on peut échanger les points et avoir une configuration comptée une deuxième fois alors qu'il semble que c'est la même, mais c'est comme ça que le problème est défini : il y a un premier point et un deuxième point. Si on insistait sur ne pas identifier les points il faudrait couper l'espace des configurations en deux mais alors on devrait diviser par la masse totale de l'espace des configurations (la quadruple intégrale sans rien à l'intérieur) qui vaudrait 1/2 et donc même résultat au final. Si on était en discret on serait emmerdé par les configurations où les deux points sont confondus qui ne sont elles pas double-comptées, mais ici c'est un événement de probabilité nulle, qui ne compte pas.
@@3x3-x3x-oXo Oh my bad, you are right, when you calculate the weight with the integral of 1, it's the same thing : you count the points twice, so they cancel each other. (The french flag is there only to apologize for my poor english, but we have to write in english).
I gave this a try without watching the video and thought this might be the solution: integral from 0 to 1 integral from 0 to 1 (2-x-y)*(sqrt(x^2+y^2)) dx dy But that ended up being something that I can't solve and that is equal to around 0.651757 according to WolframAlpha. I don't really know why my idea is wrong but I definitely wouldn't have tried to use a quadruple integral after already being unable to solve a double integral.
being a statistician i'd have done it with probability theory: - the components of P and Q are IID uniform - the absolute difference of two uniform RVs is a triangle distribution - squaring is a change of variables - summing two such RVs is convolution, probably not tractable so we leave it as is - law of the unconscious statistician turns this into a double integral, even more ugly but the fact that this video exists at all indicates it's tractable
@@3x3-x3x-oXo yes, it's exactly the same in the end; the point of the comment is that probability theory provides conceptual chunks that get you to the final integral faster, e.g., it makes it clear that you can handle the two dimensions separately so you don't have to write down a quadruple integral
I came up with a solution to this problem that ended up being incorrect, but I’m still not entirely sure what is fundamentally wrong with it. It is well known that the average distance between two points on a line of length 1 is 1/3. Therefore the expected values for (x1-x2) and (y1-y2) should also be 1/3. Plugging this into the distance formula gives a result of sqrt2/3
That's because in general avg(F(x,y)) is not equal to F(avg(x),avg(y)). In fact that is a very wrong assumption in most cases. Take for example kill death ratios of a videogame scoreboard. The average kill death ratio is not necessarily 1, but the ratio of average kills and average deaths is always necessarily 1, since kills=deaths.
Fun fact: the average distance between 2 points in a unit cube also has a closed form, and yes, you do get 6 nested integrals. I don't know whether there's a closed form for 4+ dimensions
8:54 I'm confused why are the bounds of the third integral sign thingy 0 to -1, shouldn't they be 0 to 1 as those are the values x is ranging between in the purple triangle?
I think you're right. The bounds should be 0 to 1. On the next board Michael corrected the bounds on the second integral and said the editor will fix it in post for the previous board. I think maybe the editor thought both needed correction.
The integral is actually Int_{-1}^0 dx Int_{-x}^1 dz + Int_0^1 dx Int_0^{1-x} dz = Int_{-1}^0 dx (1+x) + Int_0^1 dx (1-x) = 2*Int_0^1 dx (1-x) upon change of variables x->-x in the first integral.
jesus this is a hell of a headache of a problem. i worked it out in my own way using probability density functions and got a somewhat similar answer with nested radicals that i *think* is equivalent to your answer. but it's late and i'm too tired to prove they are the same. but it seems likely. incidentally, my method really easily extends itself to n-dimensions
2 роки тому
> but it's late and i'm too tired to prove they are the same. I'd plug both of them in a calculator and see whether they agree numerically, and only if they do, start proving.
I think this exsample shows how it is different between the simplicity for human understanding and world of math. math is made in order to solve many human-like problems easier, but as always, we can’t produce a “perfect” tool. Maybe we have to be super-surprised of the fact that this easy method is so versatile.
Would there be a problem with finding the average of the square of the distance to simplify the integration and then come back and get the actual average distance?
Hello Mr. Penn. I have a question that has been keeping me all night these past few days. Hope you can help. Question: Let have any area on earth ( a city or a region). That we can calculate or know its area. We have n (a natural number) number of fixed spot we can put on the area.How can we calculate the minimal distance we can place n number of fixed spot such that we have the same distance between any point on the edge of the boundary to the next fixed spot or from fixed spot to fixed spot. For example: if we have n =1, so the only logical place to put the fixed point is at the center. From the the center ( fixed spot) to any point in the boundary are the same distance. What is the reasoning in the case n = 2,or n = 3, … n = inf?
In the easy to ask, but surprisingly tricky category: If rolling a die 100 times, what is the probability of getting five in a row of the same number at least once? Was helping a student with that, and there are some subtle traps to fall into.
@@teeweezeven Huh, I wonder what happens with those values if we generalize to n-dimensional hypercubes. But not enough to calculate it formally, maybe I'll run a little sim.
@@ulrichs.3228 I mean, going from 1 to 2 dimensions already complicates things a whole lot. All I can confidently say is that as n gets bigger, the distance also gets bigger, probably tending towards infinity.
It's been a while since I did an integral this involved, but when you did the change of variables y = xtanθ, why isn't the differential that results xsec²θdθ+tanθdx? Don't the differentials follow a Leibniz rule?
Seeing the thumbnail I interpreted the question different, but it seems like it might still be interesting... for two specified points what is the average path length between the two... which I feel like will just be infinite, so maybe add the restrictions that the path must not cross itself and has a certain width.
I don't see why anyone would.think of x tan theta.is thst even allowed you dot know that y will ever equal x tsn theta since x is a point in the square. So yea thinking about it morenkw.i ealize that is not a valid substitution
Layman person computer programmer here. Given two tuples (Cartesian points- x1,y1, x2,y2), what’s the algorithm I can implement in code? I didn’t follow the simplification and all of the substitutions. The final values do not seem to link back to the original inputs at all.
In code you want to compute the average value over many random experiments (where x1, x_2, y_1,y_2 are random floats between 0 and 1) of the value sqrt((x_1-x_2)^2 + (y_1,y_2)^2). So it's a simple for loop. What he computes is the expectation, that is the idealized average where the number of trials becomes infinite.
By geometrical estimation, I would have guessed the average distance of any two points is somewhere between ([1/2 of root 2 on 4] and [1/2 of 1/2]) = 0.35355 and 0.25, with less weight to the 0.25 due to geometry. As the answer is 1/3, the weighting of the lesser number 0.2022... 0.00833 (differential) the weighting is about 0.412 or 1 in 2.426.
Does michael have a calc 3 course play list? Im sorry but i havent done calc 3 (just 1 and 2) so I don't know where he's getting these conclusions from.
Based only on the average value (Expected value) of a random variable, it is impossible to draw a conclusion about the distribution of probabilities of its values. Nevertheless, in this case P[sqrt(x^2+y^2)>1/2] =125/96 - π/4 =0.517...
@@alexpotts6520 In discrete probability you can get a slightly different average when you remove double counting and when you don't, because diagonal cases (where the two points are the same) are never doubly counted. But here in continuous probability these cases do not contribute to the integral because the probability that the rwo points are the same is zero.
May I point out that 'randomly chosen' is not a proper mathematical definition ? A uniform distribution along the axes is quite logical and gives a uniform density of points inside the square, but it is not the only way to randomly choose two points inside a square (you could use polar coordinates, just for starters, with a uniform distribution of angles). And different methods will give different answers. Really, the exact random method used should be stated as part of the problem, not left to the solver to pick.
Although he tries to pump some intuition by talking about random points and an expected value, the actual question is "What is the average distance between points in a unit square?" meaning ALL points within the unit square. No distribution needed since you simply take each point (an infinite number of them) and compare it to every other point.
@@APaleDot And yet if you took a function such as f(t) = 1.5 for t1/2, it is a probability distribution that sums to 1 over [0,1]. Apply it to x and y, and you still get all points within the unit square. But somehow the points near 0 will appear more frequently, but that's okay since there's an infinity of them, you' won't run out. I'm not even going to try to work out what it does to the average distance - I'm definitely not that good at quadruple integrals - but I feel confident the result would be different.
@@emmanuellaurens2132 Sure, the result would be different if you used a probability distribution to sample points in the square. But why are you sampling points at all? To calculate the average distance between points in the square, you just look at every point in the square exactly once. There's no probability there.
@@APaleDot There is probability there, even though it is hidden: When you "look at every point inside the square once", each point inside the square will enter your focus exactly as often as any other point inside the square. In other words, to your focus the points inside the square appear to be uniformly distributed! @Emmanuel Laurens I completely agree - the reader should _never_ have to assume a probability distribution function. Even though it seems to be common one should assume uniform distribution if no pdf was specified...
I'm quite sure this problem is not well defined. I think this is way more difficult to calculate (didn't do it), but you could imagine random generated lenghts with one point (middle of the line), one angle, one lenght (the upper bound of that length is the difficult part to define). There is 4 parameters, everything is good. Except I'm quite sure the answer is not the same. Problem is : what is random ?
I'm sorry y equals x tan theta is even a Valid substitution..you don't know if y will.ever equal x times tangent theta because x is a point in the square but some dummy variable..
Isn't it somehow related to bertrand paradox? We can't say that every point has the same probability of appearing if we are talking about infinite sets, we need some probability distribution
Whenever I see a problem that starts with "choose two points at random in this 2D region" I hear bells and whistles going on somewhere in my subconscious.
No not really, a uniform distribution over the square is the really only natural "random" distribution for a point. Bertrand's paradox is more about situations where the probability distribution is ambiguous. Multivariate continuous distributions exist and there is no issues with doing statistics with them.
It probably should state the distribution, but if its not then I think it's safe to assume it is uniform. HW: solve for all cases of the beta distribution in terms of parameters a and b.
You'd think if he had a non-uniform distribution he'd have said it. The most "natural" thing to do is assume uniformity, since he didn't say (for example) "this corner of the square has more points than the others".
Could you please, write or show me any basic of knowledge, video that you learn about this subject. I need a little bit growing about integrals that you teach now. I appreciate about it. Before that, I learned many things that you learned.thanks a lot.
For anyone wondering, the numerical value is about 0.5214
Thank you. I was just pulling out the calculator to find that out myself.
Makes sense when the diagonals are considered
That is a good place to stop.
@@Eianex engineering math...
Like when π=e=3
@@Eianex I am an engineer . I guessed at ther answer...my guessd was 0.5
Love how he is 20 minutes in and then just jumps to the solution. Reminds me of my undergrad math professor for fluid mechanics.
It is interesting how adding more dimensions increases the average distance between points in a unit cube (it's a small exercise to check that for 1-dimensional cube the answer indeed is 1/3).
Here is a listing of a small simulation in JavaScript (1 to 10 dimensions, 10^7 samples):
dim=1: 0.3333
dim=2: 0.5215
dim=3: 0.6616
dim=4: 0.7777
dim=5: 0.8786
dim=6: 0.969
dim=7: 1.0515
dim=8: 1.1282
dim=9: 1.1999
dim=10: 1.2674
It feels like this can be attributed to the fact that with increasing dimensions, the outermost part of the unit cube takes more and more relative volume. However, the maximum possible distance √dim (diagonal) is also increasing, so these averages are likely to approach infinity, but there is one interesting thing:
There seems to exist a limit of AvgDist(dim) / √dim, and it is around 0.4082
I have no idea what it means, maybe that is some known constant :D
Upd: 0.4082 is pretty close to 1/√6, if that helps.
The fact that more dimensions increase the distance is logical, I think (for a cube), because the extra dimension can only increase the distance between points: it would only remain the same if the points have the same coordinate on that extra dimension, and can never decrease.
Evaluating that constant scale factor relative to sqrt(dim) is a nice exercise though. You're essentially determining sqrt(sum(x_i^2)/N), where x_i is the difference between two uniformly distributed variates. This is the square root of the mean of that squared difference. That mean is 1/6, it turns out, so that constant is 1/sqrt(6) indeed.
It means that the average segment length is for all practical purposes directly linearly proportional to the length of the longest diagonal - AvgDist(dim) = 0.4082 * √dim, especially for high dimensions . To be more precise, the average segment distance is somewhere between cca 33.33% and 40.82% of the length of the longest diagonal, smaller for low dimensions, higher for high dimensions, but that's still pretty well bounded for a quick estimate. If you don't know the dimension, you can just use the constant and you won't be much off.
It has implication for high-dimensional stochastic processes, where increasing the dimensionality of representation increases the average length between randomly generated samples or noise or average length of vectors generated using such stochastic data, which means the errors/noise is growing with dimension. If that is a problem, you can maybe stabilize it by dividing or multiplying proper stuff in the equations by square root of dimension or it's powers (e.g. second power is just dim), by which you pretty much stabilize the average length regardless of the dimensionality of representation. Your constant is not the important thing here, the practically linear proportionality of average segment and longest diagonal is.
@@popularmisconception1 en.m.wikipedia.org/wiki/Curse_of_dimensionality
prove that the answer is exatly sqrt((n-1/3)/6) where n is the number of dimensions
ok that's probably wrong since it isn't what he gets in the video
One interesting detail here is that if you add a third random point and examine the average area of the triangle with vertices at the points, the problem becomes much simpler. It's even possible to deduce the answer (11/144) without using integration.
But didn’t you need to crosscheck with the general solution?
without integration? how exactly is that possible?
How is possible without integration?
How?
Hahahaha. What a little troll.
It is pretty funny how much more complicated the closed form of the answer is compared to what you'd expect from the layout of the problem. You would think the average distance between two points on a square would simplify down to something pretty rudimentary but instead it's got this weird sum of fractions involving the number 15 and a natural log out of seemingly nowhere.
I agree! It seems like a classic problem where a nice constant or solution would pop out but it's almost more interesting because of how nasty the closed form solution appears!
A slight reframing: we first compute the distribution of X ~ Y ~ U(0,1) - U(0,1) ~ U(0,1) + U(-1,0) (random variable obtained by taking the difference of two independent uniforms on (0,1)) and then compute the expectation of sqrt(X² + Y²)
The distribution of the difference U(0,1) - U(0,1) is the same as that of U(0,1) + U(-1,0) whose pdf is a triangle function (as convolution product of two rectangle functions) which is centered around 0.
This is why we split the integral and get affine-linear terms like (1-x) and (1+x).
The sqrt(X² + Y²) distribution reminds me of the chi-distribution, except that we swapped out the Gaussian random variables for our "triangle distributed" random variables X and Y, so I wonder if there's a connection between those.
The channel mindyourdecisions had this problem a couple of years ago. A similar problem, the average distance of two points on a circle, was presented on 3blue1brown IIRC. Interesting problems, the math is over my head but I like visualizing them using geogebra or programming.
It should be Mathemaniac who presented the circle version of the problem, he's also using manim so the style of his video is very similar to 3b1b's.
Here's the link to the video series:
ua-cam.com/play/PLDcSwjT2BF_XRsHboeAQ5esyUd6RdHtgC.html
@@Boringpenguin I found the video: it was Numberphile in collab with 3b1b.
ua-cam.com/video/mZBwsm6B280/v-deo.html
That was a shit video, he glossed over the hard details because he's a hack and then went fully into the hard details. I watched that video so many times and each time I hate it even more. If anyone uses that video as a template to instruct they'll be a horrible professor. He's a great example of someone that just uses the algorithm for profit.
@@thelightningwave Why do you keep watching it, if you hate it?
@ I don't. I mean I kept watching that video to understand what he was talking about, but each time I watched I just realized that it's a horrible description/explanation of the problem. It wasn't until I watched it Soo many times and when flammable maths openly mocked him that I realized that mind your decisions is just math ClickBait and doesn't really explain things in a concise manner. I watch like at most three of his videos a year. I used to gobble his videos up like the end of the world was nire.
Look at a general (orthogonal) rectangle with horizontal side x and vertical side y inside the unit square. Without loss of generality assume P to be the bottom left corner and Q the top right corner. The required distance is √(x²+ y²). The weight of this value is the wiggle room of the rectangle within the square, which is (1-x) horizontally and (1-y) vertically. The average distance between two arbitrary points becomes (with integration boundaries x,y both in [0,1])
∫∫(1-x)(1-y)√(x²+ y²)dxdy/∫∫(1-x)(1-y)dxdy
because we have to divide by the average of the weight factors.
Which on WolframAlpha gives the same answer!!
Excellent work! This is a better answer
Elegant!
Doesn't anyone agree that sub he does of y equals x tan theta shouldn't even be allowed..you don't know y will ever equal x times tangent theta..y will only.ewual x sometimes for some points..serjosuly I dknt see why anyone would.ever think of thst..why nkt just y equals tantheta and x is sec theta or cosine and sine..something sensible and more logical!!
A similar approach. If you consider the positive lengths in the whole unit square then you get 4∫∫(1-x)(1-y)√(x²+ y²)dxdy directly with the factor of 4 coming from the 4 cases (+,+), (+,-), (-,+),and (-,-) for the signs of the coord. differences.
When calculating the integral (starting from 13:30), it is really better to go to the polar coordinates.
4*2*∫[0;π/4]∫[0;1/cosφ] (1-r*cosφ)*(1-r*sinφ)*r^2*dr*dφ=
The multiplier 2 compensates for the fact that we integrate up to π/4. and not up to π/2.
=... =∫[0; π/4] [(2/3)*(1/ cosφ^3)-(2/5)*(sinφ/cosφ^4)]dφ= ln(√2 +1)/3+(2+√2)/15.
Yea don't you agree that substitution was INSANE and should NOT BE ALLOWED I have NEVER seen a sub where you INCLUDETHE OTHER VARIABLE in the sub..what thenfuck y equals x times tan tbeta..how and why pit x in there
. I don't see why or how ANYONE would EVER think of that..come on..that shouldn't eve. Be allowed why do that..maybe y equals cosine or just tangent x...why not just that
The ‘editor’ still didn’t fix the limits properly!
great solution, and a nice little lunch break project!
import math
import random
data=[ ]
lengths = [ ]
trial_number = 10^6
for n in range(2*trial_number):
point = [random.uniform(0,1), random.uniform(0,1)]
data.append(point)
for n in range(trial_number):
length = math.sqrt((data[n][0]-data[n+trial_number][0])**2 + (data[n][1]-data[n+trial_number][1])**2)
lengths.append(length)
average = sum(lengths) / len(lengths)
print(average)
If you like these kinds of simulations, I recommend looking into the numpy package and its random features. This simulation can be done with:
----------
import numpy as np
TRIALS = 10**6
# Make 2 arrays with random 2D points. (the default is already between 0 and 1)
points1 = np.random.uniform(size=(TRIALS, 2))
points2 = np.random.uniform(size=(TRIALS, 2))
# For each pair of points, calculate the difference of their coordinates.
# Example: the pair (0.2, 0.8) - (0.3, 0.5) will give (-0.1, 0.3).
# You can think of this as a vector that takes you from point2 to point1.
diff_vectors = points1 - points2
# Now, for each of these differences, calculate the "size" of each of those vectors.
# axis=1 makes it calculate the size of *each* vector, instead of the whole matrix.
distances = np.linalg.norm(diff_vectors, axis=1)
print(distances.mean()) # built-in average
--------
This is shorter, faster to write, more readable, can scale to higher dimensions easily and runs *much* faster. I got 40x the speed.
You won't be able to tell the exact answer from this
@@tmpqtyutmpqty4733 well, yes, of course. Good luck solving the 1000-D case analytically though!
@@lidarman2 You should start your avgdst with None, not 0.0. Zero is not a reasonable value. (And overriding all initial zeroes would even be wrong in general. Your solution can't tell whether avgdst has just not been set or whether all it currently just happens to sum to 0.)
Numerically, it's better to just add all the samples, and divide them once at the end.
@@lidarman2 Who suggested creating the whole list?
Yes, for your quick and dirty exercise, it's fine.
You can keep a running count and a running sum, and at the end you divide them. No need to keep a whole list.
That's simpler than working out the math like you did, and has fewer numerical problems.
At 19:37 I noticed that log((1+sqrt(1+x^2))/x) = log(1/x+sqrt((1/x)^2+1)) = arcsinh(1/x). It makes integration by parts simpler. After that, I used a tangent substitution.
“So I think you can already see why this very easy to state problem is actually quite gnarly.. Okay good!”
Great calc3 refresher. Touched some points I was thinking about last night for diffforms.
Solving this problem with Corfton formula works really nicely (easier for the average distance on a disk)
In case you were wondering [spoiler alert] [ ( 2 + sqrt 2 ) / 15] + [ ln (1 + sqrt 2)) / 15 ] = 0.5214 to 4 dp, so if you guessed "a half" you were close
drive.google.com/file/d/1antIBxzFTw5Cfvw6nxMKLn_634byUxDA/view?usp=sharing
I was slightly puzzled by the speedy intro of 4D space at the start, and by the near-absence of any mention of the probability function. We are calculating the "Expectation Value" of the distance, d(P, Q) between points P, Q of a unit-square. The distance function is a function of 4 parameters , the (x,y) co-ords of points P, Q, which can each be chosen independently in the set [0, 1]; hence a 4D probability space. OK, got that.
Call this the 4D unit-cube, 4U. The required EV is then the integral over all points of 4U of prob(P, Q) x d(P, Q), where prob (P, Q) is the probability of choosing the 4 parameters of (P, Q). Since the choices are all distinct and independent, we have prob(P, Q) = 1 / total number of (P, Q)s. But what is the total number of (P, Q)s?
I took me a minute to twig that the needed "counting" function is the function that "clocks" (takes the value) "1" for each choice of (P, Q), but that's just the "characteristic function", χ on 4U, which when integrated over 4U yields the "vol" of 4U (I think the Prof mentioned this)
Thus prob (P, Q) = 1 / vol 4U, which is just a number which can be factored-out of our original integral; and, in fact, it's just the number 1 (which the prof also mentioned), so it, in effect, disappears
Sorry, rather laboured all that
The integral sums all distances. Should we explicitly divide it over square area (although it is equal to 1) to get average?
It's defenitely there in the definition of average you are right. Expliciting it would depend on the public I guess.
You should make a video on that integral at the end with the natural log.
Thanks for the upload
Btw, the average "Manhattan distance" between two points on a unit square is 2/3.
I remember that one other math channel posted same question, but using a little bit different concept about probability density function in the middle.
I think this was your best video so far!
could do a discret version with an nxn lattice and compute all the distances. then increase n and see if it converges towards your value.
It's seems like the answer should have been much more clean, and there would be a way to get it just by arguing about symmetry, but unfortunately not
I was lost by about two minutes in. Why is a four-dimensional hypercube involved?
What does it even mean to integrate the distance formula? Like I understand integrating a function over some interval, and I get (I think?) that the integral(1 dxdydz)= the volume of that bounded region, but.... yeah I'm stuck.
I am going to go learn about Jacobian matrices as I went fully cross-eyed by 4:44.
I really thought there would be a way to do this without integration, using some clever trick.
By the way, you sure of your answer? I put the integral that we had at the start ( the integral of Sqrt[(x2-x1)^2+(y2-y1)^2] ) in Mathematica and it spit out this as the answer: 1/120 (16 + 8√2 + 80π - 70 Log[1 + √2] - 15 Log[2] - 40 Log[3 - 2√2] + 30 Log[2 + √2])
In particular there is a π in the answer 🤔
I put the final answer in a calculator and got 0.52140543316.
I ran one million simulations in excel and got 0.521539
Probably need a lot more simulations...
That's actually amazingly good considering floating point side effects.
Jacobian is there, because an ordinary "substitution" such as u = 3x will "inflate" the region by a factor of 3 .. and so the same quantity (such as 9x^2 + 7, or aka u^2 + 7) placed into a new domain (such as u : 0 to 30, instead of x : 0 to 10) ... well it just aint right. Without some sort of Jacobian factor (such as a 1/3 = dx/du). Otherwise you're getting your quantity accumulated across three times as much territory ...
Some knowledge from probability theory allows to carry out transformations
4-dimensional integral to two-dimensional faster.
A random variable A is the difference between two independent random variables x1 and x2, uniformly distributed over the interval [0,1] : A=x1-x2.
Its distribution density is constructed as a convolution of the distribution densities x1 and x2.
Result: pA(x)=1-∣x∣ if x∈[-1,1] ;pA(x)=0 if x∉[-1,1].
The same result for the distribution density pB(y) of a random variable B=y1-y2.
According to the average calculation rule:
Avg=∫[-1,1]∫[-1,1] sqrt(x^2+y^2)*pA(x)*pB(y)dy*dx=
=4∫[0,1]∫[0,1] sqrt (x^2 +y^2)*(1-x)*(1-y)dy*dx.
It is clear that for an n-dimensional unit cube, calculations can be performed using the formula
Avg=2^n*∫[0,1]∫[0,1]∫[0,1]...... (1- x)*(1-y)*(1-z)......*sqrt (x^2+y^2+z^2......)dx*dy*dz....
do you know how to prove that it diverge for n->inf? i have been thinking and i cant do it
@@lPlanetarizado Avg(n)= 2^n*∫[0;1]...∫[0;1](1-t(1))*...(1-t(n))*sqrt [t(1)^2+...+t(n)^2]dt(1)*...dt(n). (1)
The inequality about the arithmetic mean and geometric mean gives
[ t(1)^2 +...t(n)^2]/n ≥[t(1)^2*...*t(n)^2]^(1/n)=[t(1)*...*t(n)]^(2/n).
So sqrt [t(1)^2+...+t(n)^2]≥ sqrt(n)*[t(1)*...*t(n)]^(1/n). (2)
Of (1) and (2) Avg(n) ≥(2^n)*sqrt(n)*{∫[0;1] (1- z)*z^(1/n) dz}^n=...
...= (2^n)*sqrt(n)*{1/(1/n +1)- 1/(1/n +2)}^n =
= (2^n)*sqrt(n)*{n^2/[(n+1)*(2n+1)]^n =
=sqrt (n)/[(1+1/n)^n*(1+1/2*n)^n]
It turns out that for n ≫1 Avg(n)≥sqrt(n)/[e*sqrt(e)]=sqrt(n/e^3).
The estimate is rough, but shows that Avg(n) is proportional to sqrt(n).
@@lPlanetarizado You can use en.wikipedia.org/wiki/HM-GM-AM-QM_inequalities to prove divergence in the original formulation.
You don't need to solve the problem to a closed form to prove divergence.
@ i see ....i knew there has to be an easier way...i tried the integral - just for fun- with spherical coordinate substitution, but you get a complicated expression too
Not that it matters in this case but your Jacobian is the wrong way round. It should have the new variables in the numerator.
(1) What is the average distance of 2 points on the edges of a unit square? (2) How does the ratio of question (1) to the original average in the video change as dimension grows towards infinity?
Michael's result is about 0.52. Are there bounds we can put on the result before we work out the integrals? Certainly the average is in (0,sqrt(2)).
It's very quick to work out that the result is exactly 1/3 in one dimension, and then using the triangle inequality gives you an upper bound of n/3 in n dimensions.
You can use en.wikipedia.org/wiki/HM-GM-AM-QM_inequalities for some bounds. Or more generally, artofproblemsolving.com/wiki/index.php/Power_Mean_Inequality
min distance = 0, max istance = sqrt(2) ~1.4142, the average distance = (2 + sqrt(2) / 15) + ln(1 + sqrt(2)) / 3 ~ 0,521405
why are we using a 4th dimension as a square is only within the 2 axes? kindly explain anyone
It can be interesting to ask the same question for a square with non-constant density. Depending on the density function there should be parts of the square with "more points" than others, so the average distance should change. However analytically solving the integral may really become challenging if not impossible
Hi,
Seams like the distance d((x1,y1),(x2,y2)) is counted twice, once such as d((x1,y1),(x2,y2)) and another time such as d((x2,y2),(x1,y1)), no?
I don't think the variables substitutions fixe that, one of the 2's factors has to be canceled.
To answer that question, may be we could decompose into 4 squares, and calculate the value for two non overlapping squares (where you avoid this potential double counting).
Oui il y a un double comptage dans le sens où l'on peut échanger les points et avoir une configuration comptée une deuxième fois alors qu'il semble que c'est la même, mais c'est comme ça que le problème est défini : il y a un premier point et un deuxième point.
Si on insistait sur ne pas identifier les points il faudrait couper l'espace des configurations en deux mais alors on devrait diviser par la masse totale de l'espace des configurations (la quadruple intégrale sans rien à l'intérieur) qui vaudrait 1/2 et donc même résultat au final.
Si on était en discret on serait emmerdé par les configurations où les deux points sont confondus qui ne sont elles pas double-comptées, mais ici c'est un événement de probabilité nulle, qui ne compte pas.
@@3x3-x3x-oXo Oh my bad, you are right, when you calculate the weight with the integral of 1, it's the same thing : you count the points twice, so they cancel each other.
(The french flag is there only to apologize for my poor english, but we have to write in english).
I gave this a try without watching the video and thought this might be the solution:
integral from 0 to 1 integral from 0 to 1 (2-x-y)*(sqrt(x^2+y^2)) dx dy
But that ended up being something that I can't solve and that is equal to around 0.651757 according to WolframAlpha. I don't really know why my idea is wrong but I definitely wouldn't have tried to use a quadruple integral after already being unable to solve a double integral.
being a statistician i'd have done it with probability theory:
- the components of P and Q are IID uniform
- the absolute difference of two uniform RVs is a triangle distribution
- squaring is a change of variables
- summing two such RVs is convolution, probably not tractable so we leave it as is
- law of the unconscious statistician turns this into a double integral, even more ugly but the fact that this video exists at all indicates it's tractable
You'd end up with the exact same computations in the end.
@@3x3-x3x-oXo yes, it's exactly the same in the end; the point of the comment is that probability theory provides conceptual chunks that get you to the final integral faster, e.g., it makes it clear that you can handle the two dimensions separately so you don't have to write down a quadruple integral
with a million samples i get 0.521395184807978, and the exact value is 0.521405433164720678, so the simulation checks out
I came up with a solution to this problem that ended up being incorrect, but I’m still not entirely sure what is fundamentally wrong with it.
It is well known that the average distance between two points on a line of length 1 is 1/3. Therefore the expected values for (x1-x2) and (y1-y2) should also be 1/3. Plugging this into the distance formula gives a result of sqrt2/3
That's because in general avg(F(x,y)) is not equal to F(avg(x),avg(y)). In fact that is a very wrong assumption in most cases. Take for example kill death ratios of a videogame scoreboard. The average kill death ratio is not necessarily 1, but the ratio of average kills and average deaths is always necessarily 1, since kills=deaths.
So what is the final answer? Something to compare to the intuitive guesses we all made at the start. Something close to 0.5 I think?
Even by intuition 2 point on a diagonal will cause the final average to EXCEED 0.5 , so intuition should say the avg must be >0.5
maybe there can be some geometrical trick to calculate it easier?
What a brilliant use of mathematics, you'd expect a simpler solution to this problem!
The simple solution is just running a simulation
I saw this on Presh Talwalker's channel a while back, loved it then, still love it now.
Fun fact: the average distance between 2 points in a unit cube also has a closed form, and yes, you do get 6 nested integrals.
I don't know whether there's a closed form for 4+ dimensions
Please, make the video for derivation of Probability of distance between two points being less than x.
Why not just write x1 minus x2 as delta x and the difference in y as delta y then it becomes a much easier 2d integral?
That's more or less what he's doing in a roundabout way in the video.
8:54
I'm confused why are the bounds of the third integral sign thingy 0 to -1, shouldn't they be 0 to 1 as those are the values x is ranging between in the purple triangle?
I think you're right. The bounds should be 0 to 1. On the next board Michael corrected the bounds on the second integral and said the editor will fix it in post for the previous board. I think maybe the editor thought both needed correction.
The integral is actually
Int_{-1}^0 dx Int_{-x}^1 dz + Int_0^1 dx Int_0^{1-x} dz = Int_{-1}^0 dx (1+x) + Int_0^1 dx (1-x) = 2*Int_0^1 dx (1-x)
upon change of variables x->-x in the first integral.
jesus this is a hell of a headache of a problem.
i worked it out in my own way using probability density functions and got a somewhat similar answer with nested radicals that i *think* is equivalent to your answer. but it's late and i'm too tired to prove they are the same. but it seems likely.
incidentally, my method really easily extends itself to n-dimensions
> but it's late and i'm too tired to prove they are the same.
I'd plug both of them in a calculator and see whether they agree numerically, and only if they do, start proving.
I think this exsample shows how it is different between the simplicity for human understanding and world of math. math is made in order to solve many human-like problems easier, but as always, we can’t produce a “perfect” tool. Maybe we have to be super-surprised of the fact that this easy method is so versatile.
Would there be a problem with finding the average of the square of the distance to simplify the integration and then come back and get the actual average distance?
Hello Mr. Penn. I have a question that has been keeping me all night these past few days. Hope you can help.
Question: Let have any area on earth ( a city or a region). That we can calculate or know its area. We have n (a natural number) number of fixed spot we can put on the area.How can we calculate the minimal distance we can place n number of fixed spot such that we have the same distance between any point on the edge of the boundary to the next fixed spot or from fixed spot to fixed spot.
For example: if we have n =1, so the only logical place to put the fixed point is at the center. From the the center ( fixed spot) to any point in the boundary are the same distance.
What is the reasoning in the case n = 2,or n = 3, … n = inf?
some simpler problems:
= average distance between a point and all points on a line segment
= average distance between points on two line segments
In the easy to ask, but surprisingly tricky category: If rolling a die 100 times, what is the probability of getting five in a row of the same number at least once? Was helping a student with that, and there are some subtle traps to fall into.
Instead solve: what is the probability of never getting five in a row of the same number in 100 rolls?
@@lucasm4299 Yes, the student got that far. The tricky part comes with managing the dependencies between rolls and the windows of five rolls.
It's nice to be able to find a closed form solution for this, but monte carlo sims got me 3 sig fig accuracy very fast
Approx. 0.52 ?? For the 1D problem (a unit line segment) the answer is apparently 1/3 . I like this stuff.
Can confirm it is 1/3. Was fun calculating that!
i wonde what happens if you let n tends to infinity for a "n-integral", the value converge?
@IPlanetarizado I’m sure it converges to 0
@@teeweezeven Huh, I wonder what happens with those values if we generalize to n-dimensional hypercubes. But not enough to calculate it formally, maybe I'll run a little sim.
@@ulrichs.3228 I mean, going from 1 to 2 dimensions already complicates things a whole lot.
All I can confidently say is that as n gets bigger, the distance also gets bigger, probably tending towards infinity.
It's been a while since I did an integral this involved, but when you did the change of variables y = xtanθ, why isn't the differential that results xsec²θdθ+tanθdx? Don't the differentials follow a Leibniz rule?
Seeing the thumbnail I interpreted the question different, but it seems like it might still be interesting... for two specified points what is the average path length between the two...
which I feel like will just be infinite, so maybe add the restrictions that the path must not cross itself and has a certain width.
I think a hyperbolic substitution might be cleaner than the trigonometric substitution
I don't see why anyone would.think of x tan theta.is thst even allowed you dot know that y will ever equal x tsn theta since x is a point in the square. So yea thinking about it morenkw.i ealize that is not a valid substitution
I'm overwhelmed
it can go much easier if using polar coordinate system instead of rectangular, can't it?
On one level yes, but your boundaries become really awkward and difficult to work with
Why is the average distance = ithe ntegral of the distance formula over all possible P and Q divided by the 4D volume of the unit cube?
~0.521
Layman person computer programmer here. Given two tuples (Cartesian points- x1,y1, x2,y2), what’s the algorithm I can implement in code? I didn’t follow the simplification and all of the substitutions. The final values do not seem to link back to the original inputs at all.
In code you want to compute the average value over many random experiments (where x1, x_2, y_1,y_2 are random floats between 0 and 1) of the value sqrt((x_1-x_2)^2 + (y_1,y_2)^2). So it's a simple for loop. What he computes is the expectation, that is the idealized average where the number of trials becomes infinite.
By geometrical estimation, I would have guessed the average distance of any two points is somewhere between ([1/2 of root 2 on 4] and [1/2 of 1/2]) = 0.35355 and 0.25, with less weight to the 0.25 due to geometry. As the answer is 1/3, the weighting of the lesser number 0.2022... 0.00833 (differential) the weighting is about 0.412 or 1 in 2.426.
Does michael have a calc 3 course play list?
Im sorry but i havent done calc 3 (just 1 and 2) so I don't know where he's getting these conclusions from.
At one point you pose y=x tan(theta), like... you can really do that??? Use another variable as substitute for a variable? 🤯
so the average distance is ~ 0.5214...
does it mean that:
the chance of random distance being > 0.5 is more than 50%?
Based only on the average value (Expected value) of a random variable, it is impossible to draw a conclusion about the distribution of probabilities of its values.
Nevertheless, in this case P[sqrt(x^2+y^2)>1/2] =125/96 - π/4 =0.517...
the distance of 2 points on a diagonal when x,y exceed 0.5
Something about the universal parabolic constant...good
Thank you! I was trying to remember where I saw ln(1+sqrt(2)) before xD
Should we be integrating over all pairs of points? In which case, the quad integral double counts pairs, and there should be a 1/2 factor out front?
Double counting is something you have to watch out for when calculating a sum; but when calculating an *average,* it doesn't really matter.
@@alexpotts6520 In discrete probability you can get a slightly different average when you remove double counting and when you don't, because diagonal cases (where the two points are the same) are never doubly counted. But here in continuous probability these cases do not contribute to the integral because the probability that the rwo points are the same is zero.
Did you multiply by 4? This result looks a bit small...
Can we generalize it for an arbitrary dimension n?
Awesome sir!!!
tried doing it with expectation of uniform random variables, got that the distance squared is expected to be 1, which i suspect is also wrong😅
May I point out that 'randomly chosen' is not a proper mathematical definition ? A uniform distribution along the axes is quite logical and gives a uniform density of points inside the square, but it is not the only way to randomly choose two points inside a square (you could use polar coordinates, just for starters, with a uniform distribution of angles). And different methods will give different answers. Really, the exact random method used should be stated as part of the problem, not left to the solver to pick.
Indeed, I picked the distribution of x = 0, y = 0 with p = 1, which made the solution slightly simpler.
Although he tries to pump some intuition by talking about random points and an expected value, the actual question is "What is the average distance between points in a unit square?" meaning ALL points within the unit square. No distribution needed since you simply take each point (an infinite number of them) and compare it to every other point.
@@APaleDot And yet if you took a function such as f(t) = 1.5 for t1/2, it is a probability distribution that sums to 1 over [0,1]. Apply it to x and y, and you still get all points within the unit square. But somehow the points near 0 will appear more frequently, but that's okay since there's an infinity of them, you' won't run out. I'm not even going to try to work out what it does to the average distance - I'm definitely not that good at quadruple integrals - but I feel confident the result would be different.
@@emmanuellaurens2132
Sure, the result would be different if you used a probability distribution to sample points in the square. But why are you sampling points at all? To calculate the average distance between points in the square, you just look at every point in the square exactly once. There's no probability there.
@@APaleDot There is probability there, even though it is hidden: When you "look at every point inside the square once", each point inside the square will enter your focus exactly as often as any other point inside the square. In other words, to your focus the points inside the square appear to be uniformly distributed!
@Emmanuel Laurens I completely agree - the reader should _never_ have to assume a probability distribution function. Even though it seems to be common one should assume uniform distribution if no pdf was specified...
What about if the sides of the unit square are connected in a pacman like manner ?
Is it any simpler if the surface was an equilateral triangle? (It is on a unit sphere)
but why 4 dimensions
I'm quite sure this problem is not well defined. I think this is way more difficult to calculate (didn't do it), but you could imagine random generated lenghts with one point (middle of the line), one angle, one lenght (the upper bound of that length is the difficult part to define). There is 4 parameters, everything is good. Except I'm quite sure the answer is not the same. Problem is : what is random ?
We're taking random points to mean points whose x and y coordinates are evenly distributed.
@@bscutajar that's the point. Random is not a well defined concept in this case.
Now the variance please
Using conditional probability makes the problem easy
this is escalating quickly. nice problem! predicting sqrt2 but ill have to see how it all plays out
It's not sqrt(2).
0.57 ish? The farthest apart they could be is about 1.41... seems reasonable.
In a cube ? in a N-cube ?
Gotta love quadruple-nested integrals
Can pls some1 explain why the expected value has that 4dim volume cube thing?
Michael explains it in the video?
Now do this with Manhattan distance:)
The number at the end is about 0.521405433165
#ruby 2.5.3
def dist
((rand()-rand())**2.0+(rand()-rand())**2.0)**0.5
end
size = 1000.0
v = []
1000.times do
v
I'm sorry y equals x tan theta is even a Valid substitution..you don't know if y will.ever equal x times tangent theta because x is a point in the square but some dummy variable..
*the editor corrected the wrong thing*
Im going to do the simulation for this, just to be sure 😁
Isn't it somehow related to bertrand paradox? We can't say that every point has the same probability of appearing if we are talking about infinite sets, we need some probability distribution
Whenever I see a problem that starts with "choose two points at random in this 2D region" I hear bells and whistles going on somewhere in my subconscious.
Assume uniform distribution
No not really, a uniform distribution over the square is the really only natural "random" distribution for a point. Bertrand's paradox is more about situations where the probability distribution is ambiguous. Multivariate continuous distributions exist and there is no issues with doing statistics with them.
It probably should state the distribution, but if its not then I think it's safe to assume it is uniform.
HW: solve for all cases of the beta distribution in terms of parameters a and b.
You'd think if he had a non-uniform distribution he'd have said it. The most "natural" thing to do is assume uniformity, since he didn't say (for example) "this corner of the square has more points than the others".
Now what happens when we use different metrics? :D
Do you get the same result in a polar coordinate system? (Random variable with phi,r)
The answer almost looks like the golden ratio
Approximated the answer numerically with about 30 seconds of writing a python script
Really interesting problem, tho it's quite the L that the result is not a cute number like 1/4 or something like that.
Could you please, write or show me any basic of knowledge, video that you learn about this subject.
I need a little bit growing about integrals that you teach now.
I appreciate about it.
Before that, I learned many things that you learned.thanks a lot.
Look at the other videos on the channel.
Now show how that varies with a p-adic metric