## MMORPG Gamer Community

 ggFTW Forum [Statistics] Why can't |correlation| be above one?

Looking for a new MMO?

Try

League of Legends

Rating:

8.3

 10-30-2010 #1 (permalink) is cute + gore   Join Date: Oct 2008 Location: Canada Posts: 674 Blog Entries: 4 iTrader: (1) [Statistics] Why can't |correlation| be above one? We're not exactly expected to know this for our course/it's not explained in the textbook, but I feel like I can't grasp the true meaning of correlation until I find an answer for this. So... correlation r = (n-1)^(-1)Σ[(x-xbar)/sx*(y-ybar)/sy] I get how Σ(x-xbar/sx) would just be 0 because of adding all z scores together, how multiplying x-xbar/sx and y-ybar/sy has to do with the least-square regression idea, and how dividing n-1 is just with relationship to the sample; however, why |r|< 1 I have no idea. It's also really close to s^2 = (n-1)^(-1)Σ(x-xbar)^2, but I'm not sure if that's just by chance. If anyone can prove this algebraically or with logical deduction, I'd really appreciate it. __________________ If only
Get rid of this ad by registering for our community.
 11-04-2010 #2 (permalink) ggFTW Lurker   Join Date: Jul 2009 Posts: 18 iTrader: (0) The sample correlation coefficient is defined as r = (n - 1)^{-1} Σ [(x_i - xbar)/s_x] [(y_i - ybar)/s_y]. Using s_x = sqrt[(n - 1)^{-1} Σ (x_i - xbar)^{2}] and s_y = sqrt[(n - 1)^{-1} Σ (y_i - ybar)^{2}], we can rewrite r as (1) r = Σ [(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})] [(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})] The triangle inequality states that the absolute value of a sum is never greater than the sum of the absolute values; in its simplest form, if a and b are real numbers, then |a + b| ≤ |a| + |b|. In this case, we get (2) |r| = |Σ [(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})] [(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})]| ≤ Σ |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})| |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})|. Note that the terms |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})| and |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})| are normalized distances; that is, (3) Σ |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})|^{2} = Σ |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})|^{2} = 1. Now, for any real numbers a and b, (a - b)^{2} = a^{2} - 2ab + b^{2} ≥ 0. Shifting terms around, this tells us that ab ≤ (a^{2} + b^{2})/2. Hence, for all i, we have the following inequality: (4) |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})| |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})| ≤ 1/2 * |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})|^{2} + 1/2 * |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})|^{2} And because this is true for each term, it is also true for the sum: (5) Σ |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})| |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})| ≤ 1/2 * Σ |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})|^{2} + 1/2 * Σ |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})|^{2} But, from (3), each of the sums on the right-hand side of this last inequality is 1, making the right-hand side 1/2 * 1 + 1/2 * 1 = 1. So we have (6) Σ |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})| |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})| ≤ 1. Combining (2) and (6), we get (7) |r| = |Σ [(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})] [(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})]| ≤ Σ |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})| |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})| ≤ 1.
 11-04-2010 #3 (permalink) is cute + gore   Join Date: Oct 2008 Location: Canada Posts: 674 Blog Entries: 4 iTrader: (1) Wow math genius here. It took me awhile to finish reading it. Thanks for explaining it ;3 I didn't think it would get answered at all. I'm pretty sure I understand everything except this part: "Note that the terms |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})| and |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})| are normalized distances; that is, (3) Σ |(x_i - xbar)/sqrt(Σ (x_i - xbar)^{2})|^{2} = Σ |(y_i - ybar)/sqrt(Σ (y_i - ybar)^{2})|^{2} = 1." Normalized distances and getting 1 when you square the inside...how does that work? O.o __________________ If only
 11-04-2010 #4 (permalink) ggFTW Lurker   Join Date: Jul 2009 Posts: 18 iTrader: (0) Let's start with a simpler example. Take the standard xy-coordinate system and draw a line segment from the origin to an arbitrary point (x, y). The length of this segment will be sqrt(x^2 + y^2). Now, suppose that we want the coordinates of the point on this segment that is a distance of 1 from the origin. The coordinates of this second point would have to be (x / sqrt(x^2 + y^2), y / sqrt(x^2 + y^2)). One can verify that this second point does have a distance of 1 from the origin. The same principle applies for (3), the difference being that we are operating in n-dimensional space instead of 2-dimensional space. In this case, view sqrt(Σ (x_i - xbar)^2) as the distance the point (x_1, x_2, ..., x_n) is from the point (xbar, xbar, ..., xbar), or, if you prefer, the distance the point (x_1 - xbar, x_2 - xbar, ..., x_n - xbar) is from the origin.

 Bookmarks

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is OffTrackbacks are On Pingbacks are On Refbacks are On Forum Rules

 LinkBacks (?) LinkBack to this Thread: http://ggftw.com/forum/homework-help/84997-statistics-why-cant-correlation-above-one.html Posted By For Type Date Quick Launch MMORPG Am? North America | Online Game Blog This thread Pingback 10-31-2010 05:31 AM

Need a new browser game?

Try

Tribal Wars

Rating:

6.5
Hide this banner by registering for our community.