[SciPy-User] Mann-Whitney U Test

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] Mann-Whitney U Test

Steve Ramage
Hello,

I am wondering why the Mann-Whiney U Test always returns the smallu value in the source code, it seems like it should return the U statistic for x or y consistently. https://github.com/scipy/scipy/blob/v0.14.0/scipy/stats/stats.py#L3943 .

I might be missing something, but it seems impossible to determine which distribution in the lower, Wikipedia gives an example of where the lower median is not correct, reproduced below:

#!/usr/bin/env python
import scipy.stats as stats
import scipy.stats.mstats as mstats

Hare =     [ 1 ,  2,  3,  4,  5,  6,  7,  8,  9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
Tortoise = [ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 30, 31, 32, 33, 34, 35, 36, 37, 38]

print "Hare Median: %d, Tortoise Median: %d, U1: %d, U2: %d"%(mstats.mquantiles(Hare,[0.5])[0],mstats.mquantiles(Tortoise,[0.5])[0], stats.mannwhitneyu(Hare, Tortoise)[0], stats.mannwhitneyu(Tortoise,Hare)[0])

Hare Median: 20, Tortoise Median: 19, U1: 100, U2: 100
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Mann-Whitney U Test

josef.pktd


On Tue, Dec 16, 2014 at 8:32 AM, Steve Ramage <[hidden email]> wrote:
Hello,

I am wondering why the Mann-Whiney U Test always returns the smallu value in the source code, it seems like it should return the U statistic for x or y consistently. https://github.com/scipy/scipy/blob/v0.14.0/scipy/stats/stats.py#L3943 .

I might be missing something, but it seems impossible to determine which distribution in the lower,


smallu is returned because it is is the actual test statistic for comparison with tables (or the normal approximation)   (see Wikipedia)

However, except for inertia and backwards compatibility it would have been better to change this.
Also, the function still returns the one-sided p-value which is inconsistent with returning a two-sided test statistic.
In comparison, ttests return one-sided test statistic and two-sided p-values.


 
Wikipedia gives an example of where the lower median is not correct, reproduced below:

#!/usr/bin/env python
import scipy.stats as stats
import scipy.stats.mstats as mstats

Hare =     [ 1 ,  2,  3,  4,  5,  6,  7,  8,  9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
Tortoise = [ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 30, 31, 32, 33, 34, 35, 36, 37, 38]

print "Hare Median: %d, Tortoise Median: %d, U1: %d, U2: %d"%(mstats.mquantiles(Hare,[0.5])[0],mstats.mquantiles(Tortoise,[0.5])[0], stats.mannwhitneyu(Hare, Tortoise)[0], stats.mannwhitneyu(Tortoise,Hare)[0])

Hare Median: 20, Tortoise Median: 19, U1: 100, U2: 100

What's your point?

smallu is not directly related to the medians.

We replicate the formulas and results of Wikipedia, so everything looks right to me.
I don't really understand the example, since I don't remember the details of Mann-Whitney-U or cannot figure them out right now .

Josef
 
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Mann-Whitney U Test

Steve Ramage
Thanks,

That makes sense, my point with the example was only to demonstrate that the lower median is not necessary the 'lower distribution'.

Cheers,

Steve Ramage




-----Original message-----
From: [hidden email]
Sent: Tue 16-12-2014 16:03
Subject: Re: [SciPy-User] Mann-Whitney U Test
To: SciPy Users List <[hidden email]>;

> On Tue, Dec 16, 2014 at 8:32 AM, Steve Ramage <[hidden email]
> <mailto:[hidden email]> > wrote:
> Hello,
>
> I am wondering why the Mann-Whiney U Test always returns the smallu value in
> the source code, it seems like it should return the U statistic for x or y
> consistently.
> https://github.com/scipy/scipy/blob/v0.14.0/scipy/stats/stats.py#L3943 .
>
> I might be missing something, but it seems impossible to determine which
> distribution in the lower,
>
>
> smallu is returned because it is is the actual test statistic for comparison
> with tables (or the normal approximation)   (see Wikipedia)
>
> However, except for inertia and backwards compatibility it would have been
> better to change this.
> Also, the function still returns the one-sided p-value which is inconsistent
> with returning a two-sided test statistic.
> In comparison, ttests return one-sided test statistic and two-sided p-values.
>
>
>  
> Wikipedia gives an example of where the lower median is not correct, reproduced
> below:
>
> #!/usr/bin/env python
> import scipy.stats as stats
> import scipy.stats.mstats as mstats
>
> Hare =     [ 1 ,  2,  3,  4,  5,  6,  7,  8,  9, 20, 21, 22, 23, 24, 25, 26,
> 27, 28, 29]
> Tortoise = [ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 30, 31, 32, 33, 34, 35,
> 36, 37, 38]
>
> print "Hare Median: %d, Tortoise Median: %d, U1: %d, U2:
> %d"%(mstats.mquantiles(Hare,[0.5])[0],mstats.mquantiles(Tortoise,[0.5])[0],
> stats.mannwhitneyu(Hare, Tortoise)[0], stats.mannwhitneyu(Tortoise,Hare)[0])
>
> Hare Median: 20, Tortoise Median: 19, U1: 100, U2: 100
>
> What's your point?
>
> smallu is not directly related to the medians.
>
> We replicate the formulas and results of Wikipedia, so everything looks right
> to me.
> I don't really understand the example, since I don't remember the details of
> Mann-Whitney-U or cannot figure them out right now .
>
> Josef
>  
> _______________________________________________
> SciPy-User mailing list
> [hidden email] <mailto:[hidden email]>
> http://mail.scipy.org/mailman/listinfo/scipy-user 
> <http://mail.scipy.org/mailman/listinfo/scipy-user>
>
>
> _______________________________________________
>
> SciPy-User mailing list
>
> [hidden email]
>
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
>
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user