Hello,
I am wondering why the Mann-Whiney U Test always returns the smallu value in the source code, it seems like it should return the U statistic for x or y consistently. https://github.com/scipy/scipy/blob/v0.14.0/scipy/stats/stats.py#L3943 . I might be missing something, but it seems impossible to determine which distribution in the lower, Wikipedia gives an example of where the lower median is not correct, reproduced below: #!/usr/bin/env python import scipy.stats as stats import scipy.stats.mstats as mstats Hare = [ 1 , 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29] Tortoise = [ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 30, 31, 32, 33, 34, 35, 36, 37, 38] print "Hare Median: %d, Tortoise Median: %d, U1: %d, U2: %d"%(mstats.mquantiles(Hare,[0.5])[0],mstats.mquantiles(Tortoise,[0.5])[0], stats.mannwhitneyu(Hare, Tortoise)[0], stats.mannwhitneyu(Tortoise,Hare)[0]) Hare Median: 20, Tortoise Median: 19, U1: 100, U2: 100 _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
On Tue, Dec 16, 2014 at 8:32 AM, Steve Ramage <[hidden email]> wrote: Hello, smallu is returned because it is is the actual test statistic for comparison with tables (or the normal approximation) (see Wikipedia) However, except for inertia and backwards compatibility it would have been better to change this. Also, the function still returns the one-sided p-value which is inconsistent with returning a two-sided test statistic. In comparison, ttests return one-sided test statistic and two-sided p-values. Wikipedia gives an example of where the lower median is not correct, reproduced below: What's your point? smallu is not directly related to the medians. We replicate the formulas and results of Wikipedia, so everything looks right to me. I don't really understand the example, since I don't remember the details of Mann-Whitney-U or cannot figure them out right now . Josef _______________________________________________ _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
Thanks,
That makes sense, my point with the example was only to demonstrate that the lower median is not necessary the 'lower distribution'. Cheers, Steve Ramage -----Original message----- From: [hidden email] Sent: Tue 16-12-2014 16:03 Subject: Re: [SciPy-User] Mann-Whitney U Test To: SciPy Users List <[hidden email]>; > On Tue, Dec 16, 2014 at 8:32 AM, Steve Ramage <[hidden email] > <mailto:[hidden email]> > wrote: > Hello, > > I am wondering why the Mann-Whiney U Test always returns the smallu value in > the source code, it seems like it should return the U statistic for x or y > consistently. > https://github.com/scipy/scipy/blob/v0.14.0/scipy/stats/stats.py#L3943 . > > I might be missing something, but it seems impossible to determine which > distribution in the lower, > > > smallu is returned because it is is the actual test statistic for comparison > with tables (or the normal approximation) (see Wikipedia) > > However, except for inertia and backwards compatibility it would have been > better to change this. > Also, the function still returns the one-sided p-value which is inconsistent > with returning a two-sided test statistic. > In comparison, ttests return one-sided test statistic and two-sided p-values. > > > > Wikipedia gives an example of where the lower median is not correct, reproduced > below: > > #!/usr/bin/env python > import scipy.stats as stats > import scipy.stats.mstats as mstats > > Hare = [ 1 , 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, > 27, 28, 29] > Tortoise = [ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 30, 31, 32, 33, 34, 35, > 36, 37, 38] > > print "Hare Median: %d, Tortoise Median: %d, U1: %d, U2: > %d"%(mstats.mquantiles(Hare,[0.5])[0],mstats.mquantiles(Tortoise,[0.5])[0], > stats.mannwhitneyu(Hare, Tortoise)[0], stats.mannwhitneyu(Tortoise,Hare)[0]) > > Hare Median: 20, Tortoise Median: 19, U1: 100, U2: 100 > > What's your point? > > smallu is not directly related to the medians. > > We replicate the formulas and results of Wikipedia, so everything looks right > to me. > I don't really understand the example, since I don't remember the details of > Mann-Whitney-U or cannot figure them out right now . > > Josef > > _______________________________________________ > SciPy-User mailing list > [hidden email] <mailto:[hidden email]> > http://mail.scipy.org/mailman/listinfo/scipy-user > <http://mail.scipy.org/mailman/listinfo/scipy-user> > > > _______________________________________________ > > SciPy-User mailing list > > [hidden email] > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
Free forum by Nabble | Edit this page |