# how to find p value using mannwhitneyu Classic List Threaded 1 message This post has NOT been accepted by the mailing list yet. I am trying to do a hypothesis test on census data. im trying to prove that years of education has an affect on salary.salary is a defined as <=50k, >50k. can i do this to calculate p value? import numpy as np import pandas as pd from sklearn.preprocessing import LabelEncoder from scipy.stats import mannwhitneyu print('Reading datasets...') df_trn = pd.read_csv('adult.trn', index_col=False, skipinitialspace=True) ALL_COLS = set(df_trn.columns) Wanted_COLS = set(['years-of-edu', 'salary']) Del_Cols=ALL_COLS - Wanted_COLS new_trn1 = {} for column in df_trn.drop(Del_Cols, axis=1).columns:     le = LabelEncoder()     new_trn1[column] = le.fit_transform(df_trn[column]) new_trn1 print("length of newtrn1") print(len(new_trn1['salary'])) list1 = np.array(new_trn1['years-of-edu']) list2 = np.array(new_trn1['salary']) list3 = list1[list2 == 0] list4 = list1[list2 == 1] print('list3:', np.median(list3)) print('List4:', np.median(list4)) pp=mannwhitneyu(list3, list4) print(pp) it returns p value as zero. my data set looks like 