[SciPy-User] fromstring and int64

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] fromstring and int64

Carlos Medrano
Hello,

  I am having an issue with function fromstring and long integers
(int64). It seems it cannot read properly long numbers. I use scipy
version 0.14.0 and numpy 1.8.2. I have the problem also if I use numpy
directly.

import scipy
s='0 1445367600061 -35960 39671 79230'
scipy.fromstring(s, sep=' ', dtype=scipy.int64)

I get this

array([ 0, 2147483647, -35960, 39671, 79230], dtype=int64)

which is wrong in the second element

However if I write to a file the string s and use loadtxt, I get the
right values:

f=open('dumf.txt','w')
f.write(s)
f.close()

scipy.loadtxt('dumf.txt', dtype=scipy.int64)

I get:
array([0, 1445367600061,  -35960, 39671, 79230], dtype=int64)

I think this can be a bug a I would like to check on the mailing list
first and see if other people have the same behaviour.

Regards

Carlos Medrano


_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: fromstring and int64

Chris Barker - NOAA Federal
On Wed, Oct 21, 2015 at 9:23 AM, Carlos Medrano <[hidden email]> wrote:
 I am having an issue with function fromstring and long integers (int64). It seems it cannot read properly long numbers. I use scipy version 0.14.0 and numpy 1.8.2. I have the problem also if I use numpy directly.

import scipy
s='0 1445367600061 -35960 39671 79230'
scipy.fromstring(s, sep=' ', dtype=scipy.int64)

I get this

array([ 0, <a href="tel:2147483647" value="+12147483647" target="_blank">2147483647, -35960, 39671, 79230], dtype=int64)

it works for me:
In [1]: %paste
import scipy
s='0 1445367600061 -35960 39671 79230'
scipy.fromstring(s, sep=' ', dtype=scipy.int64)

## -- End pasted text --
Out[1]: 
array([            0, 1445367600061,        -35960,         39671,
               79230])

numpy version 1.9.3 

However, I suspect it's a platform thing -- I'm on 64 bit OS-X, which used 64 bit integers for a long -- 32 bit platforms and Windows64 don't.

scipy.fromstring is numpy.fromstring, and numpy.fromstring is kludgy, ugly and pretty broken. It also punts the string parsing to the C atoi(), with a bit of standard python plugged in there. That's why I think it's a C long problem.

Even if you are running on on a "proper" 64 bit platfrom, it may be broken, but I can guarantee you it will be a pain to fix.

So the short answer is -- don't use it.

scipy.loadtxt('dumf.txt', dtype=scipy.int64)

yes, loadtxt is a lot smarter.

However, fromstring is a lt faster, so it's too bad.

If you really need very fast reading of numbers from text from files, I'd look at panda's CSV reader -- I hear it's pretty sweet.

Also -- I have some Cython code that's blazingly fast -- only floats right now, but it wouldn't be hard to adapt to integers...

I can send it to you if you want.

-CHB



--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
SciPy-User mailing list
[hidden email]
https://mail.scipy.org/mailman/listinfo/scipy-user