Strange memory limits

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange memory limits

Chris Weisiger-2
(This is unrelated to my earlier question about 2D data slicing)

We have a 32-bit Windows program that has Python bindings which do most of the program logic, reserving the C++ side for heavy lifting. This program needs to reserve buffers of memory to accept incoming image data from our different cameras -- it waits until it has received an image from all active cameras, then saves the image to disk, repeat until all images are in. So the Python side uses numpy to allocate a block of memory, then hands it off to the C++ side where images are written to it and then later stored. Ordinarily all of our cameras are operating in sync so the delay between the first and last cameras is small, so we can keep the memory buffer small. I'm working on a modified data collection mode where each camera does a lengthy independent sequence, though, requiring me to either rewrite the data saving system or simply increase the buffer size.

Increasing the buffer size works just fine until I try to allocate about a 3x735x512x512 array (camera/Z/X/Y) of 16-bit ints, at which point I get a MemoryError. This is only a bit over 1GB worth of memory (out of 12GB on the computer), and according to Windows' Task Manager the program was only using about 100MB before I tried the allocation -- of course, I've no idea how the Task Manager maps to how much RAM I've actually requested. So that's a bit strange. I ought to have 4GB worth of space (or at the very least 3GB), which is more than enough for what I need.

Short of firing up a memory debugger, any suggestions for tracking down big allocations? Numpy *should* be our only major offender here aside from the C++ portion of the program, which is small enough for me to examine by hand. Would it be reasonable to expect to see this problem go away if we rebuilt as a 64-bit program with 64-bit numpy et al?

Thanks for your time.

-Chris

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Strange memory limits

David Baddeley
Hi Chris, 

what you're probably running into is a problem with allocating a continuous block of memory / a memory fragmentation issue. Depending on how windows has scattered the bits of you're program, (and how much you've allocated and deleted) you might have lots of small chunks of memory allocated throughout your 3GB address space. When python asks for a contiguous block, it finds that there are none of that size available, despite the fact that the required total amount of memory is. This doesn't just affect python/numpy - I've had major issues with this in Matlab as well (If anything, Matlab seems worse).  I've generally found I've been unable to reliably allocate contiguous blocks over ~ 1/4 of the total memory size. This also gets worse the longer windows (and your program) has been running.

Compiling as 64 bit might solve your problem, as, with 12 GB of memory, there will be a larger address space to look for contiguous blocks in, but probably doesn't address the fundamental issue. I suspect you could probably get away with having much smaller contiguous blocks (eg have 3 separate arrays for the 3 different cameras) or even a new array for each image

cheers,
David


From: Chris Weisiger <[hidden email]>
To: SciPy Users List <[hidden email]>
Sent: Tue, 29 March, 2011 11:22:03 AM
Subject: [SciPy-User] Strange memory limits

(This is unrelated to my earlier question about 2D data slicing)

We have a 32-bit Windows program that has Python bindings which do most of the program logic, reserving the C++ side for heavy lifting. This program needs to reserve buffers of memory to accept incoming image data from our different cameras -- it waits until it has received an image from all active cameras, then saves the image to disk, repeat until all images are in. So the Python side uses numpy to allocate a block of memory, then hands it off to the C++ side where images are written to it and then later stored. Ordinarily all of our cameras are operating in sync so the delay between the first and last cameras is small, so we can keep the memory buffer small. I'm working on a modified data collection mode where each camera does a lengthy independent sequence, though, requiring me to either rewrite the data saving system or simply increase the buffer size.

Increasing the buffer size works just fine until I try to allocate about a 3x735x512x512 array (camera/Z/X/Y) of 16-bit ints, at which point I get a MemoryError. This is only a bit over 1GB worth of memory (out of 12GB on the computer), and according to Windows' Task Manager the program was only using about 100MB before I tried the allocation -- of course, I've no idea how the Task Manager maps to how much RAM I've actually requested. So that's a bit strange. I ought to have 4GB worth of space (or at the very least 3GB), which is more than enough for what I need.

Short of firing up a memory debugger, any suggestions for tracking down big allocations? Numpy *should* be our only major offender here aside from the C++ portion of the program, which is small enough for me to examine by hand. Would it be reasonable to expect to see this problem go away if we rebuilt as a 64-bit program with 64-bit numpy et al?

Thanks for your time.

-Chris

 
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Strange memory limits

Charles R Harris
In reply to this post by Chris Weisiger-2


On Mon, Mar 28, 2011 at 4:22 PM, Chris Weisiger <[hidden email]> wrote:
(This is unrelated to my earlier question about 2D data slicing)

We have a 32-bit Windows program that has Python bindings which do most of the program logic, reserving the C++ side for heavy lifting. This program needs to reserve buffers of memory to accept incoming image data from our different cameras -- it waits until it has received an image from all active cameras, then saves the image to disk, repeat until all images are in. So the Python side uses numpy to allocate a block of memory, then hands it off to the C++ side where images are written to it and then later stored. Ordinarily all of our cameras are operating in sync so the delay between the first and last cameras is small, so we can keep the memory buffer small. I'm working on a modified data collection mode where each camera does a lengthy independent sequence, though, requiring me to either rewrite the data saving system or simply increase the buffer size.

Increasing the buffer size works just fine until I try to allocate about a 3x735x512x512 array (camera/Z/X/Y) of 16-bit ints, at which point I get a MemoryError. This is only a bit over 1GB worth of memory (out of 12GB on the computer), and according to Windows' Task Manager the program was only using about 100MB before I tried the allocation -- of course, I've no idea how the Task Manager maps to how much RAM I've actually requested. So that's a bit strange. I ought to have 4GB worth of space (or at the very least 3GB), which is more than enough for what I need.

Windows 32 bit gives you 2GB and keeps the rest for itself.

<snip>

Chuck


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Strange memory limits

Chris Barker - NOAA Federal
In reply to this post by David Baddeley
On 3/28/11 4:51 PM, David Baddeley wrote:
> what you're probably running into is a problem with allocating a
> continuous block of memory / a memory fragmentation issue.

On 3/28/11 8:01 PM, Charles R Harris wrote:

> Windows 32 bit gives you 2GB and keeps the rest for itself.

right -- I've had no problems running the same code with 32 bit python
on OS-X that crashes out with memory errors on Windows -- similar hardware.

> Compiling as 64 bit might solve your problem, as, with 12 GB of memory,
> there will be a larger address space to look for contiguous blocks in,
> but probably doesn't address the fundamental issue.

Ah, but while you still may only have 12GB memory, with 64 bit Windows
and Python, the virtual memory space is massive, so I suspect you'll be
fine. Using 1GB memory buffers on 32bit is certainly pushing it.

> I suspect you could
> probably get away with having much smaller contiguous blocks (eg have 3
> separate arrays for the 3 different cameras) or even a new array for
> each image.

That would make it easier for the OS to manage the memory well.

-Chris


--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Strange memory limits

Chris Weisiger-2
Thanks for the information, all. Annoyingly, I can't allocate 3 individual 735x512x512 arrays (each ~360MB) -- the third gives a memory error. If I allocate 1024x1024 byte arrays (thus, each 1MB), I can make 1500 before getting a memory error. So I'm definitely running into *some* issue that prevents larger blocks from being allocated, but I'm also hitting a ceiling well before I should be.

I had thought that Windows allowed for 3GB address spaces for 32-bit processes, but apparently (per http://msdn.microsoft.com/en-us/library/aa366778%28v=vs.85%29.aspx#memory_limits ) that only applies if the program has IMAGE_FILE_LARGE_ADDRESS_AWARE and 4GT set...sounds like I'd need a recompile and some system tweaks to set those. The proper, and more work-intensive, solution would be to make a 64-bit build.

My (admittedly limited) understanding of memory fragmentation was that it's a per-process problem. I'm seeing this issue immediately on starting up the program, so the program's virtual memory address space should be pretty clean.

-Chris

On Tue, Mar 29, 2011 at 10:14 AM, Christopher Barker <[hidden email]> wrote:
On 3/28/11 4:51 PM, David Baddeley wrote:
> what you're probably running into is a problem with allocating a
> continuous block of memory / a memory fragmentation issue.

On 3/28/11 8:01 PM, Charles R Harris wrote:

> Windows 32 bit gives you 2GB and keeps the rest for itself.

right -- I've had no problems running the same code with 32 bit python
on OS-X that crashes out with memory errors on Windows -- similar hardware.

> Compiling as 64 bit might solve your problem, as, with 12 GB of memory,
> there will be a larger address space to look for contiguous blocks in,
> but probably doesn't address the fundamental issue.

Ah, but while you still may only have 12GB memory, with 64 bit Windows
and Python, the virtual memory space is massive, so I suspect you'll be
fine. Using 1GB memory buffers on 32bit is certainly pushing it.

> I suspect you could
> probably get away with having much smaller contiguous blocks (eg have 3
> separate arrays for the 3 different cameras) or even a new array for
> each image.

That would make it easier for the OS to manage the memory well.

-Chris


--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Strange memory limits

Christoph Gohlke


On 3/29/2011 10:31 AM, Chris Weisiger wrote:

> Thanks for the information, all. Annoyingly, I can't allocate 3
> individual 735x512x512 arrays (each ~360MB) -- the third gives a memory
> error. If I allocate 1024x1024 byte arrays (thus, each 1MB), I can make
> 1500 before getting a memory error. So I'm definitely running into
> *some* issue that prevents larger blocks from being allocated, but I'm
> also hitting a ceiling well before I should be.
>
> I had thought that Windows allowed for 3GB address spaces for 32-bit
> processes, but apparently (per
> http://msdn.microsoft.com/en-us/library/aa366778%28v=vs.85%29.aspx#memory_limits
> ) that only applies if the program has IMAGE_FILE_LARGE_ADDRESS_AWARE
> and 4GT set...sounds like I'd need a recompile and some system tweaks to
> set those. The proper, and more work-intensive, solution would be to
> make a 64-bit build.
>
> My (admittedly limited) understanding of memory fragmentation was that
> it's a per-process problem. I'm seeing this issue immediately on
> starting up the program, so the program's virtual memory address space
> should be pretty clean.
>
> -Chris
>
> On Tue, Mar 29, 2011 at 10:14 AM, Christopher Barker
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     On 3/28/11 4:51 PM, David Baddeley wrote:
>     >  what you're probably running into is a problem with allocating a
>     >  continuous block of memory / a memory fragmentation issue.
>
>     On 3/28/11 8:01 PM, Charles R Harris wrote:
>
>     >  Windows 32 bit gives you 2GB and keeps the rest for itself.
>
>     right -- I've had no problems running the same code with 32 bit python
>     on OS-X that crashes out with memory errors on Windows -- similar
>     hardware.
>
>     >  Compiling as 64 bit might solve your problem, as, with 12 GB of
>     memory,
>     >  there will be a larger address space to look for contiguous blocks in,
>     >  but probably doesn't address the fundamental issue.
>
>     Ah, but while you still may only have 12GB memory, with 64 bit Windows
>     and Python, the virtual memory space is massive, so I suspect you'll be
>     fine. Using 1GB memory buffers on 32bit is certainly pushing it.
>
>     >  I suspect you could
>     >  probably get away with having much smaller contiguous blocks (eg
>     have 3
>     >  separate arrays for the 3 different cameras) or even a new array for
>     >  each image.
>
>     That would make it easier for the OS to manage the memory well.
>
>     -Chris
>
>
>     --
>     Christopher Barker, Ph.D.
>     Oceanographer
>
>     Emergency Response Division
>     NOAA/NOS/OR&R            (206) 526-6959   voice
>     7600 Sand Point Way NE   (206) 526-6329   fax
>     Seattle, WA  98115       (206) 526-6317   main reception
>
>     [hidden email] <mailto:[hidden email]>


Try VMMap
<http://technet.microsoft.com/en-us/sysinternals/dd535533.aspx>. The
software lists, among other useful information, the sizes of contiguous
blocks of memory available to a process. You'll probably find that 64
bit Python lets you use a much larger contiguous block than 32 bit Python.

It could help to create large numpy arrays early in the program, e.g.
before importing packages or creating other arrays.

Christoph
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Strange memory limits

Chris Weisiger-2
On Tue, Mar 29, 2011 at 10:39 AM, Christoph Gohlke <[hidden email]> wrote:

Try VMMap
<http://technet.microsoft.com/en-us/sysinternals/dd535533.aspx>. The
software lists, among other useful information, the sizes of contiguous
blocks of memory available to a process. You'll probably find that 64
bit Python lets you use a much larger contiguous block than 32 bit Python.

It could help to create large numpy arrays early in the program, e.g.
before importing packages or creating other arrays.


Ah, thanks. Looks like there's some very loosely-packed "image" allocations at one end of the heap that are basically precluding allocations of large arrays in that area without actually using up all that much total memory. I wonder if maybe they're for imported Python modules...well, at least now I have a tool to help me figure out where memory's going. The right answer's probably still to just make a 64-bit version though.

-Chris

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user