[SciPy-User] Large Memory usage while doing median filter

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[SciPy-User] Large Memory usage while doing median filter

Joe P Ninan
Hi,
I was trying median_filter in scipy.ndimage.filters
on a 1024x1024 array.

What I noticed is that the memory requirement grows really fast when we increase the size of the median filter.
On a machine with 6gb RAM I could do only (150,150) size filter.
Anything above gives Memory Error.

On a bigger server I could see it takes about 16gb RAM while using a filter size (200, 200)

I can understand, computation time increasing with size of filter, but why is the memory size exploding with respect to size of the median filter?
Is this expected behaviour?

-cheers
joe
--
/---------------------------------------------------------------
"GNU/Linux: because a PC is a terrible thing to waste" -  GNU Generation

************************************************
Joe Philip Ninan      
Research Scholar      
DAA,  TIFR,                          
Mumbai, India.     
Ph: +917738438212 
------------------------------------------------------------
My GnuPG Public Key: www.tifr.res.in/~ninan/JPN_public.key

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Jerome Kieffer
On Mon, 11 May 2015 00:56:29 +0530
Joe P Ninan <[hidden email]> wrote:

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

--
Jérôme Kieffer
Data analysis unit - ESRF
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Juan Nunez-Iglesias
If you can cast your image as a uint8 image, try the median filter in scikit-image's filters.rank module. It's very fast and has a minimal memory footprint. But it doesn't work on floats or high ints. 


Sent from Mailbox


On Mon, May 11, 2015 at 3:25 PM, Jerome Kieffer <[hidden email]> wrote:

On Mon, 11 May 2015 00:56:29 +0530
Joe P Ninan <[hidden email]> wrote:


> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

--
Jérôme Kieffer
Data analysis unit - ESRF
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Joe P Ninan
Hi Juan,
Thank you for the suggestion, but my data is 32 bit float. And since the precision in data is important, I cannot convert them to uint8 data.

As Jerome suggested, it might be due to the extra large array scipy is creating to do faster sorting.
In typical astronomy applications I encounter, our images are bigger than 1kx1k, I wonder whether there exist other tools to do median filtering.
For a moving window median, since only a few pixels leaves and enter the window, if we take advantage of that, then I would imagine the sort time required to find median in each window position wouldn't be very high.

Does anybody know of any such fast median filter routines in python?
Thanking you,
-cheers
joe


On 11 May 2015 at 11:01, Juan Nunez-Iglesias <[hidden email]> wrote:
If you can cast your image as a uint8 image, try the median filter in scikit-image's filters.rank module. It's very fast and has a minimal memory footprint. But it doesn't work on floats or high ints. 


Sent from Mailbox


On Mon, May 11, 2015 at 3:25 PM, Jerome Kieffer <[hidden email]> wrote:

On Mon, 11 May 2015 00:56:29 +0530
Joe P Ninan <[hidden email]> wrote:


> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

--
Jérôme Kieffer
Data analysis unit - ESRF
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user




--
/---------------------------------------------------------------
"GNU/Linux: because a PC is a terrible thing to waste" -  GNU Generation

************************************************
Joe Philip Ninan      
Research Scholar      
DAA,  TIFR,                          
Mumbai, India.     
Ph: +917738438212 
------------------------------------------------------------
My GnuPG Public Key: www.tifr.res.in/~ninan/JPN_public.key

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Moore, Eric (NIH/NIDDK) [F]


From: Joe P Ninan [mailto:[hidden email]]
Sent: Monday, May 11, 2015 9:14 AM
To: SciPy Users List
Subject: Re: [SciPy-User] Large Memory usage while doing median filter

Hi Juan,
Thank you for the suggestion, but my data is 32 bit float. And since the precision in data is important, I cannot convert them to uint8 data.

As Jerome suggested, it might be due to the extra large array scipy is creating to do faster sorting.
In typical astronomy applications I encounter, our images are bigger than 1kx1k, I wonder whether there exist other tools to do median filtering.
For a moving window median, since only a few pixels leaves and enter the window, if we take advantage of that, then I would imagine the sort time required to find median in each window position wouldn't be very high.

Does anybody know of any such fast median filter routines in python?
Thanking you,
-cheers
joe


On 11 May 2015 at 11:01, Juan Nunez-Iglesias <[hidden email]> wrote:
If you can cast your image as a uint8 image, try the median filter in scikit-image's filters.rank module. It's very fast and has a minimal memory footprint. But it doesn't work on floats or high ints. 


Sent from Mailbox

On Mon, May 11, 2015 at 3:25 PM, Jerome Kieffer <[hidden email]> wrote:
On Mon, 11 May 2015 00:56:29 +0530
Joe P Ninan <[hidden email]> wrote:

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

--
Jérôme Kieffer
Data analysis unit - ESRF
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user 


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



--
/---------------------------------------------------------------
"GNU/Linux: because a PC is a terrible thing to waste" -  GNU Generation

************************************************
Joe Philip Ninan      
Research Scholar      
DAA,  TIFR,                          
Mumbai, India.     
Ph: +917738438212 
------------------------------------------------------------
Website: www.tifr.res.in/~ninan/
My GnuPG Public Key: www.tifr.res.in/~ninan/JPN_public.key


Hi Joe,

Would you report this as an issue on github so that it doesn't get lost?  

A second thought is that a different implementation of a median filter exists in the signal package as medfilt and medfilt2d.  I haven't ever used any of these functions, but it might be worth a shot to try them.  

-Eric
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Juan Nunez-Iglesias
In reply to this post by Joe P Ninan
Hey Joe,

The moving window approach is that used by skimage's rank filters, but this requires maintaining a histogram of existing values, which is much easier when these are 8-bit ints.

It occurs to me you can generate a slow but low-memory version using generic_filter:

ndimage.generic_filter(image, np.median, footprint=footprint)

That won't generate a bigger array than the image + footprint, I believe.

With a bit of work, you might actually be able to create the sliding window approach yourself this way! =)

Juan.

On Mon, May 11, 2015 at 11:14 PM, Joe P Ninan <[hidden email]> wrote:
Hi Juan,
Thank you for the suggestion, but my data is 32 bit float. And since the precision in data is important, I cannot convert them to uint8 data.

As Jerome suggested, it might be due to the extra large array scipy is creating to do faster sorting.
In typical astronomy applications I encounter, our images are bigger than 1kx1k, I wonder whether there exist other tools to do median filtering.
For a moving window median, since only a few pixels leaves and enter the window, if we take advantage of that, then I would imagine the sort time required to find median in each window position wouldn't be very high.

Does anybody know of any such fast median filter routines in python?
Thanking you,
-cheers
joe


On 11 May 2015 at 11:01, Juan Nunez-Iglesias <[hidden email]> wrote:
If you can cast your image as a uint8 image, try the median filter in scikit-image's filters.rank module. It's very fast and has a minimal memory footprint. But it doesn't work on floats or high ints. 


Sent from Mailbox


On Mon, May 11, 2015 at 3:25 PM, Jerome Kieffer <[hidden email]> wrote:

On Mon, 11 May 2015 00:56:29 +0530
Joe P Ninan <[hidden email]> wrote:


> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.
Cheers,

--
Jérôme Kieffer
Data analysis unit - ESRF
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user




--
/---------------------------------------------------------------
"GNU/Linux: because a PC is a terrible thing to waste" -  GNU Generation

************************************************
Joe Philip Ninan      
Research Scholar      
DAA,  TIFR,                          
Mumbai, India.     
Ph: <a href="tel:%2B917738438212" value="+917738438212" target="_blank">+917738438212 
------------------------------------------------------------
My GnuPG Public Key: www.tifr.res.in/~ninan/JPN_public.key

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Warren Weckesser-2
In reply to this post by Jerome Kieffer


On Mon, May 11, 2015 at 1:25 AM, Jerome Kieffer <[hidden email]> wrote:
On Mon, 11 May 2015 00:56:29 +0530
Joe P Ninan <[hidden email]> wrote:

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.


Maybe I didn't search hard enough, but I don't see where such an array is allocated.  There are several layers of calls, from python in ndimage/filters.py down to C in ndimage/src/ni_filters.c, so maybe I missed it.  Can you point to where such an array is created, or was that really a guess?

Warren




Cheers,

--
Jérôme Kieffer
Data analysis unit - ESRF
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Jerome Kieffer
On Mon, 11 May 2015 09:53:35 -0400
Warren Weckesser <[hidden email]> wrote:

> > I guess this is because scipy creates a 1024x1024x(40000) array to do the
> > sort along the last axis.
> > maybe no the best from the memorry point of view.
> >  
>
>
> Maybe I didn't search hard enough, but I don't see where such an array is
> allocated.  There are several layers of calls, from python in
> ndimage/filters.py down to C in ndimage/src/ni_filters.c, so maybe I missed
> it.  Can you point to where such an array is created, or was that really a
> guess?

It is really a guess ... I did not have a look at the source code.

To do such things, a colleague of mine did some CUDA (OpenCL would be
the same) but it is out of the scope.

Cheers.
--
Jérôme Kieffer
tel +33 476 882 445
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Moore, Eric (NIH/NIDDK) [F]
In reply to this post by Warren Weckesser-2
From: Warren Weckesser [mailto:[hidden email]]
Sent: Monday, May 11, 2015 9:54 AM
To: SciPy Users List
Subject: Re: [SciPy-User] Large Memory usage while doing median filter



On Mon, May 11, 2015 at 1:25 AM, Jerome Kieffer <[hidden email]> wrote:
On Mon, 11 May 2015 00:56:29 +0530
Joe P Ninan <[hidden email]> wrote:

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.

Maybe I didn't search hard enough, but I don't see where such an array is allocated.  There are several layers of calls, from python in ndimage/filters.py down to C in ndimage/src/ni_filters.c, so maybe I missed it.  Can you point to where such an array is created, or was that really a guess?
Warren

The really large array is allocated in NI_InitFilterOffsets, on line 518, in ni_support.c which is called from line 726 of ni_filter.c, in Ni_RankFilter.

For me, calling ndimage.median_filter(arr, 150), with arr a (1024, 1024) array of doubles or floats results in an allocation of 4050000000 bytes ( 3.77 GB).  Which seems a little bit bigger than we would like here.

-Eric

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Warren Weckesser-2


On Mon, May 11, 2015 at 10:34 AM, Moore, Eric (NIH/NIDDK) [F] <[hidden email]> wrote:
From: Warren Weckesser [mailto:[hidden email]]
Sent: Monday, May 11, 2015 9:54 AM
To: SciPy Users List
Subject: Re: [SciPy-User] Large Memory usage while doing median filter



On Mon, May 11, 2015 at 1:25 AM, Jerome Kieffer <[hidden email]> wrote:
On Mon, 11 May 2015 00:56:29 +0530
Joe P Ninan <[hidden email]> wrote:

> Hi,
> I was trying median_filter in scipy.ndimage.filters
> on a 1024x1024 array.
>
> What I noticed is that the memory requirement grows really fast when we
> increase the size of the median filter.
> On a machine with 6gb RAM I could do only (150,150) size filter.
> Anything above gives Memory Error.
>
> On a bigger server I could see it takes about 16gb RAM while using a filter
> size (200, 200)
>
> I can understand, computation time increasing with size of filter, but why
> is the memory size exploding with respect to size of the median filter?
> Is this expected behaviour?

I guess this is because scipy creates a 1024x1024x(40000) array to do the sort along the last axis.
maybe no the best from the memorry point of view.

Maybe I didn't search hard enough, but I don't see where such an array is allocated.  There are several layers of calls, from python in ndimage/filters.py down to C in ndimage/src/ni_filters.c, so maybe I missed it.  Can you point to where such an array is created, or was that really a guess?
Warren

The really large array is allocated in NI_InitFilterOffsets, on line 518, in ni_support.c which is called from line 726 of ni_filter.c, in Ni_RankFilter.


Thanks Eric.

Warren

 

For me, calling ndimage.median_filter(arr, 150), with arr a (1024, 1024) array of doubles or floats results in an allocation of 4050000000 bytes ( 3.77 GB).  Which seems a little bit bigger than we would like here.

-Eric

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Sturla Molden-3
On 11/05/15 17:16, Warren Weckesser wrote:

>     I guess this is because scipy creates a 1024x1024x(40000) array to
>     do the sort along the last axis.
>     maybe no the best from the memorry point of view.

And when you see this answer, the solution is "Cython" :)

I guess we could change ndimage's medianfilter to call introselect on
each axis sequantially. But as noted, it will take little bit of Cython.


Sturla



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

ralfgommers


On Mon, May 11, 2015 at 5:44 PM, Sturla Molden <[hidden email]> wrote:
On 11/05/15 17:16, Warren Weckesser wrote:

>     I guess this is because scipy creates a 1024x1024x(40000) array to
>     do the sort along the last axis.
>     maybe no the best from the memorry point of view.

And when you see this answer, the solution is "Cython" :)

I guess we could change ndimage's medianfilter to call introselect on
each axis sequantially. But as noted, it will take little bit of Cython.

Good thing then that there's a GSoC on rewriting ndimage in Cython is about to start:)

Ralf



_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: Large Memory usage while doing median filter

Sturla Molden-3
On 11/05/15 18:01, Ralf Gommers wrote:

> Good thing then that there's a GSoC on rewriting ndimage in Cython is
> about to start:)

Yes.

However, the world isn't always perfect, and neither is SciPy. But
correctly working code is infinitely better than no code, even it it
hungry on memory. And the cheapest solution to excessive memory use is
(almost) always to buy more RAM. That tends to be way cheaper than
paying a developer :)


Sturla


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user