down-sampling an array by averaging - vectorized form?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

down-sampling an array by averaging - vectorized form?

Andrew Giessel
Hello all,

I'm looking to down-sample an image by averaging.  I quickly hacked up the following code, which does exactly what I want, but the double loop is slow (the images I'm working with are ~2000x2000 pixels).  Is there a nice way to vectorize this?  A quick profile showed that most of the time is spend averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide the whole array, and then take every so many pixels?

This method of down-sampling (spatial averaging) makes sense for the type of data I'm using and yields good results, but I'm also open to alternatives.  Thanks in advance!

Andrew

######################
import numpy as np

def downsample(array, reduction):
    """example call for 2fold size reduction:  newImage = downsample(image, 2)"""

    newArray = np.empty(array.shape[0]/reduction, array.shape[1]/reduction)

    for x in range(newArray.shape[0]):
        for y in range(newArray.shape[1]):
            newArray[x,y] = np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1])

    return newArray
######################

--
Andrew Giessel, PhD

Department of Neurobiology, Harvard Medical School
220 Longwood Ave Boston, MA 02115
ph: 617.432.7971 email: [hidden email]

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

down-sampling an array by averaging - vectorized form?

andrew giessel-2
Hello all,

I'm looking to down-sample an image by averaging.  I quickly hacked up the following code, which does exactly what I want, but the double loop is slow (the images I'm working with are ~2000x2000 pixels).  Is there a nice way to vectorize this?  A quick profile showed that most of the time is spend averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide the whole array, and then take every so many pixels?

This method of down-sampling (spatial averaging) makes sense for the type of data I'm using and yields good results, but I'm also open to alternatives.  Thanks in advance!

Andrew

######################
import numpy as np

def downsample(array, reduction):
    """example call for 2fold size reduction:  newImage = downsample(image, 2)"""

    newArray = np.empty(array.shape[0]/reduction, array.shape[1]/reduction)

    for x in range(newArray.shape[0]):
        for y in range(newArray.shape[1]):
            newArray[x,y] = np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1])

    return newArray
######################

--
Andrew Giessel, PhD

Department of Neurobiology, Harvard Medical School
220 Longwood Ave Boston, MA 02115
ph: 617.432.7971 email: [hidden email]

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: down-sampling an array by averaging - vectorized form?

jkhilmer@chemistry.montana.edu
Andrew,

This is a very naive response, since I don't have nearly the
experience as many of the other contributors here.

Since you're working with images, and you might want more complicated
variants of this process in the future, why not convolve your array
and slice the output with a stride/step.  For the sizes you're talking
about, memory shouldn't be a concern.  That should give you a very
flexible procedure that is inherently "vectorized".

Jonathan


On Fri, Feb 10, 2012 at 8:19 PM, andrew giessel
<[hidden email]> wrote:

> Hello all,
>
> I'm looking to down-sample an image by averaging.  I quickly hacked up the
> following code, which does exactly what I want, but the double loop is slow
> (the images I'm working with are ~2000x2000 pixels).  Is there a nice way to
> vectorize this?  A quick profile showed that most of the time is spend
> averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide the
> whole array, and then take every so many pixels?
>
> This method of down-sampling (spatial averaging) makes sense for the type of
> data I'm using and yields good results, but I'm also open to alternatives.
>  Thanks in advance!
>
> Andrew
>
> ######################
> import numpy as np
>
> def downsample(array, reduction):
>     """example call for 2fold size reduction:  newImage = downsample(image,
> 2)"""
>
>     newArray = np.empty(array.shape[0]/reduction, array.shape[1]/reduction)
>
>     for x in range(newArray.shape[0]):
>         for y in range(newArray.shape[1]):
>             newArray[x,y] =
> np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1])
>
>     return newArray
> ######################
>
> --
> Andrew Giessel, PhD
>
> Department of Neurobiology, Harvard Medical School
> 220 Longwood Ave Boston, MA 02115
> ph: 617.432.7971 email: [hidden email]
>
> _______________________________________________
> SciPy-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: down-sampling an array by averaging - vectorized form?

Jerome Kieffer
In reply to this post by Andrew Giessel
Hi,

Your needs are very close to "binning" (just devide the result by the number of input pixel in each output):

def binning(inputArray, binsize):
    """
    @param inputArray: input ndarray
    @param binsize: int or 2-tuple representing the size of the binning
    @return: binned input ndarray
    """
    inputSize = inputArray.shape
    outputSize = []
    assert(len(inputSize) == 2)
    if isinstance(binsize, int):
        binsize = (binsize, binsize)
    for i, j in zip(inputSize, binsize):
        assert(i % j == 0)
        outputSize.append(i // j)

    if numpy.array(binsize).prod() < 50:
        out = numpy.zeros(tuple(outputSize))
        for i in xrange(binsize[0]):
            for j in xrange(binsize[1]):
                out += inputArray[i::binsize[0], j::binsize[1]]
    else:
        temp = inputArray.copy()
        temp.shape = (outputSize[0], binsize[0], outputSize[1], binsize[1])
        out = temp.sum(axis=3).sum(axis=1)
    return out


This function implements 2 methods:
- one faster for small binning based on a loop and a sum of all elements
- one for larger binning (8x8 and +) based on a reshape and two sum on two different axis

HTH
--
Jérôme Kieffer
Data analysis unit - ESRF
_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: down-sampling an array by averaging - vectorized form?

Tony Yu-3
In reply to this post by Andrew Giessel


On Fri, Feb 10, 2012 at 10:11 PM, Andrew Giessel <[hidden email]> wrote:
Hello all,

I'm looking to down-sample an image by averaging.  I quickly hacked up the following code, which does exactly what I want, but the double loop is slow (the images I'm working with are ~2000x2000 pixels).  Is there a nice way to vectorize this?  A quick profile showed that most of the time is spend averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide the whole array, and then take every so many pixels?

This method of down-sampling (spatial averaging) makes sense for the type of data I'm using and yields good results, but I'm also open to alternatives.  Thanks in advance!

Andrew

######################
import numpy as np

def downsample(array, reduction):
    """example call for 2fold size reduction:  newImage = downsample(image, 2)"""

    newArray = np.empty(array.shape[0]/reduction, array.shape[1]/reduction)

    for x in range(newArray.shape[0]):
        for y in range(newArray.shape[1]):
            newArray[x,y] = np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1])

    return newArray
######################


I think `scipy.ndimage.zoom` does what you want. Or actually, it does the opposite: your 2fold size reduction example would be

>>> from scipy import ndimage
>>> small_image = ndimage.zoom(image, 0.5)

-Tony 

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: down-sampling an array by averaging - vectorized form?

Andrew Giessel
I'd like to thank everyone for their responses- they were really helpful in thinking about the problem.  All the solutions people posted were faster than my original brutish algorithm, but all had subtle differences as well.   None of them are 'vectorized' persay but all are more clever or effeicent ways of getting at the same problem.  I thought I'd write a couple of quick comments.

Convolution with a kernel is a good idea (one that I should have thought of since I've been working with various types of filtering recently) and works quite quickly.  Depending on the type and size of the kernel it yields slightly different local averages at each point which can be sub-sampled to yield a smaller array.  I didn't play with too many types of kernels but I'd say it's important to consider the nature of the kernel and the relation of the subsampling frequency to the size of the kernel in order to get the best results.

The binning followed by scalar division is wicked fast and yields results that are very close to my original algorithm.  The reshaping seems very clever and I am going to read it more carefully to learn some lessons there, I think.

The ndimage.zoom approach is a very general approach (and roughly as quick as the others).  As far as I can tell, that function uses spline interpolation for zoom factors > 1, and I'm unsure how it deals with zoom factors < 1.  It might do nearest neighbor or something like that, I wasn't able to quickly determine from glancing at the source.  If anyone knows, it would be cool to hear.

I think I'll probably go with binning for now-  We'll be dealing with hundreds of wide-field microscopy images (of this rough size) in every experiment and speed is a factor.

Thanks again everyone!

Best,

Andrew



On Sat, Feb 11, 2012 at 10:40, Tony Yu <[hidden email]> wrote:


On Fri, Feb 10, 2012 at 10:11 PM, Andrew Giessel <[hidden email]> wrote:
Hello all,

I'm looking to down-sample an image by averaging.  I quickly hacked up the following code, which does exactly what I want, but the double loop is slow (the images I'm working with are ~2000x2000 pixels).  Is there a nice way to vectorize this?  A quick profile showed that most of the time is spend averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide the whole array, and then take every so many pixels?

This method of down-sampling (spatial averaging) makes sense for the type of data I'm using and yields good results, but I'm also open to alternatives.  Thanks in advance!

Andrew

######################
import numpy as np

def downsample(array, reduction):
    """example call for 2fold size reduction:  newImage = downsample(image, 2)"""

    newArray = np.empty(array.shape[0]/reduction, array.shape[1]/reduction)

    for x in range(newArray.shape[0]):
        for y in range(newArray.shape[1]):
            newArray[x,y] = np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1])

    return newArray
######################


I think `scipy.ndimage.zoom` does what you want. Or actually, it does the opposite: your 2fold size reduction example would be

>>> from scipy import ndimage
>>> small_image = ndimage.zoom(image, 0.5)

-Tony 

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user




--
Andrew Giessel, PhD

Department of Neurobiology, Harvard Medical School
220 Longwood Ave Boston, MA 02115
ph: <a href="tel:617.432.7971" value="+16174327971" target="_blank">617.432.7971 email: [hidden email]

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: down-sampling an array by averaging - vectorized form?

Zachary Pincus-2
Hi Andrew,

> None of them are 'vectorized' persay but all are more clever or effeicent ways of getting at the same problem.  I thought I'd write a couple of quick comments.

It depends what you mean by "vectorized" -- none are using SIMD instructions on the chip, but from the Matlab/numpy perspective I think people often mean "vectorized" as "multiple data elements acted on by a single command" such as C = A + B, where A and B are matrices.

In any case, the reshaping approach is "vectorized" according to the latter definition, which obviously really just means "the loops are pushed down into C"...

> The binning followed by scalar division is wicked fast and yields results that are very close to my original algorithm.  The reshaping seems very clever and I am going to read it more carefully to learn some lessons there, I think.

For more information on reshaping to do decimation, see the "avoiding loops when downsampling arrays" thread on the numpy-discussion list from last week:
https://groups.google.com/forum/#!topic/numpy/qyDKJTj5jx4

There's a bit more discussion about how this works, and some memory-layout caveats to be aware of.

Also, instead of doing the reshaping, you could see if hard-coding the averaging is faster. Here's how to do it for the 2x2 case:
B = (A[::2,::2] + A[1::2,::2] + A[::2,1::2] + A[1::2,::2])/4.0

> The ndimage.zoom approach is a very general approach (and roughly as quick as the others).  As far as I can tell, that function uses spline interpolation for zoom factors > 1, and I'm unsure how it deals with zoom factors < 1.  It might do nearest neighbor or something like that, I wasn't able to quickly determine from glancing at the source.  If anyone knows, it would be cool to hear.

I'm pretty certain that the zoom function doesn't do anything smart for image decimation/minification -- it just uses the requested interpolation order to take a point-sample of the image at the calculated coordinates. Lack of good decimation is a limitation of ndimage. I know that there are decimation routines in scipy.signal, but I'm not sure if they're just 1D.

In general, for integer-factor downsampling, I either do it with slicing like the above example, or use convolution (like ndimage.gaussian_filter with the appropriate bandwidth, which is quite fast) to prefilter followed by taking a view such as A[::3,::3] to downsample by a factor of three.

Zach


_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user
Reply | Threaded
Open this post in threaded view
|

Re: down-sampling an array by averaging - vectorized form?

Andrew Giessel
Heya Zach-

On Sat, Feb 11, 2012 at 14:56, Zachary Pincus <[hidden email]> wrote:
Hi Andrew,

> None of them are 'vectorized' persay but all are more clever or effeicent ways of getting at the same problem.  I thought I'd write a couple of quick comments.

It depends what you mean by "vectorized" -- none are using SIMD instructions on the chip, but from the Matlab/numpy perspective I think people often mean "vectorized" as "multiple data elements acted on by a single command" such as C = A + B, where A and B are matrices.

In any case, the reshaping approach is "vectorized" according to the latter definition, which obviously really just means "the loops are pushed down into C"...

Yes, I guess that's more precisely what I mean- I would call the reshaping approach vectorized as well.
 
> The binning followed by scalar division is wicked fast and yields results that are very close to my original algorithm.  The reshaping seems very clever and I am going to read it more carefully to learn some lessons there, I think.

For more information on reshaping to do decimation, see the "avoiding loops when downsampling arrays" thread on the numpy-discussion list from last week:
https://groups.google.com/forum/#!topic/numpy/qyDKJTj5jx4

There's a bit more discussion about how this works, and some memory-layout caveats to be aware of.

Also, instead of doing the reshaping, you could see if hard-coding the averaging is faster. Here's how to do it for the 2x2 case:
B = (A[::2,::2] + A[1::2,::2] + A[::2,1::2] + A[1::2,::2])/4.0

Wow, last week?  Guess that's what I get for not searching archives.  I just joined both numpy-discussion and scipy-user this week.  I will check that thread.

> The ndimage.zoom approach is a very general approach (and roughly as quick as the others).  As far as I can tell, that function uses spline interpolation for zoom factors > 1, and I'm unsure how it deals with zoom factors < 1.  It might do nearest neighbor or something like that, I wasn't able to quickly determine from glancing at the source.  If anyone knows, it would be cool to hear.

I'm pretty certain that the zoom function doesn't do anything smart for image decimation/minification -- it just uses the requested interpolation order to take a point-sample of the image at the calculated coordinates. Lack of good decimation is a limitation of ndimage. I know that there are decimation routines in scipy.signal, but I'm not sure if they're just 1D.

In general, for integer-factor downsampling, I either do it with slicing like the above example, or use convolution (like ndimage.gaussian_filter with the appropriate bandwidth, which is quite fast) to prefilter followed by taking a view such as A[::3,::3] to downsample by a factor of three.

Cheers, it's great to know what other people do.

ag 

--
Andrew Giessel, PhD

Department of Neurobiology, Harvard Medical School
220 Longwood Ave Boston, MA 02115
ph: 617.432.7971 email: [hidden email]

_______________________________________________
SciPy-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/scipy-user