Hi,
I have a spectra with multiple gaussian emission lines over a noisy continuum. My primary objective is to find areas under all the gaussian peaks. For that, the following is the algorithm i have in mind. 1) fit the continuum and subtract it. 2) find the peaks 3) do least square fit of gaussian at the peaks to find the area under each gaussian peaks. I am basically stuck at the first step itself. Simple 2nd or 3rd order polynomial fit is not working because the contribution from peaks are significant. Any tool exist to fit continuum ignoring the peaks? For finding peaks, i tried find_peaks_cwt in signal module of scipy. But it seems to be quite sensitive of the width of peak and was picking up non-existing peaks also. The wavelet used was default mexican hat. Is there any better wavelet i should try? Or is there any other module in python/scipy which i should give a try? Thanking you. -cheers joe -- /--------------------------------------------------------------- "GNU/Linux: because a PC is a terrible thing to waste" - GNU Generation _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
I'm not sure about the peak-finding part, but for fitting the continuum by ignoring peaks I often do an iterative fit such as below. It ignores more and more points from the peaks. I use this kind of an algorithm for fitting the continuum on absorption spectra pretty often.
done = False while not done: done = True fit = numpy.poly1d(numpy.polyfit(x,y,order)) residuals = y - fit(x) std = numpy.std(residuals)
badindices = numpy.where(residuals > 5*std)[0] if badindices.size > 0: done = False x = numpy.delete(x, badindices) y = numpy.delete(y, badindices) That solution may not be the best for you if you have a very large array though, because deleting indices from a numpy array is inefficient. Cheers, Kevin Gullikson
On Fri, Sep 28, 2012 at 1:45 PM, Joe Philip Ninan <[hidden email]> wrote: Hi, _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Joe P Ninan
On Sat, 29 Sep 2012 00:15:21 +0530
Joe Philip Ninan <[hidden email]> wrote: > 1) fit the continuum and subtract it. > Or is there any other module in python/scipy which i should give a try? > Thanking you. Iteratively apply a Savitsky-Golay filter with a large width(>10) and a low order (2). at the begining you will only smear out the noise then start removing peaks. SG filter are really fast to apply. -- Jérôme Kieffer Data analysis unit - ESRF _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Joe P Ninan
> I have a spectra with multiple gaussian emission lines over a noisy
> continuum. > My primary objective is to find areas under all the gaussian peaks. > For that, the following is the algorithm i have in mind. > 1) fit the continuum and subtract it. > 2) find the peaks > 3) do least square fit of gaussian at the peaks to find the area under > each gaussian peaks. > I am basically stuck at the first step itself. Simple 2nd or 3rd order > polynomial fit is not working because the contribution from peaks are > significant. Any tool exist to fit continuum ignoring the peaks? Try to fit all at once and subtract only parts of the model which best describe the background. In fact, doing so, you do not even need to subtract the continuum. If you were using peak-o-mat you could e.g. try a model like CB DEC GA GA GA GA (constant background, exponential decay, gauss) assuming in this case, that the continuum can be described by an exponential function plus a constant offset. Then you would evaluate those parts of the model which are related to the continuum and subtract them from the original one. peak-o-mat however cannot find the peaks on its one, as you said, peak finding can be quite tricky. Tell if that helps or if you need more information. Regards, Christian _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by kgulliks
On Sat, Sep 29, 2012 at 5:28 PM, Kevin Gullikson
<[hidden email]> wrote: > That solution may not be the best for you if you have a very large array > though, because deleting indices from a numpy array is inefficient. Alternatively, you could use a masked array, and just mask out the peaks. No deletion needed. What polynomial order do you use? _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Joe P Ninan
Hi Joe,
On Fri, Sep 28, 2012 at 1:45 PM, Joe Philip Ninan <[hidden email]> wrote: > Hi, > I have a spectra with multiple gaussian emission lines over a noisy > continuum. > My primary objective is to find areas under all the gaussian peaks. > For that, the following is the algorithm i have in mind. > 1) fit the continuum and subtract it. > 2) find the peaks > 3) do least square fit of gaussian at the peaks to find the area under each > gaussian peaks. > I am basically stuck at the first step itself. Simple 2nd or 3rd order > polynomial fit is not working because the contribution from peaks are > significant. Any tool exist to fit continuum ignoring the peaks? > For finding peaks, i tried find_peaks_cwt in signal module of scipy. But it > seems to be quite sensitive of the width of peak and was picking up > non-existing peaks also. > The wavelet used was default mexican hat. Is there any better wavelet i > should try? > > Or is there any other module in python/scipy which i should give a try? > Thanking you. > -cheers > joe I would echo much of the earlier advice. Fitting in stages (first background, then peaks) can be a bit dangerous, but is sometimes justifiable. I think there really isn't a good domain-independent way to model a continuum background, and it can be very useful to have some physical or spectral model for what the form of the continuum should be. That being said, there are a few things you might consider trying, especially since you know that you have positive peaks on a relatively smooth (if noisy) background. First, in the fit objective function, you might consider weighting positive elements of the residuals logarithmically and negative elements by some large scale or even exponentially. That will help to ignore the peaks, and keep the modeled background on the very low end of the spectra. Second, use your knowledge of the peak widths to set the polynomial or spline, or whatever function you're using to model the background. If you know your peaks have some range of widths, you could even consider using a Fourier filtering method to reduce the low-frequency continuum and the high-frequency noise while leaving the frequencies of interest (mostly) in tact. With such an approach, you might fit the background such that it only tried to match the low-frequency components of the spectra. Finally, sometimes, a least-squares fit isn't needed. For example, for x-ray fluorescence spectra there is a simple but pretty effective method by Kajfosz and Kwiatek in Nucl Instrum Meth B22, p78 (1987) "Non-polynomial approximation of background in x-ray spectra". For an implementation of this, see https://github.com/xraypy/tdl/blob/master/modules/xrf/xrf_bgr.py This might not be exactly what you're looking for, but it might help get you started. --Matt _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
Hi Joe;
I don't know what exactly you are working on, but it seems like you could benefit from the astronomical spectrum fitting package Sherpa, which is importable as a python module. You can read more about it here: http://cxc.cfa.harvard.edu/contrib/sherpa/ Python is developed but the Chandra x-ray center but is not astronomy-specific. An introduction to the interactive interface can be found at: http://python4astronomers.github.com/fitting/spectrum.html There is a bug in the current installer which concerns the sherpa.astro.ui module; if you're on a *nix-like system I have written a little how-to on fixing this bug here: http://lusepuster.posterous.com/installing-sherpa-fitting-software-on-ubuntu (The title says Ubuntu but I actually don't think there's anything Ubuntu-specific in it). But even if you have no luck fixing the bug, you can still use the normal sherpa.ui module which is all that is required to follow the above tutorial, all you'll lose is some pretty astronomy-specific convenience functions. As for the actual strategy: if you're only interested in the continuum in order to eliminate it, I think I'd recommend localizing the peaks first, then select a region around them (sherpa has a tool for that), choose a model for the continuum (there are several to choose from, but for local simple models a constant or a power law would often be fine), and then add a simple gaussian to the model and perform the fit as described in the Python4Astronomers link above. If your spectra are particularly well-behaved, you may have luck building a model that describes both continuum and all your peaks of interest with a combination of e.g. a (multiple) power-law or a blackbody spectrum plus some gaussians, but often the reward is not really worth the hassle. Cheers; Emil On 09/30/2012 08:21 PM, Matt Newville wrote: > Hi Joe, > > On Fri, Sep 28, 2012 at 1:45 PM, Joe Philip Ninan <[hidden email]> wrote: >> Hi, >> I have a spectra with multiple gaussian emission lines over a noisy >> continuum. >> My primary objective is to find areas under all the gaussian peaks. >> For that, the following is the algorithm i have in mind. >> 1) fit the continuum and subtract it. >> 2) find the peaks >> 3) do least square fit of gaussian at the peaks to find the area under each >> gaussian peaks. >> I am basically stuck at the first step itself. Simple 2nd or 3rd order >> polynomial fit is not working because the contribution from peaks are >> significant. Any tool exist to fit continuum ignoring the peaks? >> For finding peaks, i tried find_peaks_cwt in signal module of scipy. But it >> seems to be quite sensitive of the width of peak and was picking up >> non-existing peaks also. >> The wavelet used was default mexican hat. Is there any better wavelet i >> should try? >> >> Or is there any other module in python/scipy which i should give a try? >> Thanking you. >> -cheers >> joe > I would echo much of the earlier advice. Fitting in stages (first > background, then peaks) can be a bit dangerous, but is sometimes > justifiable. > > I think there really isn't a good domain-independent way to model a > continuum background, and it can be very useful to have some physical > or spectral model for what the form of the continuum should be. > > That being said, there are a few things you might consider trying, > especially since you know that you have positive peaks on a relatively > smooth (if noisy) background. First, in the fit objective function, > you might consider weighting positive elements of the residuals > logarithmically and negative elements by some large scale or even > exponentially. That will help to ignore the peaks, and keep the > modeled background on the very low end of the spectra. > > Second, use your knowledge of the peak widths to set the polynomial or > spline, or whatever function you're using to model the background. If > you know your peaks have some range of widths, you could even consider > using a Fourier filtering method to reduce the low-frequency continuum > and the high-frequency noise while leaving the frequencies of interest > (mostly) in tact. With such an approach, you might fit the background > such that it only tried to match the low-frequency components of the > spectra. > > Finally, sometimes, a least-squares fit isn't needed. For example, > for x-ray fluorescence spectra there is a simple but pretty effective > method by Kajfosz and Kwiatek in Nucl Instrum Meth B22, p78 (1987) > "Non-polynomial approximation of background in x-ray spectra". For an > implementation of this, see > https://github.com/xraypy/tdl/blob/master/modules/xrf/xrf_bgr.py > > This might not be exactly what you're looking for, but it might help > get you started. > > --Matt > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Matt Newville
Hi Joe;
Depending on the physical character of your data, I believe the spectral fitting tool Sherpa could be of help to you. http://cxc.cfa.harvard.edu/contrib/sherpa/ An introductory tutorial is given here: http://python4astronomers.github.com/fitting/spectrum.html - the software package is general enough that it is also useful for non-astronomers. It continas some very nice tools to include or ignore certain data regions in your fit. There is a bug in the Sherpa standalone package, the solution of which I descibe here: http://lusepuster.posterous.com/installing-sherpa-fitting-software-on-ubuntu (I believe it should work on any *nix like system with the proper libraries and build tools installed). The bug only affects the sherpa.astro.ui sub-package, If for some reason you have no suces with the bug fix, you can still use all the functionality in sherpa.ui and follow the tutorials etc.; all you lose is some specialized high-level astronomical convenience functions and models. If your continuum isn't particularly well-behaved, and if you are only interested in the continuum in order to eliminate it, I think I'd start with localizing the peaks, then selecting a small region around each of them and model the background with your model of choice - in this case, a constant or a polynomium or power law shouod often work fine - and add the gaussian for the peak to the model, perform the fit and go on to next line. A first-guess to the peak position can often be made with some prior knowledge of the wavelength of the transition you're investigating. Cheers; Emil On 09/30/2012 08:21 PM, Matt Newville wrote: > Hi Joe, > > On Fri, Sep 28, 2012 at 1:45 PM, Joe Philip Ninan <[hidden email]> wrote: >> Hi, >> I have a spectra with multiple gaussian emission lines over a noisy >> continuum. >> My primary objective is to find areas under all the gaussian peaks. >> For that, the following is the algorithm i have in mind. >> 1) fit the continuum and subtract it. >> 2) find the peaks >> 3) do least square fit of gaussian at the peaks to find the area under each >> gaussian peaks. >> I am basically stuck at the first step itself. Simple 2nd or 3rd order >> polynomial fit is not working because the contribution from peaks are >> significant. Any tool exist to fit continuum ignoring the peaks? >> For finding peaks, i tried find_peaks_cwt in signal module of scipy. But it >> seems to be quite sensitive of the width of peak and was picking up >> non-existing peaks also. >> The wavelet used was default mexican hat. Is there any better wavelet i >> should try? >> >> Or is there any other module in python/scipy which i should give a try? >> Thanking you. >> -cheers >> joe > I would echo much of the earlier advice. Fitting in stages (first > background, then peaks) can be a bit dangerous, but is sometimes > justifiable. > > I think there really isn't a good domain-independent way to model a > continuum background, and it can be very useful to have some physical > or spectral model for what the form of the continuum should be. > > That being said, there are a few things you might consider trying, > especially since you know that you have positive peaks on a relatively > smooth (if noisy) background. First, in the fit objective function, > you might consider weighting positive elements of the residuals > logarithmically and negative elements by some large scale or even > exponentially. That will help to ignore the peaks, and keep the > modeled background on the very low end of the spectra. > > Second, use your knowledge of the peak widths to set the polynomial or > spline, or whatever function you're using to model the background. If > you know your peaks have some range of widths, you could even consider > using a Fourier filtering method to reduce the low-frequency continuum > and the high-frequency noise while leaving the frequencies of interest > (mostly) in tact. With such an approach, you might fit the background > such that it only tried to match the low-frequency components of the > spectra. > > Finally, sometimes, a least-squares fit isn't needed. For example, > for x-ray fluorescence spectra there is a simple but pretty effective > method by Kajfosz and Kwiatek in Nucl Instrum Meth B22, p78 (1987) > "Non-polynomial approximation of background in x-ray spectra". For an > implementation of this, see > https://github.com/xraypy/tdl/blob/master/modules/xrf/xrf_bgr.py > > This might not be exactly what you're looking for, but it might help > get you started. > > --Matt > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Joe P Ninan
Hi Matt, Christian, Jerome, Kevin and David.
Thanks a lot for all the suggestions. First i apologize for my delay in replying. something was wrong with my subscription, and i was only able to read the emails in archive page. I tried to model the continuum as an exponential function and did least square fit. ( at first i was trying out 2nd degree polynomials) With the iterative masking method suggested. it seems to be doing a good job. I haven't tried on all data set yet. Since the width,position nor amplitude of peaks were not same in all data, peak finding was not easy. But the code by sixtenbe in github https://gist.github.com/1178136 helped me find the peaks _almost_ reliably. Thanking you again for all the help. -cheers joe On 29 September 2012 00:15, Joe Philip Ninan <[hidden email]> wrote: Hi, -- /--------------------------------------------------------------- "GNU/Linux: because a PC is a terrible thing to waste" - GNU Generation ************************************************ Joe Philip Ninan http://sites.google.com/site/jpninan/ Research Scholar /________________\ DAA, | Vadakeparambil | TIFR, | Pullad P.O. | Mumbai-05, India. | Kerala, India | Ph: +917738438212 | PIN:689548 | ------------------------------\_______________/-------------- _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Jerome Kieffer
Hi,
I know I'm a bit late to the discussion, but I have some experience fitting emission lines. Here's what I've found to work: *) Fit the lines and background together *) Use the simplest reasonable model for the background: constant, linear, etc. --> You could measure the background and construct a model using linear interpolation *) Put the characteristics of your detector in your model: -> Line-width (fwhm) vs energy -> Detector efficiency vs energy *) If you know the possible lines you'll be looking for, put those in your model as well If you don't know what lines to expect, but know the shape of the peaks you're looking for, you might look at using MPOC-MLE, which is described reasonably well in the paper "Pileup Correction Algorithms for Very-High-Count-Rate Gamma-Ray Spectrometry With NaI(Tl) Detectors" by M. Bolic. I've implemented a modified version of the algorithm for counting x-rays from detectors in pulse-mode, and it's the most robust algorithm I've been able to find for that purpose. I hope this helps. David On 09/30/2012 03:54 AM, Jerome Kieffer wrote: > On Sat, 29 Sep 2012 00:15:21 +0530 > Joe Philip Ninan<[hidden email]> wrote: > >> 1) fit the continuum and subtract it. >> Or is there any other module in python/scipy which i should give a try? >> Thanking you. > Iteratively apply a Savitsky-Golay filter with a large width(>10) and a low order (2). > at the begining you will only smear out the noise then start removing peaks. > > SG filter are really fast to apply. > _______________________________________________ SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
In reply to this post by Joe P Ninan
On Thu, Oct 4, 2012 at 7:20 AM, Joe Philip Ninan <[hidden email]> wrote:
> Hi Matt, Christian, Jerome, Kevin and David. > Thanks a lot for all the suggestions. > First i apologize for my delay in replying. something was wrong with my > subscription, and i was only able to read the emails in archive page. > I tried to model the continuum as an exponential function and did least > square fit. ( at first i was trying out 2nd degree polynomials) > With the iterative masking method suggested. it seems to be doing a good > job. I haven't tried on all data set yet. > Since the width,position nor amplitude of peaks were not same in all data, > peak finding was not easy. > But the code by sixtenbe in github https://gist.github.com/1178136 helped me > find the peaks _almost_ reliably. > Thanking you again for all the help. Is there some sample data of spectra that has this pattern available somewhere? Kevins iterative dropping/masking method is very similar to least trimmed squares. My impression is that identifying peaks and fitting the continuum/baseline is very similar to outlier detection with robust estimation. statsmodels has robust M-estimation, essentially replacing least squares by a robust loss function. I have in preparation for statsmodels, least trimmed squares (which starts with a small subsample and adds observations until only outliers are left), maximum trimmed likelihood (which also works for other models like Poisson) and MM-estimators (which start with least trimmed squares but then switches to M-estimation to get higher efficiency in the normal case.) Caveat: so far only for models that are linear in parameters. With some sample data we could try if any of our robust estimators would help in this case. Josef > -cheers > joe > > > On 29 September 2012 00:15, Joe Philip Ninan <[hidden email]> wrote: >> >> Hi, >> I have a spectra with multiple gaussian emission lines over a noisy >> continuum. >> My primary objective is to find areas under all the gaussian peaks. >> For that, the following is the algorithm i have in mind. >> 1) fit the continuum and subtract it. >> 2) find the peaks >> 3) do least square fit of gaussian at the peaks to find the area under >> each gaussian peaks. >> I am basically stuck at the first step itself. Simple 2nd or 3rd order >> polynomial fit is not working because the contribution from peaks are >> significant. Any tool exist to fit continuum ignoring the peaks? >> For finding peaks, i tried find_peaks_cwt in signal module of scipy. But >> it seems to be quite sensitive of the width of peak and was picking up >> non-existing peaks also. >> The wavelet used was default mexican hat. Is there any better wavelet i >> should try? >> >> Or is there any other module in python/scipy which i should give a try? >> Thanking you. >> -cheers >> joe >> -- >> /--------------------------------------------------------------- >> "GNU/Linux: because a PC is a terrible thing to waste" - GNU Generation >> >> > > > > -- > /--------------------------------------------------------------- > "GNU/Linux: because a PC is a terrible thing to waste" - GNU Generation > > ************************************************ > Joe Philip Ninan http://sites.google.com/site/jpninan/ > Research Scholar /________________\ > DAA, | Vadakeparambil | > TIFR, | Pullad P.O. | > Mumbai-05, India. | Kerala, India | > Ph: +917738438212 | PIN:689548 | > ------------------------------\_______________/-------------- > > > _______________________________________________ > SciPy-User mailing list > [hidden email] > http://mail.scipy.org/mailman/listinfo/scipy-user > SciPy-User mailing list [hidden email] http://mail.scipy.org/mailman/listinfo/scipy-user |
Free forum by Nabble | Edit this page |