Filtro levigante Savitzky-Golay per dati non equidistanti

16

Ho un segnale misurato a 100Hz e devo applicare il filtro di livellamento Savitzky-Golay su questo segnale. Tuttavia, a un esame più attento il mio segnale non viene misurato a una velocità perfettamente costante, il delta tra le misurazioni è compreso tra 9,7 e 10,3 ms.

Esiste un modo per utilizzare il filtro Savitzky-Golay su dati non equidistanti? Ci sono altri metodi che potrei applicare?

filters smoothing

— VLC
fonte

Un articolo del 1991 di Gorry tratta praticamente di questo argomento datapdf.com/… . Ma tldr, la risposta di datageist è l'idea principale giusta (minimi quadrati locali). Ciò che Gorry osserva è che i coefficienti dipendono solo dalle variabili indipendenti e sono lineari nelle variabili dipendenti (come Savitzky-Golay). Quindi fornisce un modo per calcolarli, ma se non si sta scrivendo una libreria ottimizzata, è possibile utilizzare qualsiasi vecchio installatore dei minimi quadrati.

— Dave Pritchard,

5

Un metodo potrebbe essere quello di ricampionare i tuoi dati in modo che siano equidistanti, quindi puoi eseguire qualsiasi elaborazione tu voglia. Il ricampionamento bandlimited utilizzando il filtro lineare non sarà una buona opzione poiché i dati non sono distribuiti uniformemente, quindi è possibile utilizzare una sorta di interpolazione polinomiale locale (ad esempio spline cubiche) per stimare quali valori del segnale sottostante sono "esatti" Intervalli di 10 millisecondi.

— Jason R
fonte

Avevo in mente questa soluzione come ultima risorsa. Mi chiedo se alla fine questo approccio offra una soluzione migliore del semplice presupporre che il mio segnale sia misurato a una velocità costante.

— VLC

Penso che anche se è campionato in modo non uniforme, puoi comunque usare l'interpolazione sinc () (o un diverso filtro passa basso altamente campionato). Questo può dare risultati migliori di una spline o di un pchip

— Hilmar

1

@Hilmar: hai ragione. Esistono diversi modi per ricampionare i dati; un'interpolazione sinc approssimativa sarebbe il metodo "ideale" per il ricampionamento bandlimited.

— Jason R,

15

A causa del modo in cui il filtro Savitzky-Golay viene derivato (ovvero come accoppiamenti polinomiali dei minimi quadrati locali), c'è una generalizzazione naturale al campionamento non uniforme - è solo molto più computazionalmente costoso.

Filtri Savitzky-Golay in generale

Per il filtro standard, l'idea è di adattare un polinomio a un set locale di campioni [usando il minimo numero di quadrati], quindi sostituire il campione centrale con il valore del polinomio nell'indice centrale (cioè a 0). Ciò significa che i coefficienti di filtro SG standard possono essere generati invertendo una matrice Vandermonde di indicazioni campione. Ad esempio, per generare un adattamento parabolico locale su cinque campioni (con indicazioni locali -2, -1,0,1,2), il sistema di equazioni di progetto sarebbe il seguente: $y_0\dots y_4$ $Ac = y$

[\begin{matrix} - 2^{0} & - 2^{1} & - 2^{2} \\ - 1^{0} & - 1^{1} & - 1^{2} \\ 0^{0} & 0^{1} & 0^{2} \\ 1^{0} & 1^{1} & 1^{2} \\ 2^{0} & 2^{1} & 2^{2} \end{matrix}] [\begin{matrix} c_{0} \\ c_{1} \\ c_{2} \end{matrix}] = [\begin{matrix} y_{0} \\ y_{1} \\ y_{2} \\ y_{3} \\ y_{4} \end{matrix}] .

$\begin{bmatrix}-2^0 & -2^1 & -2^2 \\ -1^0 & -1^1 & -1^2 \\ 0^0 & 0^1 & 0^2 \\ 1^0 & 1^1 & 1^2 \\ 2^0 & 2^1 & 2^2 \end{bmatrix} \begin{bmatrix} c_0 \\ c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} y_0 \\ y_1 \\ y_2 \\ y_3 \\ y_4 \end{bmatrix}.$

$c_0 \dots c_2$ $c_0 + c_1x + c_2x^2$ $x = 0$ $c_0$ $c = (A^TA)^{-1}A^T y\space$

[\begin{matrix} c_{0} \\ c_{1} \\ c_{2} \end{matrix}] = [\begin{matrix} - 3 & 12 & 17 & 12 & - 3 \\ - 7 & - 4 & 0 & 4 & 7 \\ 5 & - 3 & - 5 & - 3 & 5 \end{matrix}] [\begin{matrix} y_{0} \\ y_{1} \\ y_{2} \\ y_{3} \\ y_{4} \end{matrix}] .

$\begin{bmatrix}c_0 \\ c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} -3 & 12 & 17 & 12 & -3 \\ -7 & -4 & 0 & 4 & 7 \\ 5 & -3 & -5 & -3 & 5 \\ \end{bmatrix} \begin{bmatrix} y_0 \\ y_1 \\ y_2 \\ y_3 \\ y_4 \end{bmatrix}.$

$c_0 + c_1x + c_2x^2$ $c_1 + 2c_2x$ $c_1$

Campionamento non uniforme

$x_n$ $t_n$ $0$

\begin{aligned} t_{- 2} & = x_{- 2} - x_{0} \\ t_{- 1} & = x_{- 1} - x_{0} \\ t_{0} & = x_{0} - x_{0} \\ t_{1} & = x_{1} - x_{0} \\ t_{2} & = x_{2} - x_{0} \end{aligned}

$\begin{align} t_{-2} & = x_{-2} - x_0 \\ t_{-1} & = x_{-1} - x_0 \\ t_0 & = x_0 - x_0 \\ t_1 & = x_1 - x_0 \\ t_2 & = x_2 - x_0 \end{align}$

quindi ogni matrice di progettazione avrà la seguente forma:

A = [\begin{matrix} t_{- 2}^{0} & t_{- 2}^{1} & t_{- 2}^{2} \\ t_{- 1}^{0} & t_{- 1}^{1} & t_{- 1}^{2} \\ t_{0}^{0} & t_{0}^{1} & t_{0}^{2} \\ t_{1}^{0} & t_{1}^{1} & t_{1}^{2} \\ t_{2}^{0} & t_{2}^{1} & t_{2}^{2} \end{matrix}] = [\begin{matrix} 1 & t_{- 2} & t_{- 2}^{2} \\ 1 & t_{- 1} & t_{- 1}^{2} \\ 1 & 0 & 0 \\ 1 & t_{1} & t_{1}^{2} \\ 1 & t_{2} & t_{2}^{2} \end{matrix}] .

$A = \begin{bmatrix} t_{-2}^0 & t_{-2}^1 & t_{-2}^2 \\ t_{-1}^0 & t_{-1}^1 & t_{-1}^2 \\ t_0^0 & t_0^1 & t_0^2 \\ t_1^0 & t_1^1 & t_1^2 \\ t_2^0 & t_2^1 & t_2^2 \end{bmatrix} = \begin{bmatrix} 1 & t_{-2} & t_{-2}^2 \\ 1 & t_{-1} & t_{-1}^2 \\ 1 & 0 & 0 \\ 1 & t_1 & t_1^2 \\ 1 & t_2 & t_2^2 \end{bmatrix}.$

The first row of the pseudoinverse of $A$ dotted with the local sample values will yield $c_0$ , the smoothed value at that sample.

— datageist
fonte

sounds like it moves from O(log(n)) to O(n^2).

— EngrStudent - Reinstate Monica

Here's an implementation of Scala described by datageist upwards.

— Medium core

1

@Mediumcore You didn't add a link to your original post. Also, I deleted it because it didn't provide an answer to the question. Please try to edit datageist's post to add a link; it'll be moderated in after review.

— Peter K.

4

"As a cheap alternative, one can simply pretend that the data points are equally spaced ...
if the change in $f$ across the full width of the $N$ point window is less than $\sqrt{N/2}$ times the measurement noise on a single point, then the cheap method can be used."
$\qquad -$ Numerical Recipes pp. 771-772

(derivation anyone ?)

("Pretend equally spaced" means:
take the nearest $\pm N/2$ points around each $t$ where you want SavGol( $t$ ),
not snap all $t_i \to i$ . That may be obvious, but got me for a while.)

— denis
fonte

1

I found out, that there are two ways to use the savitzky-golay algorithm in Matlab. Once as a filter, and once as a smoothing function, but basically they should do the same.

yy = sgolayfilt(y,k,f): Here, the values y=y(x) are assumed to be equally spaced in x.
yy = smooth (x, y, span, 'sgolay', degree): qui puoi avere x come input extra e facendo riferimento alla guida di Matlab x non deve essere equidistante!

— Jochen
fonte

0

Se è di qualche aiuto, ho realizzato un'implementazione in C del metodo descritto da datageist. Gratuito da usare a proprio rischio.

/**
 * @brief smooth_nonuniform
 * Implements the method described in  /signals/1676/savitzky-golay-smoothing-filter-for-not-equally-spaced-data
 * free to use at the user's risk
 * @param n the half size of the smoothing sample, e.g. n=2 for smoothing over 5 points
 * @param the degree of the local polynomial fit, e.g. deg=2 for a parabolic fit
 */
bool smooth_nonuniform(uint deg, uint n, std::vector<double>const &x, std::vector<double> const &y, std::vector<double>&ysm)
{
    if(x.size()!=y.size()) return false; // don't even try
    if(x.size()<=2*n)      return false; // not enough data to start the smoothing process
//    if(2*n+1<=deg+1)       return false; // need at least deg+1 points to make the polynomial

    int m = 2*n+1; // the size of the filter window
    int o = deg+1; // the smoothing order

    std::vector<double> A(m*o);         memset(A.data(),   0, m*o*sizeof(double));
    std::vector<double> tA(m*o);        memset(tA.data(),  0, m*o*sizeof(double));
    std::vector<double> tAA(o*o);       memset(tAA.data(), 0, o*o*sizeof(double));

    std::vector<double> t(m);           memset(t.data(),   0, m*  sizeof(double));
    std::vector<double> c(o);           memset(c.data(),   0, o*  sizeof(double));

    // do not smooth start and end data
    int sz = y.size();
    ysm.resize(sz);           memset(ysm.data(), 0,sz*sizeof(double));
    for(uint i=0; i<n; i++)
    {
        ysm[i]=y[i];
        ysm[sz-i-1] = y[sz-i-1];
    }

    // start smoothing
    for(uint i=n; i<x.size()-n; i++)
    {
        // make A and tA
        for(int j=0; j<m; j++)
        {
            t[j] = x[i+j-n] - x[i];
        }
        for(int j=0; j<m; j++)
        {
            double r = 1.0;
            for(int k=0; k<o; k++)
            {
                A[j*o+k] = r;
                tA[k*m+j] = r;
                r *= t[j];
            }
        }

        // make tA.A
        matMult(tA.data(), A.data(), tAA.data(), o, m, o);

        // make (tA.A)-¹ in place
        if (o==3)
        {
            if(!invert33(tAA.data())) return false;
        }
        else if(o==4)
        {
            if(!invert44(tAA.data())) return false;
        }
        else
        {
            if(!inverseMatrixLapack(o, tAA.data())) return false;
        }

        // make (tA.A)-¹.tA
        matMult(tAA.data(), tA.data(), A.data(), o, o, m); // re-uses memory allocated for matrix A

        // compute the polynomial's value at the center of the sample
        ysm[i] = 0.0;
        for(int j=0; j<m; j++)
        {
            ysm[i] += A[j]*y[i+j-n];
        }
    }

    std::cout << "      x       y       y_smoothed" << std::endl;
    for(uint i=0; i<x.size(); i++) std::cout << "   " << x[i] << "   " << y[i]  << "   "<< ysm[i] << std::endl;

    return true;
}

— techwinder
fonte