Learning from Prof Sergey Fomel’s paper: “Adaptive multiple subtraction using regularized non stationary regression”

stationary process:


Formally, let \left\{X_t\right\} be a stochastic process and let F_{X}(x_{t_1 + \tau}, \ldots, x_{t_k + \tau}) represent the cumulative distribution function of the joint distribution of \left\{X_t\right\} at times t_1 + \tau, \ldots, t_k + \tau. Then, \left\{X_t\right\} is said to be stationary if, for all k, for all \tau, and for all t_1, \ldots, t_k,

 F_{X}(x_{t_1+\tau} ,\ldots, x_{t_k+\tau}) = F_{X}(x_{t_1},\ldots, x_{t_k}).

Since \tau does not affect F_X(\cdot) F_{X} is not a function of time.


In all cases, the estimation target is a function of the independent variables called the regression function. In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression function which can be described by a probability distribution.

The performance of regression analysis methods in practice depends on the form of the data generating process, and how it relates to the regression approach being used. Since the true form of the data-generating process is generally not known, regression analysis often depends to some extent on making assumptions about this process. These assumptions are sometimes testable if a sufficient quantity of data is available. Regression models for prediction are often useful even when the assumptions are moderately violated, although they may not perform optimally. However, in many applications, especially with small effects or questions of causality based on observational data, regression methods can give misleading results.

match filtering:?

Adaptive subtraction: Standard adaptive subtraction methods use the well-known minimum energy criterion, stating that the total energy after optimal multiple attenuation should be minimal.

Adaptive subtraction

The goal of adaptive subtraction is to estimate the non-stationary filters ${\bf f}$ that minimize the objective function 

g({\bf f})=\Vert{\bf Pf}-{\bf d}\Vert^2
,\end{displaymath} (71)

where ${\bf P}$ represents the non-stationary convolution with the multiple model obtained with SRMP (i.e., Chapter [*]) and ${\bf d}$ are the input data. These filters are estimated in a least-squares sense for one shot gather at a time. Note that in practice, a regularization term is usually added in equation ([*]) to enforce smoothness between filters. This strategy is similar to the one used in Chapter [*]. The residual vector ${\bf Pf}-{\bf d}$ contains the estimated primaries.

Multiple subtraction

In this section, the multiple model computed in the preceding section is subtracted from the data with two techniques. The model is obtained after shot interpolation with the sparseness constraint. The first technique is a pattern-based method introduced in Chapter [*] that separates primaries from multiples according to their multivariate spectra. These spectra are approximated with prediction-error filters. The second technique adaptively subtract the multiple model from the data by estimating non-stationary matching filters (see Chapter [*]).

Shaping regularization: http://www.reproducibility.org/RSF/book/jsg/shape/paper_html/

Least squares sense:

Least Squares Fitting

DOWNLOAD Mathematica Notebook EXPLORE THIS TOPIC IN the MathWorld Classroom Contribute to this entryLeastSquaresFitting

A mathematical procedure for finding the best-fitting curve to a given set of points by minimizing the sum of the squares of the offsets (“the residuals”) of the points from the curve. The sum of the squares of the offsets is used instead of the offset absolute values because this allows the residuals to be treated as a continuous differentiable quantity. However, because squares of the offsets are used, outlying points can have a disproportionate effect on the fit, a property which may or may not be desirable depending on the problem at hand.

Matched filter:

In signal processing, a matched filter (originally known as a North filter[1]) is obtained by correlating a known signal, or template, with an unknown signal to detect the presence of the template in the unknown signal. This is equivalent to convolving the unknown signal with a conjugated time-reversed version of the template. The matched filter is the optimal linear filter for maximizing the signal to noise ratio (SNR) in the presence of additive stochastic noise. Matched filters are commonly used in radar, in which a known signal is sent out, and the reflected signal is examined for common elements of the out-going signal. Pulse compression is an example of matched filtering. It is so called because impulse response is matched to input pulse signals. Two-dimensional matched filters are commonly used in image processing, e.g., to improve SNR for X-ray. Matched filtering is a demodulation technique with LTI filters to maximize SNR.

Derivation of the matched filter impulse response[edit]

The following section derives the matched filter for a discrete-time system. The derivation for a continuous-time system is similar, with summations replaced with integrals.

The matched filter is the linear filter, h, that maximizes the output signal-to-noise ratio.

\ y[n] = \sum_{k=-\infty}^{\infty} h[n-k] x[k].

Though we most often express filters as the impulse response of convolution systems, as above (see LTI system theory), it is easiest to think of the matched filter in the context of the inner product, which we will see shortly.

We can derive the linear filter that maximizes output signal-to-noise ratio by invoking a geometric argument. The intuition behind the matched filter relies on correlating the received signal (a vector) with a filter (another vector) that is parallel with the signal, maximizing the inner product. This enhances the signal. When we consider the additive stochastic noise, we have the additional challenge of minimizing the output due to noise by choosing a filter that is orthogonal to the noise.

Let us formally define the problem. We seek a filter, h, such that we maximize the output signal-to-noise ratio, where the output is the inner product of the filter and the observed signal x.

Our observed signal consists of the desirable signal s and additive noise v:

\ x=s+v.\,

Let us define the covariance matrix of the noise, reminding ourselves that this matrix has Hermitian symmetry, a property that will become useful in the derivation:

\ R_v=E\{ vv^\mathrm{H} \}\,

where v^\mathrm{H} denotes the conjugate transpose of v, and E denotes expectation. Let us call our output, y, the inner product of our filter and the observed signal such that

\ y = \sum_{k=-\infty}^{\infty} h^*[k] x[k] = h^\mathrm{H}x = h^\mathrm{H}s + h^\mathrm{H}v = y_s + y_v.

We now define the signal-to-noise ratio, which is our objective function, to be the ratio of the power of the output due to the desired signal to the power of the output due to the noise:

\mathrm{SNR} = \frac{|y_s|^2}{E\{|y_v|^2\}}.

We rewrite the above:

\mathrm{SNR} = \frac{|h^\mathrm{H}s|^2}{E\{|h^\mathrm{H}v|^2\}}.

We wish to maximize this quantity by choosing h. Expanding the denominator of our objective function, we have

\ E\{ |h^\mathrm{H}v|^2 \} = E\{ (h^\mathrm{H}v){(h^\mathrm{H}v)}^\mathrm{H} \} = h^\mathrm{H} E\{vv^\mathrm{H}\} h = h^\mathrm{H}R_vh.\,

Now, our \mathrm{SNR} becomes

\mathrm{SNR} = \frac{ |h^\mathrm{H}s|^2 }{ h^\mathrm{H}R_vh }.

We will rewrite this expression with some matrix manipulation. The reason for this seemingly counterproductive measure will become evident shortly. Exploiting the Hermitian symmetry of the covariance matrix R_v, we can write

\mathrm{SNR} = \frac{ | {(R_v^{1/2}h)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
                  { {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) },

We would like to find an upper bound on this expression. To do so, we first recognize a form of the Cauchy-Schwarz inequality:

\ |a^\mathrm{H}b|^2 \leq (a^\mathrm{H}a)(b^\mathrm{H}b),\,

which is to say that the square of the inner product of two vectors can only be as large as the product of the individual inner products of the vectors. This concept returns to the intuition behind the matched filter: this upper bound is achieved when the two vectors a and b are parallel. We resume our derivation by expressing the upper bound on our \mathrm{SNR} in light of the geometric inequality above:

\mathrm{SNR} = \frac{ | {(R_v^{1/2}h)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
                  { {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) }
             \frac{ \left[
             			{(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h)
             			{(R_v^{-1/2}s)}^\mathrm{H} (R_v^{-1/2}s)
             		\right] }
                  { {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) }.

Our valiant matrix manipulation has now paid off. We see that the expression for our upper bound can be greatly simplified:

\mathrm{SNR} = \frac{ | {(R_v^{1/2}h)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
                  { {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) }
             \leq s^\mathrm{H} R_v^{-1} s.

We can achieve this upper bound if we choose,

\ R_v^{1/2}h = \alpha R_v^{-1/2}s

where \alpha is an arbitrary real number. To verify this, we plug into our expression for the output \mathrm{SNR}:

\mathrm{SNR} = \frac{ | {(R_v^{1/2}h)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
                  { {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) }
           = \frac{ \alpha^2 | {(R_v^{-1/2}s)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
                  { \alpha^2  {(R_v^{-1/2}s)}^\mathrm{H} (R_v^{-1/2}s) }
           = \frac{ | s^\mathrm{H} R_v^{-1} s |^2 }
                  { s^\mathrm{H} R_v^{-1} s }
           = s^\mathrm{H} R_v^{-1} s.

Thus, our optimal matched filter is

\ h = \alpha R_v^{-1}s.

We often choose to normalize the expected value of the power of the filter output due to the noise to unity. That is, we constrain

\ E\{ |y_v|^2 \} = 1.\,

This constraint implies a value of \alpha, for which we can solve:

\ E\{ |y_v|^2 \} = \alpha^2 s^\mathrm{H} R_v^{-1} s = 1,


\ \alpha = \frac{1}{\sqrt{s^\mathrm{H} R_v^{-1} s}},

giving us our normalized filter,

\ h = \frac{1}{\sqrt{s^\mathrm{H} R_v^{-1} s}} R_v^{-1}s.

If we care to write the impulse response of the filter for the convolution system, it is simply the complex conjugate time reversal of h.

Though we have derived the matched filter in discrete time, we can extend the concept to continuous-time systems if we replace R_v with the continuous-time autocorrelation function of the noise, assuming a continuous signal s(t), continuous noise v(t), and a continuous filter h(t).


My experiences at Directed Reading Program at Math Department at UT Austin

It has been several weeks from the perfect ending of the great Directed Reading Program sponsored by the Department of Mathematics at UT Austin. I am one of the final 20 presenters on 24/12/2014 who had shown through their presentations that they did make the best and most out of this educational resources.

I conducted my independent mini project under the patient guidance of Sona Akopian about the analysis/theories of wave equations, the numerical algorithms, and also the visualization– making a 2-D wave equation movie in Matlab.

Here are what I output. For further details? Stay tuned!

With CFL conditions. Initial equation as a sine equation

With CFL conditions. Initial equation as a sine equation

with CFL conditions initial equation as a parabola

with CFL conditions initial equation as a parabola

Without CFl conditions. it blows out.

Without CFl conditions. it blows out.

What is Reproducible Research and why it is important.

“A community involvement in actively maintaining reproducibility of previously published results assures that a body of knowledge in a computational field stays alive and can be scientifically extended though continuing contributions.”

–Sergey Fomel

“It is a big chore for one researcher to reproduce the analysis and computational results of another. I discovered that this problem has a simple technological solution: illustrations (figures) in a technical document are made by programs and command scripts that along with required data should be linked to the document itself.”

–Jon Claerbout

I’ve got little research experience particularly with large-scale computationally-based project until I attended the second Madagascar Working Workshop held by “The Rice Inversion Project” this summer in Houston, where I also first knew about an old but also new concept in natural science even recently in social science area– Reproducible Research.

So what is it, how has it been raising researchers’ and professionals’ interest, and why it is of increasing importance?

As the word “Reproducibility” denotes, it is the ability of an entire experiment or study to be reproduced, either by the researcher or by someone else working independently.(Wikipedia-Reproducibility). To put it more vividly, I believe “Reproducibility” is like the recipe you attached to your freshly-made-creative-savoury dish with which people can reproduce the dish at home and even add some new ingredients or spices as time flies. Then Reproducible Research is naturally one where researchers in particularly but not restricted to the field of computational science deliberately “serve” for the scientific community to provide the complete development environment and the complete set of instructions.

How did reproducibility raise professionals concerns? I want to mention a recent event in academia in biology, and also a situation I just run into this afternoon with my mentor Sona in our small computational research project about wave equation.


January 2014, two papers, whose lead author, Haruko Obokata, a young and beautiful Japanese female researcher in her early thirties, about a simple new method for creating stem cells were published in the prestigious journal Nature to much fanfare. It suddenly made a stir in media in Japan and beyond. However, these two papers were retracted in July after another researcher based in UC Berkeley concluding that her studies are dubious with the fact that his team had tried so many times but were still unable to reproduce the results. We can see the paramount importance of reproducibility in disciplines like biology, physics, mathematics whose quality of detailed deriving process determines the persuasion of authors’ achievements. In terms of scientific computation or the field of software development, both of which underline the final results and figures rather than the process, however, the significance of the implementation of reproducible-research frameworks stems from, first of all, the necessity for long-term maintenance of reproducible results. “Simple storage of software codes is useful but insufficient, because typical scientific codes have multiple dependencies( libraries, compliers, operating systems, etc.). With time, different parts of the software environment change and cause the reproducibility of previously published results to break down.” (“Reproducible research as a community effort: Lessons form the Madagascar project”–Sergey Fomel). Further, situations where researchers and publishers’ colleagues question part of their paper and look for backup details from them are too common and familiar to many of the professionals in the field, and their review and tracing back the right set of codes used in that paper and not to mention the right set of parameter they select to output the nice results takes these researchers a fairly big amount of time and thus decrease their research productivity or academic competitiveness. This is so true even for an undergraduate student who is doing her own independent project like me. This afternoon, I was debugging the codes my mentor sent me to refer to which can make a movie in 2-D about a stable moving wave. It’s a matlab code and I tried many times but still could not find what’s going wrong. I decided to check with google one command by one command. After tedious repetition of “copy and paste” on google I finally found out that it is because one of the file “avifile” has been removed from Matlab and I shall use “VideoWriter” class instead. My personal experience exemplifies that inevitable changes in the software environment easily cause breakdown and without dedicated and continuous maintenance, the computational results easily loose their credibility and practicability.

At the end of this blog, I’d like to provide two very useful and detailed website which people interested in the specific tools for implementation of Reproducibility can refer to afterwards:


http://www.ahay.org/wiki/Houston_2014 You’re also welcome to join the Madagascar open community, a great source comprised of three levels: programs, workflow scripts, and papers.