Jump to content


A question about Sigma points in SPKF


  • Please log in to reply
7 replies to this topic

#1 Tipou

Tipou

    New Member

  • Members
  • Pip
  • 3 posts
  • Country: flag of United States United States

Posted 28 August 2011 - 11:52 PM

Hi,

First, please excuse me if my question has already been answered. I took a look in several topics but I found nothing about my problem.
In fact, I would like to understand why it is necessary to use a set of 2n+1 Sigma points.
Indeed, for me, only 2n sigma points can be used, the central sigma point corresponding to the mean of the prior estimated state can be infered from the others because of the symmetrical distributions of the SP all around the mean.
Taking an exemple in 1D :
x is the state before propagation, and sigma² is its covariance. If we have only 2 SP corresponding to x1=x-sigma and x2=x+sigma, then the mean x can be found by (x1+x2)/2. There is no need to sample 3 SP as described in the SPKF algortihm. The covariance can also be found by [(x1-x)(x1-x)+(x2-x)(x2-x)]/2. We have already described mean and covariance with only 2 SP instead of 3.

Maybe using 2n+1 SP instead of only 2n is easier and faster, and that's why it has been implemented this way. Please could someone confirm what I am saying? (Or in the contrary explain me why I am wrong...)

Thanks

Edited by Tipou, 29 August 2011 - 12:13 AM.


#2 Haldir

Haldir

    Advanced Member

  • Members
  • PipPipPip
  • 38 posts
  • Country: flag of Germany Germany


Posted 29 August 2011 - 06:41 AM

As far as I know, van der Merwe used the 2n+1 sigma point selection scheme in his ph.d thesis because it's the most obvious and easiest one.
Julier et al. reduced the count to n+2 with the spherical simplex selection scheme and for a few special cases even n+1 is possible.

#3 noether

noether

    New Member

  • Members
  • Pip
  • 1 posts
  • Country: flag of Spain Spain

Posted 29 August 2011 - 01:04 PM

View PostTipou, on 28 August 2011 - 11:52 PM, said:

Hi,

First, please excuse me if my question has already been answered. I took a look in several topics but I found nothing about my problem.
In fact, I would like to understand why it is necessary to use a set of 2n+1 Sigma points.
Indeed, for me, only 2n sigma points can be used, the central sigma point corresponding to the mean of the prior estimated state can be infered from the others because of the symmetrical distributions of the SP all around the mean.
Taking an exemple in 1D :
x is the state before propagation, and sigma² is its covariance. If we have only 2 SP corresponding to x1=x-sigma and x2=x+sigma, then the mean x can be found by (x1+x2)/2. There is no need to sample 3 SP as described in the SPKF algortihm. The covariance can also be found by [(x1-x)(x1-x)+(x2-x)(x2-x)]/2. We have already described mean and covariance with only 2 SP instead of 3.

Maybe using 2n+1 SP instead of only 2n is easier and faster, and that's why it has been implemented this way. Please could someone confirm what I am saying? (Or in the contrary explain me why I am wrong...)

Thanks

Hello Tipou,

You are right at 50%, the sigma points BEFORE the non-linear process are simetrics. But after the non-linear function (prediction or updating stage) your points are not simetrical anymore where you have to capture the stadistics of your variable (aka filter), and NOT before the non-linear process because it makes non-sense as you are interested in your non-linear process.

#4 odtu_ee

odtu_ee

    Member

  • Members
  • PipPip
  • 13 posts
  • Country: flag of Turkey Turkey

Posted 29 August 2011 - 06:58 PM

You can use as many points as you want as long as the mean and covariance of your sigma points is equal to the mean and covariance of the underlying Gaussian distribution.

That is the reason why there are different sigma point generation methods in the literature. These different methods are then usually compared based on the similarity of higher order statistics of sigma points to the Gaussian distribution statistics.

But this is also problematic. Because, if the real distribution is not Gaussian, then those comparisons does not mean anything.

As a result, use whatever sigma point selection that you feel comfortable. In INS case, people usually prefer spherical simplex method just because it needs only n+2 points.

The real issue with sigma point filters of navigation systems lies in the attitude computation. So far, I have not seen any serious paper analyzing the sigma point methods based on the attitude parametrizations. As you might have already noticed, attitude is not a vector field which means that its mean and covariance cannot be computed as the sums of realizations. Furthermore attitude can not be represented as Gaussian. That is why, people always make up some ad-hoc procedure to apply sigma point concept to the attitude states.

Considering this arbitrariness over attitude states, you will realize that analyzing/comparing different sigma point selection methods is completely waste of time. Just select one of the published methods and continue to use it. You won't see any performance improvements by changing your sigma point selection method.

#5 Haldir

Haldir

    Advanced Member

  • Members
  • PipPipPip
  • 38 posts
  • Country: flag of Germany Germany


Posted 29 August 2011 - 07:15 PM

As odtu_ee explains it, it does not matter that much which sigma point scheme you use. One of the most common is the spherical simplex sets, which needs n+2 sigma points. Another capable one is schmidt-orthogonal, but I don't know of a square root version for it.
There are few calculation complexity comparisons here: http://www.hindawi.c...pe/2009/507370/

#6 Tipou

Tipou

    New Member

  • Members
  • Pip
  • 3 posts
  • Country: flag of United States United States

Posted 30 August 2011 - 11:29 AM

Thank you for your answers guys !
Sigma point filtering methods are more comprehensible now. Actually I had read several papers which have introduced a cost function supposed to determine the minimal number of Sigma points and I absolutely wanted to understand the meaning of it. I realized that, this cost function was introduced in order to increase the similarity of the higher order moments, as odtu_ee said.

I noticed that in “Unscented Filtering and Nonlinear Estimation” by Julier and Uhlmann, that they first used 2N symmetric SPs. But this set of Sps was not enough to have more information about higher order moments. Thus an extra SP has been added (corresponding to the mean). This new set of 2N+1 SPs allows the introduction of a scaling parameter which offers a degree of freedom to tune the filter, enabling to control some aspects of the higher orders without modifying the initial mean and covariance.

Van de Merwe illustrates it in an example with the non linear function f(x)=x². A set of 2N+1 SPs allows you to reach the fourth order moment (kurtosis) whereas it is impossible with only 2N SPs.

odtu_ee, could you explain what you mean by saying that attitude is not a vector field? I am aware of the fact that nonlinearities will occur, leading to a non gaussien distribution but I do not see the link with the vector field. Currently I am focused on sigma point filtering which is totally new to me and I may find the answer by myself later when I face the problem, but I still prefer to ask.

If a have new questions about INS or SPKF, I will be glad to ask you again.

Edited by Tipou, 31 August 2011 - 11:43 AM.


#7 odtu_ee

odtu_ee

    Member

  • Members
  • PipPip
  • 13 posts
  • Country: flag of Turkey Turkey

Posted 31 August 2011 - 07:43 PM

View PostTipou, on 30 August 2011 - 11:29 AM, said:

odtu_ee, could you explain what you mean by saying that attitude is not a vector field?

Suppose a and b be 2 quaternions.Then, c=a+b is not a quaternion because of the obvious reasons.
Therefore, quaternions is not a vector space.

The implication of this fact for the sigma point filters can also be easily noticed. Let x1...xn be n sigma points representing the distribution of the quaternion. How can you compute the mean of these sigma points (quaternions)?

A variety of different procedures to remedy this problem is described in the papers. For instance one method is as follows:
1. assume x1 is the mean (call this mean "m")
2. compute the rotation vectors between x1...xn and "m".
3. compute the algebraic mean of the rotation vectors (call this "rv")
4. compute the new mean as m=q(rv)*m, where q(rv) is the quaternion representation of "rv".
5. go to step 2 if norm(derivative(rv))>threshold
else
quit

This is just a method. I am quite sure that there are much better methods in the literature such as the ones that uses rodrigues parameters etc. I am not in a position to advice you a specific method. There are lots of papers about this and I read only a few of them. (Also each paper that I read claims its method is the best without providing a convincing comparison)

PS: pdf of attitude states must have a limited support (e.g. uniform density with [0 2PI] support). Gaussian density has infinite support. That is why attitude states cannot be assumed Gaussian regardless of any non-linearities.

#8 Tipou

Tipou

    New Member

  • Members
  • Pip
  • 3 posts
  • Country: flag of United States United States

Posted 15 September 2011 - 11:57 AM

Thank you for these details. It is clear for me now ;)