2.1 Differential corrections as an optimization algorithm

Next: 2.2 Linear orbit identification Up: 2. Identification penalties Previous: 2. Identification penalties

2.1 Differential corrections as an optimization algorithm

The principle of least squares assumes that a target function Q has to be minimized to find the nominal solution. The target function $Q=\xi\cdot\xi/m$ is formed with the sum of squares of the residuals $\xi$ , with $\xi \in \Re ^m$ . The residuals are normalized, as discussed in Section 3.3, thus Q is dimensionless. In our case, $m=2\cdot N_{obs}$ for N_obsastrometric observations of two angular coordinates. The residuals are functions $\xi(X)$ of the estimated parameters $X\in \Re^N$ . In the simplest problems of orbit determination N=6and X is some vector representing the orbital elements at some initial epoch t₀: in this paper $X=(a,h,k,p,q,\lambda)$ are the equinoctial elements as defined in Paper I, Sect. 4.1. Some of the coordinates of the vector X, e.g. in our case the mean longitude $\lambda$ , are not real numbers, but are sometimes angles (defined $mod(2\pi)$ ), and this introduces some complications which will be noted later.

Thus the target function also depends upon X, and the minimum of Q(X) is obtained by solving the nonlinear equations:

$\begin{displaymath}{\displaystyle \partial Q \over \displaystyle \partial X}={2 ... ...splaystyle \partial \xi \over \displaystyle \partial X}(X) \ . \end{displaymath}$

Now let the map between X and $\xi$ be linearized in a neighborhood of some point X^*:

$\begin{displaymath}\xi=B\; \Delta X \ \ \ ;\ \ \\ Delta X = X- X^* \end{displaymath}$

where the target function is approximated by a quadratic form

$\begin{displaymath}Q(X)= {\displaystyle 1 \over \displaystyle m}\; \Delta X^T \; B^T B\; \Delta X \ . \end{displaymath}$

The equations to be solved for the minimum are the normal equations

$\begin{displaymath}B^T B\; \Delta X=-B^T\; \xi \end{displaymath}$

with normal matrix C=B^T B and solution

$\begin{displaymath}\Delta X= -\Gamma\;B^T\;\xi \end{displaymath}$

computed with the covariance matrix $\Gamma=C^{-1}$ , which exists whenever C is positive-definite, which is generically the case for $m\geq N$ . We shall of course assume that the linearization is performed around the solution X^* such that $B^T\xi=\underline 0$ ; in the standard differential corrections procedure X^* is obtained by iterating the solution of the normal equation until convergence (pseudo-Newton method). For a standard reference on differential correction, see [Cappellari et al., 1976].

Please note that to apply a single iteration of differential correction, and even any fixed number of iterations, is not enough to guarantee convergence; an iterative scheme with a tight convergence control needs to be used. As an example, in our programs the convergence is controlled by requiring that the correction norm

$\begin{displaymath}\vert\vert\Delta X\vert\vert= \sqrt{\Delta X \cdot C\, \Delta X /N}\; < \epsilon \end{displaymath}$

(1)

is below a small control value $\epsilon$ to stop the iterative procedure; $\epsilon=10^{-5}$ is used in most cases.

Next: 2.2 Linear orbit identification Up: 2. Identification penalties Previous: 2. Identification penalties

Andrea Milani
2000-06-21