gtsam/doc/trustregion.lyx

790 lines
16 KiB
Plaintext

#LyX 2.1 created this file. For more info see http://www.lyx.org/
\lyxformat 474
\begin_document
\begin_header
\textclass article
\begin_preamble
\usepackage{url}
\usepackage{hyperref}
\end_preamble
\use_default_options true
\maintain_unincluded_children false
\language english
\language_package default
\inputencoding auto
\fontencoding global
\font_roman default
\font_sans default
\font_typewriter default
\font_math auto
\font_default_family default
\use_non_tex_fonts false
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100
\graphics default
\default_output_format default
\output_sync 0
\bibtex_command default
\index_command default
\paperfontsize default
\spacing single
\use_hyperref false
\papersize default
\use_geometry false
\use_package amsmath 1
\use_package amssymb 2
\use_package cancel 1
\use_package esint 1
\use_package mathdots 1
\use_package mathtools 1
\use_package mhchem 1
\use_package stackrel 0
\use_package stmaryrd 1
\use_package undertilde 1
\cite_engine basic
\cite_engine_type default
\biblio_style plain
\use_bibtopic false
\use_indices false
\paperorientation portrait
\suppress_date false
\justification true
\use_refstyle 1
\index Index
\shortcut idx
\color #008000
\end_index
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\paragraph_indentation default
\quotes_language english
\papercolumns 1
\papersides 1
\paperpagestyle default
\tracking_changes false
\output_changes false
\html_math_output 0
\html_css_as_file 0
\html_be_strict false
\end_header
\begin_body
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\SE}[1]{\mathbb{SE}\left(#1\right)}
{\mathbb{SE}\left(#1\right)}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\se}[1]{\mathfrak{se}\left(#1\right)}
{\mathfrak{se}\left(#1\right)}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\SO}[1]{\mathbb{SO}\left(#1\right)}
{\mathbb{SO}\left(#1\right)}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\so}[1]{\mathfrak{so}\left(#1\right)}
{\mathfrak{so}\left(#1\right)}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\R}[1]{\mathbb{R}^{#1}}
{\mathbb{R}^{#1}}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\prob}[2]{#1\hspace{0.1em}|\hspace{0.1em}#2}
{#1\hspace{0.1em}|\hspace{0.1em}#2}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\norm}[1]{\left\Vert #1\right\Vert }
{\left\Vert #1\right\Vert }
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\t}{\mathsf{T}}
{\mathsf{T}}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
newcommand{
\backslash
smallequals}{
\backslash
mbox{
\backslash
raisebox{0.2ex}{
\backslash
fontsize{8}{10}
\backslash
selectfont $=$}}}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\smeq}{\smallequals}
{{\scriptstyle =}}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\th}[1]{#1^{\mathrm{th}}}
{#1^{\mathrm{th}}}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\defeq}{\stackrel{\Delta}{=}}
{\stackrel{\Delta}{=}}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\im}{\mathcal{I}}
{\mathcal{I}}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\lin}[1]{\overset{{\scriptscriptstyle \circ}}{#1}}
{\overset{{\scriptscriptstyle \circ}}{#1}}
\end_inset
\end_layout
\begin_layout Standard
\begin_inset FormulaMacro
\newcommand{\lins}[3]{\overset{{\scriptscriptstyle \circ}}{#1}\vphantom{#1}_{#3}^{#2}}
{\overset{{\scriptscriptstyle \circ}}{#1}\vphantom{#1}_{#3}^{#2}}
\end_inset
\end_layout
\begin_layout Section
Overview of Trust-region Methods
\end_layout
\begin_layout Standard
For nice figures, see
\begin_inset space ~
\end_inset
\begin_inset CommandInset citation
LatexCommand cite
key "Hauser06lecture"
\end_inset
.
\end_layout
\begin_layout Standard
We just deal here with a small subset of trust-region methods, specifically
approximating the cost function as quadratic using Newton's method, and
using the Dogleg method and later to include Steihaug's method.
\end_layout
\begin_layout Standard
The overall goal of a nonlinear optimization method is to iteratively find
a local minimum of a nonlinear function
\begin_inset Formula
\[
\hat{x}=\arg\min_{x}f\left(x\right)
\]
\end_inset
where
\begin_inset Formula $f\left(x\right)\to\mathbb{R}$
\end_inset
is a scalar function.
In GTSAM, the variables
\begin_inset Formula $x$
\end_inset
could be manifold or Lie group elements, so in this document we only work
with
\emph on
increments
\emph default
\begin_inset Formula $\delta x\in\R n$
\end_inset
in the tangent space.
In this document we specifically deal with
\emph on
trust-region
\emph default
methods, which at every iteration attempt to find a good increment
\begin_inset Formula $\norm{\delta x}\leq\Delta$
\end_inset
within the
\begin_inset Quotes eld
\end_inset
trust radius
\begin_inset Quotes erd
\end_inset
\begin_inset Formula $\Delta$
\end_inset
.
\end_layout
\begin_layout Standard
Further, most nonlinear optimization methods, including trust region methods,
deal with an approximate problem at every iteration.
Although there are other choices (such as quasi-Newton), the Newton's method
approximation is, given an estimate
\begin_inset Formula $x^{\left(k\right)}$
\end_inset
of the variables
\begin_inset Formula $x$
\end_inset
,
\begin_inset Formula
\begin{equation}
f\left(x^{\left(k\right)}\oplus\delta x\right)\approx M^{\left(k\right)}\left(\delta x\right)=f^{\left(k\right)}+g^{\left(k\right)\t}\delta x+\frac{1}{2}\delta x^{\t}G^{\left(k\right)}\delta x\text{,}\label{eq:M-approx}
\end{equation}
\end_inset
where
\begin_inset Formula $f^{\left(k\right)}=f\left(x^{\left(k\right)}\right)$
\end_inset
is the function at
\begin_inset Formula $x^{\left(k\right)}$
\end_inset
,
\begin_inset Formula $g^{\left(x\right)}=\left.\frac{\partial f}{\partial x}\right|_{x^{\left(k\right)}}$
\end_inset
is its gradient, and
\begin_inset Formula $G^{\left(k\right)}=\left.\frac{\partial^{2}f}{\partial x^{2}}\right|_{x^{\left(k\right)}}$
\end_inset
is its Hessian (or an approximation of the Hessian).
\end_layout
\begin_layout Standard
Trust-region methods adaptively adjust the trust radius
\begin_inset Formula $\Delta$
\end_inset
so that within it,
\begin_inset Formula $M$
\end_inset
is a good approximation of
\begin_inset Formula $f$
\end_inset
, and then never step beyond the trust radius in each iteration.
When the true minimum is within the trust region, they converge quadratically
like Newton's method.
At each iteration
\begin_inset Formula $k$
\end_inset
, they solve the
\emph on
trust-region subproblem
\emph default
to find a proposed update
\begin_inset Formula $\delta x$
\end_inset
inside the trust radius
\begin_inset Formula $\Delta$
\end_inset
, which decreases the approximate function
\begin_inset Formula $M^{\left(k\right)}$
\end_inset
as much as possible.
The proposed update is only accepted if the true function
\begin_inset Formula $f$
\end_inset
decreases as well.
If the decrease of
\begin_inset Formula $M$
\end_inset
matches the decrease of
\begin_inset Formula $f$
\end_inset
well, the size of the trust region is increased, while if the match is
not close the trust region size is decreased.
\end_layout
\begin_layout Standard
Minimizing Eq.
\begin_inset space ~
\end_inset
\begin_inset CommandInset ref
LatexCommand ref
reference "eq:M-approx"
\end_inset
is itself a nonlinear optimization problem, so there are various methods
for approximating it, including Dogleg and Steihaug's method.
\end_layout
\begin_layout Section
Adapting the Trust Region Size
\end_layout
\begin_layout Standard
As mentioned in the previous section, we increase the trust region size
if the decrease in the model function
\begin_inset Formula $M$
\end_inset
matches the decrease in the true cost function
\begin_inset Formula $S$
\end_inset
very closely, and decrease it if they do not match closely.
The closeness of this match is measured with the
\emph on
gain ratio
\emph default
,
\begin_inset Formula
\[
\rho=\frac{f\left(x\right)-f\left(x\oplus\delta x_{d}\right)}{M\left(0\right)-M\left(\delta x_{d}\right)}\text{,}
\]
\end_inset
where
\begin_inset Formula $\delta x_{d}$
\end_inset
is the proposed dogleg step to be introduced next.
The decrease in the model function is always non-negative, and as the decrease
in
\begin_inset Formula $f$
\end_inset
approaches it,
\begin_inset Formula $\rho$
\end_inset
approaches
\begin_inset Formula $1$
\end_inset
.
If the true cost function increases,
\begin_inset Formula $\rho$
\end_inset
will be negative, and if the true cost function decreases even more than
predicted by
\begin_inset Formula $M$
\end_inset
, then
\begin_inset Formula $\rho$
\end_inset
will be greater than
\begin_inset Formula $1$
\end_inset
.
A typical update rule, as per Lec.
7-1.2 of
\begin_inset CommandInset citation
LatexCommand cite
key "Hauser06lecture"
\end_inset
is:
\begin_inset Formula
\[
\Delta_{k+1}\leftarrow\begin{cases}
\Delta_{k}/4 & \rho<0.25\\
\min\left(2\Delta_{k},\Delta_{max}\right)\text{,} & \rho>0.75\\
\Delta_{k} & 0.75>\rho>0.25
\end{cases}
\]
\end_inset
where
\begin_inset Formula $\Delta_{k}\triangleq\norm{\delta x_{d}}$
\end_inset
.
Note that the rule is designed to ensure that
\begin_inset Formula $\Delta_{k}$
\end_inset
never exceeds the maximum trust region size
\begin_inset Formula $\Delta_{max}.$
\end_inset
\end_layout
\begin_layout Section
Dogleg
\end_layout
\begin_layout Standard
Dogleg minimizes an approximation of Eq.
\begin_inset space ~
\end_inset
\begin_inset CommandInset ref
LatexCommand ref
reference "eq:M-approx"
\end_inset
by considering three possibilities using two points - the minimizer
\begin_inset Formula $\delta x_{u}^{\left(k\right)}$
\end_inset
of
\begin_inset Formula $M^{\left(k\right)}$
\end_inset
along the negative gradient direction
\begin_inset Formula $-g^{\left(k\right)}$
\end_inset
, and the overall Newton's method minimizer
\begin_inset Formula $\delta x_{n}^{\left(k\right)}$
\end_inset
of
\begin_inset Formula $M^{\left(k\right)}$
\end_inset
.
When the Hessian
\begin_inset Formula $G^{\left(k\right)}$
\end_inset
is positive, the magnitude of
\begin_inset Formula $\delta x_{u}^{\left(k\right)}$
\end_inset
is always less than that of
\begin_inset Formula $\delta x_{n}^{\left(k\right)}$
\end_inset
, meaning that the Newton's method step is
\begin_inset Quotes eld
\end_inset
more adventurous
\begin_inset Quotes erd
\end_inset
.
How much we step towards the Newton's method point depends on the trust
region size:
\end_layout
\begin_layout Enumerate
If the trust region is smaller than
\begin_inset Formula $\delta x_{u}^{\left(k\right)}$
\end_inset
, we step in the negative gradient direction but only by the trust radius.
\end_layout
\begin_layout Enumerate
If the trust region boundary is between
\begin_inset Formula $\delta x_{u}^{\left(k\right)}$
\end_inset
and
\begin_inset Formula $\delta x_{n}^{\left(k\right)}$
\end_inset
, we step to the linearly-interpolated point between these two points that
intersects the trust region boundary.
\end_layout
\begin_layout Enumerate
If the trust region boundary is larger than
\begin_inset Formula $\delta x_{n}^{\left(k\right)}$
\end_inset
, we step to
\begin_inset Formula $\delta x_{n}^{\left(k\right)}$
\end_inset
.
\end_layout
\begin_layout Standard
To find the intersection of the line between
\begin_inset Formula $\delta x_{u}^{\left(k\right)}$
\end_inset
and
\begin_inset Formula $\delta x_{n}^{\left(k\right)}$
\end_inset
with the trust region boundary, we solve a quadratic roots problem,
\begin_inset Formula
\begin{align*}
\Delta & =\norm{\left(1-\tau\right)\delta x_{u}+\tau\delta x_{n}}\\
\Delta^{2} & =\left(1-\tau\right)^{2}\delta x_{u}^{\t}\delta x_{u}+2\tau\left(1-\tau\right)\delta x_{u}^{\t}\delta x_{n}+\tau^{2}\delta x_{n}^{\t}\delta x_{n}\\
0 & =\delta x_{u}^{\t}\delta x_{u}-2\tau\delta x_{u}^{\t}\delta x_{u}+\tau^{2}\delta x_{u}^{\t}\delta x_{u}+2\tau\delta x_{u}^{\t}\delta x_{n}-2\tau^{2}\delta x_{u}^{\t}\delta x_{n}+\tau^{2}\delta x_{n}^{\t}\delta x_{n}-\Delta^{2}\\
0 & =\left(\delta x_{u}^{\t}\delta x_{u}-2\delta x_{u}^{\t}\delta x_{n}+\delta x_{n}^{\t}\delta x_{n}\right)\tau^{2}+\left(2\delta x_{u}^{\t}\delta x_{n}-2\delta x_{u}^{\t}\delta x_{u}\right)\tau-\Delta^{2}+\delta x_{u}^{\t}\delta x_{u}\\
\tau & =\frac{-\left(2\delta x_{u}^{\t}\delta x_{n}-2\delta x_{u}^{\t}\delta x_{u}\right)\pm\sqrt{\left(2\delta x_{u}^{\t}\delta x_{n}-2\delta x_{u}^{\t}\delta x_{u}\right)^{2}-4\left(\delta x_{u}^{\t}\delta x_{u}-2\delta x_{u}^{\t}\delta x_{n}+\delta x_{n}^{\t}\delta x_{n}\right)\left(\delta x_{u}^{\t}\delta x_{u}-\Delta^{2}\right)}}{2\left(\delta x_{u}^{\t}\delta x_{u}-\delta x_{u}^{\t}\delta x_{n}+\delta x_{n}^{\t}\delta x_{n}\right)}
\end{align*}
\end_inset
From this we take whichever possibility for
\begin_inset Formula $\tau$
\end_inset
such that
\begin_inset Formula $0<\tau<1$
\end_inset
.
\end_layout
\begin_layout Standard
To find the steepest-descent minimizer
\begin_inset Formula $\delta x_{u}^{\left(k\right)}$
\end_inset
, we perform line search in the gradient direction on the approximate function
\begin_inset Formula $M$
\end_inset
,
\begin_inset Formula
\begin{equation}
\delta x_{u}^{\left(k\right)}=\frac{-g^{\left(k\right)\t}g^{\left(k\right)}}{g^{\left(k\right)\t}G^{\left(k\right)}g^{\left(k\right)}}g^{\left(k\right)}\label{eq:steepest-descent-point}
\end{equation}
\end_inset
\end_layout
\begin_layout Standard
Thus, mathematically, we can write the dogleg update
\begin_inset Formula $\delta x_{d}^{\left(k\right)}$
\end_inset
as
\begin_inset Formula
\[
\delta x_{d}^{\left(k\right)}=\begin{cases}
-\frac{\Delta}{\norm{\delta x_{u}^{\left(k\right)}}}\delta x_{u}^{\left(k\right)}\text{,} & \Delta<\norm{\delta x_{u}^{\left(k\right)}}\\
\left(1-\tau^{\left(k\right)}\right)\delta x_{u}^{\left(k\right)}+\tau^{\left(k\right)}\delta x_{n}^{\left(k\right)}\text{,} & \norm{\delta x_{u}^{\left(k\right)}}<\Delta<\norm{\delta x_{n}^{\left(k\right)}}\\
\delta x_{n}^{\left(k\right)}\text{,} & \norm{\delta x_{n}^{\left(k\right)}}<\Delta
\end{cases}
\]
\end_inset
\end_layout
\begin_layout Section
Working with
\begin_inset Formula $M$
\end_inset
as a Bayes' Net
\end_layout
\begin_layout Standard
When we have already eliminated a factor graph into a Bayes' Net, we have
the same quadratic error function
\begin_inset Formula $M^{\left(k\right)}\left(\delta x\right)$
\end_inset
, but it is in a different form:
\begin_inset Formula
\[
M^{\left(k\right)}\left(\delta x\right)=\frac{1}{2}\norm{Rx-d}^{2}\text{,}
\]
\end_inset
where
\begin_inset Formula $R$
\end_inset
is an upper-triangular matrix (stored as a set of sparse block Gaussian
conditionals in GTSAM), and
\begin_inset Formula $d$
\end_inset
is the r.h.s.
vector.
The gradient and Hessian of
\begin_inset Formula $M$
\end_inset
are then
\begin_inset Formula
\begin{align*}
g^{\left(k\right)} & =R^{\t}\left(Rx-d\right)\\
G^{\left(k\right)} & =R^{\t}R
\end{align*}
\end_inset
\end_layout
\begin_layout Standard
In GTSAM, because the Bayes' Net is not dense, we evaluate Eq.
\begin_inset space ~
\end_inset
\begin_inset CommandInset ref
LatexCommand ref
reference "eq:steepest-descent-point"
\end_inset
in an efficient way.
Rewriting the denominator (leaving out the
\begin_inset Formula $\left(k\right)$
\end_inset
superscript) as
\begin_inset Formula
\[
g^{\t}Gg=\sum_{i}\left(R_{i}g\right)^{\t}\left(R_{i}g\right)\text{,}
\]
\end_inset
where
\begin_inset Formula $i$
\end_inset
indexes over the conditionals in the Bayes' Net (corresponding to blocks
of rows of
\begin_inset Formula $R$
\end_inset
) exploits the sparse structure of the Bayes' Net, because it is easy to
only include the variables involved in each
\begin_inset Formula $i^{\text{th}}$
\end_inset
conditional when multiplying them by the corresponding elements of
\begin_inset Formula $g$
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset CommandInset bibtex
LatexCommand bibtex
bibfiles "trustregion"
options "plain"
\end_inset
\end_layout
\end_body
\end_document