{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# DynaFu ICP Math\n",
    "## Differentiating and Linearising Rt matrices\n",
    "\n",
    "In dynafu, the warp function looks like the following for each node $i$:\n",
    "\n",
    "\n",
    "$\n",
    "\\begin{equation*}\n",
    "f_i(x_i, V_g) = T_{x_i} * V_g = R(x_i) * V_g + t(x_i)\n",
    "\\end{equation*}\n",
    "$\n",
    "\n",
    "where ${x_i}$ are the transformation parameters for node $i$ and the rotation is performed around the corresponding node (and not a global reference)\n",
    "\n",
    "For linearising a transform around the parameters $\\mathbf{x}$, we need to find the derivative\n",
    "\n",
    "$\n",
    "\\begin{equation*}\n",
    "\\displaystyle\n",
    "\\frac{\\partial f_i(\\mathbf{x} \\circ \\epsilon,   V_g)}{\\partial \\epsilon} |_{\\epsilon = 0}\n",
    "\\end{equation*}\n",
    "$\n",
    "\n",
    "We calculate this as follows:\n",
    "\n",
    "$\n",
    "\\begin{equation*}\n",
    "f_i(\\mathbf{x} \\circ \\epsilon, V_g) = f_i(\\epsilon, V) = T_{inc} * V\n",
    "\\end{equation*}\n",
    "$ where $V = f_i(\\mathbf{x}, V_g)$ and $T_{inc}$ is the infinitesimal transform with parameters $\\epsilon$\n",
    "\n",
    "According to Lie algebra, each Rt matrix can be represented as $A = e^\\xi$ where $\\xi$ are the transform parameters. Therefore,\n",
    "\n",
    "\n",
    "$\n",
    "\\begin{equation*}\n",
    "f_i(\\mathbf{x}, V_g) = e^\\xi V\n",
    "\\end{equation*}\n",
    "$\n",
    "\n",
    "Therefore,\n",
    "\n",
    "$\n",
    "\\begin{equation*}\n",
    "\\displaystyle\n",
    "\\frac{\\partial f_i(\\mathbf{x} \\circ \\xi,   V_g)}{\\partial \\xi} |_{\\xi = 0} =\n",
    "\\frac{\\partial e^\\xi V}{\\partial \\xi} |_{\\xi=0} = \n",
    "\\begin{pmatrix} -[V]_{\\times} & I_{3x3} \\end{pmatrix}_{3 \\times 6}\n",
    "\\end{equation*}\n",
    "$\n",
    "\n",
    "Let us denote $\\begin{pmatrix} -[V]_{\\times} & I_{3x3} \\end{pmatrix}$ as $G(V)$ from now on.\n",
    "\n",
    "This result is mentioned in [this manifold optimisation tutorial](http://ingmec.ual.es/~jlblanco/papers/jlblanco2010geometry3D_techrep.pdf) (equation 10.23).\n",
    "\n",
    "With this result, we can now linearise our transformation around $\\mathbf{x}$:\n",
    "\n",
    "$\n",
    "\\begin{equation*}\n",
    "f_i(x_i, V_g) = G(V) * \\epsilon + V\n",
    "\\end{equation*}\n",
    "$\n",
    "\n",
    "\n",
    "I suppose the following is an equivalent excerpt from the dynafu paper (Section about efficient optimisation) that mentions this way of calculating derivatives:\n",
    "> We formulate compositional updates $\\hat x$ through the exponential map with a per-node twist $ξ_i ∈ se(3)$, requiring 6 variables per node transform, and perform linearisation  around $ξ_i=  0$. \n",
    "\n",
    "As a side note, the derivative $\\large \\frac{\\partial e^\\xi}{\\partial \\xi}|_{\\xi=0}$ is called the tangent (esentially the derivative) to the SE(3) manifold (the space in which Rt matrix $T_{inc}$ exists) at identity ($\\xi = 0$)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Estimating Warp Field Parameters\n",
    "The total energy to be minimised is \n",
    "\n",
    "$\n",
    "E = E_{data} + \\lambda E_{reg}\n",
    "$\n",
    "\n",
    "#### Data term rearrangement \n",
    "$\n",
    "\\displaystyle\n",
    "E_{data} = \\sum_{u \\in \\Omega} \\rho_{Tukey}( (T_u N_g)^T ((T_u V_g) - V_c))\n",
    "$\n",
    "\n",
    "The quadcopter paper tells us that the following expression has the same minimiser, so we can use this instead:\n",
    "\n",
    "$\n",
    "\\displaystyle\n",
    "E_{data} = \\sum_{u \\in \\Omega} w_{Tukey}(r_u) \\cdot (r_u)^2\n",
    "$\n",
    "\n",
    "where $w_{Tukey}(x) = \\rho'(x)/x$ which behaves like a constant term and $r_u = N_g^T (V_g - T_u^{-1}\\cdot V_c)$\n",
    "\n",
    "#### Regularisation term rearrangement\n",
    "$\n",
    "\\begin{equation}\n",
    "\\displaystyle\n",
    "E_{reg} = \\sum_{i = 0}^n \\sum_{j \\in \\varepsilon(i)} \\alpha_{ij} \\rho_{Huber} (T_{i}V_g^j - T_{j}V_g^j)\n",
    "\\end{equation}\n",
    "$\n",
    "\n",
    "This needs to be changed to the form of weighted least squares to be useful. So incorporate the same rearrangement as the data term and sum over edges instead:\n",
    "\n",
    "$\n",
    "\\begin{equation}\n",
    "\\displaystyle\n",
    "E_{reg} = \\sum_{e \\in E} w_{Huber}(r_e) (r_e)^2\n",
    "\\end{equation}\n",
    "$\n",
    "\n",
    "Here $E$ is the set of the directed edges in the regularisation graph between all nodes from current level and the next coarser level. And $w_{Huber}(x) = \\alpha_x \\rho'(x)/x$\n",
    "\n",
    "#### Obtaining normal equation\n",
    "\n",
    "Therefore to solve an iteration, we equate the derivative with 0\n",
    "\n",
    "$\n",
    "\\begin{equation*}\n",
    "\\large\n",
    "\\frac{\\partial E_{data}}{\\partial \\xi} + \\lambda \\frac{\\partial E_{reg}}{\\partial \\xi} = 0\n",
    "\\end{equation*}\n",
    "$\n",
    "\n",
    "which gives us\n",
    "\n",
    "$\n",
    "\\begin{equation*}\n",
    "J_d^T W_d(r_d + J_d\\mathbf{\\hat x}) + \\lambda J_r^T W_r (r_r + J_r\\mathbf{\\hat x}) = 0\n",
    "\\end{equation*}\n",
    "$\n",
    "\n",
    "$\n",
    "(J_d^T W_d J_d + \\lambda J_r^T W_r J_r)\\mathbf{\\hat x} = -(J_d^T W_d r_d + \\lambda J_r^T W_r r_r)\n",
    "$\n",
    "\n",
    "Here $W_d$ and $W_r$ are the weight matrices as described in quadcopter paper. However for $W_r, \\alpha$ is also incorporated in this matrix"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Calculating Data Term Jacobian ($J_d$) \n",
    "\n",
    "We estimate the inverse of Rt matrices instead of the Rt matrices themselves. So firstly we have to write $T_u^{-1} V_g$ in terms of inverse matrices. However, I realised that\n",
    "\n",
    "$$\n",
    "\\begin{equation*}\n",
    "T_u^{-1} V_g \\ne \\sum_{k \\in N(V_u)} \\frac{w_k T_k^{-1}V_g}{w_k}\n",
    "\\end{equation*}\n",
    "$$\n",
    "\n",
    "Unfortunately, I could not find a representation of $T_u^{-1} V_g$ in terms of $T_k^{-1}V_g$ and got stuck here. Below is an approach without estimating the inverse Rt matrices, I think we can use that instead as the math is now fixed.\n",
    "\n",
    "### Alternative calculation for $J_d$\n",
    "The residual $r_u$ in the data term is \n",
    "\n",
    "$$\n",
    "r_u = (T_u N_g)^T (T_u V_g - V_c)\n",
    "$$\n",
    "\n",
    "Let $a, b$ be column vectors such that $a = T_u N_g$ and $b = (T_u V_g - V_c)$. Now we can state the residual as \n",
    "\n",
    "$$\n",
    "r_u = a^Tb\n",
    "$$\n",
    "\n",
    "Each entry in $J_d$ for node paramter $x_j$ associated with node $j$ is:\n",
    "\n",
    "$$\n",
    "(J_d)_{uj} = \\frac{\\partial r_u}{\\partial x_j} = \\frac{\\partial (a^Tb)}{\\partial x_j}\n",
    "$$\n",
    "\n",
    "**Please note that numerator layout is assumed in all the derivatives**\n",
    "\n",
    "Applying chain rule for multiple variables, we get\n",
    "\n",
    "$$\n",
    "\\begin{equation}\\begin{aligned}\n",
    "\\frac{\\partial (a^Tb)}{\\partial x_j} & = \n",
    "\\frac{\\partial (a^Tb)}{\\partial a} \\cdot \\frac{\\partial a}{\\partial x_j} +\n",
    "\\frac{\\partial (a^Tb)}{\\partial b} \\cdot \\frac{\\partial b}{\\partial x_j} \\\\\n",
    "& =\n",
    "\\frac{\\partial (a^Tb)}{\\partial a} \\cdot \\frac{\\partial a}{\\partial x_j} +\n",
    "\\frac{\\partial (b^Ta)}{\\partial b} \\cdot \\frac{\\partial b}{\\partial x_j} && \\text{Since  $a^Tb = b^Ta$} \\\\\n",
    "& =\n",
    "b^T \\cdot \\frac{\\partial a}{\\partial x_j} +\n",
    "a^T \\cdot \\frac{\\partial b}{\\partial x_j} && \\text{Since $\\frac{\\partial x^TA}{\\partial x} = A^T$}\n",
    "\\end{aligned}\\end{equation}\\tag{1}\\label{1}\n",
    "$$\n",
    "\n",
    "The identity $\\frac{\\partial x^TA}{\\partial x} = A^T$ is mentioned in [this wikipedia page](https://en.wikipedia.org/wiki/Matrix_calculus#Vector-by-vector_identities). Now we calculate $\\frac{\\partial a}{\\partial x_j}$ and $\\frac{\\partial b}{\\partial x_j}$ as follows:\n",
    "\n",
    "$$\n",
    "\\begin{equation}\\begin{aligned}\n",
    "\\frac{\\partial a}{\\partial x_j} & = \\frac{\\partial (T_u N_g)}{\\partial x_j} \\\\\n",
    "& = \\sum_{k \\in N(V_u)} \\frac{w_k \\frac{\\partial T_k N_g}{\\partial x_j}}{w_k} \\\\\n",
    "& = \n",
    "\\begin{cases}\n",
    " \\frac{w_j \\frac{\\partial T_j N_g}{\\partial x_j}}{\\sum_{k \\in N(V_u)} w_k} && \\text{if $j \\in N(V_u)$} \\\\\n",
    " 0 && \\text{otherwise}\n",
    "\\end{cases} \\\\\n",
    "& = \n",
    "\\begin{cases}\n",
    " \\frac{w_j \\begin{pmatrix}-[T_j N_g]_\\times & I_{3\\times3}\\end{pmatrix}}{\\sum_{k \\in N(V_u)} w_k} && \\text{if $j \\in N(V_u)$} \\\\\n",
    " 0 && \\text{otherwise}\n",
    "\\end{cases}\n",
    "\\end{aligned}\\end{equation}\\tag{2}\\label{2}\n",
    "$$\n",
    "\n",
    "\n",
    "$$\n",
    "\\begin{equation}\\begin{aligned}\n",
    "\\frac{\\partial b}{\\partial x_j} & = \\frac{\\partial (T_uV_g - V_c)}{\\partial x_j} \\\\\n",
    "& = \\frac{\\partial T_uV_g}{\\partial x_j}\\\\\n",
    "& = \n",
    "\\begin{cases}\n",
    " \\frac{w_j \\begin{pmatrix}-[T_j V_g]_\\times & I_{3\\times3}\\end{pmatrix}}{\\sum_{k \\in N(V_u)} w_k} && \\text{if $j \\in N(V_u)$} \\\\\n",
    " 0 && \\text{otherwise}\n",
    "\\end{cases} && \\text{Calculated similarly to ($\\ref{2}$)}\n",
    "\\end{aligned}\\end{equation}\\tag{3}\\label{3}\n",
    "$$\n",
    "\n",
    "Substituting equations $(\\ref{2})$, $(\\ref{3})$ as well as the values of $a^T$ and $b^T$ in $(\\ref{1})$, we obtain the required result:\n",
    "\n",
    "$$\n",
    "(J_d)_{uj} = \n",
    "\\begin{cases}\n",
    "\\frac{w_j}{\\sum_{k \\in N(V_u)} w_k}\n",
    "\\left(\n",
    "  (T_u V_g - V_c)^T\n",
    "  \\begin{pmatrix}-[T_j N_g] & I_{3\\times3}\\end{pmatrix} +\n",
    "  (T_u N_g)^T \\begin{pmatrix}-[T_j V_g] & I_{3\\times3}\\end{pmatrix}\n",
    "\\right) & \\text{if} j \\in N(V_u) \\\\\n",
    "0 & \\text{otherwise}\n",
    "\\end{cases}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can simplify this expression further:\n",
    "\n",
    "$$\n",
    "\\begin{equation*}\n",
    "\\left(\n",
    "  (T_u V_g - V_c)^T \\begin{pmatrix}-[T_j N_g] & I_{3\\times3}\\end{pmatrix} +\n",
    "  (T_u N_g)^T \\begin{pmatrix}-[T_j V_g] & I_{3\\times3}\\end{pmatrix}\n",
    "\\right) = \\\\\n",
    "\\left [ \n",
    "(T_u V_g - V_c)^T \\left ( -[T_j N_g]_\\otimes \\right ) \\mid (T_u V_g - V_c)^T\n",
    " \\right ]\n",
    "+\n",
    "\\left [ \n",
    "(T_u N_g)^T \\left ( -[T_j V_g]_\\otimes \\right ) \\mid (T_u N_g)^T\n",
    " \\right ] = \\\\\n",
    "\\left [ \n",
    "(T_u V_g - V_c)^T \\left ( -[T_j N_g]_\\otimes \\right ) +\n",
    "(T_u N_g)^T \\left ( -[T_j V_g]_\\otimes \\right )\n",
    "    \\mid (T_u V_g + T_u N_g - V_c )^T\n",
    "\\right ] = \\\\\n",
    "\\left [ \n",
    "(T_u V_g - V_c)^T  ( [T_j N_g]_\\otimes )^T +\n",
    "(T_u N_g)^T ( [T_j V_g]_\\otimes )^T\n",
    "    \\mid (T_u V_g + T_u N_g - V_c )^T\n",
    "\\right ] = \\\\\n",
    "\\begin{bmatrix}\n",
    "( [T_j N_g]_\\otimes )(T_u V_g - V_c) +\n",
    "( [T_j V_g]_\\otimes )(T_u N_g)\n",
    "\\\\ \n",
    "T_u V_g + T_u N_g - V_c\n",
    "\\end{bmatrix}^T = \\\\\n",
    "\\begin{bmatrix}\n",
    "( T_j N_g) \\times (T_u V_g - V_c) +\n",
    "( T_j V_g) \\times (T_u N_g)\n",
    "\\\\ \n",
    "T_u V_g + T_u N_g - V_c\n",
    "\\end{bmatrix}^T = \\\\\n",
    "\\begin{bmatrix}\n",
    "( T_j N_g) \\times (T_u V_g - V_c) +\n",
    "( T_j V_g) \\times (T_u N_g)\n",
    "\\\\ \n",
    "T_u V_g + T_u N_g - V_c \n",
    "\\end{bmatrix}^T\n",
    "\\end{equation*}\n",
    "$$\n",
    "\n",
    "So, we get the final expression as:\n",
    "$$\n",
    "\\begin{equation*}\n",
    "(J_d)_{uj} = \n",
    "\\begin{cases}\n",
    "\\frac{w_j}{\\sum_{k \\in N(V_u)} w_k}\n",
    "\\begin{bmatrix}\n",
    "( T_j N_g) \\times (T_u V_g - V_c) +\n",
    "( T_j V_g) \\times (T_u N_g)\n",
    "\\\\ \n",
    "T_u V_g + T_u N_g - V_c \n",
    "\\end{bmatrix}^T\n",
    "&\n",
    "\\text{if} j \\in N(V_u) \\\\\n",
    "0 & \\text{otherwise}\n",
    "\\end{cases}\n",
    "\\end{equation*}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have expression for the Jacobian, we can form the normal equation corresponding to the data term of the ICP\n",
    "\n",
    "$$\n",
    "\\begin{equation*}\n",
    "\\\\\n",
    "\\Omega \\text{ pixels, N nodes}\n",
    "\\\\\n",
    "J_d^T W_d J_d \\mathbf{\\hat x} = -J_d^T W_d r_d \\\\\n",
    "\\underbrace{J_d^T}_{6N \\times\\Omega} \\underbrace {W_d}_{\\Omega \\times \\Omega}\n",
    "\\underbrace{J_d}_{\\Omega \\times 6N} \\underbrace{\\mathbf{\\hat x}}_{6N \\times 1} =\n",
    "-J_d^T W_d \\underbrace{r_d}_{\\Omega \\times 1} \\\\\n",
    "\\underbrace{\\mathbf{A}_d}_{6N \\times 6N} \\mathbf{\\hat x} = \\underbrace{\\mathbf{b}_d}_{6N \\times 1} \\\\\n",
    "\\\\\n",
    "\\text {for each block (i, j) which are 6}\\times\\text{6:}\n",
    "\\\\\n",
    "(\\mathbf{A}_d)_{ij} = \\sum_{\\Omega} \\underbrace{w_d(\\Omega)}_{\\text{robust for pix}}\n",
    "\\left(\\frac{\\partial E_d}{\\partial x_i}\\right)^T_\\Omega \\left(\\frac{\\partial E_d}{\\partial x_j}\\right)_\\Omega \\\\\n",
    "\\\\\n",
    "\\text {for each block j which are 6}\\times\\text{1:}\n",
    "\\\\\n",
    "(\\mathbf{b}_d)_{j} = - \\sum_{\\Omega} \\underbrace{w_d(\\Omega)}_{\\text{robust for pix}} r_d(\\Omega)\n",
    "\\left(\\frac{\\partial E_d}{\\partial x_j}\\right)^T_\\Omega \\\\\n",
    "(\\mathbf{b}_d)_{j} = - \\sum_{\\Omega} \\underbrace{w_d(\\Omega)}_{\\text{robust for pix}} ((T_u N_g)^T (T_u V_g - V_c))_{\\Omega}\n",
    "\\left(\n",
    "    \\frac{w_j \\text{ or 0} }{\\sum_{k \\in N(V_u)} w_k}\n",
    "\\begin{bmatrix}\n",
    "( T_j N_g) \\times (T_u V_g - V_c) +\n",
    "( T_j V_g) \\times (T_u N_g)\n",
    "\\\\ \n",
    "T_u V_g + T_u N_g - V_c \n",
    "\\end{bmatrix}\n",
    "    \\right)_\\Omega \\\\\n",
    "(\\mathbf{A}_d)_{ij} = \\sum_{\\Omega} \\underbrace{w_d(\\Omega)}_{\\text{robust for pix}}\n",
    "\\left(\n",
    "\\frac{(w_i\\text{ or 0})(w_j\\text{ or 0})}{(\\sum_{k \\in N(V_u)} w_k)^2}\n",
    "    \\underbrace{(A^T_{i} A_{j})}_{6 \\times 6}\n",
    "\\right)_{\\Omega}\n",
    "\\\\ \\\\\n",
    "A_{i} =\n",
    "\\begin{bmatrix}\n",
    "( T_i N_g) \\times (T_u V_g - V_c) +\n",
    "( T_i V_g) \\times (T_u N_g)\n",
    "\\\\ \n",
    "T_u V_g + T_u N_g - V_c \n",
    "\\end{bmatrix}\n",
    "\\end{equation*}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Calculating Regularisation Term Jacobian ($J_r$)\n",
    "\n",
    "Each row in $J_r$ corresponds to derivative of summand for each edge $e$ in the regularisation graph $\\epsilon$ and column $k$ corresponds to node $k$ with respect to which the derivative is calculated.\n",
    "\n",
    "$$\n",
    "\\begin{equation*}\n",
    "\\displaystyle\n",
    "(J_r)_{ek} = \n",
    "\\sum_{e_{ij} \\in \\epsilon} \\frac{\\partial ( T_iV_g^j - T_jV_g^j)}{\\partial x_k}\n",
    "=\n",
    "\\sum_{e_{ij} \\in \\epsilon} \\begin{cases}\n",
    "\\begin{pmatrix} -[T_iV_g^j] & I_{3x3} \\end{pmatrix} & \\text {if   }  i = k \\\\\n",
    "-\\begin{pmatrix} -[T_jV_g^j] & I_{3x3} \\end{pmatrix} & \\text {if  }  j = k \\\\\n",
    "0 & \\text {otherwise}\n",
    "\\end{cases}\n",
    "\\end{equation*}\n",
    "$$\n",
    "\n",
    "Using this Jacobian we can set up a normal equation corresponding to the regularisation term similarly to the data term as mentioned in the previous section"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}