{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Probably the simplest BIP in the world (the short story)\n", "\n", "\n", "Here we define and solve a very simple Bayesian inverse problem (BIP) using \n", "CUQIpy. The purpose of this notebook is to introduce the basic concepts of\n", "Bayesian inverse problems in a simple way and to use minimal CUQIpy code to\n", "solve it.\n", "\n", "In the next notebooks (the long story), we discuss more details about setting up\n", "this BIP in CUQIpy and provide many exercises to help the reader to explore \n", "using CUQIpy in solving BIPs and think of slightly different BIP modeling \n", "scenarios.\n", "\n", "\n", "## Contents of this notebook: \n", " * [0. Learning objectives](#r-learning-objectives)\n", " * [1. Defining the BIP](#r-defining-the-bip)\n", " * [2. Solving the BIP](#r-solving-the-bip)\n", " * [3. Summary](#r-summary)\n", " * [References](#r-references)\n", "\n", "\n", "## 0. Learning objectives \n", " * Create a simple BIP in CUQIpy\n", " * Create a linear forward model object\n", " * Create distribution objects that represent the prior, the noise, and the data distributions\n", " * Use the `BayesianProblem` class to define the BIP\n", "\n", " * Solve a simple BIP in CUQIpy\n", " * Compute the maximum a posteriori (MAP) estimate\n", " * Sample from a simple posterior distribution and visualize the results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ⚠️ Note: This notebook uses MCMC samplers from the new cuqi.experimental.mcmc module, which are expected to become the default soon. Check out the documentation for more details.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Defining the BIP \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Consider the following inverse problem: given observed data $b$, determine $x_1$, and $x_2$:\n", "\n", "$$b = x_1 + x_2 + e \\;\\;\\mathrm{with}\\;\\; e \\sim \\mathrm{Gaussian}(0, 0.1)$$\n", "\n", "We can also write it in the following matrix form:\n", "\n", "$$b = \\mathbf{A}\\mathbf{x} + e = \\large(1,1\\large)\\binom{x_1}{x_2} + e$$\n", "\n", "\n", "|variable | description |dimension |\n", "|:------------|:------------------------|:--------------|\n", "|$\\mathbf{x}$ |parameter to be inferred |2-dimensional​ |\n", "|$\\mathbf{A}$ |forward model |1-by-2 matrix |\n", "|$b$ |data | 1-dimensional |\n", "|$e$ |noise |1-dimensional ​|\n", "\n", "This problem is:\n", "* Considered a linear inverse problem since the forward model is linear.\n", "* Ill-posed (in the sense of Hadamard [1]) since the solution is not unique, i.e., for some given value of $b$, e.g., $b=3$, all parameter pairs $(x_1, x_2)$ satisfying $x_1 + x_1 = 3$ are solutions to the (noise-free) problem." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us define the BIP components and solve it using CUQIpy." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "hide-cell" ] }, "outputs": [], "source": [ "# Importing the required libraries\n", "from cuqi.distribution import Gaussian\n", "from cuqi.problem import BayesianProblem\n", "from cuqi.model import LinearModel\n", "from cuqi.utilities import plot_1D_density, plot_2D_density\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.1. The forward model " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us define the forward model $\\mathbf{A}$. We start by specifying the underlying matrix:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A_matrix = np.array([[1.0, 1.0]])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we wrap `A_matrix` in a `CUQIpy` forward model object, which we name `A`, as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A = LinearModel(A_matrix)\n", "print(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2. The prior \n", "\n", "**Bayesian approach: Use prior to express belief about solution**\n", "\n", "A common choice for simplicity is the zero mean Gaussian prior where the components are independent and identically distributed (i.i.d.):\n", "\n", "$$ \\mathbf{x} \\sim \\mathrm{Gaussian}(\\mathbf{0}, \\delta^2 \\mathbf{I})$$\n", "\n", "The probability density function (PDF) of such a Gaussian prior is expressed as\n", "\n", "$$ \\pi (\\mathbf{x}) = \\frac{1}{\\sqrt{(2 \\pi)^n \\delta^{2n}}} \\mathrm{exp}\\left(-\\frac{||\\mathbf{x}||^2}{2\\delta^2}\\right) $$\n", "\n", "where:\n", "- $n$ is the dimension of the parameter space (which is 2 in this specific case),\n", "- $\\delta$ is the standard deviation of the prior distribution,\n", "- $\\mathbf{I}$ is the identity matrix.\n", "\n", "Let us specify the prior distribution for the parameter $\\mathbf{x}$ in `CUQIpy`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = Gaussian(np.zeros(2), 2.5)\n", "print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can plot the prior PDF for the parameters $x_1$ and $x_2$, along with samples from the prior distribution:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot PDF\n", "plot_2D_density(x, v1_min=-5, v1_max=5, v2_min=-5, v2_max=5)\n", "\n", "# Sample\n", "x_samples = x.sample(1000)\n", "\n", "# Plot samples\n", "x_samples.plot_pair(ax=plt.gca())\n", "plt.ylim(-5,5);\n", "plt.xlim(-5,5);\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.3. The noise distribution \n", "\n", "- As mentioned earlier, we assume $e \\sim \\mathrm{Gaussain}(0, 0.1)$ \n", "- We can define the noise distribution as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "e = Gaussian(0, 0.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us also plot the PDF of the noise distribution, using the python function `plot_pdf_1D`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_1D_density(e, -1.5, 1.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.4. The data distribution \n", "\n", "The noise in the measurement data follows $e \\sim \\mathrm{Gaussain}(0, 0.1)$ and due to the relation $b = \\mathbf{A}\\mathbf{x} + e $, we can write\n", "\n", "$$ b | \\mathbf{x} \\sim \\mathrm{Gaussian}(\\mathbf{A}\\mathbf{x}, \\sigma^2\\mathbf{I})$$\n", "\n", "and in this case we specify $\\sigma^2 = 0.1$.\n", "\n", "$$ \\pi (b | \\mathbf{x}) = \\frac{1}{\\sqrt{(2 \\pi)^m \\sigma^{2m}}} \\mathrm{exp}\\left(-\\frac{||\\mathbf{A}\\mathbf{x}- b||^2}{2\\sigma^2}\\right) $$\n", "\n", "- The data distribution is the conditional distribution of $b$ given $\\mathbf{x}$.\n", "- This PDF can only be evaluated for a given $\\mathbf{x}$.\n", "\n", "We create the data distribution object as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b = Gaussian(A@x, 0.1)\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before sampling or evaluating the PDF of `b`, we need to specify the value of the parameters`x`. Let us choose the following value:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "particular_x = np.array([1.5, 1.5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we condition the data distribution `b` on the given parameter `particular_x`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b_given_particular_x = b(x=particular_x)\n", "print(b_given_particular_x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have the distribution object `b_given_particular_x` that represents the data distribution given a particular value of the parameters `x`. We can now plot the PDF of this distribution:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_1D_density(b_given_particular_x, 1.5, 4.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use `b_given_particular_x` to simulate noisy data assuming that the true `x` parameters is `particular_x`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b_obs = b_given_particular_x.sample()\n", "print(b_obs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.5. The likelihood function \n", "\n", "\n", "We obtain the likelihood function by fixing observed data $b^\\mathrm{obs}$ in the data distribution and considering the function of $\\mathbf{x}$:\n", "\n", "$$L (\\mathbf{x} | b^\\mathrm{obs}) \\mathrel{\\vcenter{:}}= \\pi (b^\\mathrm{obs} | \\mathbf{x})$$\n", "\n", "Since we are using a Gaussian noise model, the likelihood can be evaluated as\n", "\n", "$$L (\\mathbf{x} | b=b^\\mathrm{obs}) = \\frac{1}{\\sqrt{(2 \\pi)^m \\sigma^{2m}}} \\mathrm{exp}\\left(-\\frac{||\\mathbf{A}\\mathbf{x}- b^\\mathrm{obs}||^2}{2\\sigma^2}\\right) = \\frac{1}{\\sqrt{2 \\pi \\cdot 0.1}} \\mathrm{exp}\\left(-\\frac{(x_1+x_2- b^\\mathrm{obs})^2}{2\\cdot 0.1}\\right) $$\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In CUQIpy, we can define the likelihood function as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "likelihood = b(b=b_obs)\n", "print(likelihood)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the likelihood function is a density function and is not a distribution. If we try to compute `pdf` for example, we will get an error." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We plot the likelihood function for the observed data `b_obs`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x1_lim = np.array([-5, 5])\n", "x2_lim = np.array([-5, 5])\n", "plot_2D_density(\n", " likelihood,\n", " v1_min=x1_lim[0], v1_max=x1_lim[1],\n", " v2_min=x2_lim[0], v2_max=x2_lim[1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note:** One method of estimating the inverse problem solution is to maximize the likelihood function, which is equivalent to minimizing the negative log-likelihood function. This is known as the maximum likelihood (ML) point estimate.\n", "\n", "For this problem, all the points $(x_1, x_2)$ that satisfy $x_1 + x_2 = b^\\mathrm{obs}$ are solutions to this maximization problem. These solutions are depicted as the red line in the likelihood plot below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot the likelihood\n", "plot_2D_density(\n", " likelihood,\n", " v1_min=x1_lim[0], v1_max=x1_lim[1],\n", " v2_min=x2_lim[0], v2_max=x2_lim[1])\n", "\n", "# Plot the line x2 = b_obs - x1\n", "plt.plot(x1_lim, b_obs-x1_lim, '--r')\n", "plt.ylim(x2_lim);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Combining the likelihood with the prior will give us a unique maximum a posteriori (MAP) point estimate as we will see next." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.6. Putting it all together, the BIP \n", "\n", "#### Posterior definition using the Bayes’ rule\n", "The posterior is proportional to the product of **likelihood** and **prior**\n", "\n", "$$\\pi(\\mathbf{x} | b=b^\\mathrm{obs}) \\propto L (\\mathbf{x} | b=b^\\mathrm{obs}) \\pi(\\mathbf{x})$$\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In CUQIpy, we can use the class `BayesianProblem` to bundle the prior, the data distribution, and the data, then use it to explore the posterior distribution (e.g. point estimate and sampling):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "BP = BayesianProblem(b, x)\n", "print(BP)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we pass the data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "BP.set_data(b=b_obs)\n", "print(BP)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note the difference in the target of the `BayesianProblem` object before and after passing the data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Solving the BIP " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.1. Maximum a posteriori (MAP) estimate \n", " \n", "Maximizer of the posterior\n", "\n", "$$\\mathbf{x}^* = \\underset{\\mathbf{x}}{\\operatorname{argmax\\;}} \\pi(\\mathbf{x} | b=b^\\mathrm{obs})$$\n", "In the case with Gaussian noise and Gaussian prior, this is the classic Tikhonov solution\n", "\n", "$$\\mathbf{x}^* = \\underset{\\mathbf{x}}{\\operatorname{argmin\\;}} \\frac{1}{2 \\sigma^2} ||\\mathbf{A}\\mathbf{x}- b^\\mathrm{obs}||_2^2 + \\frac{1}{2\\delta^2}||\\mathbf{x} ||^2_2$$\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To compute (and print) the MAP estimate in CUQIpy, we write:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "map_estimate = BP.MAP()\n", "print(map_estimate)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.2. Sampling from the posterior \n", "\n", "The MAP point is a very useful point estimate, but it does not provide information about the uncertainty associated with the estimate. To quantify this uncertainty, we need to draw samples from the posterior distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To sample the posterior distribution in CUQIpy, we can use the `sample_posterior` method of the `BayesianProblem` object:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "samples = BP.sample_posterior(1000, experimental=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We visualize the samples:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot the posterior PDF\n", "plot_2D_density(BP.posterior, x1_lim[0], x1_lim[1], x2_lim[0], x2_lim[1])\n", "\n", "# Plot the posterior samples\n", "samples.plot_pair(ax=plt.gca())\n", "plt.plot(map_estimate[0], map_estimate[1], 'ro')\n", "plt.ylim(x2_lim);\n", "plt.xlim(x1_lim);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Summary \n", "In this notebook, we defined a simple Bayesian inverse problem (BIP) of two unknown parameters and solved it using CUQIpy. We defined the forward model, the prior, and the data distribution; and created simulated data. We then combined these components to define the BIP using the `BayesianProblem` class. We computed the maximum a posteriori (MAP) estimate and sampled from the posterior distribution to quantify the uncertainty in the estimate." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References \n", "1. Latz, J. (2020). On the well-posedness of Bayesian inverse problems. SIAM/ASA Journal on Uncertainty Quantification, 8(1), 451-482." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.5" } }, "nbformat": 4, "nbformat_minor": 4 }