{"id":354,"date":"2025-03-30T16:47:37","date_gmt":"2025-03-30T08:47:37","guid":{"rendered":"https:\/\/www.toothlessos.xyz\/?p=354"},"modified":"2025-03-30T16:47:40","modified_gmt":"2025-03-30T08:47:40","slug":"the-optimization-set-qr-factorization","status":"publish","type":"post","link":"https:\/\/www.toothlessos.xyz\/index.php\/2025\/03\/30\/the-optimization-set-qr-factorization\/","title":{"rendered":"The optimization set: QR factorization"},"content":{"rendered":"<div class=\"wp-block-aioseo-table-of-contents\"><ul><li><a class=\"aioseo-toc-item\" href=\"#aioseo-e\">Definition<\/a><\/li><li><a class=\"aioseo-toc-item\" href=\"#aioseo-uses\">Example: Least Square Problems<\/a><ul><li><a class=\"aioseo-toc-item\" href=\"#aioseo-problem-definition\">Problem definition<\/a><\/li><li><a class=\"aioseo-toc-item\" href=\"#aioseo-algebratic-and\">Algebraic Interpretation<\/a><\/li><li><a class=\"aioseo-toc-item\" href=\"#aioseo-geometric-interpretation\">Geometric Interpretation<\/a><\/li><li><a class=\"aioseo-toc-item\" href=\"#aioseo-solving-least-square-problems\">Solving Least Square Problems<\/a><\/li><li><a class=\"aioseo-toc-item\" href=\"#aioseo-simplifying-computation-with-qr-factorization\">Simplifying computation with QR factorization<\/a><\/li><li><a class=\"aioseo-toc-item\" href=\"#aioseo-condition-numbers\">*Condition numbers<\/a><\/li><\/ul><\/li><li><a class=\"aioseo-toc-item\" href=\"#aioseo-gram-s\">Gram-Schmidt Algorithm<\/a><ul><li><a class=\"aioseo-toc-item\" href=\"#aioseo-some-thoughts\">Some thoughts<\/a><\/li><\/ul><\/li><\/ul><\/div>\n\n<p><\/p>\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-e\">Definition<\/h2>\n\n\n\n<p>(Taken from Prof. Sun, Peng&#8217;s slides)<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"522\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-21-1024x522.png\" alt=\"\" class=\"wp-image-358\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-21-1024x522.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-21-300x153.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-21-768x391.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-21-1536x782.png 1536w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-21.png 1606w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"404\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-22-1024x404.png\" alt=\"\" class=\"wp-image-361\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-22-1024x404.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-22-300x118.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-22-768x303.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-22-1536x606.png 1536w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-22.png 1707w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The geometric (orthonormal) properties of the columns of the Q matrix is useful, as we will see in the following example of least square problems.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-uses\">Example: Least Square Problems<\/h2>\n\n\n\n<p>(Reference: S. Boyd and L. Vandenberghe, <em>Introduction to Applied Linear Algebra \u2013 Vectors, Matrices, and Least Squares<\/em>, Chapter 10 &amp; 12. <a href=\"https:\/\/web.stanford.edu\/~boyd\/vmls\/vmls.pdf\">-&gt;eBook&lt;-<\/a>.)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-problem-definition\">Problem definition<\/h3>\n\n\n\n<p>Suppose that we have a matrix A with the shape of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-ad219d2119c896ed97f3af1ef53df35d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;&#42;&#110;&#32;&#40;&#109;&#32;&#62;&#32;&#110;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"106\" style=\"vertical-align: -5px;\"\/>, so the system of linear equations <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-054f87a07025338170db3a4b50c1f7a9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;&#120;&#32;&#61;&#32;&#98;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"55\" style=\"vertical-align: 0px;\"\/> (b is a <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-6b41df788161942c6f98604d37de8098_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"15\" style=\"vertical-align: 0px;\"\/>-vector) is over-determined. In other words, there are more equations (<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-6b41df788161942c6f98604d37de8098_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"15\" style=\"vertical-align: 0px;\"\/>) than variables (<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-b170995d512c659d8668b4e42e1fef6b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/>).<\/p>\n\n\n\n<p>Over-determined systems are very common in data-driven methods. The number of the entries in the dataset (columns, m) are usually a lot larger than the dimensions of the dataset (rows, n).<\/p>\n\n\n\n<p>For most over-determined systems, there is no <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-b170995d512c659d8668b4e42e1fef6b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/>-vector x such that <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-054f87a07025338170db3a4b50c1f7a9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;&#120;&#32;&#61;&#32;&#98;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"55\" style=\"vertical-align: 0px;\"\/>. However, as a compromise, we can have an <strong>approximate solution<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-0dcea2b69c13f530fe0536612992690a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#32;&#120;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: 0px;\"\/> that minimize the residual <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-3679b7eff6ba139a973de9310139c38f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#114;&#32;&#61;&#32;&#65;&#120;&#32;&#45;&#32;&#98;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"85\" style=\"vertical-align: 0px;\"\/>. In the <strong>linear least square problem<\/strong>, we minimize the squared norm of the residual <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-c409433a9e2dfcdb83360a974d243f18_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#114;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\" style=\"vertical-align: 0px;\"\/>, that is: <\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 22px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-bb1fb6c5ccd9cef1ee008b20bdd56998_l3.png\" height=\"22\" width=\"172\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#91;&#92;&#109;&#105;&#110;&#32;&#124;&#124;&#32;&#114;&#32;&#124;&#124;&#94;&#50;&#32;&#61;&#32;&#124;&#124;&#32;&#65;&#120;&#32;&#45;&#32;&#98;&#32;&#124;&#124;&#94;&#50;&#32;&#92;&#93;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p>In other words, any vector <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-0dcea2b69c13f530fe0536612992690a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#32;&#120;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: 0px;\"\/> that satisfies <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-8428a72841b28d79fc86fe50c6189b1d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#124;&#124;&#32;&#65;&#92;&#104;&#97;&#116;&#32;&#120;&#32;&#45;&#32;&#98;&#32;&#124;&#124;&#94;&#50;&#32;&#92;&#108;&#101;&#32;&#124;&#124;&#32;&#65;&#120;&#32;&#45;&#32;&#98;&#32;&#124;&#124;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"182\" style=\"vertical-align: -5px;\"\/> for all x is an solution for the least square problem.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-algebratic-and\">Algebraic Interpretation<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"361\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-23-1024x361.png\" alt=\"\" class=\"wp-image-393\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-23-1024x361.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-23-300x106.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-23-768x271.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-23.png 1127w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-geometric-interpretation\">Geometric Interpretation<\/h3>\n\n\n\n<p>Consider the case for <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-6b41df788161942c6f98604d37de8098_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"15\" style=\"vertical-align: 0px;\"\/>=3, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-b170995d512c659d8668b4e42e1fef6b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/>=2. Let A (Shape: <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-b66871024e3eb02a9efa4a30dab08045_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;&#42;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"9\" width=\"43\" style=\"vertical-align: 0px;\"\/>) be as follows:<\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 64px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-b88aec4393bf6b182da0b506b0556c92_l3.png\" height=\"64\" width=\"97\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#91;&#65;&#32;&#61;&#32;&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#112;&#109;&#97;&#116;&#114;&#105;&#120;&#125;&#32;&#49;&#32;&#38;&#32;&#48;&#32;&#92;&#92;&#32;&#48;&#32;&#38;&#32;&#49;&#32;&#92;&#92;&#32;&#48;&#32;&#38;&#32;&#48;&#32;&#92;&#101;&#110;&#100;&#123;&#112;&#109;&#97;&#116;&#114;&#105;&#120;&#125;&#92;&#93;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p>Here, we can see that the column space of A lives in <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-9364ff6381a2c94a1275e7a1ac1ab35e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#98;&#98;&#123;&#82;&#125;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"20\" style=\"vertical-align: 0px;\"\/>, while vector b lives in <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-f2bf7f43f18e7cd8c0e05ae4745dc288_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#98;&#98;&#123;&#82;&#125;&#94;&#51;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"20\" style=\"vertical-align: 0px;\"\/>. Therefore, it is not possible for us to find an exact solution for most of the cases. As for the approximate solution <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-0dcea2b69c13f530fe0536612992690a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#32;&#120;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: 0px;\"\/>, it gives the linear combination of the column vectors of A that minimize the norm of the residual vector Ax &#8211; b.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"589\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-24-1024x589.png\" alt=\"\" class=\"wp-image-399\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-24-1024x589.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-24-300x173.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-24-768x442.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-24.png 1481w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>(Visualization: DE: Ax &#8211; b)<\/p>\n\n\n\n<p>It is worth noting that the vector Ax &#8211; b is orthogonal to the column space of A. Below is the algebraic justification for this:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"589\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-30-1024x589.png\" alt=\"\" class=\"wp-image-421\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-30-1024x589.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-30-300x173.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-30-768x442.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-30.png 1272w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-solving-least-square-problems\">Solving Least Square Problems<\/h3>\n\n\n\n<p>Here, we need to make the assumption that the column vectors of matrix A are <strong>linearly independent<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"802\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-25-1024x802.png\" alt=\"\" class=\"wp-image-402\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-25-1024x802.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-25-300x235.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-25-768x602.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-25.png 1126w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"888\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-26-1024x888.png\" alt=\"\" class=\"wp-image-403\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-26-1024x888.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-26-300x260.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-26-768x666.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-26.png 1119w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The result <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-e856cd2140a68ceaeb0b3a8c15e1efe4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#123;&#120;&#125;&#32;&#61;&#32;&#123;&#40;&#65;&#94;&#123;&#84;&#125;&#65;&#41;&#125;&#94;&#123;&#45;&#49;&#125;&#65;&#94;&#123;&#84;&#125;&#98;\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"135\" style=\"vertical-align: -5px;\"\/> can be written in the <strong>pseudo-inverse<\/strong> notation <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-a9bb2b0577ba5aae16828c147b77d5de_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#123;&#120;&#125;&#32;&#61;&#32;&#65;&#94;&#123;&#92;&#100;&#97;&#103;&#103;&#101;&#114;&#125;&#98;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"62\" style=\"vertical-align: 0px;\"\/>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-simplifying-computation-with-qr-factorization\">Simplifying computation with QR factorization<\/h3>\n\n\n\n<p>Given that the column vectors of matrix A are <strong>linearly independent<\/strong>, we can implement QR factorization on A, which gives us a simpler way to calculate the pseudo-inverse. Moreover, empirical results suggests that using QR decomposition usually gives us <strong>condition numbers<\/strong> that are much smaller compared to directly calculating the pseudo-inverse.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"932\" height=\"290\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-27.png\" alt=\"\" class=\"wp-image-412\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-27.png 932w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-27-300x93.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-27-768x239.png 768w\" sizes=\"auto, (max-width: 932px) 100vw, 932px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"280\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-29-1024x280.png\" alt=\"\" class=\"wp-image-417\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-29-1024x280.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-29-300x82.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-29-768x210.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-29.png 1099w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>For non-singular matrix A, we have <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-c21b047df4e8cca9024a6d803eb9f4e8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#65;&#94;&#123;&#45;&#49;&#125;&#32;&#61;&#32;&#82;&#94;&#123;&#45;&#49;&#125;&#81;&#94;&#123;&#84;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"111\" style=\"vertical-align: -4px;\"\/><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-condition-numbers\">*Condition numbers<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"159\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-28-1024x159.png\" alt=\"\" class=\"wp-image-416\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-28-1024x159.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-28-300x47.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-28-768x119.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-28-1536x238.png 1536w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-28.png 1636w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Smaller condition numbers, better numerical stability!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-gram-s\">Gram-Schmidt Algorithm<\/h2>\n\n\n\n<p>Now we look at the Gram-Schmidt Algorithm that we use to actually implement the QR factorization.<\/p>\n\n\n\n<p>Finding the Q matrix in essentially finding orthonormal basis of column space of A. The algorithm below utilizes this orthonormal property well. For <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-c2fc519a802d0cd73a2b7a74f802833b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#117;&#95;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"16\" style=\"vertical-align: -3px;\"\/>, we just take the unit vector <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-1900a484b687cf3bfd85c929f272f63c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#101;&#95;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"14\" style=\"vertical-align: -3px;\"\/> that points into the same direction with <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-460f5abc45b558edf34fe288dc0a9979_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#97;&#95;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/>. For the following <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-71a61e35897b62da24cd85c110c2b047_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#117;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/>, we compute them by subtracting <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-f91083f3035e5168a6f0b3e6335d6858_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#97;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"14\" style=\"vertical-align: -3px;\"\/> by it&#8217;s projections onto all the orthonormal basis we&#8217;ve already computed. In this case, we make sure that the residue is orthogonal to all the existing basis (i.e. the residue cannot be expressed by a linear combination of the existing basis). After this, we simply normalize <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-71a61e35897b62da24cd85c110c2b047_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#117;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/> to get the orthogonal basis <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-d33c164f455b97af0a78c1c0eaac4383_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#101;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"13\" style=\"vertical-align: -3px;\"\/>.<\/p>\n\n\n\n<p>For the R matrix, we note that for orthonormal basis <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-d33c164f455b97af0a78c1c0eaac4383_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#101;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"13\" style=\"vertical-align: -3px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-db3947981ec85f9e5290f92912b9f952_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#101;&#95;&#106;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"14\" style=\"vertical-align: -6px;\"\/><\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 19px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.toothlessos.xyz\/wp-content\/ql-cache\/quicklatex.com-638f69c2a1403cf05edf4aec921c8080_l3.png\" height=\"19\" width=\"257\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#91;&#101;&#95;&#105;&#32;&#92;&#99;&#100;&#111;&#116;&#32;&#101;&#95;&#106;&#32;&#61;&#32;&#48;&#32;&#40;&#105;&#32;&#92;&#110;&#101;&#32;&#106;&#41;&#32;&#92;&#116;&#101;&#120;&#116;&#123;&#32;&#125;&#32;&#101;&#95;&#105;&#32;&#92;&#99;&#100;&#111;&#116;&#32;&#101;&#95;&#106;&#32;&#61;&#32;&#49;&#32;&#40;&#105;&#32;&#61;&#32;&#106;&#41;&#92;&#93;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p>The results can be easily validated by doing the matrix multiplication!<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"429\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-31-1024x429.png\" alt=\"\" class=\"wp-image-423\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-31-1024x429.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-31-300x126.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-31-768x321.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-31-1536x643.png 1536w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-31.png 1768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"381\" src=\"http:\/\/38.246.252.17:8080\/wp-content\/uploads\/2025\/03\/image-32-1024x381.png\" alt=\"\" class=\"wp-image-428\" srcset=\"https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-32-1024x381.png 1024w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-32-300x112.png 300w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-32-768x286.png 768w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-32-1536x571.png 1536w, https:\/\/www.toothlessos.xyz\/wp-content\/uploads\/2025\/03\/image-32.png 1891w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>(A simple visualization)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-some-thoughts\">Some thoughts<\/h3>\n\n\n\n<p>Borrowing the idea from my professor, the Gram-Schmidt algorithm can be seen as a form of incremental learning: instead of directly computing the whole Q &amp; R matrices at the same time, we take the incremental way of subtracting the projections.<\/p>\n\n\n\n<p>This actually reminds me of two machine learning algorithms that uses the same idea.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>The <strong>residual connections<\/strong> in <strong>ResNet<\/strong>\n<ul class=\"wp-block-list\">\n<li>Instead of learning the direct mapping\u00a0H(x), residual blocks learn the residual\u00a0F(x)=H(x) &#8211; x. This reformulation makes it easier to optimize identity mappings &#8211; the network can <strong>simply drive\u00a0F(x)\u00a0to zero rather than learning an identity function from scratch<\/strong>. (Generated by DeepSeek)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>The gradual denoising process in <strong>Diffusion Models<\/strong> <\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>Definition (Taken from Prof. Sun, Peng&#8217;s slides)  [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[38,21,20,37,9],"class_list":["post-354","post","type-post","status-publish","format-standard","hentry","category-math","tag-linear-algebra","tag-linear-regression","tag-machine-learning","tag-optimization","tag-review"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/posts\/354","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/comments?post=354"}],"version-history":[{"count":74,"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/posts\/354\/revisions"}],"predecessor-version":[{"id":526,"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/posts\/354\/revisions\/526"}],"wp:attachment":[{"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/media?parent=354"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/categories?post=354"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.toothlessos.xyz\/index.php\/wp-json\/wp\/v2\/tags?post=354"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}