はじめに

　『スタンフォード　ベクトル・行列からはじめる最適化数学』の学習ノートです。
　「数式の行間埋め」や「Pythonを使っての再現」によって理解を目指します。本と一緒に読んでください。

　この記事は3.3節「標準偏差」の内容です。
　標準偏差の定義を確認して、性質を導出します。

【前の内容】

www.anarchive-beta.com

【他の内容】

www.anarchive-beta.com

【今回の内容】

はじめに
標準偏差の定義
標準偏差の性質
- 定数の加算
- スカラー倍
標準偏差と二乗平均平方根・平均の関係
参考書籍
おわりに

標準偏差の定義

　まずは、標準偏差の定義式を確認して、ベクトルを使った計算式を導出します。
　平均については「1.4：内積の性質と計算例【『スタンフォード線形代数入門』のノート】 - からっぽのしょこ」、ノルムについては「3.1-2：ユークリッドノルム・ユークリッド距離の性質と計算例【『スタンフォード線形代数入門』のノート】 - からっぽのしょこ」を参照してください。

　標準偏差は、偏差の2乗平均平方根(分散の平方根)で定義されます。

$\displaystyle \begin{align} \mathrm{std}(\mathbf{x}) &= \sqrt{ \frac{ (x_1 - \mathrm{avg}(\mathbf{x}))^2 + (x_2 - \mathrm{avg}(\mathbf{x}))^2 + \cdots + (x_n - \mathrm{avg}(\mathbf{x}))^2 }{ n } } \\ &= \frac{1}{\sqrt{n}} \sqrt{ \sum_{i=1}^n (x_i - \mathrm{avg}(\mathbf{x}))^2 } \tag{1} \end{align}$

　 $\mathrm{avg}(\mathbf{x})$ は、 $\mathbf{x}$ の平均です(1.4節)。

　「 $\mathbf{x}$ の各要素 $x_i$ 」と「 $\mathbf{x}$ の平均 $\mathrm{avg}(\mathbf{x})$ 」の差(偏差)を $\tilde{x}_i = x_i - \mathrm{avg}(\mathbf{x})$ と置き、 $n$ 個の偏差 $\tilde{x}_i$ をまとめて平均除去ベクトル

$\displaystyle \tilde{\mathbf{x}} = \begin{bmatrix} \tilde{x}_1 \\ \tilde{x}_2 \\ \vdots \\ \tilde{x}_n \end{bmatrix} = \begin{bmatrix} x_1 - \mathrm{avg}(\mathbf{x}) \\ x_2 - \mathrm{avg}(\mathbf{x}) \\ \vdots \\ x_n - \mathrm{avg}(\mathbf{x}) \end{bmatrix}$

とします。
　標準偏差の定義式(1)の分子について、2乗和は内積であり(1.4節) 、また内積の平方根はユークリッドノルムなので(3.1節)、 $\tilde{\mathbf{x}}$ で置き換えられます。

$\displaystyle \mathrm{std}(\mathbf{x}) = \frac{ \sqrt{ \tilde{\mathbf{x}}^{\top} \tilde{\mathbf{x}} } }{ \sqrt{n} } = \frac{ \|\tilde{\mathbf{x}}\| }{ \sqrt{n} }$

　また $\tilde{\mathbf{x}}$ は、 $\mathbf{x}$ と1ベクトル $\mathbf{1}$ を用いて

$\displaystyle \tilde{\mathbf{x}} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} - \begin{bmatrix} \mathrm{avg}(\mathbf{x}) \\ \mathrm{avg}(\mathbf{x}) \\ \vdots \\ \mathrm{avg}(\mathbf{x}) \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} - \mathrm{avg}(\mathbf{x}) \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix} = \mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1}$

と変形できます。

$\displaystyle \mathrm{std}(\mathbf{x}) = \frac{1}{\sqrt{n}} \sqrt{ \Bigl[ \mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1} \Bigr]^{\top} \Bigl[ \mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1} \Bigr] } = \frac{ \|\mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1}\| }{ \sqrt{n} }$

　さらに、 $\mathrm{avg}(\mathbf{x}) = \frac{1}{n} \mathbf{1}^{\top} \mathbf{x}$ で置き換えます(1.4節)。

$\displaystyle \mathrm{std}(\mathbf{x}) = \frac{1}{\sqrt{n}} \sqrt{ \Bigl[ \mathbf{x} - \frac{1}{n} (\mathbf{1}^{\top} \mathbf{x}) \mathbf{1} \Bigr]^{\top} \Bigl[ \mathbf{x} - \frac{1}{n} (\mathbf{1}^{\top} \mathbf{x}) \mathbf{1} \Bigr] } = \frac{ \| \mathbf{x} - \frac{1}{n} (\mathbf{1}^{\top} \mathbf{x}) \mathbf{1} \| }{ \sqrt{n} } \tag{2}$

　ベクトルのノルムによる計算式が得られました。

標準偏差の性質

　次は、標準偏差の性質を導出します。

定数の加算

　 $\mathbf{x}$ の全ての要素に定数 $a$ を足したベクトル

$\displaystyle \mathbf{x} - a \mathbf{1} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} - a \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix} = \begin{bmatrix} x_1 - a \\ x_2 - a \\ \vdots \\ x_n - a \end{bmatrix}$

の標準偏差を考えます。

　ベクトルのノルムによる計算式(2)の $\mathbf{x}$ を $\mathbf{x} - a \mathbf{1}$ に置き換えて、式を整理します。

$\displaystyle \begin{aligned} \mathrm{std}(\mathbf{x} - a \mathbf{1}) &= \frac{1}{\sqrt{n}} \left\| [\mathbf{x} - a \mathbf{1}] - \frac{1}{n} \Bigl( \mathbf{1}^{\top} [\mathbf{x} - a \mathbf{1}] \Bigr) \mathbf{1} \right\| \\ &= \frac{1}{\sqrt{n}} \left\| \mathbf{x} - a \mathbf{1} - \frac{1}{n} \Bigl( \mathbf{1}^{\top} \mathbf{x} - a \mathbf{1}^{\top} \mathbf{1} \Bigr) \mathbf{1} \right\| \\ &= \frac{1}{\sqrt{n}} \left\| \mathbf{x} - a \mathbf{1} - \frac{1}{n} (\mathbf{1}^{\top} \mathbf{x}) \mathbf{1} + \frac{a}{n} (\mathbf{1}^{\top} \mathbf{1}) \mathbf{1} \right\| \end{aligned}$

　内積の性質 $\mathbf{1}^{\top} \mathbf{1} = n$ より(1.4節)、 $a$ に関する項が打ち消されます。

$\displaystyle \begin{aligned} \mathrm{std}(\mathbf{x} - a \mathbf{1}) &= \frac{1}{\sqrt{n}} \left\| \mathbf{x} - a \mathbf{1} - \frac{1}{n} (\mathbf{1}^{\top} \mathbf{x}) \mathbf{1} + \frac{a}{n} n \mathbf{1} \right\| \\ &= \frac{1}{\sqrt{n}} \left\| \mathbf{x} - a \mathbf{1} - \frac{1}{n} (\mathbf{1}^{\top} \mathbf{x}) \mathbf{1} + a \mathbf{1} \right\| \\ &= \frac{1}{\sqrt{n}} \left\| \mathbf{x} - \frac{1}{n} (\mathbf{1}^{\top} \mathbf{x}) \mathbf{1} \right\| = \mathrm{std}(\mathbf{x}) \end{aligned}$

　 $\mathbf{x}$ の標準偏差の式(2)になりました。

　よって、ベクトルの各要素に同じ定数を足しても標準偏差は変化しません。

スカラー倍

　続いて、 $\mathbf{x}$ の全ての要素に定数 $a$ を掛けたベクトル

$\displaystyle a \mathbf{x} = a \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} a x_1 \\ a x_2 \\ \vdots \\ a x_n \end{bmatrix}$

の標準偏差を考えます。

　ベクトルのノルムによる計算式(2)の $\mathbf{x}$ を $a \mathbf{x}$ に置き換えて、式を整理します。

$\displaystyle \begin{aligned} \mathrm{std}(a \mathbf{x}) &= \frac{1}{\sqrt{n}} \left\| [a \mathbf{x}] - \frac{1}{n} \Bigl( \mathbf{1}^{\top} [a \mathbf{x}] \Bigr) \mathbf{1} \right\| \\ &= \frac{1}{\sqrt{n}} \left\| a \mathbf{x} - \frac{a}{n} \Bigl( \mathbf{1}^{\top} \mathbf{x} \Bigr) \mathbf{1} \right\| \end{aligned}$

　ノルムの性質 $\|\beta \mathbf{x}\| = |\beta| \|\mathbf{x}\|$ より(3.1節)、 $a$ をノルムの外に出します。

$\displaystyle \mathrm{std}(a \mathbf{x}) = \frac{|a|}{\sqrt{n}} \left\| \mathbf{x} - \frac{1}{n} (\mathbf{1}^{\top} \mathbf{x}) \mathbf{1} \right\| = |a| \mathrm{std}(\mathbf{x})$

　 $\mathbf{x}$ の標準偏差の式(2)に $a$ の絶対値を掛けた式なりました。

　よって、ベクトルの各要素に同じ定数を掛けると標準偏差は定数の絶対値倍になります。

標準偏差と二乗平均平方根・平均の関係

　最後に、標準偏差・二乗平均平方根・平均の関係を導出します。

　標準偏差の2乗を考えます。

$\displaystyle \begin{aligned} \mathrm{std}(\mathbf{x})^2 &= \left( \frac{ \|\mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1}\| }{ \sqrt{n} } \right)^2 \\ &= \frac{ \|\mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1}\|^2 }{ n } \end{aligned}$

　分子について、ノルムの2乗を展開すると

$\displaystyle \begin{aligned} \|\mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1}\|^2 &= [\mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1}]^{\top} [\mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1}] \\ &= \mathbf{x}^{\top} \mathbf{x} - \mathbf{x}^{\top} [\mathrm{avg}(\mathbf{x}) \mathbf{1}] - [\mathrm{avg}(\mathbf{x}) \mathbf{1}]^{\top} \mathbf{x} + [\mathrm{avg}(\mathbf{x}) \mathbf{1}]^{\top} [\mathrm{avg}(\mathbf{x}) \mathbf{1}] \\ &= \mathbf{x}^{\top} \mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{x}^{\top} \mathbf{1} - \mathrm{avg}(\mathbf{x}) \mathbf{1}^{\top} \mathbf{x} + \mathrm{avg}(\mathbf{x})^2 \mathbf{1}^{\top} \mathbf{1} \\ &= \mathbf{x}^{\top} \mathbf{x} - 2 \mathrm{avg}(\mathbf{x}) \mathbf{1}^{\top} \mathbf{x} + \mathrm{avg}(\mathbf{x})^2 \mathbf{1}^{\top} \mathbf{1} \end{aligned}$

となります。内積の性質 $\mathbf{a}^{\top} \mathbf{b} = \mathbf{b}^{\top} \mathbf{a}$ より、項をまとめました。
　さらに、 $\mathbf{1}^{\top} \mathbf{x} = n \mathrm{avg}(\mathbf{x})$ 、 $\mathbf{1}^{\top} \mathbf{1} = n$ なので(1.4節)

$\displaystyle \begin{aligned} \|\mathbf{x} - \mathrm{avg}(\mathbf{x}) \mathbf{1}\|^2 &= \mathbf{x}^{\top} \mathbf{x} - 2 n \mathrm{avg}(\mathbf{x})^2 + n \mathrm{avg}(\mathbf{x})^2 \\ &= \mathbf{x}^{\top} \mathbf{x} - n \mathrm{avg}(\mathbf{x})^2 \end{aligned}$

となります。

　この式を基の式に代入します。

$\displaystyle \begin{aligned} \mathrm{std}(\mathbf{x})^2 &= \frac{ \mathbf{x}^{\top} \mathbf{x} - n \mathrm{avg}(\mathbf{x})^2 }{ n } \\ &= \frac{\mathbf{x}^{\top} \mathbf{x}}{n} - \mathrm{avg}(\mathbf{x})^2 \end{aligned}$

　前の項を2乗平均平方根の2乗(2乗平均) $\mathrm{rms}(\mathbf{x})^2 = \mathrm{ms}(\mathbf{x}) = \frac{\mathbf{x}^{\top} \mathbf{x}}{n}$ に置き換えます。

$\displaystyle \mathrm{std}(\mathbf{x})^2 = \mathrm{rms}(\mathbf{x})^2 - \mathrm{avg}(\mathbf{x})^2$

　「標準偏差の2乗」と「二乗平均平方根の2乗」、「平均の2乗」の関係式が得られました。

　この記事では、標準偏差の定義や性質を確認しました。次の記事では、標準化の定義や性質を確認します。

参考書籍

Stephen Boyd・Lieven Vandenberghe(著),玉木徹(訳)『スタンフォード　ベクトル・行列からはじめる最適化数学』講談社サイエンティク,2021年.

おわりに

　これは線形代数なんですか？そもそも線形代数って何ですか？現在の認識では、ベクトル・行列計算が含まれていれば線形代数なんだと思っています。

　この本では、内積などを駆使して、変数をベクトルとして(n個まとめて)扱うように計算(式変形)することが多いです。このブログでも、本に合わせて書いています。
　標準偏差に関しては別記事で、総和などを駆使して、i番目の要素の計算が分かるように書きました。なんでわざわざノルムにするんだよという方は、こちらも読んでみてください。

www.anarchive-beta.com

　またこの本だと、標準化したベクトルの形式でも登場するので、次の記事もセットで読んでください。

【次の内容】

からっぽのしょこ

読んだら書く！書いたら読む！同じ事は二度調べ(たく)ない

3.3：標準偏差の計算式【『スタンフォード線形代数入門』のノート】

はじめに

標準偏差の定義

標準偏差の性質

定数の加算

スカラー倍

標準偏差と二乗平均平方根・平均の関係

参考書籍

おわりに