- The Nonparametric Bootstrap
- Resampling Plans
- The Parametric Bootstrap
- Influence Functions and Robust Estimations
- The Percentile Methods
- Bias-Corrected Confidence Intervals
- Second-Order Accuracy
- Bootstrap-t Intervals
- Objective Bayes Intervals and the Confidence Distribution
- A family of probability densities
- Log likelihood functions
- Maximum Likelihood Estimate
- Fisher Information
- Confidence Interval
- Permutation and Randomization
A family of probability densities
, the observed data, is a point in the sample space , while the unobserved parameter is a point in the parameter space .
Log likelihood functions
Maximum Likelihood Estimate
With a one-parameter family of densities
The score function is the first derivative of with respect to
Then, the Fisher information is defined to be the variance of the score function,
The MLE has approximately normal distribution with mean and variance ,
An i.i.d. sample
With a one-parameter family of densities, ,
where is the observed Fisher information
Permutation and Randomization
* Linear hull
* Affine hull
* Convex hull
, where and
* Conic hull
- Optimality conditions
, and ;
where is a set of active constraints.
Consider a primal in standard form. , s.t. , .
where are called the Lagrange multipliers, or dual variables, of the problem.
Lagrange dual function
Lower bound property of the dual function
The dual function is jointly concave in . Moreover, it holds that
Dual optimization problem and weak duality
d s.t. .
It is remarkable that the dual problem is always a convex optimization problem, even when the primal problem is not convex. Weak duality property of the dual problem is: d p*.
Strong duality and Slater's condition for convex programs
The first of the convex are affine. If there exist a point such that
then strong duality holds between the primal and dual problems, that is, p=d.
A primal and the corresponding dual inequality cannot be slack simultaneously.
If , then it must be . If , then it must be .
- Properties of trace and rank
- Matrix Norm
- Cauchy-Schwartz inequalities and definition of angles
- Range, nullspace, and rank
- Matrix description of subspaces
- Symmetric matrix
- Congruence transformations
- Eigenvalue and eigenvector
- Spectral decomposition (a.k.a. eigendecomposition) for symmetric matrix
- Singular value decomposition
- Cholesky decomposition of p.s.d. and p.d. matrices
- Rayleigh quotient
- Properties of eigenvalues and eigenvectors
Properties of trace and rank
for any scaler and any .
- Frobenius Norm
Cauchy-Schwartz inequalities and definition of angles
We can define the corresponding angle as such that
Range, nullspace, and rank
The rank of matrix A is the dimension of its range.
Matrix description of subspaces
* Diagonal matrix
Any quadratic function can be written as
For any matrix it holds that:
* , and ;
* if and only if is full-column rank, i.e., ;
* if and only if is full-row rank, i.e., .
Eigenvalue and eigenvector
- Therefore means at least one eigenvalue is 0. Also, A is invertible when
- Eigenvalues of symmetric matrices are nonnegative. If is positive semi-definite, then
- Eigenvalues of positive definite matrices are positive. If is positive definite, then
- From , matrix A is invertable if and only if A is positive definite.
Spectral decomposition (a.k.a. eigendecomposition) for symmetric matrix
Any symmetric matrix can be decomposed as a weighted sum of normalized dyads.
then A can be described by eigenvalues and eigenvectors of A.
Singular value decomposition
In words, the singular value theorem states that any matrix can be written as a sum of simpler matrices (orthogonal dyads).
Then A can be described by singular values of (i.e. eigenvalues of ) and eigenvectors of and as follows.
Or, in compact form,
(Because and are p.s.d..)
and are orthogonal matrices.
* SVD, range, and nullspace
The first columns of are the orthogonal basis of range space (columns space) of A.
The last columns of are the orthogonal basis of nullspace of A.
Cholesky decomposition of p.s.d. and p.d. matrices
If such that , then is positive semi-definite.
If is positive semi-definite, then such that . That is, any p.s.d. matrix can be written as a product . P is not unique. If A is positive definite, then we can choose lower triangular matrix for the decomposition as , where L is invertable.
Example of p.s.d. matrix
Variance-covariance matrix is a notable example of p.s.d. matrix.
, where .
Given , it holds that
Therefore we can solve optimization problem in quadratic form by finding eigenvalues.
Properties of eigenvalues and eigenvectors
|Type of matrix||Eigenvalues||Eigenvectors|
|Projections||column space; nullspace|
|Every matrix||rank(A) = rank()||eigenvectors of in|