Processing math: 100%

Saturday, October 20, 2018

Linear Discriminant Analysis -1

Computing classification means computing class posterior probabilities Pr(G|X) - that is probability that class is G given input X. If fk(x) is the class conditional density of X in class Gk and if πk is prior probability, then Bayes theorem gives, Pr(G=k|X=x)=fk(x)πkKl=1fl(x)πl

Depending on the models for class density, different techniques emerge.
* linear and quadratic discrimnant analysis use Gaussian densities
* Mixture of Gaussians lead to non-linear decision boundaries.
* Navie Bayes models assume that each class density is product of marginal densities.
Derivations:
Suppose that each class density is modelled as multivariate Gaussian. fk(x)=1(2π)p/2|Σk|1/2e12(xμk)TΣ1k(xμk)
Clear that Pr(G=k|X=x)=fk(x)πkKl=1fl(x)πlPr(G=l|X=x)=fl(x)πlKp=1fp(x)πpPr(G=k|X=x)Pr(G=l|X=x)=fk(x)πkfl(x)πllog(Pr(G=k|X=x)Pr(G=l|X=x))=logfk(x)fl(x)+logπk(x)πl(x)
Now expression fk(x)fl(x) can be simplified as follows. Note Σ is assumed to be same for all the classes. log(fk(x)fl(x))=[12(xμk)TΣ1(xμk)+12(xμl)TΣ1(xμl)]
And =12[xTΣ1xxTΣ1μkμTkΣ1x+μTkΣ1μkxTΣ1x+xTΣ1μk+μlΣ1xμTlΣ1μl]
Next =12[xTΣ1x+xTΣ1μk+μTkΣ1xμTkΣ1μk+xTΣ1xxTΣ1μlμTlΣ1x+μTlΣ1μl]

Before simplying, we should note the following:
  • Since Σ is a covariance matrix and since covariance between i and j is same as j and i, then ΣT=Σ=Σ1.
  • The mean matrices μk for class k and μl for class l are p×1 matrix.
  • Terms such as μkΣ1μl have dimensions 1×pp×pp×1=1×1 or scalars aka reals.
Second term and fifth term with leading xT can be simplified as 12xTΣ1(μkμl)
Collecting third and seventh terms x at end. [12(μTkμTl)Σ1x]T=12xTΣ1(μkμl)
The remaining terms are 12μlkΣ1μl+12μTlΣ1μl+12μTkΣ1μl12(μTkΣ1μl)T=12(μk+μl)TΣ1(μkμl)
Combining all those simplification, log(Pr(G=k|X=x)Pr(G=l|X=x))=log(πkπl)12(μk+μl)TΣ1(μkμl)+xTΣ1(μkμl)
At the line that divides class k,l has probabilities Pr(G=k|X=x)=Pr(G=l|X=x) log(1)=0=log(πk)log(πl)12μTkΣ1μk+xTΣ1μk+12μTlΣ1μlxTΣ1μl
From here, the linear discriminant can be written as δk=log(πk)12μTkΣ1μk+xTΣ1μk

No comments:

Post a Comment

Chain complexes on Hilbert spaces

 Chain complexes are mathematical structures used extensively in algebraic topology, homological algebra, and other areas of mathematics. Th...