Few Questions on statistics and statistical machine learning

Subhabrata · April 16, 2011

Dear Group,

I am a researcher in Linguistics from India, and trying to master aspects of machine learning and while doing so I tried to receive some questions on Statistics, and two models I am trying to read Naïve Bayes and HMM. If any one of the learned members of the group, can kindly absolve my queries, I would be grateful enough.

I. Some Questions on Statistics:

a) Is Θ (theta), -the paramater computed as the probability of the domain of the variable?

b) There are various kinds of distributions like binomial, normal, etc. looking at the data how may we decide how would we decide which distributions to fit in?

c) Feature is represented generally by X, where X =<x1,x2,x3,….xn>

If we do a pos tag, how would we represent it, if I take an example,

“Statistics is a good subject”

As,

Statistics[x1] is[x2] good[x3] subject[x4]

Or

X=(word=”Statistics”[x1], subject=”science”[x2], field=”study”[x3])…..

II. Naïve Bayes:

Naïve Bayes is known as Naïve because of its naïve assumption made in class conditional independence. My question is how likelihood is computed, is it,

P(X1|C1)*P(X2|C2)*….*P(Xn|Cn)

Or as given in Wikipedia?

III. Hidden Markov Model:

a) Forward Algorithm:

Forward Probability is said as, [∑ _Iα_t(j) a _ij ] b_i(O_t-1)

Where,

α_t(j)=π _i* b_i(O₁)

Can it be computed as, ∑ _sP(V,S).

b) Viterbi:

Can we compute Viterbi as, b*V_i*max(α_i).

Notations are taken from Rabiner(1989) paper.

*NB: If Some annotations slightly differed my apology for the same.

Best Regards,

Subhabrata Banerjee.

Sign In

Few Questions on statistics and statistical machine learning

Recommended Posts

Subhabrata

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Important Information