Bayesian learning:
My understanding of probabilities is that:
P[A|B]= integral{ P[A|C,B] * P[C|B] dC} (integration with respect to C)
But expression 8.10 claims that:
P[A|B]= integral{ P[A|C] * P[C|B] dC} (where A=X_t+1 , B=X_obs , C=W)
is 8.10 a special case?
in that case I would guess that the information from all X_obs are included in W after training, and therefore they are redundant and either one of them can be removed from P[A|C,B]; i.e. P[A|B]=P[A|C]
please help, thanks