Great article! It’s good to know that my intuition developed over time aligns with Rasa’s best practices. =D Keep posting the good stuff, I’ll keep reading!
Can’t wait for part 3! I’m particularly concerned about some datasets which are naturally skewed, for instance when some intent such as “affirm” or “deny” is trying to perform against an “inform_user_id” with a lookup table of all your users’ names… (I’m making this example up). I wouldn’t know how to tackle them right now!
I thought that the natural way to do it is that while doing gradient descent, the weight of the gradient according to each training example would be proportional to the number of data points in that example’s intent. Or maybe that weight can be learned (who knows). But that’s definitely an interesting problem to solve for me!