Google Tech Talks Januaary 29, 2007 ABSTRACT Training on imperfectly representative data inevitably leads to classification errors. Retraining an OCR engine with post-edited data, or even with the imperfect labels assigned by the classifier, reduces both bias and variance. Although the theoretical foundations of decision-directed adaptation are meager, it has proved successful in diverse experiments. When the operational data can be partitioned into isogenous subsets, style-constrained classification is appropriate. Patterns should be recognized in groups rather than in isolation. Shape and language context are complementary. Operator interaction should be rationalized. Only dynamic classifiers...
Get notified about new features and conference additions.