Article

Unsupervised Spelling Correction for Slovak

arrow_icon

Daniel Hladek, Jan Stas, Jozef Juhar

arrow_icon

DOI: 10.15598/aeee.v11i5.898

Abstract

This paper introduces a method to automatically propose and choose a correction for an incorrectly written word in a large text corpus written in Slovak. This task can be described as a process of finding the best matching sequence of correct words to a list of incorrectly spelled words, found in the input. Knowledge base of the classification system - statistics about sequences of correctly typed words and possible corrections for incorrectly typed words can be mathematically described as a hidden Markov model. The best matching sequence of correct words is found using Viterbi algorithm. The system will be evaluated on a manually corrected testing set.

Full Text:

PDF

Cite this