logo
Ponte Academic Journal
Oct 2017, Volume 73, Issue 10

BIGRAMS AND CHUNKING: ADVANTAGES FOR USING IN AUTOMATIC SPELLING CORRECTION IN RUSSIAN AND ENGLISH

Author(s): Vladimir Polyakov ,Ivan Anisimov, Elena Makarova

J. Ponte - Oct 2017 - Volume 73 - Issue 10
doi: 10.21506/j.ponte.2017.10.10



Abstract:
The present research is concerned with the problem of automatic spelling correction for Russian and English. The program realized in a batch mode draws upon chunking - a model of an incomplete syntactic analysis. Basing on the previous version of the program and its advantages and shortcomings, we made a decision to introduce a stage of analysis using bigrams into the chunking pipeline, which considerably increased the efficiency of spelling correction. Unlike other programs that presuppose an interactive mode with a human interference, the spelling corrector described in the present paper is completely automatic, i.e. the program itself chooses the best variant of a correction and makes the necessary replacement. The work of the program was tested on two mini-collections (for Russian and for English) of a hundred clauses each collected from Twitter. Though there is still room for improvement, the results testify to the fact that joint use of bigrams and chunks has great potential.
Download full text:
Check if you have access through your login credentials or your institution