Charles Explorer logo
🇬🇧

Lightweight diacritics restoration for V4 languages

Publication

Abstract

Diacritics restoration became a ubiquitous task in the Latinalphabet-based English-dominated Internet language environment. In this article, we describe a small footprint 1D convolution-based approach, which works on character-level.

The model even runs locally in a web browser, and surpasses the performance of similarly sized models. We evaluate our model on the languages of the Visegrád Group, with emphasis on Hungarian.