RuthLemm Demo
This is a demonstration of RuthLemm, a transformer (BART-based) lemmatizer for the Old Belarusian (Ruthenian) language. It can process raw text or files in the CoNLL-U format used by Universal Dependencies.
How to Use:
- Lemmatize String: Enter any text in the text box. The tool will tokenize it, lemmatize each word, and return the result. This mode does not use morphological information.
- Lemmatize CoNLL-U: Paste your CoNLL-U data into the text box or upload a
.conllu
file.- You can choose whether to use morphological features to improve accuracy via the "Use Morphology" checkbox.
- The output will be the same CoNLL-U data with the
LEMMA
column updated. You can copy the result or download it as a file.