Conditional random fields in text segmentation by language

Robin Cabeza Ruiz


This work presents using conditional random fields for solving the task of text segmentation by language, considering it as a sequence tagging task. Language changes are considered to occur in every part of the text, observations are assumed to be the words in the text, and the states are the different languages. Research let conclude that conditional random fields are a powerful tool for segmentation of multilingual text. 


Text segmentation by language; conditional random fields.

Full Text:



