Combining pattern-based CRFs and weighted context-free grammars
Takhanov R. Kolmogorov V.
2022IOS Press BV
Intelligent Data Analysis
2022#26Issue 1257 - 272 pp.
We consider two models for the sequence labeling (tagging) problem. The first one is a Pattern-Based Conditional Random Field (PB), in which the energy of a string (chain labeling) x=x1... xn∈Dn is a sum of terms over intervals [i,j] where each term is non-zero only if the substring xi... xj equals a prespecified word w∈Λ. The second model is a Weighted Context-Free Grammar (WCFG) frequently used for natural language processing. PB and WCFG encode local and non-local interactions respectively, and thus can be viewed as complementary. We propose a Grammatical Pattern-Based CRF model (GPB) that combines the two in a natural way. We argue that it has certain advantages over existing approaches such as the Hybrid model of Benedí and Sanchez that combines N-grams and WCFGs. The focus of this paper is to analyze the complexity of inference tasks in a GPB such as computing MAP. We present a polynomial-time algorithm for general GPBs and a faster version for a special case that we call Interaction Grammars.
conditional random fields , interaction grammars , Pattern-based , weighted context-free grammar
Text of the article Перейти на текст статьи
Mathematics Department, Nazarbayev University, Nur-Sultan, Kazakhstan
Institute of Science and Technology Austria, Klosterneuburg, Austria
Mathematics Department
Institute of Science and Technology Austria
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026