In this paper, we propose a representation of Czech complex predicates with light verbs in a valency lexicon. We demonstrate that if such a representation is to be theoretically adequate and at the same time economical, the information on complex predicates should be divided between two components of the lexicon, a data component and a grammar component.
The data component stores all the information necessary for the generation of well-formed deep and surface syntactic structures of complex predicates, namely valency frames of both light verbs and predicative nouns, links between these frames, mapping of verbal and nominal valency complementations, and mapping of the semantic participant Instigator onto a verbal complementation. The grammar component of the lexicon, representing a part of the overall grammar of the language, contains formal rules.
These rules, which are instantiated by the information stored in the data component, allow users to obtain deep and surface structures of complex predicates. Finally, the proposed model is applied in the annotation of 1.215 Czech complex predicates selected from the Czech National Corpus on the basis of frequency and saliency.
The resulting data forms a solid foundation for further survey into various semantic and syntactic aspects of Czech complex predicates.