The CorDis television corpus is an XML (eXtensible Mark-up Language) TEI (Text Encoding Initiative)-conformant collection of texts representing a signifi cant portion of the television news discourse on the 2003 Iraqi confl ict, comprising four subcorpora, that is, the evening news broadcasts for BBC, CBS, RAI Uno and Canale 5 from 20 March to 18 April (see Introduction, this volume).The main purpose of this paper is to show the function and importance of markup for the retrieval of discourse-specific information in a television news corpus. In order to do so, some preliminary issues have to be addressed: (1) the role of annotation in the creation of a harmonized and consistent corpus, with specific reference to TEI mark-up of spoken discourse, and (2) an overview of the corpus composition and of the relevant categories that have been encoded. The focus will be particularly on the function of mark-up associated with television news discourse, in order to illustrate the way mark-up gives access to meta-linguistic information by telling part of the parallel story constituted by the visual text, thus permitting the recovery of non-verbal data, a fundamental characteristic of the medium (television) and of the genre (television news).Finally, we will argue that such a homogeneously encoded corpus is a precious resource for research, both because it enhances reliability and favours reusability, making the data easily retrievable, and because it gives access to a whole set of information that would otherwise be lost.
|Titolo:||Mark-up and the narrative structure of TV news|
|Data di pubblicazione:||2009|
|Appare nelle tipologie:||2.1 Contributo in volume (Capitolo o Saggio)|