Tekstgenres analyseren op lexicale complexiteit met TScan

  • Henk L.W. Pander Maat Henk L.W. Pander Maat
  • Nick Dekker Nick Dekker

T-Scan is a tool for the automatic analysis of Dutch text. This paper presents the first large-scale corpus analysis with T-Scan, focusing on lexical complexity. A collection of nearly 1000 text specimens was assembled, containing ten genres: travel blogs, celebrity news features, novels, textbooks for vocational secondary schools, textbooks for general secondary schools, news reports, opinion pieces, political programs, medical advice texts and research articles. The lexical complexity features in the analysis include morphology, word frequency, various word concreteness indices, personal pronouns, names and verb tense. Systematic genre differences are found, such that a genre detection model comprising 18 T-Scan features correctly identifies 83 percent of the corpus texts. Most lexical features differentiating genres intuitively relate to text topic complexity. A closer analysis is offered of the contrast between the two textbook samples in the corpus, which differ only in the educational levels they cater for. Again, topic variation seems a more important factor than stylistic variation. We demonstrate a new method to examine stylistic variation, which consists of within-genre comparisons using the genre prediction; more specifically, ‘deviant’ texts are compared to ‘typical’ members of their genre.

Netspar, Network for Studies on Pensions, Aging and Retirement, is a thinktank and knowledge network. Netspar is dedicated to promoting a wider understanding of the economic and social implications of pensions, aging and retirement in the Netherlands and Europe.

MORE ABOUT NETSPAR


Mission en strategy           •           Network           •           Organisation           •          Magazine
Board Brief            •            Actionplan 2023-2027           •           Researchagenda

ABOUT NETSPAR

Our partners

B20231704_PGIM_Blacklogo2
B20221103_Zwitserlevengrayscale
B20231704_PensioenFederatie_Blacklogo
B20231704_DNB_Blacklogo
B20160708_tilburg university
View all partners