-
The LSP (Language for Special Purposes) corpus consists of texts from seven selected domains. The DK-CLARIN LSP corpus comprises 11 M tokens from the period 2000-2010,...
- XML
-
En opmærket multimodal samling af samtaler på dansk hvor tolv deltagerpar taler sammen for at lære hinanden at kende. Deltagerne blev filmet mens de stod foran hinanden og talte...
- XML
-
The Leipzig Corpora Collection provides different tools and data for download, which are protected by copyright. For more details please refer to our terms of usage....
- TXT
-
Maskinlæsbar version af dumps fra den danske wikipedia kilder. Se https://foundation.wikimedia.org/wiki/Terms_of_Use
- XML
-
Maskinlæsbar version af dumps fra den danske wikipedias citater. Se https://foundation.wikimedia.org/wiki/Terms_of_Use
- XML
-
A billion-word corpus of Danish text. Split into many sections, and covering many dimensions of variation (spoken/written, formal/informal, modern/old, rigsdansk/dialect, and so...
- TXT
-
MULINCO - MUltiLINgual Corpus of the University of COpenhagen. 7 eventyr af H.C.Andersen, tekster af Edgar Allen Poe, Saxos Danmarks historie og EU-traktater på flere sprog...