-
Private Cybersecurity NER dataset
Our dataset is created by merging APTNER and CyNER datasets, containing 13601 sentences, 347779 tokens, and 37684 entities. The split ratio was roughly 70% for training and... -
Private Italian Thesaurus for Tourism domain
An Italian thesaurus in the domain of the Tourism, counting 2,684 concepts, organized according to semantic relationships (equivalence, hierarchical and associative). The... -
Private PoliModal Corpus
The corpus includes the transcripts of 56 TV face-to-face interviews for a total of 14 hours, taken from the Italian political talk show Mezz'ora in piĆ¹ broadcast from 24...