-
Private Cybersecurity NER dataset
Our dataset is created by merging APTNER and CyNER datasets, containing 13601 sentences, 347779 tokens, and 37684 entities. The split ratio was roughly 70% for training and... -
Spotify Tracks Dataset (full)
The dataset is created exploiting the Spotify API and the tracks id provided by the authors of https://www.kaggle.com/datasets/maharshipandya/-spotify-tracks-dataset.... The... -
Spotify track dataset (small)
The dataset is created exploiting the Spotify API and the tracks id provided by the authors of https://www.kaggle.com/datasets/maharshipandya/-spotify-tracks-dataset.... The...-
ZIP
The resource: 'std_small' is not accessible as guest user. You must login to access it!
-
ZIP
-
Air Quality Datasets over L'Aquila Region
These datasets have been collected through ESA, CeTEMPS and ARTA. They are a work-in-progress deliverable of a virtual laboratory (VL-Disaster) in the context of the SoBigData. -
Private Smart Cities Weather and Pollution conditions
A set of weather and climatic conditions gathered during the Toolsmart PoN project ( Open Community PA 2020 – Pon Governance 2014-2020). Data are obtained from IoT based... -
GiveMeSomeCreditSC
The GiveMeSomeCredit dataset - https://www.kaggle.com/c/GiveMeSomeCredit - contains different features of borrowers. The task is predicting the financial distress of a...-
ZIP
The resource: 'GiveMeSomeCreditSC' is not accessible as guest user. You must login to access it!
-
ZIP
-
Santorini Tweets July-August 2021
This dataset contains 225.501 tweets written by 141.277 users. These tweets are geolocated in Santorini, or they contain the word or the hashtag "santorini" in the text. They...-
ZIP
The resource: 'tweet_santorini.csv' is not accessible as guest user. You must login to access it!
-
ZIP
-
HANSEN: Spoken Text Authorship Analysis
HANSEN encom- passes meticulous curation of existing speech datasets accompanied by transcripts, along- side the creation of novel AI-generated spo- ken text datasets.... -
Yeast
The yeast dataset is a collection of yeast microarray expressions and phylogenetic profiles which can be used to learn the yeast gene functional categories. One row of this...-
arff
The resource: 'Yeast Dataset' is not accessible as guest user. You must login to access it!
-
arff
-
Medical Dataset
The medical dataset contains a corpus of fully anonymized clinical text. Each document in the corpus is associated with a set of ICD-9 codes which represents the diagnosis...-
ZIP
The resource: 'Medical Dataset' is not accessible as guest user. You must login to access it!
-
ZIP
-
-
CSV
The resource: 'Churn Dataset' is not accessible as guest user. You must login to access it!
-
CSV
-
German Credit
In the german credit dataset each one of the 1,000 persons is classified as a good or bad creditor according to attributes like age, sex, checking_account, credit_amount,...-
CSV
The resource: 'German Credit' is not accessible as guest user. You must login to access it!
-
CSV
-
Compas
The compas dataset contains the features used by the COMPAS algorithm for scoring defendants and their risk (Low, Medium and High), for over $4,000$ individuals. We considered...-
CSV
The resource: 'https://www' is not accessible as guest user. You must login to access it!
-
CSV
-
Dataset Adult
The adult dataset includes $48,842$ instances with demographic information like age, workclass, marital-status, race, capital-loss, capital-gain etc. The income attribute...-
CSV
The resource: 'Adult' is not accessible as guest user. You must login to access it!
-
CSV
-
Official administrative information of Tuscany
The data contains the spatial partitioning of Tuscany and some statistical information published by the Italian Statistical Bureau.-
LOD
The resource: 'Linked Open Data' is not accessible as guest user. You must login to access it!
-
LOD
-
ClueWeb09
The ClueWeb09 dataset consists of about 1 billion web pages in ten languages that were collected in January and February 2009. It was created to support research on... -
Physical activity, quality of sleep, and quality of life in Italy: the long t...
From March 2020 to May 2021, several lockdown periods caused by COVID-19 pandemic have limited, with varying degrees of severity, the people’s usual activities and mobility in...-
ZIP
The resource: 'dataset and code' is not accessible as guest user. You must login to access it!
-
ZIP
-
Global Peace Index data
A dataset of the Global Peace Index (GPI), which ranks 163 independent states and territories according to their level of peacefulness. The GPI covers 99.7 per cent of the... -
NYSE transactions
This dataset contains financial data on the price of the top 250 most liquid assets of New York Stock Exchange (NYSE) from 2006 to 2014. The dataset contains transactions,... -
FED data
March 2001- September 2013 quarterly data of US banks' holdings. The number of financial institutions present in the data is pretty stable during quarters, starting from...