-
Santorini Tweets July-August 2021
This dataset contains 225.501 tweets written by 141.277 users. These tweets are geolocated in Santorini, or they contain the word or the hashtag "santorini" in the text. They...-
ZIP
The resource: 'tweet_santorini.csv' is not accessible as guest user. You must login to access it!
-
ZIP
-
Amazon Network
Network was collected by crawling Amazon website. It is based on Customers Who Bought This Item Also Bought feature of the Amazon website. If a product i is frequently...-
HTML
The resource: 'Amazon Network ' is not accessible as guest user. You must login to access it!
-
HTML
-
Wikipedia Word Embeddings
Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0... -
Facebook EuroSys 2009
This dataset contains Social and interaction graphs representing two large-scale Facebook regional networks. Social graphs describe Facebook friendships between users... -
Facebook - New Orleans regional network
This dataset contains information about 90,269 users and 3,646,662 friendship links between those users. These users belong to the New Orleans Facebook regional network. The...-
HTML
The resource: 'New Orleans Facebook dataset' is not accessible as guest user. You must login to access it!
-
HTML
-
A dataset of gamers on Twitter
This gaming-related dataset consists of 8932 users (labeled as gamers) engaging in game-related conversations. We have collected (June 2018) their timeline (the most recent 3200... -
Learning to quantify: LeQua 2022 datasets
The aim of LeQua 2022 (the 1st edition of the CLEF “Learning to Quantify” lab) is to allow the comparative evaluation of methods for “learning to quantify” in textual... -
Product Reviews for Ordinal Quantification
This data set comprises a labeled training set, validation samples, and testing samples for ordinal quantification. It appears in our research paper "Ordinal Quantification... -
VaxxHesitancy: A Dataset for Studying Hesitancy Towards COVID-19 Vaccination ...
We create a publicly available dataset of over 3,100 COVID-19 vaccine-related tweets labeled as one of four stance categories: pro-vaxx, anti-vaxx, vaxx-hesitant, or... -
Ukraine-related Disinformation Dataset
Ukraine-related disinformation dataset from "Comparative Analysis of Engagement, Themes, and Causality of Ukraine-Related Debunks and Disinformation" (accepted at SocInfo... -
Ego Networks of Words in Twitter
This set of dataframes were used in our last paper : Ollivier K, Boldrini C, Passarella A, Conti M (2022) Structural invariants and semantic fingerprints in the “ego network”... -
Cross-Lingual Dataset of Crisis-Related Social Media
If you use this dataset, please cite the following paper: Fedor Vitiugin, Carlos Castillo: Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive...