SoBigData Services and Products - Organisations

Dataset

Santorini Tweets July-August 2021

This dataset contains 225.501 tweets written by 141.277 users. These tweets are geolocated in Santorini, or they contain the word or the hashtag "santorini" in the text. They...

ZIP
The resource: 'tweet_santorini.csv' is not accessible as guest user. You must login to access it!

Method

CLiQS

CLiQS is a Python language software package for social media texts summarization with a diversified approach.

The resource: 'CLiQS-CM' is not accessible as guest user. You must login to access it!

Dataset

Wikipedia Word Embeddings

Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0...

The resource: 'Embeddings' is not accessible as guest user. You must login to access it!

Dataset

Learning to quantify: LeQua 2022 datasets

The aim of LeQua 2022 (the 1st edition of the CLEF “Learning to Quantify” lab) is to allow the comparative evaluation of methods for “learning to quantify” in textual...

The resource: 'Zenodo link' is not accessible as guest user. You must login to access it!

Dataset

Product Reviews for Ordinal Quantification

This data set comprises a labeled training set, validation samples, and testing samples for ordinal quantification. It appears in our research paper "Ordinal Quantification...

The resource: 'Zenodo link' is not accessible as guest user. You must login to access it!

Dataset

Cross-Lingual Dataset of Crisis-Related Social Media

If you use this dataset, please cite the following paper: Fedor Vitiugin, Carlos Castillo: Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive...

The resource: 'Cross-Lingual Dataset of ...' is not accessible as guest user. You must login to access it!

6 items found

Santorini Tweets July-August 2021

CLiQS

Wikipedia Word Embeddings

Learning to quantify: LeQua 2022 datasets

Product Reviews for Ordinal Quantification

Cross-Lingual Dataset of Crisis-Related Social Media