-
SWH Filenames
A 69 GB dataset with ~2.3 billion strings representing deduplicated names of source code files collected by Software Heritage, the great library of source code...-
ZIP
The resource: 'SWH Filenames' is not accessible as guest user. You must login to access it!
-
ZIP
-
CLiQS
CLiQS is a Python language software package for social media texts summarization with a diversified approach. -
Cross-Lingual Dataset of Crisis-Related Social Media
If you use this dataset, please cite the following paper: Fedor Vitiugin, Carlos Castillo: Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive...