3 items found

Organisations: SoBigData Services and Products Groups: sobigdata-it Tags: Web mining

Filter Results
  • Access required...

    ×

    Dataset

    Private Cybersecurity NER dataset

    Our dataset is created by merging APTNER and CyNER datasets, containing 13601 sentences, 347779 tokens, and 37684 entities. The split ratio was roughly 70% for training and...
  • Dataset

    SWH Filenames

    A 69 GB dataset with ~2.3 billion strings representing deduplicated names of source code files collected by Software Heritage, the great library of source code...
    • ZIP
      The resource: 'SWH Filenames' is not accessible as guest user. You must login to access it!
  • Dataset

    FAIR-SWENG: dataset on gender fairness in software engineering academic lands...

    The dataset contains academic performance metrics of Software Engineers worldwide.
You can also access this registry using the API (see API Docs).