Skip to content

Latest commit

 

History

History
20 lines (16 loc) · 765 Bytes

File metadata and controls

20 lines (16 loc) · 765 Bytes

Code-Comment-Assessment-Dataset

1 Introduction

This repository contains datasets for paper "Deep Code-Comment Understanding and Assessment".

2 Dataset

The public dataset is from this work. The public dataset includes the results of a manual assessment on the coherence between comments and the implementations of 3636 methods, gathered from three open source softwares implemented in Java.

Our labeled dataset is from the Java projects uploaded to GitHub before October 2018.

3 Data Structure

For each method in our labeled dataset, the structure is shown as belows:

  • #No
  • #File (the name of the source file)
  • #Comment
  • #Code