Dataset: ICLR

In this assignment, we are working with manuscripts and their reviews from a famous CS conference, ICLR (International Conference on Learning Representations). This is a top conference in computer science on machine learning.

Each manuscript have 2 – 3 reviews. Each row in the training.csv and test_contentonly.csv represent a review to a specific manuscript. They contains the following columns

Don't use plagiarized sources. Get Your Custom Essay on
Dataset: ICLR
Just from $13/Page
Order Essay
  • id: id of manuscript
  • reviewer_name: name of reviewer for this manuscript
  • title: title of the manuscript
  • abstract: abstract of the manuscript
  • comments: review texts of this manuscript by a specific reviewer
  • decision: final decision (1 if the manuscript was accepted or 0 otherwise).

The decision column was not directly listed in the test_contentonly.csv. Instread, it was listed in test_label.csv.

Grading policy

We will grade based on your code notebook (Python notebook or R markdown file) on GitHub. Your codes should have clear documentations of the process you take and decisions you have made. Also discuss your results when appropriate (see the problem descriptions below).

1. Supervised methods (60 pts) Please use Python or R to do the assignment 

In this task, you need to predict whether a manuscript is accepted (1) or rejected (0), based on the review texts.

1.1 Dictionary method (20 pts)

Use the dictionary method to predict whether manuscripts in the test data were accepted or rejected.

  • list the dictionaries you used
  • Discuss how you construct your dictionary (e.g., by reading and summarizing, using embedding, etc).

1.2 Supervised methods (20 pts)

Use the dictionary method to predict whether manuscripts in the test data were accepted or rejected, using training.csv as the training data.

1.3 Evaluation (20 pts)

  • Compare supervised learning’s performance with dictionary methods, based on the testdata. The correct labels are provided in test_label.csv. Report the following:
    • Precision
    • Recall
    • F1 score
    • AUC score (of ROC curves).
  • Discuss whether supervised methods or dictionary methods yield better performance. And what makes you achieve a good prediction performance?
Order your essay today and save 20% with the discount code: GREEN

Order a unique copy of this paper

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal
error: Content is protected !!
Live Chat+1(978) 822-0999EmailWhatsApp

Order your essay today and save 20% with the discount code GREEN