Instructor Interface for Plagiarism Detection

Submitty is an open source programming assignment submission system from the Rensselaer Center for Open Source Software (RCOS), launched by the Department of Computer Science at Rensselaer Polytechnic Institute.

My GSoC project involved working on the Plagiarism Detector (also called “Lichen”) of Submitty Organization. Along with my GSoC project, I also worked on implementing some crucial features and fixing bugs. Working on various features and bugs throughout Submitty helped me learn even more about Software Development.

Throughout my GSoC journey, I learnt about Working of Plagiarism Detector, Web Technologies, Servers, Travis Testing.

Tasks done as part of my GSoC project

  1. Made Java, Python, C++ tokenizer for the core plagiarism module.

    Initially I was assigned to make tokenizers using Microsoft Language Servers. Microsoft Language Servers are used by various text editors like Sublime, Atom, etc and it provide the text editor with the features like AutoComplete Suggestion, References, etc. We aimed at sending a file and getting its tokens from the server, but due to no direct client request method for tokens, couldn’t integrate Microsoft Language Server. So finally made tokenizers similar to how language server do tokenization internally. This involved exploring codebase of Cquery language server, Palantir Language Server, and Java Language Server.

    Link to commits (merged)-

    Python and CPP tokenizers (#3)

    Java tokenizers (#8)

  2. Worked on Visualization tools for Plagiarism Interface

    Work on the new interface of Plagiarism Detector to let instructor see plagiarism results on the interface. Worked on visualization tool to make plagiarism result more intuitive to differentiate between plagiarism vs. coincidental matching.

    Implement various visualization tools like-

    There can be different kind of matches like common code, suspicious, match with instructor provided code.

    a) Used colors in code boxes where code is is displayed to differentiate between various type of matches. Also add various color click events to display with whom that colored section matches with.

    Link to commits (merged)-

    Lichen first draft (#2239)

    Lichen color click events (#2343)

    Lichen minor modifications (#2592)

    Lichen main page modifications (#2626)

  3. Implemented backend for New Plagiarism Detector

    Link to commits (merged)-

    Lichen first draft (#2239)

  4. Automated the new Plagiarism Interface by integrating to Submitty Daemon

    Automated the plagiarism detector to do various jobs from interface itself. This includes creating configuration file for gradeable for which the plagiarism is to be run, editing configuration, rerunning plagiarism detector for a gradeable, delete plagiarism results for a gradeable.

    Link to commits (merged)-

    Run lichen plagiarism as submitty daemon job (#2423)

Ongoing task

  1. Initial Test Suite for Plagiarism Detector (no pull request yet)

    I am currently working on testing and debugging of Plagiarism Detector. This involves creating regression test for plagiarism detector. This Regression test will check whether the tokenization, hashing of token sequence and matching of hashes is done correctly. This regression test then will be integrated to travis.

New Features implemented and Bug fixed along with GSoC Project

  1. Extended Registration section from numeric to alphanumeric

    Link to commits (merged)-

    Alphanumeric registration section (#2069)

  2. Implemented Delete Gradeable Feature

    This feature can be used instructor to delete a gradeable provided some constraints are matched.

    Link to commits (merged)-

    Delete gradeable feature (#2031)

  3. Implemented Team Export & Import Feature from one gradeable to other

    This feature help instructor transport the teams from one gradeable to another. This can be used in cases where instructor wants same teams across many gradeables in course.

    Link to commits (merged)-

    Team member export and import feature (#1982)

  4. Implemented Rebuild Assignment feature

    This feature allows instructor to rebuild a gradeable from interface itself rather than going to server and running rebuild script there.

    Link to commits (merged)-

    Assignments can be rebuild from interface (#2105)

Commit History (including all commits which got merged)

Commits in Submitty/Submitty repo

Commits in Submitty/Lichen repo

Building the project

For building the project, it will require to build complete Submitty System.

  1. Instructions for building Submitty- Developer/VM Install using Vagrant

  2. For using Submitty and its Plagiarism Detector, follow instructions at Developer/Installation and Instructor/Plagiarism Detection