Submitty logo

Submitty Student Auto Feed is an optional command-line PHP script that can automatically fill or update classlists on a cron schedule.

Table of Contents

  1. Requirements
  2. Files
  3. Deprecated Files
  4. Course Database Backups
  5. Student CSV
  6. Install on Ubuntu 16.04
  7. Configuration
  8. PAM Authentication and accounts.php

1. Requirements

Back To Table of Contents

2. Files

Latest version of the auto feed script and supplmental files will be checked into the master branch in Submitty/Docs/student_auto_feed/

Back To Table of Contents

3. Deprecated Files

These files remain in the repository but have not been updated. They are incompatible with the latest auto feed script and database schema. It is highly inadvisable to use these files.

Back To Table of Contents

4. Course Database Backups

Please use db_backup.py (located in Submitty/Docs/nightly_db_backup/) on a cron schedule to create nightly backups of course databases.

Back To Table of Contents

5. Before Installing Auto Feed Script

It is important that you can receive a regularly updated data feed of student enrollment. The data should be tabulated (like a spreadsheet), but must be written as a CSV file. You will likely need the cooperation from your university’s data warehouse and/or registrar. The CSV file will need to be delivered or provided somewhere that the auto feed script can access.

Please note where the CSV location is as you will need it later.

IMPORTANT – CSV files are traditionally human readable raw text files and the CSVs required by the auto feed script will contain student enrollment data protected by FERPA (U.S. federal statute 20 U.S.C. § 1232g). Please take appropriate information protection measures. SUBMITTY IS NOT RESPONSIBLE FOR YOUR COURSE’S, DEPARTMENT’S, OR UNIVERSITY’S INFORMATION CONTROL POLICIES OR ACTIVITIES.

Back To Table of Contents

5.1 Student CSV Layout

There are nine required columns/fields, and one optional column/field processed by the submitty auto feed script. The student CSV data file may have additional fields, which will be ignored by, but there may not be less than the nine columns/fields required. The columns/fields may be in any order. config.php will map the requisite (and optional) columns/fields to their respective data points. (q.v. CSV Fields Mapping for additional explanation)

The student CSV file must have fields/columns for these data points:
  1. Student’s legal first name
  2. Student’s legal last name
  3. Student’s campus computer systems user account ID
    • The student CSV file should NEVER contain account passwords.
  4. Student’s email address
  5. Course that student is enrolled in
    • This is actually two fields/columns. One is the course “prefix” (often the dept. code), and the other is the course “number”. Combined, they make up the course code as listed in the course catalog. e.g. Prefix is “CSIS” and number is “101” as in “CSIS 101” in the course catalog.
  6. Student’s registration status
    • A student can drop enrollment during an academic term, and this may be reflected by a code in the student CSV file. Alternatively, the student’s enrollment entry, in the CSV file, can be entirely omitted when a course is dropped. The student auto feed script is intended to work with either case.
  7. Which lab/course section a student is enrolled in.
  8. The “term code” for the current academic term must exist in every data row.
This data point is optional in the student CSV file:
  1. Student’s “preferred” first name
    • This is any name a student prefers to be known by.
    • Submitty permits students to manually set their own “preferred” name, even if the student CSV file contains no record of it.
Other Requirements:

Back To Table of Contents

6. Install On Ubuntu 16.04

As these are PHP scripts, they should run on any computer that has PHP 5.4+ and the appropriate extensions installed. However, these instructions will focus on Ubuntu 16.04 (same OS that is supported for Submitty). Ubuntu 16.04 uses PHP 7.0 by default.

As Ubuntu is part of the Debian Linux family, these instructions are very likely to work with other Debian family distributions with, perhaps, minor adjustments.

  1. If they haven’t already been installed, install PHP and the required extensions.
    sudo apt-get install php php-pgsql php-iconv php-ssh2
    
  2. Ensure the extensions are active.
    sudo phpenmod php-pgsql php-iconv php-ssh2
    
  3. Create a directory on the server to run the scripts and copy submitty_student_auto_feed.php and config.php from the repository to your new directory.
    • submitty_student_auto_feed.php and config.php both should reside in the same directory and both must be accessible by the same user account.
    • root is technically not required to run the auto feed, but the account owning the script files will be responsible to run the auto feed via cron.
  4. It is highly advisable for sysadmins to set user-level only permissions to the auto feed script files.
    • submitty_student_auto_feed.php is intended to be executable.
    • config.php is not intended to be executable.
    • The following sets owner only permissions of “Read/Write/Execute” to submitty_student_auto_feed.php and “Read/Write” (non executable) to config.php:
      sudo chmod 0700 student_submitty_auto_feed.php
      sudo chmod 0600 config.php
      

Back To Table of Contents

7. Configuration

Configuration options exist in config.php as “constants”. The goal, here, is to define each constant to a value reflective of your use of Submitty. The provided defaults, while illustrative, typically will not work.

IMPORTANT – these lines are treated as actual PHP program code. define is a function that requires parentheses. Inside the parentheses are (usually) string-values arguments, comma separated. String values must be enclosed in single or double quotes. However, sometimes the value is a whole number or of the keywords true, false, or null. These are not string values and therefore are not enclosed in single or double quotes. Each line must end with a semicolon. Otherwise, the auto feed will throw a syntax or parse error and won’t run.

Best practice is to follow the format as seen in the examples.


Here is an example option:

define('CSV_FILE', '/path/to/datafile.csv');

This defines the constant CSV_FILE and sets it to the value /path/to/datafile.csv.

Do not change the constant. Only change the constant’s value.

We would need to change the value to reflect where the student data CSV is located (did you note this back in chapter 5?). For example, if your data warehouse delivers the feed CSV to /users/datawarehouse/enrollment.csv – then change the line to read:

define('CSV_FILE', '/users/datawarehouse/enrollment.csv');

There are a couple other options to set besides define: date_default_timezone_set and ini_set.

Following is a list of each constant and what they represent. Summaries are also provided as “code comments” within config.php. Consistent with C and Java styles, PHP code comments either begin with double slashes // or are multiple lines between /* and */. Using a text editor with syntax highlighting will be highly beneficial as code comments will be given a unique text color (text coloring will vary from editor to editor).

Back To Table of Contents

7.1 Configurations

These options are set in config.php. config.php must exist in the same directory and be accessible by the same user account as submitty_student_auto_feed.php.

Back To Table of Contents

Database Connection

define('DB_HOST',     'submitty.cs.myuniversity.edu');
define('DB_LOGIN',    'hsdbu');
define('DB_PASSWORD', 'DB.p4ssw0rd');

These options specify the login to the Submitty database for the hostname of the database, the user login (typically hsdbu), and the password (same as used in Submitty setup).

IMPORTANT – Without this configuration, the auto feed cannot add or update course enrollments.

Note that the database is often on the same server as Submitty, but this is not required. The database can be hosted on a separate server from Submitty.

Back To Table of Contents

Error Logging

define('ERROR_EMAIL',    'sysadmins@lists.myuniversity.edu');
define('ERROR_LOG_FILE', '/location/of/auto_feed_error.log');

When an error occurs, it is written to a raw text logfile. The location of the logfile must be specified and must be accessible by the user account running the auto feed.

Error messages can also be emailed, presumably to a sysadmin or a mailing list monitored by an IT dept (highly recommended). Emailing error messages can be disabled by setting the value to null (without quotes).

IMPORTANT – error-log email may be considered unauthenticated email by many Universities. Your campus may restrict or outright deny delivery of the error-log emails. Consult with your University’s IT department regarding unauthenticated email.

Back To Table of Contents

CSV File Access

define('CSV_AUTH',               'remote_keypair');
define('CSV_FILE',               '/path/to/datafile.csv');
define('CSV_REMOTE_SERVER',      'fileserver.myuniversity.edu');
define('CSV_AUTH_USER',          'remote_user');
define('CSV_AUTH_PASSWORD',      null);
define('CSV_AUTH_PUBKEY',        '/path/to/rsa_key.pub');
define('CSV_AUTH_PRIVKEY',       '/path/to/rsa_key.pfx');
define('CSV_PRIVKEY_PASSPHRASE', 'MySecretPassphrase');

These constants define how the CSV data can be accessed.

Back To Table of Contents

CSV Delimiter

define('CSV_DELIM_CHAR', chr(9));

Every CSV file has a delimiter character that separates each data field, and this delimiter needs to be specified. Usually, the delimiter is a comma, but it can be any character from the standard ASCII table. The delimiter character can be directly quoted (e.g. ',' for comma) or specified by its ASCII value (e.g. chr(44) for comma). In the example above, chr(9) is the tab key.

Here are some example delimiters:

Back To Table of Contents

CSV Validation

define('VALIDATE_MIN_FILESIZE', 65536);
define('VALIDATE_NUM_FIELDS',   10);

These options are used to (loosely) detect a bad CSV file.

Back To Table of Contents

CSV Fields Mapping

define('COLUMN_COURSE_PREFIX', 8);
define('COLUMN_COURSE_NUMBER', 9);
define('COLUMN_REGISTRATION',  7);
define('COLUMN_SECTION',       10);
define('COLUMN_USER_ID',       5);
define('COLUMN_FIRSTNAME',     2);
define('COLUMN_LASTNAME',      1);
define('COLUMN_PREFERREDNAME', 3);
define('COLUMN_EMAIL',         4);
define('COLUMN_TERM_CODE',     0);

Each of these constants represents the data fields that must be read from the student data CSV. Different universities will order the data differently, therefore the auto feed requires these define functions to determine which columns hold the needed data.

IMPORTANT – The integer values actually represent array indices, and as is common convention in programming, array indices start counting at zero. That is, the first column of the CSV is #0, the second column is #1, the third column is #2, and so on.

Back To Table of Contents

Student Registration Codes

define('STUDENT_REGISTERED_CODES', serialize( array(
'RA',
'RW'
)));

This option is a little more complicated to look at, but is actually not any more difficult than the others. This define is creating a list (a.k.a. “array”) of all the registration codes that indicate a student is enrolled in a course. Your student CSV may include students who were enrolled in a course, but that enrollment may change mid-term.

In the above example, the two codes RA and RW indicate a student is enrolled in a course. In this case, RA may mean “Registered by Adviser” and RW may mean “Registered by Web”.

Back To Table of Contents

Expected Term Code

define('EXPECTED_TERM_CODE', '201705');

This check will ensure that a course’s enrollment database does not accidentally get clobbered by a student enrollment list for a different term.

Every term (semester) should be associated with a unique code. This code will have to be updated by a sysadmin, as needed.

Per example, above, 201705 might be a code for the Summer 2017 semester. That is 201705 might be year 2017, starting month 5 (which is May).

The student auto feed will check every row for this code and compare it with this define statement. Rows that do not match the define value will be ignored. It is possible that when one row does not match, all rows will not match.

Back To Table of Contents

Windows Encoding Conversion

define('CONVERT_CP1252', true);

If your student CSV originates from a Windows computer, the auto feed may need to do a text encoding conversion from CP-1252 to UTF-8; especially when the CSV character data is expected to include diacritics. Set CONVERT_CP1252 to true if the student CSV originates from a Windows computer. Otherwise, set to false.

Back To Table of Contents

End of Line Detection

ini_set('auto_detect_line_endings', true);

In summary, this define shouldn’t be changed. It ensures that CSV files exported by Microsoft Excel for Macintosh are correctly processed.

Back To Table of Contents

About “allow_url_fopen”

ini_set("allow_url_fopen", true);

This entry is only relevant when the student CSV file must be remotely accessed
(q.v. CSV File Access). Setting this value to true permits resolving the domain name of another server. Otherwise, the server would have to be accessed via IP address.

This setting is irrelevant when the student CSV can be locally accessed.

Back To Table of Contents

Timezone

date_default_timezone_set('America/New_York');

This option must be set to your timezone. The example, above, is set to Eastern timezone.

Suggested Settings For Timezones in USA
Timezone Suggested Setting
Eastern America/New_York
Central America/Chicago
Mountain America/Denver
Mountain (no daylight savings) America/Phoenix
Pacific America/Los_Angeles
Alaska America/Anchorage
Hawaii America/Adak
Hawaii (no daylight savings) Pacific/Honolulu

For a complete list of timezones: http://php.net/manual/en/timezones.php

Back To Table of Contents

PAM Authentication and accounts.php

The script accounts.php will automate the creation of local accounts used with PAM authentication.

This script is not needed when using database authentication.

accounts.php must exist on the same server as Submitty, and it needs to be run as root. When the script is run, it will look for all existing courses and read all user entries from Submitty’s course databases. Any missing local accounts are automatically generated.

It is recommended that this script is run every hour as a cron job. Should an instructor manually add a student to their course, the student’s access to Submitty will be available “within an hour”.

Back To Table of Contents