Submitty Student Auto Feed is an optional command-line PHP script that can automatically fill or update classlists on a cron schedule.
1. Requirements
- Submitty Student Auto Feed is intended to be managed by a systems administrator or similar IT professional.
- PHP 5.6 or higher with
pgsql
andiconv
extensions.- Although not a necessity, the auto feed script can operate on the same server that Submitty is running on.
- The
ssh2
extension is also required if the data feed CSV resides on a different server than the script is running from (this also includes running the script on the Submitty server).
- A regularly updated CSV data feed of student enrollment.
- Contact your university’s registrar and/or data warehouse for assistance.
2. Files
Latest version of the auto feed script and supplmental files will be checked into the master
branch in SysadminTools/student_auto_feed/
submitty_student_auto_feed.php
– Executable PHP script to read student registration CSV and update Submitty classlist enrollment.config.php
– REQUIRED config file forsubmitty_student_auto_feed.php
accounts.php
– IMPORTANT for Submitty servers utilizing PAM authentication.
3. Course Database Backups
Please use db_backup.py
(located in SysadminTools/nightly_db_backup/
) on a cron schedule to create nightly backups of course databases.
Run db_backup.py -h
to see extended help and argument list.
4. Before Installing Auto Feed Script
It is important that you can receive a regularly updated data feed of student enrollment. The data should be tabulated (like a spreadsheet), but must be written as a CSV file. You will likely need the cooperation from your university’s data warehouse and/or registrar. The CSV file will need to be delivered or provided somewhere that the auto feed script can access.
Please note where the CSV location is as you will need it later.
IMPORTANT – CSV files are traditionally human readable raw text files and the CSVs required by the auto feed script will contain student enrollment data protected by FERPA (U.S. federal statute 20 U.S.C. § 1232g). Please take appropriate information protection measures. SUBMITTY IS NOT RESPONSIBLE FOR YOUR COURSE’S, DEPARTMENT’S, OR UNIVERSITY’S INFORMATION CONTROL POLICIES OR ACTIVITIES.
4.1 Student CSV Layout
There are ten required columns/fields, and one optional column/field processed by the submitty auto feed script.
The student CSV data file may have additional fields, which will be ignored by, but there may not be less than the ten columns/fields required.
The columns/fields may be in any order.
config.php
will map the requisite (and optional) columns/fields to their respective data points.
(q.v. CSV Fields Mapping for additional explanation)
The student CSV file must have fields/columns for these data points:
- Student’s legal first name
- Student’s legal last name
- Student’s campus computer systems user account ID
- The student CSV file should NEVER contain account passwords.
- An alternate numeric ID.
- This is intended to be the student’s campus assign ID number, but could be any alternate alphanumeric ID code assigned to the student by the data feed.
- This ID code can contain delimiter characters, such as dashes.
- Student’s email address
- Course the student is enrolled in
- This is actually two fields/columns. One is the course “prefix” (often the dept. code), and the other is the course “number”. Combined, they make up the course code as listed in the course catalog. e.g. Prefix is “CSIS” and number is “101” as in “CSIS 101” in the course catalog.
- Student’s registration status
- A student can drop enrollment during an academic term, and this may be reflected by a code in the student CSV file. Alternatively, the student’s enrollment entry, in the CSV file, can be entirely omitted when a course is dropped. The student auto feed script is intended to work with either case.
- Which lab/course section a student is enrolled in.
- The “term code” for the current academic term must exist in every data row.
This data point is optional in the student CSV file:
- Student’s “preferred” first name
- This is any name a student prefers to be known by.
- Submitty permits students to manually set their own “preferred” name, even if the student CSV file contains no record of it.
Other Requirements:
- Every column/field must be delimited by a the same character.
- While commas are a common delimiter,
tab
characters are recommended. - Columns/field data should not be enclosed by quotes. Quotation marks may be picked up as part of the data and fail certain validation checks.
5. Install On Ubuntu Server
As these are PHP scripts, they should run on any computer that has PHP 5.6+ and the appropriate extensions installed. However, these instructions will focus on Ubuntu server (same OS that is supported for Submitty). Ubuntu 16.04 uses PHP 7.0 by default, and Ubuntu 18.04 uses PHP 7.2 by default.
As Ubuntu is part of the Debian Linux family, these instructions are very likely to work with other Debian family distributions with, perhaps, minor adjustments.
- If they haven’t already been installed, install PHP and the required extensions.
sudo apt-get install php php-pgsql php-iconv php-ssh2
NOTE:
php-iconv
is not needed in Ubuntu 18.04. It is part of thephp-common
package installed withphp
. - Ensure the extensions are active.
sudo phpenmod php-pgsql php-iconv php-ssh2
- Create a directory on the server to run the scripts and copy
submitty_student_auto_feed.php
andconfig.php
from the repository to your new directory.submitty_student_auto_feed.php
andconfig.php
both should reside in the same directory and both must be accessible by the same user account.root
is technically not required to run the auto feed, but the account owning the script files will be responsible to run the auto feed via cron.
- File permissions:
submitty_student_auto_feed.php
is intended to be executable.config.php
is not intended to be executable.- The following sets owner only permissions of “Read/Write/Execute” to
submitty_student_auto_feed.php
and “Read/Write” (non executable) toconfig.php
:sudo chmod 0700 student_submitty_auto_feed.php sudo chmod 0600 config.php
6. Command Line Arguments
-h --help |
Extended help including usage and argument list. |
-t [term code] |
Manually set the term code. |
7. Configuration
Configuration options exist in config.php
as “constants”.
The goal, here, is to define each constant to a value reflective of your use of Submitty.
The provided defaults, while illustrative, typically will not work.
IMPORTANT – these lines are treated as actual PHP program code.
define
is a function that requires parentheses.
Inside the parentheses are (usually) string-values arguments, comma separated.
String values must be enclosed in single or double quotes.
However, sometimes the value is a whole number or of the keywords true
, false
, or null
.
These are not string values and therefore are not enclosed in single or double quotes.
Each line must end with a semicolon.
Otherwise, the auto feed will throw a syntax or parse error and won’t run.
Best practice is to follow the format as seen in the examples.
Here is an example option:
define('CSV_FILE', '/path/to/datafile.csv');
This defines the constant CSV_FILE
and sets it to the value /path/to/datafile.csv
.
Do not change the constant. Only change the constant’s value.
We would need to change the value to reflect where the student data CSV is located (did you note this back in chapter 5?).
For example, if your data warehouse delivers the feed CSV to /users/datawarehouse/enrollment.csv
– then change the line to read:
define('CSV_FILE', '/users/datawarehouse/enrollment.csv');
There are a couple other options to set besides define
: date_default_timezone_set
and ini_set
.
Following is a list of each constant and what they represent.
Summaries are also provided as “code comments” within config.php
.
Consistent with C and Java styles, PHP code comments either begin with double slashes //
or are multiple lines between /*
and */
.
Using a text editor with syntax highlighting will be highly beneficial as code comments will be given a unique text color (text coloring will vary from editor to editor).
7.1 Configurations
These options are set in config.php
.
config.php
must exist in the same directory and be accessible by the same user account as submitty_student_auto_feed.php
.
Database Connection
define('DB_HOST', 'submitty.cs.myuniversity.edu');
define('DB_LOGIN', 'submitty_dbuser');
define('DB_PASSWORD', 'DB.p4ssw0rd');
These options specify the login to the Submitty database for the hostname of the database, the user login (typically submitty_dbuser
), and the password (same as used in Submitty setup).
IMPORTANT – Without this configuration, the auto feed cannot add or update course enrollments.
Note that the database is often on the same server as Submitty, but this is not required. The database can be hosted on a separate server from Submitty.
Error Logging
define('ERROR_EMAIL', 'sysadmins@lists.myuniversity.edu');
define('ERROR_LOG_FILE', '/location/of/auto_feed_error.log');
When an error occurs, it is written to a raw text logfile. The location of the logfile must be specified and must be accessible by the user account running the auto feed.
Error messages can also be emailed, presumably to a sysadmin or a mailing list monitored by an IT dept (highly recommended).
Emailing error messages can be disabled by setting the value to null
(without quotes).
IMPORTANT – error-log email may be considered unauthenticated email by many Universities. Your campus may restrict or outright deny delivery of the error-log emails. Consult with your University’s IT department regarding unauthenticated email.
CSV File Access
define('CSV_FILE', '/path/to/datafile.csv');
These constants define how the CSV data can be accessed.
CSV_FILE
is the absolute path to the CSV data file.- This script does not currently support network access to the CSV data file.
CSV Delimiter
define('CSV_DELIM_CHAR', chr(9));
Every CSV file has a delimiter character that separates each data field, and this delimiter needs to be specified.
Usually, the delimiter is a comma, but it can be any character from the standard ASCII table.
The delimiter character can be directly quoted (e.g. ','
for comma) or specified by its ASCII value (e.g. chr(44)
for comma).
In the example above, chr(9)
is the tab key.
Here are some example delimiters:
- Comma
define('CSV_DELIM_CHAR', ',');
- Comma by ASCII value
define('CSV_DELIM_CHAR', chr(44));
- Tilde
define('CSV_DELIM_CHAR', '~');
- Semicolon
define('CSV_DELIM_CHAR', ';');
CSV Validation
define('VALIDATE_MIN_FILESIZE', 65536);
define('VALIDATE_NUM_FIELDS', 10);
These options are used to (loosely) detect a bad CSV file.
-
VALIDATE_MIN_FILESIZE
sets the acceptable minimum file size as an integer in bytes. This is useful to detect an egregiously small CSV that could indicate data corruption (such as a file containing end-of-line characters, but no actual data).It is possible to snare a legitimate CSV as a false-positive, so setting this value relatively small, but greater than zero, is advised. Here are a couples tips to help you decide what should be a reasonable validation filesize.
- A CSV with 5,120 end-of-line chars (empty rows) will be 5,120 bytes (5 kilobytes) in size. CP-1252 (MS-Windows) encoded CSVs have two end-of-line chars per row, so 5,120 empty rows will make up a 10 kilobyte CSV.
- As seen in the example above, 65,536 bytes = 64 kilobytes.
-
VALIDATE_NUM_FIELDS
is a check to make sure that an exact number of fields/columns is present in every row of the CSV. Any row that does not have this exact value is expected to have unreliable data and is ignored by the auto feed script. This value includes any extraneous fields/columns that your University’s registrar/data warehouse provides.Even though the auto feed requires ten columns, the CSV being provided may have more. If so, use the number of columns in the CSV to set this option. Otherwise, all columns may be ignored and no enrollment additions or updates will be recorded.
CSV Fields Mapping
define('COLUMN_COURSE_PREFIX', 8);
define('COLUMN_COURSE_NUMBER', 9);
define('COLUMN_REGISTRATION', 7);
define('COLUMN_SECTION', 10);
define('COLUMN_USER_ID', 5);
define('COLUMN_NUMERIC_ID', 6);
define('COLUMN_FIRSTNAME', 2);
define('COLUMN_LASTNAME', 1);
define('COLUMN_PREFERREDNAME', 3);
define('COLUMN_EMAIL', 4);
define('COLUMN_TERM_CODE', 0);
Each of these constants represents the data fields that must be read from the student data CSV.
Different universities will order the data differently, therefore the auto feed requires these define
functions to determine which columns hold the needed data.
IMPORTANT – The integer values actually represent array indices, and as is common convention in programming, array indices start counting at zero. That is, the first column of the CSV is #0, the second column is #1, the third column is #2, and so on.
-
COLUMN_COURSE_PREFIX
represents the course prefix (often the dept. code) as seen in the course catalog. The entire course code would be the prefix and number combined. -
COLUMN_COURSE_NUMBER
represents the course number as seen in the course catalog. The entire course code would be the prefix and number combined.IMPORTANT: The student CSV must have separate columns for the prefix/dept code and the course number. This permits the auto feed script to process enrollment for courses in multiple departments.
For Example Prefix “CSIS” (Computer Science and Information Systems dept.) Number “101” (introductory programming course) Course Listed In Catalog “CSIS 101” or “CSIS-101” -
COLUMN_REGISTRATION
represents the column that has a code to verify that a student is enrolled or not. q.v. Student Registration Codes -
COLUMN_SECTION
is the column representing the lab/course section a student is enrolled in. -
COLUMN_USER_ID
is a student’s computing systems user ID, and is also used to login to Submitty. -
COLUMN_NUMERIC_ID
is intended to represent the student’s campus assigned ID number, but could be any alternate ID number provided by a data feed. Although this column is called “numeric”, any alphanumeric code is permitted with delimiter characters, such as dashes. -
COLUMN_FIRSTNAME
is a student’s legal first name. -
COLUMN_LASTNAME
is a student’s legal last name. -
COLUMN_PREFERREDNAME
is the name a student wishes to be known by. If your university’s student data CSV doesn’t have this column, set this tonull
. -
COLUMN_EMAIL
contains a student’s email address. -
COLUMN_TERM_CODE
contains the code describing the current term. This is checked against the “expected” term code for validation. q.v. Expected Term Code
Student Registration Codes
define('STUDENT_REGISTERED_CODES', array('RA', 'RW'));
This option is a little more complicated to look at, but is actually not any more difficult than the others.
This define
is creating a list (a.k.a. “array”) of all the registration codes that indicate a student is enrolled in a course.
Your student CSV may include students who were enrolled in a course, but that enrollment may change mid-term.
In the above example, the two codes RA
and RW
indicate a student is enrolled in a course.
In this case, RA
may mean “Registered by Adviser” and RW
may mean “Registered by Web”.
- You will need to coordinate with your University’s registrar or data warehouse to determine what all the enrollment codes are.
- You will need to replace/remove/add enrollment codes to this
define
that are found in your student CSV. - This example has two codes, but you may have more codes or only one code.
- Even if there is only one registration code, you must have the
array(
and));
program code. - Don’t forget the commas – just like in an English list, every item (code) is separated by commas.
- Any student not associated with a registration code as listed in this option is assumed to have dropped the course or has otherwise been unregistered for some reason. In which case, an update will occur in Submitty’s database to reflect the student is no longer enrolled in that course.
Expected Term Code
define('EXPECTED_TERM_CODE', '201705');
This check will ensure that a course’s enrollment database does not accidentally get clobbered by a student enrollment list for a different term.
Every term (semester) should be associated with a unique code. This code will have to be updated by a sysadmin, as needed.
Per example, above, 201705
might be a code for the Summer 2017 semester.
That is 201705
might be year 2017, starting month 5 (May).
The student auto feed will check every row for this code and compare it with this define
statement.
Rows that do not match the define
value will be ignored.
It is possible that when one row does not match, all rows will not match.
Windows Encoding Conversion
define('CONVERT_CP1252', true);
If your student CSV originates from a Windows computer, the auto feed may need to do a text encoding conversion from CP-1252 to UTF-8; especially when the CSV character data is expected to include diacritics.
Set CONVERT_CP1252
to true
if the student CSV originates from a Windows computer.
Otherwise, set to false
.
End of Line Detection
ini_set('auto_detect_line_endings', true);
In summary, this define
shouldn’t be changed.
It ensures that CSV files exported by Microsoft Excel for Macintosh are correctly processed.
Timezone
date_default_timezone_set('America/New_York');
This option must be set to your timezone. The example, above, is set to Eastern timezone.
Suggested Settings For Timezones in USA
Timezone | Suggested Setting |
---|---|
Eastern | America/New_York |
Central | America/Chicago |
Mountain | America/Denver |
Mountain (no daylight savings) | America/Phoenix |
Pacific | America/Los_Angeles |
Alaska | America/Anchorage |
Hawaii | America/Adak |
Hawaii (no daylight savings) | Pacific/Honolulu |
For a complete list of timezones: http://php.net/manual/en/timezones.php
8. PAM Authentication and accounts.php
The script accounts.php
will automate the creation of local accounts used with PAM authentication.
This script is not needed when using database authentication.
accounts.php
must exist on the same server as Submitty, and it needs to be run as root
.
This script is intended to read user entries from Submitty’s “master” database to generate any missing local accounts needed for PAM authentication.
Run accounts.php -h
to see extended help and argument list.
It is recommended that this script is run every hour as a cron job. That way, should an instructor manually add a student to their course, the student’s access to Submitty will be available “within an hour”.