Submitty Student Auto Feed is an optional command-line PHP script that can automatically fill or update classlists on a cron schedule.
- Submitty Student Auto Feed is intended to be managed by a systems administrator or similar IT professional.
- PHP 5.6 or higher with
- Although not a necessity, the auto feed script can operate on the same server that Submitty is running on.
ssh2extension is also required if the data feed CSV resides on a different server than the script is running from (this also includes running the script on the Submitty server).
- A regularly updated CSV data feed of student enrollment.
- Contact your university’s registrar and/or data warehouse for assistance.
Latest version of the auto feed script and supplmental files will be checked into the
master branch in
submitty_student_auto_feed.php– Executable PHP script to read student registration CSV and update Submitty classlist enrollment.
config.php– REQUIRED config file for
accounts.php– IMPORTANT for Submitty servers utilizing PAM authentication.
3. Course Database Backups
db_backup.py (located in
SysadminTools/nightly_db_backup/) on a cron schedule to create nightly backups of course databases.
db_backup.py -h to see extended help and argument list.
4. Before Installing Auto Feed Script
It is important that you can receive a regularly updated data feed of student enrollment. The data should be tabulated (like a spreadsheet), but must be written as a CSV file. You will likely need the cooperation from your university’s data warehouse and/or registrar. The CSV file will need to be delivered or provided somewhere that the auto feed script can access.
Please note where the CSV location is as you will need it later.
IMPORTANT – CSV files are traditionally human readable raw text files and the CSVs required by the auto feed script will contain student enrollment data protected by FERPA (U.S. federal statute 20 U.S.C. § 1232g). Please take appropriate information protection measures. SUBMITTY IS NOT RESPONSIBLE FOR YOUR COURSE’S, DEPARTMENT’S, OR UNIVERSITY’S INFORMATION CONTROL POLICIES OR ACTIVITIES.
4.1 Student CSV Layout
There are ten required columns/fields, and one optional column/field processed by the submitty auto feed script.
The student CSV data file may have additional fields, which will be ignored by, but there may not be less than the ten columns/fields required.
The columns/fields may be in any order.
config.php will map the requisite (and optional) columns/fields to their respective data points.
(q.v. CSV Fields Mapping for additional explanation)
The student CSV file must have fields/columns for these data points:
- Student’s legal first name
- Student’s legal last name
- Student’s campus computer systems user account ID
- The student CSV file should NEVER contain account passwords.
- An alternate numeric ID.
- This is intended to be the student’s campus assign ID number, but could be any alternate alphanumeric ID code assigned to the student by the data feed.
- This ID code can contain delimiter characters, such as dashes.
- Student’s email address
- Course the student is enrolled in
- This is actually two fields/columns. One is the course “prefix” (often the dept. code), and the other is the course “number”. Combined, they make up the course code as listed in the course catalog. e.g. Prefix is “CSIS” and number is “101” as in “CSIS 101” in the course catalog.
- Student’s registration status
- A student can drop enrollment during an academic term, and this may be reflected by a code in the student CSV file. Alternatively, the student’s enrollment entry, in the CSV file, can be entirely omitted when a course is dropped. The student auto feed script is intended to work with either case.
- Which lab/course section a student is enrolled in.
- The “term code” for the current academic term must exist in every data row.
This data point is optional in the student CSV file:
- Student’s “preferred” first name
- This is any name a student prefers to be known by.
- Submitty permits students to manually set their own “preferred” name, even if the student CSV file contains no record of it.
- Every column/field must be delimited by a the same character.
- While commas are a common delimiter,
tabcharacters are recommended.
- Columns/field data should not be enclosed by quotes. Quotation marks may be picked up as part of the data and fail certain validation checks.
5. Install On Ubuntu Server
As these are PHP scripts, they should run on any computer that has PHP 5.6+ and the appropriate extensions installed. However, these instructions will focus on Ubuntu server (same OS that is supported for Submitty). Ubuntu 16.04 uses PHP 7.0 by default, and Ubuntu 18.04 uses PHP 7.2 by default.
As Ubuntu is part of the Debian Linux family, these instructions are very likely to work with other Debian family distributions with, perhaps, minor adjustments.
- If they haven’t already been installed, install PHP and the required extensions.
sudo apt-get install php php-pgsql php-iconv php-ssh2
php-iconvis not needed in Ubuntu 18.04. It is part of the
php-commonpackage installed with
- Ensure the extensions are active.
sudo phpenmod php-pgsql php-iconv php-ssh2
- Create a directory on the server to run the scripts and copy
config.phpfrom the repository to your new directory.
config.phpboth should reside in the same directory and both must be accessible by the same user account.
rootis technically not required to run the auto feed, but the account owning the script files will be responsible to run the auto feed via cron.
- File permissions:
submitty_student_auto_feed.phpis intended to be executable.
config.phpis not intended to be executable.
- The following sets owner only permissions of “Read/Write/Execute” to
submitty_student_auto_feed.phpand “Read/Write” (non executable) to
sudo chmod 0700 student_submitty_auto_feed.php sudo chmod 0600 config.php
6. Command Line Arguments
||Extended help including usage and argument list.|
||Manually set the term code.|
Configuration options exist in
config.php as “constants”.
The goal, here, is to define each constant to a value reflective of your use of Submitty.
The provided defaults, while illustrative, typically will not work.
IMPORTANT – these lines are treated as actual PHP program code.
define is a function that requires parentheses.
Inside the parentheses are (usually) string-values arguments, comma separated.
String values must be enclosed in single or double quotes.
However, sometimes the value is a whole number or of the keywords
These are not string values and therefore are not enclosed in single or double quotes.
Each line must end with a semicolon.
Otherwise, the auto feed will throw a syntax or parse error and won’t run.
Best practice is to follow the format as seen in the examples.
Here is an example option:
This defines the constant
CSV_FILE and sets it to the value
Do not change the constant. Only change the constant’s value.
We would need to change the value to reflect where the student data CSV is located (did you note this back in chapter 5?).
For example, if your data warehouse delivers the feed CSV to
/users/datawarehouse/enrollment.csv – then change the line to read:
There are a couple other options to set besides
Following is a list of each constant and what they represent.
Summaries are also provided as “code comments” within
Consistent with C and Java styles, PHP code comments either begin with double slashes
// or are multiple lines between
Using a text editor with syntax highlighting will be highly beneficial as code comments will be given a unique text color (text coloring will vary from editor to editor).
These options are set in
config.php must exist in the same directory and be accessible by the same user account as
define('DB_HOST', 'submitty.cs.myuniversity.edu'); define('DB_LOGIN', 'submitty_dbuser'); define('DB_PASSWORD', 'DB.p4ssw0rd');
These options specify the login to the Submitty database for the hostname of the database, the user login (typically
submitty_dbuser), and the password (same as used in Submitty setup).
IMPORTANT – Without this configuration, the auto feed cannot add or update course enrollments.
Note that the database is often on the same server as Submitty, but this is not required. The database can be hosted on a separate server from Submitty.
define('ERROR_EMAIL', 'firstname.lastname@example.org'); define('ERROR_LOG_FILE', '/location/of/auto_feed_error.log');
When an error occurs, it is written to a raw text logfile. The location of the logfile must be specified and must be accessible by the user account running the auto feed.
Error messages can also be emailed, presumably to a sysadmin or a mailing list monitored by an IT dept (highly recommended).
Emailing error messages can be disabled by setting the value to
null (without quotes).
IMPORTANT – error-log email may be considered unauthenticated email by many Universities. Your campus may restrict or outright deny delivery of the error-log emails. Consult with your University’s IT department regarding unauthenticated email.
CSV File Access
These constants define how the CSV data can be accessed.
CSV_FILEis the absolute path to the CSV data file.
- This script does not currently support network access to the CSV data file.
Every CSV file has a delimiter character that separates each data field, and this delimiter needs to be specified.
Usually, the delimiter is a comma, but it can be any character from the standard ASCII table.
The delimiter character can be directly quoted (e.g.
',' for comma) or specified by its ASCII value (e.g.
chr(44) for comma).
In the example above,
chr(9) is the tab key.
Here are some example delimiters:
- Comma by ASCII value
define('VALIDATE_MIN_FILESIZE', 65536); define('VALIDATE_NUM_FIELDS', 10);
These options are used to (loosely) detect a bad CSV file.
VALIDATE_MIN_FILESIZEsets the acceptable minimum file size as an integer in bytes. This is useful to detect an egregiously small CSV that could indicate data corruption (such as a file containing end-of-line characters, but no actual data).
It is possible to snare a legitimate CSV as a false-positive, so setting this value relatively small, but greater than zero, is advised. Here are a couples tips to help you decide what should be a reasonable validation filesize.
- A CSV with 5,120 end-of-line chars (empty rows) will be 5,120 bytes (5 kilobytes) in size. CP-1252 (MS-Windows) encoded CSVs have two end-of-line chars per row, so 5,120 empty rows will make up a 10 kilobyte CSV.
- As seen in the example above, 65,536 bytes = 64 kilobytes.
VALIDATE_NUM_FIELDSis a check to make sure that an exact number of fields/columns is present in every row of the CSV. Any row that does not have this exact value is expected to have unreliable data and is ignored by the auto feed script. This value includes any extraneous fields/columns that your University’s registrar/data warehouse provides.
Even though the auto feed requires ten columns, the CSV being provided may have more. If so, use the number of columns in the CSV to set this option. Otherwise, all columns may be ignored and no enrollment additions or updates will be recorded.
CSV Fields Mapping
define('COLUMN_COURSE_PREFIX', 8); define('COLUMN_COURSE_NUMBER', 9); define('COLUMN_REGISTRATION', 7); define('COLUMN_SECTION', 10); define('COLUMN_USER_ID', 5); define('COLUMN_NUMERIC_ID', 6); define('COLUMN_FIRSTNAME', 2); define('COLUMN_LASTNAME', 1); define('COLUMN_PREFERREDNAME', 3); define('COLUMN_EMAIL', 4); define('COLUMN_TERM_CODE', 0);
Each of these constants represents the data fields that must be read from the student data CSV.
Different universities will order the data differently, therefore the auto feed requires these
define functions to determine which columns hold the needed data.
IMPORTANT – The integer values actually represent array indices, and as is common convention in programming, array indices start counting at zero. That is, the first column of the CSV is #0, the second column is #1, the third column is #2, and so on.
COLUMN_COURSE_PREFIXrepresents the course prefix (often the dept. code) as seen in the course catalog. The entire course code would be the prefix and number combined.
COLUMN_COURSE_NUMBERrepresents the course number as seen in the course catalog. The entire course code would be the prefix and number combined.
IMPORTANT: The student CSV must have separate columns for the prefix/dept code and the course number. This permits the auto feed script to process enrollment for courses in multiple departments.
For Example Prefix “CSIS” (Computer Science and Information Systems dept.) Number “101” (introductory programming course) Course Listed In Catalog “CSIS 101” or “CSIS-101”
COLUMN_REGISTRATIONrepresents the column that has a code to verify that a student is enrolled or not. q.v. Student Registration Codes
COLUMN_SECTIONis the column representing the lab/course section a student is enrolled in.
COLUMN_USER_IDis a student’s computing systems user ID, and is also used to login to Submitty.
COLUMN_NUMERIC_IDis intended to represent the student’s campus assigned ID number, but could be any alternate ID number provided by a data feed. Although this column is called “numeric”, any alphanumeric code is permitted with delimiter characters, such as dashes.
COLUMN_FIRSTNAMEis a student’s legal first name.
COLUMN_LASTNAMEis a student’s legal last name.
COLUMN_PREFERREDNAMEis the name a student wishes to be known by. If your university’s student data CSV doesn’t have this column, set this to
COLUMN_EMAILcontains a student’s email address.
COLUMN_TERM_CODEcontains the code describing the current term. This is checked against the “expected” term code for validation. q.v. Expected Term Code
Student Registration Codes
define('STUDENT_REGISTERED_CODES', array('RA', 'RW'));
This option is a little more complicated to look at, but is actually not any more difficult than the others.
define is creating a list (a.k.a. “array”) of all the registration codes that indicate a student is enrolled in a course.
Your student CSV may include students who were enrolled in a course, but that enrollment may change mid-term.
In the above example, the two codes
RW indicate a student is enrolled in a course.
In this case,
RA may mean “Registered by Adviser” and
RW may mean “Registered by Web”.
- You will need to coordinate with your University’s registrar or data warehouse to determine what all the enrollment codes are.
- You will need to replace/remove/add enrollment codes to this
definethat are found in your student CSV.
- This example has two codes, but you may have more codes or only one code.
- Even if there is only one registration code, you must have the
- Don’t forget the commas – just like in an English list, every item (code) is separated by commas.
- Any student not associated with a registration code as listed in this option is assumed to have dropped the course or has otherwise been unregistered for some reason. In which case, an update will occur in Submitty’s database to reflect the student is no longer enrolled in that course.
Expected Term Code
This check will ensure that a course’s enrollment database does not accidentally get clobbered by a student enrollment list for a different term.
Every term (semester) should be associated with a unique code. This code will have to be updated by a sysadmin, as needed.
Per example, above,
201705 might be a code for the Summer 2017 semester.
201705 might be year 2017, starting month 5 (May).
The student auto feed will check every row for this code and compare it with this
Rows that do not match the
define value will be ignored.
It is possible that when one row does not match, all rows will not match.
Windows Encoding Conversion
If your student CSV originates from a Windows computer, the auto feed may need to do a text encoding conversion from CP-1252 to UTF-8; especially when the CSV character data is expected to include diacritics.
true if the student CSV originates from a Windows computer.
Otherwise, set to
End of Line Detection
In summary, this
define shouldn’t be changed.
It ensures that CSV files exported by Microsoft Excel for Macintosh are correctly processed.
This option must be set to your timezone. The example, above, is set to Eastern timezone.
Suggested Settings For Timezones in USA
|Mountain (no daylight savings)||
|Hawaii (no daylight savings)||
For a complete list of timezones: http://php.net/manual/en/timezones.php
8. PAM Authentication and
accounts.php will automate the creation of local accounts used with PAM authentication.
This script is not needed when using database authentication.
accounts.php must exist on the same server as Submitty, and it needs to be run as
This script is intended to read user entries from Submitty’s “master” database to generate any missing local accounts needed for PAM authentication.
accounts.php -h to see extended help and argument list.
It is recommended that this script is run every hour as a cron job. That way, should an instructor manually add a student to their course, the student’s access to Submitty will be available “within an hour”.