Difference between revisions of "OPS435 Online Assignment 2"

From CDOT Wiki
Jump to: navigation, search
(Sample Outputs)
(27 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[Category:OPS435-Python]][[Category:rchan]]
 
[[Category:OPS435-Python]][[Category:rchan]]
 +
'''Please note that we have different versions of Assignment 2 for each OPS435 Section.'''
 +
::<b><font color='red'>You will loose a lot of marks if you pick the wrong section.
 +
::Check twice before you click.</font></b>
  
=Assignment 2 - Usage Report=
+
= Section A =
'''Weight:''' 10% of the overall grade
+
: If you are in OPS435 Section A for the Fall 2020 Semester, please use [[OPS435 Assignment 2 for Section AB]]
 +
: Professor: <font color='green'>Raymond Chan</font>
  
'''Due Date:''' Please follow the three stages of submission schedule:
+
= Section B =
* Complete the for the monthly algorithm for this assignment script by July 27, 2020 and submit on Blackboard,
+
: If you are in OPS435 Section A for the Fall 2020 Semester, please use [[OPS435 Assignment 2 for Section AB]]
* Complete the your Python script and push to Github by August 14, 2020 at 9:00 PM,
+
: Professor: <font color='green'>Raymond Chan</font>
  
==Overview==
+
= Section C =
Most system administrators would like to know the utilization of their systems by their users. On a Linux system, each user's login records are normally stored in the binary file /var/log/wtmp. The login records in this binary file can not be viewed or edited directly using normal Linux text commands like 'less', 'cat', etc. The 'last' command is often used to display the login records stored in this file in a human readable form. Please check the man page of the 'last' command for available options. The following is the contents of the file named "usage_data_file", which is a sample output of the 'last' command with the '-Fiw' flag on:
+
: If you are in OPS435 Section A for the Fall 2020 Semester, please use [[OPS435 Assignment 2 for Section C]]
<pre>
+
: Professor: <font color='blue>Eric Brauer</font>
$ last -Fiw > usage_data_file
 
$ cat usage_data_file
 
rchan    pts/9        10.40.91.236    Tue Feb 13 16:53:42 2018 - Tue Feb 13 16:57:02 2018  (00:03)   
 
cwsmith  pts/10      10.40.105.130    Wed Feb 14 23:09:12 2018 - Thu Feb 15 02:11:23 2018  (03:02)
 
rchan    pts/2        10.40.91.236    Tue Feb 13 16:22:00 2018 - Tue Feb 13 16:45:00 2018  (00:23)   
 
rchan    pts/5        10.40.91.236    Tue Feb 15 16:22:00 2018 - Tue Feb 15 16:55:00 2018  (00:33)   
 
asmith  pts/2        10.43.115.162    Tue Feb 13 16:19:29 2018 - Tue Feb 13 16:22:00 2018  (00:02)   
 
tsliu2  pts/4        10.40.105.130    Tue Feb 13 16:17:21 2018 - Tue Feb 13 16:30:10 2018  (00:12)   
 
cwsmith  pts/13      10.40.91.247    Tue Mar 13 18:08:52 2018 - Tue Mar 13 18:46:52 2018  (00:38)   
 
asmith  pts/11      10.40.105.130    Tue Feb 13 14:07:43 2018 - Tue Feb 13 16:07:43 2018  (02:00)
 
</pre>
 
It is always desirable to have a daily, weekly, or monthly usage reports by user or by remote host based on the above information.
 
 
 
== Tasks for this assignment ==
 
In this assignment, your should preform the following activities:
 
# Complete a detail algorithm for producing monthly usage reports by user or by remote host based on the information stored in any given files generated from the 'last' command.
 
# Once you have complete the detail algorithm, you should then <b>design the structure of your python script</b> by identifying the appropriate python objects, functions and modules to be used for each task in your algorithm and the main control logic. Make sure to identify the followings:
 
## input data,
 
## computation tasks, and
 
## outputs.
 
# implement your computational solution using a single python script. You can use any built-in functions and functions from the python modules list in the "Allowed Python Modules" section below to implement your solution.
 
# Test and review your working python code to see whether you can improve the interface of each function to facilitate better code re-use (this process is called <b>refactoring</b>).
 
 
 
== Allowed Python Modules ==
 
* the <b>os, sys</b> modules
 
* the <b>argparse</b> module
 
* The <b>time</b> module
 
** [https://docs.python.org/3/howto/argparse.html Argparse Tutorial] - should read this first.
 
** [https://docs.python.org/3/library/argparse.html Argparse API reference information page]
 
 
 
== Instructions ==
 
 
 
Accept the Assignment #2 via the link on Blackboard, and clone the Github repository on a Linux machine of your choosing. Rename "a2_template.py" to "a2_<your myseneca username>.py, just as we did in Assignment 1. You may also want to create a symbolic link using <code>ln -s a2_<myseneca_id>.py a2.py</code> to save time.
 
 
 
=== Program Name and valid command line arguments ===
 
Name your Python3 script as <code>a2_[student_id].py</code>. Your script must accept one or more "file name" as its command line parameters and other optional parameters as shown below. Your python script should produce the following usage text when run with the --help option:
 
<pre>
 
[eric@centos7 a1]$ python3 ./a2.py -h
 
usage: new_template.py [-h] [-l {user,host}] [-r RHOST] [-t {daily,monthly}]
 
                      [-u USER] [-s] [-v]
 
                      F [F ...]
 
 
 
Usage Report based on the last command
 
 
 
positional arguments:
 
  F                    list of files to be processed
 
 
 
optional arguments:
 
  -h, --help            show this help message and exit
 
  -l {user,host}, --list {user,host}
 
                        generate user name or remote host IP from the given
 
                        files
 
  -r RHOST, --rhost RHOST
 
                        usage report for the given remote host IP
 
  -t {daily,monthly}, --type {daily,monthly}
 
                        type of report: daily or weekly
 
  -u USER, --user USER  usage report for the given user name
 
  -s, --seconds        return times in seconds
 
  -v, --verbose        turn on output verbosity
 
 
 
Copyright 2020 - Eric Brauer
 
 
 
</pre>
 
Replace the last line with your own full name.
 
 
 
 
 
If there is only one file name provided at the command line, read the login/logout records from the contents of the given file. If the file name is "online", get the record on the system your script is being execute using the Linux command "last -iwF". The format of each line in the file should be the same as the output of 'last -Fiw'. Filter out incomplete login/logout record (hints: check for the number of fields in each record).
 
 
 
If there is more than one file name provided, merge all the files together with the first one at the top and the last one at the bottom. Read and process the file contents in that order in your program.
 
 
 
=== Header ===
 
 
 
All your Python codes for this assignment must be placed in a <font color='red'><b><u>single source file</u></b></font>. Please include the following declaration by <b><u>you</u></b> as the <font color='blue'><b>script level docstring</b></font> in your Python source code file (replace [Student_id] with your Seneca email user name, and "Student Name" with your own name):
 
 
 
<source>OPS435 Assignment 2 - Summer 2020
 
Program: a2_[seneca_id].py
 
Author: "Student Name"
 
The python code in this file a2_[seneca_id].py is original work written by
 
"Student Name". No code in this file is copied from any other source
 
including any person, textbook, or on-line resource except those provided
 
by the course instructor. I have not shared this python file with anyone
 
or anything except for submission for grading. 
 
I understand that the Academic Honesty Policy will be enforced and violators
 
will be reported and appropriate action will be taken.
 
</source>
 
 
 
=== Use of Github ===
 
You will once again be graded partly on <b>correct use of version control</b>, that is use of numerous commits with sensible commit messages. In professional practice, this is critically important for the timely delivery of code. You will be expected to use:
 
<ol>
 
<li><code>git add *.py</code>
 
<li><code>git commit -m "a message that describes the change"</code>
 
<li><code>git push</code>
 
 
 
after completing each step. There is no penalty for "too many commits", there is no such thing!
 
 
 
=== Suggested Process ===
 
<ol>
 
<li> Read the rest of this document, try and understand what is expected.
 
<li> Use the invite link posted to Blackboard to accept the assignment, and clone the repo to a Linux machine.
 
<li> Copy a2_template.py into a2_<myseneca_id>.py. Replace with your Myseneca username.
 
<li> Run the script itself. Investigate argparse. Experiment with the various options, particularly -v. Read the docs, what option must you implement? Go ahead and implement it. Test with print() for now. <b>Commit the change.</b>
 
<li> Investigate the `parse_user()` function, with the <code>usage_data_file</code>. This should take the list of lines from the file, and instead return a list of usernames. <b>Commit the change.</b>
 
<li> Use argparse with `-l user` `usage__data_file` to call the `parse_user()` function. <b>Commit the change.</b>
 
<li> Write a function to print the list from `parse_for_user()`. Now you have input -> processing -> output. <b>Continue committing these changes as your proceed.</b>
 
<li> Implement the same things as parse_for_user but for `parse_for_hosts`. Output should be sorted.
 
<li> Compare your output with the output below.
 
<li> Write the `parse_for_daily()` function using the pseudocode given. This should be taking the list of lines from your file, and output a list of lists with start dates in DD/MM/YYYY format as well usage in seconds.
 
<li> <code> [['01/01/1980', '1200'], ['02/01/1980', '2400'], ['03/01/1980', '2200']] </code>
 
<li> Once your `parse_for_daily()` function works, call it with the argparse options, and display the contents.
 
<li> Write (or modify) a function to do the same for remote hosts.
 
<li> Implement the outputting of the duration in HH:MM:SS instead of seconds. It's recommended you write a function to take in seconds and return a string. Call this when the `-s` option is absent. Make sure this is working with remote hosts as well. You should now have x of y tests passing.
 
<li> Finally, implement the `--monthly` option. Create a new function and get it working. start with seconds, then duration and make sure it works with remote as well.
 
<li> Perform last checks and document your code. Write **why** your code is doing what it does, rather than **what** it's doing. You should have 100% of tests succeeding.
 
</ol>
 
 
 
=== Sample Outputs ===
 
The following are the reports generated by the usage report script (ur.py) with the "usage_data_file" mentioned in the overview section. You can download the file [https://scs.senecac.on.ca/~raymond.chan/ops435/a2/usage_data_file here] to test your ur.py script.
 
==== User List ====
 
The following is the user list extracted from the usage_data_file created by the command:
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -l user usage_data_file
 
</pre>
 
 
 
<pre>
 
User list for usage_data_file
 
=============================
 
asmith
 
cwsmith
 
rchan
 
tsliu2
 
</pre>
 
 
 
==== Remote Host List ====
 
The following is the remote host list extracted from the usage_file_file created by the command:
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -l host usage_data_file
 
</pre>
 
 
 
<pre>
 
Host list for usage_data_file
 
=============================
 
10.40.105.130
 
10.40.91.236
 
10.40.91.247
 
10.43.115.162
 
</pre>
 
 
 
==== Daily Usage Report by User ====
 
The following are Daily Usage Reports created for user rchan. The output can be displayed either in seconds:
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -u rchan -t daily usage_data_file --seconds
 
</pre>
 
 
 
<pre>
 
Daily Usage Report for rchan
 
============================
 
Date                Usage
 
13/02/2018            200
 
13/02/2018          1380
 
15/02/2018          1980
 
Total                3560
 
</pre>
 
 
 
...or by omitting the <code>--seconds</code> option, in HH:MM:SS format.
 
<pre>
 
[eric@centos a2]$ ./a2.py -u rchan -t daily usage_data_file
 
</pre>
 
 
 
<pre>
 
Daily Usage Report for rchan
 
============================
 
Date                Usage
 
13/02/2018      00:03:20
 
13/02/2018      00:23:00
 
15/02/2018      00:33:00
 
Total            00:59:20
 
</pre>
 
It's recommended you get the seconds working first, then create a function to converts seconds to HH:MM:SS.
 
 
 
==== Daily Usage Report by Remote Host====
 
The following is a Daily Usage Report created for the Remote Host 10.40.105.103 by the command:
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -r 10.40.105.130 -t daily usage_data_file -s
 
</pre>
 
 
 
<pre>
 
Daily Usage Report for 10.40.105.130
 
====================================
 
Date          Usage in Seconds
 
2018 02 15        7883
 
2018 02 14        3047
 
2018 02 13        7969
 
Total            18899
 
</pre>
 
 
 
Just as you did with <code>--user</code>, your script should also display the time in HH:MM:SS by omitting the <code>--seconds</code> option.
 
 
 
==== Monthly Usage Report by User ====
 
The following is a Monthly Usage Report created for user rchan by the command:
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -u rchan -t monthly usage_data_file -s
 
</pre>
 
 
 
<pre>
 
Monthly Usage Report for rchan
 
==============================
 
Date                Usage
 
02/2018              3560
 
Total                3560
 
</pre>
 
 
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -u cwsmith -t monthly usage_data_file
 
</pre>
 
 
 
<pre>
 
Monthly Usage Report for cwsmith
 
================================
 
Date                Usage
 
02/2018          03:02:11
 
03/2018          00:38:00
 
Total            03:40:11
 
</pre>
 
 
 
==== Monthly Usage Report by Remote Host ====
 
The following is a Monthly Usage Report created for the remote host 10.40.105.130 by the command:
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -r 10.40.105.130 -t monthly usage_data_file
 
</pre>
 
 
 
<pre>
 
Monthly Usage Report for 10.40.105.130
 
======================================
 
Date                Usage
 
02/2018          05:15:00
 
Total            05:15:00
 
</pre>
 
 
 
As discussed before, this command should also accept the <code>--seconds</code> option.
 
 
 
==== List Users With Verbose ====
 
Calling any of the previous commands with the <code>--verbose</code> option should cause the script to output more information:
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -l user usage_data_file -v
 
</pre>
 
 
 
<pre>
 
Files to be processed: ['usage_data_file']
 
Type of args for files <class 'list'>
 
User list for usage_data_file
 
=============================
 
asmith
 
cwsmith
 
rchan
 
tsliu2
 
</pre>
 
 
 
<pre>
 
[eric@centos7 a2]$ ./a2.py -r 10.40.105.130 -t monthly usage_data_file -v
 
</pre>
 
 
 
<pre>
 
Files to be processed: ['usage_data_file']
 
Type of args for files <class 'list'>
 
usage report for remote host: 10.40.105.130
 
usage report type: monthly
 
Monthly Usage Report for 10.40.105.130
 
======================================
 
Date                Usage
 
02/2018          05:15:00
 
Total            05:15:00
 
</pre>
 
 
 
==== Daily Report From Online ====
 
Running the script with "online" as a file argument should call a subprocess.Popen object and run the command <code>last -Fiw</code>.
 
<pre>
 
[eric@mtrx-node06pd ~]$ ./a2.py -l user online
 
</pre>
 
 
 
(Example Output from Matrix):
 
<pre>
 
User list for last_output
 
=========================
 
aabbas28
 
aaddae1
 
aali309
 
aaljajah
 
aalves-staffa
 
aanees1
 
aarham
 
aassankanov
 
abalandin
 
abhaseen
 
abholay
 
acamuzcu
 
acchikoti
 
adas20
 
adeel.javed
 
...
 
</pre>
 
 
 
<pre>
 
[eric@mtrx-node06pd ~]$ ./a2.py -u adas20 -t daily online
 
</pre>
 
 
 
<pre>
 
Daily Usage Report for abholay
 
==============================
 
Date                Usage
 
16/07/2020      00:13:09
 
17/07/2020      00:08:59
 
Total            00:22:08
 
 
 
</pre>
 
 
 
=== Detail Algorithm Document ===
 
Follow the standard computation procedure: input - process - ouput when creating the algorithm document for this assignment.
 
==== input ====
 
* get data (command line arguments/options) from the user using the functions provided by the argparse module
 
* according to the arguments/options given at the command line, take appropriate processing action.
 
==== processing ====
 
* based on the file(s) specified, read the contents of each file and use appropriate objects to store it
 
* based on the command line arguments/options, process the data accordingly, which includes
 
** data preprocessing (split a multi-day record into single day record)
 
** record processing (preform required computation)
 
==== output ====
 
* output the required report based on the processed data
 
==== identify and select appropriate python objects and functions ====
 
The following python functions (to be created, you may have more) are useful in handling the following sub-tasks:
 
* reads login records from files and filters out unwanted records
 
* convert login records into proper python object type so that it can be processed using as much built-in functions as possible
 
* create function which generates daily usage reports by user and/or by remote host
 
* create function which generates weekly usage reports by user and/or by remote host
 
 
 
To  help you with this assignment, you can use the ur_template.py in the sample ops435-a2 repository as a starting point in designing your own Python Usage Report script.
 
<font color='blue'><b>If you don't have enough time to create all the functions for the data processing steps, you should study the functions in the ur_funcs.py (provided by your teacher), pick and use the one that may help. If you use any of the functions from ur_funcs.py, there will be a cost of 10% to your overall grade. If you create all the functions yourself, you will get a bonus of 10%.</b></font>
 
 
 
=== Python script coding and debugging ===
 
For each function, identify what type of objects should be passed to the function, and what type of objects should be returned to the caller.
 
Once you have finished coding a function, you should start a Python3 interactive shell, import your functions and manually test each function and verify its correctness.
 
=== Final Test===
 
Once you have all the individual function tested and that each is working properly, perform the final test with test data provided by your professor and verify that your script produces the correct results before submitting your python program on Blackboard. Upload all the files for this assignment 2 to your vm in myvmlab and perform the final test.
 
 
 
 
 
== Rubric ==
 
 
 
{| class="wikitable" border="1"
 
! Task !!  Maximum mark !! Actual mark
 
|-
 
| Algorithm Submission || 10 ||
 
|-
 
| Coming soon... ||
 
|-
 
| '''Total''' || 100 ||
 
|}
 
 
 
== Submission ==
 
* Stage 1: Submit your algorithm document file to Blackboard by July 27, 2020.
 
* Stage 2: Use commits to push your python script for this assignment to Github.com. The final state of your repository will be looked at on August 14, 2020 at 9:00 PM.
 

Revision as of 08:20, 17 November 2020

Please note that we have different versions of Assignment 2 for each OPS435 Section.

You will loose a lot of marks if you pick the wrong section.
Check twice before you click.

Section A

If you are in OPS435 Section A for the Fall 2020 Semester, please use OPS435 Assignment 2 for Section AB
Professor: Raymond Chan

Section B

If you are in OPS435 Section A for the Fall 2020 Semester, please use OPS435 Assignment 2 for Section AB
Professor: Raymond Chan

Section C

If you are in OPS435 Section A for the Fall 2020 Semester, please use OPS435 Assignment 2 for Section C
Professor: Eric Brauer