Changes

Jump to: navigation, search

OPS435 Assignment 2 for Section A

7,349 bytes removed, 11:28, 12 July 2021
update for summer
[[Category:OPS435-Python]][[Category:rchanebrauer]]= Overview: du Improved =<code>du</code> is a tool for inspecting directories. It will return the contents of a directory along with how much drive space they are using. However, it can be parse its output quickly, as it usually returns file sizes as a number of bytes:
=Assignment 2 <code><b>user@host ~ $ du --max- Usage Report='''Weight:''' 10% of the overall grade '''Due Date:''' Please follow the three stages of submission schedule:* Complete the algorithm document for this assignment script by July 31, 2020 and submit on Blackboard by 9:00 PM,* Complete the your Python script and push to Github by August 14, 2020 at 9:00 PM, and* Copy your Python script into a Word document and submit to Blackboard by August 14, 2020 at 9:00 PM. ==Overview==Most system administrators would like to know the utilization of their systems by their users. On a Linux system, each user's login records are normally stored in the binary file depth 1 /usr/local/varlib</logb></wtmp. The login records in this binary file can not be viewed or edited directly using normal Linux text commands like 'less', 'cat', etc. The 'last' command is often used to display the login records stored in this file in a human readable form. Please check the man page of the 'last' command for available options. The following is the contents of the file named "usage_data_file", which is a sample output of the 'last' command with the '-Fiw' flag on:code>
<pre>
$ last -Fiw > usage_data_file164028 /usr/local/lib/heroku$ cat usage_data_filerchan pts11072 /usr/local/lib/9 10python2.40.91.236 Tue Feb 13 16:53:42 2018 - Tue Feb 13 16:57:02 2018 (00:03) 7cwsmith pts92608 /10 10.40.105.130 Wed Feb 14 23:09:12 2018 - Thu Feb 15 02:11:23 2018 (03:02)usr/local/lib/node_modulesrchan pts8 /usr/2 10.40.91.236 Tue Feb 13 16:22:00 2018 - Tue Feb 13 16:45:00 2018 (00:23) rchan ptslocal/5 10.40.91.236 Tue Feb 15 16:22:00 2018 - Tue Feb 15 16:55:00 2018 (00:33) asmith ptslib/2 10python3.43.115.162 Tue Feb 13 16:19:29 2018 - Tue Feb 13 16:22:00 2018 (00:02) 8tsliu2 pts267720 /4 10.40.105.130 Tue Feb 13 16:17:21 2018 - Tue Feb 13 16:30:10 2018 (00:12) cwsmith ptsusr/13 10.40.91.247 Tue Mar 13 18:08:52 2018 - Tue Mar 13 18:46:52 2018 (00:38) asmith ptslocal/11 10.40.105.130 Tue Feb 13 14:07:43 2018 - Tue Feb 13 16:07:43 2018 (02:00)lib
</pre>
It is always desirable to have You will therefore be creating a tool called <b>duim (du improved></b>. Your script will call du and return the contents of a dailyspecified directory, or monthly usage reports by user or by remote host based on and generate a bar graph for each subdirectory. The bar graph will represent the above informationdrive space as percent of the total drive space for the specified directory.An example of the finished code your script might produce is this:
== Tasks for this assignment ==In this assignment, your should preform the following activities:# Complete a detail algorithm for producing monthly usage reports by user or by remote host based on the information stored in any given files generated from the 'last' command. # Once you have complete the detail algorithm, you should then <bcode>design the structure of your python script</b> by identifying the appropriate python objects, functions and modules to be used for each task in your algorithm and the main control logic. Make sure to identify the followings:## input data, ## computation tasks, and ## outputs.# implement your computational solution using a single python script. You can use any built-in functions and functions from the python modules list in the "Allowed Python Modules" section below to implement your solutionuser@host ~ $ . # Test and review your working python code to see whether you can improve the interface of each function to facilitate better code re-use (this process is called <b>refactoring</b>)duim== Allowed Python Modules ==* the <b>os, sys<py /b> modules* the <b>argparse<usr/b> module* The <b>time<local/b> module * The <b>subprocesslib</b> module** [https://docs.python.org/3/howto/argparse.html Argparse Tutorial] - should read this first.** [https://docs.python.org/3/library/argparse.html Argparse API reference information page] == Instructions == Accept the Assignment #2 via the link on Blackboard, and clone the Github repository on a Linux machine of your choosing. Rename "a2_template.py" to "a2_<your myseneca username>.py, just as we did in Assignment 1. You may also want to create a symbolic link using <code>ln -s a2_<myseneca_id>.py a2.py</code> to save time. === Program Name and valid command line arguments ===Name your Python3 script as <code>a2_[student_id].py</code>. Your script must accept one or more "file name" as its command line parameters and other optional parameters as shown below. Your python script should produce the following usage text when run with the --help option:
<pre>
61 % [eric@centos7 a1============ ]$ python3 160.2 MiB /usr/local/lib/a2.py -hherokuusage: new_template.py 4 % [-h= ] 10.8 MiB /usr/local/lib/python2.7 34 % [-l {user,host}] [-r RHOST] [-t {daily,monthly}======= ]90.4 MiB /usr/local/lib/node_modules 0 % [-u USER ] [-s] [-v] F [F 8.0 kiB /usr/local/lib/python3..] Usage Report based on the last command8positional argumentsTotal: F list of files to be processed optional arguments: -h, --help show this help message and exit -l {user,host}, --list {user,host} generate user name or remote host IP from the given files -r RHOST, --rhost RHOST usage report for the given remote host IP -t {daily,monthly}, --type {daily,monthly} type of report: daily or monthly -u USER, --user USER usage report for the given user name -s, --seconds return times in seconds -v, --verbose turn on output verbosity Copyright 2020 - Eric Brauer261.4 MiB /usr/local/lib
</pre>
Replace the last line with your own full name.
 
Compare the usage output you have now with the one above. There is one option missing, you will need to change the <code>argparse</code> function to implement it.
 
You will that there is an 'args' object in a2_template.py. Once the <code>parse_command_args()</code> function is called, it will return an args object. The command line arguments will be stored as attributes of that object. <b>Do not use sys.argv to parse arguments.</b>
 
If there is only one file name provided at the command line, read the login/logout records from the contents of the given file. If the file name is "online", get the record on the system your script is being execute using the Linux command "last -iwF". The format of each line in the file should be the same as the output of 'last -Fiw'. Filter out incomplete login/logout record (hints: check for the number of fields in each record).
 
If there is more than one file name provided, merge all the files together with the first one at the top and the last one at the bottom. Read and process the file contents in that order in your program.
 
=== Header ===
 
All your Python codes for this assignment must be placed in a <font color='red'><b><u>single source file</u></b></font>. Please include the following declaration by <b><u>you</u></b> as the <font color='blue'><b>script level docstring</b></font> in your Python source code file (replace [Student_id] with your Seneca email user name, and "Student Name" with your own name):
 
<source>OPS435 Assignment 2 - Summer 2020
Program: a2_[seneca_id].py
Author: "Student Name"
The python code in this file a2_[seneca_id].py is original work written by
"Student Name". No code in this file is copied from any other source
including any person, textbook, or on-line resource except those provided
by the course instructor. I have not shared this python file with anyone
or anything except for submission for grading.
I understand that the Academic Honesty Policy will be enforced and violators
will be reported and appropriate action will be taken.
</source>
=== Use The details of Github ===You the final output will once again be graded partly on <b>correct use of version control</b>, that is use of numerous commits with sensible commit messages. In professional practiceup to you, this is critically important for the timely delivery of code. You but you will be expected required to use: <ol><li><code>git add *fulfill some specific requirements before completing your script. Read on...py</code><li><code>git commit -m "a message that describes the change"</code><li><code>git push</code>
after completing each step. There = Assignment Requirements === Permitted Modules ==<b><font color='blue'>Your python script is no penalty for "too many commits"allowed to import only the <u>os, there is no such thing!subprocess and sys</u> modules from the standard library.</font></b>
=== Suggested Process =Required Functions ==<ol><li> Read the rest of this document, try and understand what is expected. <li> Use the invite link posted to Blackboard You will need to accept complete the assignment, and clone functions inside the repo to a Linux machine.provided file called <li> Copy a2_template.py into a2_<myseneca_idcode>duim.py. Replace with your Myseneca username.<li> Run the script itself. Investigate argparse. Experiment with the various options, particularly -v. Read the docs, what option must you implement? Go ahead and implement it. Test with print() for now. <b>Commit the change.</b><li> Investigate the `parse_user()` function, with the <code>usage_data_file. The provided </code>checkA1. This should take the list of lines from the file, and instead return a list of usernames. <b>Commit the change.py</bcode><li> Use argparse with `-l user` `usage__data_file` will be used to call the `parse_user()` function. <b>Commit the change.</b><li> Write a function to print the list from `parse_for_user()`. Now you have input -> processing -> output. <b>Continue committing test these changes as your proceed.</b><li> Implement the same things as parse_for_user but for `parse_for_hosts`. Output should be sorted. <li> Compare your output with the output below.<li> Write the `parse_for_daily()` function using the pseudocode given. This should be taking the list of lines from your file, and output a dictionary with start dates in DD/MM/YYYY format as the key and usage in seconds as the valuefunctions.<li> <code> {'01/01/1980': 1200, '02/01/1980': 2400, '03/01/1980': 2200} </code><li> Once your `parse_for_daily()` function works, call it with the argparse options, and display the contents.<li> Write (or modify) a function to do the same for remote hosts.<li> Implement the outputting of the duration in HH:MM:SS instead of seconds. It's recommended you write a function to take in seconds and return a string. Call this when the `-s` option is absent. Make sure this is working with remote hosts as well. You should now have x of y tests passing.<li> Finally, implement the `--monthly` option. Create a new function and get it working. start with seconds, then duration and make sure it works with remote as well.<li> Perform last checks and document your code. Write **why** your code is doing what it does, rather than **what** it's doing. You should have 100% of tests succeeding.</ol>
=== Output Format ===* <code>call_du_sub()</code> should take the target directory as an argument and return a list of strings returned by the command <b>du -d 1<target directory></b>.** Use subprocess.Popen. The format ** '-d 1' specifies a <i>max depth</i> of 1. Your list shouldn't include files, just a list of your log tables subdirectories in the target directory.** Your list should <u>not</u> contain newline characters. * <code>percent_to_graph()</code> should be identical to take two arguments: percent and the sample output below, in order to minimize test total chars. It should return a 'bar graph' as a string.** Your function should check error. The horizontal banner that the percent argument is a valid number between title 0 and data 100. It should fail if it isn't. You can <code>raise ValueError</code> in this case.** <b>total chars</b> refers to the total number of characters that the bar graph will be composed of . You can use equal signs (<code>=)</code> or any other character that makes sense, and but the empty space <b>must be the length composed of spaces</b>, at least until you have passed the title stringfirst milestone.List tables ** The string returned by this function should need no extra formattingonly be composed of these two characters.For example, calling <code>percent_to_graph(50, 10)</code> should return: '===== 'For daily* <code>create_dir_dict</montly tables with two columnscode> should take a list as the argument, and should return a dictionary.** The first column should list can be 10 characters long and be left-alignedthe list returned by <code>call_du_sub()</code>.** The second column dictionary that you return should be 15 characters long have the full directory name as <i>key</i>, and the number of bytes in the directory as the <i>value</i>. This value should be right-alignedan integer. For example, using the example of <b>/usr/local/lib</b>, the function would return: {'/usr/local/lib/heroku': 164028, '/usr/local/lib/python2.7': 11072, ... }
=== Sample Outputs =Additional Functions ==The following are the reports generated by the usage report script (urYou may create any other functions that you think appropriate, especially when you begin to build additional functionality.py) with the Part of your evaluation will be on how "usage_data_filere-usable" mentioned in the overview section. You can download the file [https://scs.senecac.on.ca/~raymond.chan/ops435/a2/usage_data_file here] to test your ur.py scriptfunctions are, and sensible use of arguments and return values.==== User List ====The following is the user list extracted from the usage_data_file created by the command:<pre>[eric@centos7 a2]$ ./a2.py -l user usage_data_file</pre>
<pre>User list for usage_data_file=============================asmithcwsmithrchantsliu2</pre> ==== Remote Host List ==Use of GitHub ==You will be graded partly on the quality of your Github commits. You may make as many commits as you wish, it will have no impact on your grade. The following only exception to this is the remote host list extracted from the usage_file_file created by the command:<preb>[eric@centos7 a2]$ assignments with very few commits.</a2b> These will receive low marks for GitHub use and may be flagged for possible academic integrity violations.py -l host usage_data_file</preb> <prefont color='blue'>Host list for usage_data_file=============================10Assignments that do not adhere to these requirements may not be accepted.40.105.13010.40.91.23610.40.91.24710.43.115.162</prefont></b>
==== Daily Usage Report by User ====Professionals generally follow these guidelines:The following are Daily Usage Reports created for user rchan. The output can be displayed either in seconds:* commit their code after every significant change, * the code <prei>[eric@centos7 a2]$ ./a2.py -u rchan -t daily usage_data_file --secondsshould hopefully</prei>run without errors after each commit, and* every commit has a descriptive commit message.
<pre>Daily Usage Report for rchan============================Date Usage13/02/2018 158015/02/2018 1980Total 3560</pre>After completing each function, make a commit and push your code.
...or by omitting the <code>--seconds</After fixing a problem, make a commit and push your code> option, in HH:MM:SS format. <pre>[eric@centos a2]$ ./a2.py -u rchan -t daily usage_data_file</pre>
<preb>Daily Usage Report for rchan============================Date Usage13<u>GitHub is your backup and your proof of work.</02/2018 00:26:0015/02/2018 00:33:00Total 00:59:20u></preb>It's recommended you get the seconds working first, then create a function to converts seconds to HH:MM:SS.
==== Daily Usage Report by Remote Host====The following is a Daily Usage Report created for the Remote Host 10These guidelines are not always possible, but you will be expected to follow these guidelines as much as possible.40Break your problem into smaller pieces, and work iteratively to solve each small problem.105Test your code after each small change you make, and address errors as soon as they arise.103 by the command:<pre>[eric@centos7 a2]$ ./a2.py -r 10.40.105.130 -t daily usage_data_file -s</pre>It will make your life easier!
<pre>Daily Usage Report for 10.40.105.130==================================Coding Standard ==Date UsageYour python script must follow the following coding guide:14* [https:/02/2018 1093113www.python.org/dev/02peps/2018 7969Total 18900<pep-0008/pre>PEP-8 -- Style Guide for writing Python Code]
Just as === Documentation ===* Please use python's docstring to document your python script (script level documentation) and each of the functions (function level documentation) you did with created for this assignment. The docstring should describe 'what' the function does, not 'how' it does.* Your script should also include in-line comments to explain anything that isn't immediately obvious to a beginner programmer. <codeb>--user</codeu>, It is expected that you will be able to explain how each part of your script should also display the time code works in HH:MM:SS by omitting the detail.<code/u>--seconds</codeb> option* Refer to the docstring for after() to get an idea of the function docstrings required.
==== Monthly Usage Report by User =Authorship Declaration ===The following is a Monthly Usage Report created All your Python code for user rchan by this assignment must be placed in the command:provided Python file called <preb>[eric@centos7 a2]$ assignment1.py</a2b>.py -<u>Do not change the name of this file.</u> Please complete the declaration <b><u>as part of the docstring</u rchan -t monthly usage_data_file -s ></preb>in your Python source code file (replace "Student Name" with your own name).
<pre>Monthly Usage Report for rchan=Submission Guidelines and Process =============================Date Usage02/2018 3560Total 3560</pre>
<pre>== Clone Your Repo (ASAP) == [eric@centos7 a2]$ The first step will be to clone the Assignment 1 repository./a2The invite link will be provided to you by your professor. The repo will contain a check script, a README file, and the file where you will enter your code.py -u cwsmith -t monthly usage_data_file</pre>
<pre>Monthly Usage Report for cwsmith==============================The First Milestone (due February 14) ==Date UsageFor the first milestone you will have two functions to complete.02* <code>call_du_sub</2018 03:02:1103code> will take one argument and return a list. The argument is a target directory. The function will use <code>subprocess.Popen</code> to run the command <b>du -d l <target_directory></2018 00:38:00Total 03:40:11b>. * <code>percent_to_graph</precode>will take two arguments and return a string.
==== Monthly Usage Report by Remote Host ====The following is a Monthly Usage Report created for the remote host 10.40.105.130 by the command:In order to complete <precode>[eric@centos7 a2]$ ./a2.py -r 10.40.105.130 -t monthly usage_data_filepercent_to_graph()</precode>, it's helpful to know the equation for converting a number from one scale to another.
<pre>Monthly Usage Report for 10.40.105[[File:Scaling-formula.130======================================Date Usage02/2018 05:15:00Total 05:15:00</pre>png]]
As discussed beforeIn this equation, ``x`` refers to your input value percent and ``y`` will refer to the number of symbols to print. The max of percent is 100 and the min of percent is 0.Be sure that you are rounding to an integer, this command should also accept and then print that number of symbols to represent the percentage. The number of spaces that you print will be the <code>--seconds</code> optioninverse.
==== List Users With Verbose ====Calling any of the previous commands Test your functions with the Python interpreter. Use <code>--verbosepython3</code> option should cause the script to output more information, then:<pre> import duim[eric@centos7 a2]$ ./a2 duim.py -l user usage_data_file -v</pre>percent_to_graph(50, 10)
<pre>Files to be processedTo test with the check script, run the following: ['usage_data_file']Type of args for files <class 'list'>User list for usage_data_file=============================asmithcwsmithrchantsliu2</pre>
<precode>[eric@centos7 a2]$ ./a2python3 checkA1.py -r 10.40.105.130 -t monthly usage_data_file f -vTestPercent</precode>
<pre>== Second Milestone (due February 21) ==Files For the second milestone you will have one more function to be processed: ['usage_data_file']complete.Type of args for files * <code>create_dir_dict<class '/code> will take your list'from <code>call_du_sub</code> and return a dictionary. usage report for remote host: 10** Every item in your list should create a key in your dictionary.40.105.130usage report type: monthlyMonthly Usage Report for 10** Your dictionary values should be a number of bytes.40.105.130======================================Date Usage02/2018 05:15:00Total 05:15:00</pre>
==== Daily Report From Online ====Running the script with "online" as a file argument should call a subprocess.Popen object and run the command For example: <code>last -Fiw{'/usr/lib/local': 33400}</code>.<pre> [eric@mtrx-node06pd ~]$ ** Again, test using your Python interpreter or the check script./a2.py -l user online</pre>
(Example Output from Matrix)To run the check script, enter the following:<pre>User list for online====================aabbas28aaddae1aali309aaljajahaalves-staffaaanees1aarhamaassankanovabalandinabhaseenabholayacamuzcuacchikotiadas20adeel.javed...</pre>
<precode>[eric@mtrx-node06pd ~]$ ./a2python checkA1.py -u adas20 f -t daily onlinev TestDirDict</precode>
<pre>Daily Usage Report for abholay==========================Minimum Viable Product ==Once you have achieved the Milestones, you will have to do the following to get a minimum viable product:* In your <code>if __name__ =='__main__'</code> block, you will have to check command line arguments. ** If the user has entered no command line argument, use the current directory.** If the user has entered more than one argument, or their argument isn't a valid directory, print an error message.Date Usage** Otherwise, the argument will be your target directory.16* Call <code>call_du_sub</07code> with the target directory.* Pass the return value from that function to <code>create_dir_dict</2020 00code>* You may wish to create one or more functions to do the following:13:0917** Use the total size of the target directory to calculate percentage.** For each subdirectory of target directory, you will need to calculate a percentage, using the total of the target directory.** Once you've calculated percentage, call <code>percent_to_graph</07code> with a max_size of your choice. ** For every subdirectory, print <i>at least</2020 00:08:59i> the percent, the bar graph, and the name of the subdirectory.Total 00:22:08** The target directory <b>should not</b> have a bar graph.
</pre>== Additional Features ==
=== Detail Algorithm Document ===Follow After completing the standard computation procedure: input - process - ouput when creating the algorithm document for this assignment.==== input ====* get data (command line arguments/options) from the user using the functions provided by the argparse module* according above, you are expected to the arguments/options given at the command line, take appropriate processing actionadd some additional features. ==== processing ====* based on the file(s) specified, read the contents of each file and use appropriate objects to store it* based on the command line arguments/options, process the data accordingly, which includes** data preprocessing (split a multi-day record into single day record)** record processing (preform required computation)==== output ====* output the required report based on the processed data==== identify and select appropriate python objects and functions ====The following python functions (to be created, Some improvements you may have more) could make are useful in handling the following sub-tasks:* reads login records from files and filters out unwanted records* convert login records into proper python object type so that it can be processed using as much built-in functions as possible * create functions which generate daily usage reports by user and/or by remote host* create functions which generate monthly usage reports by user and/or by remote host
To help you with this assignment* Format the output in a way that is easy to read.* Add colour to the output.* Add more error checking, you should use print a usage message to the user.* Convert bytes to a human-readable format. NOTE: This doesn't have to be 100% accurate to get marks.* Accept more options from the a2_templateuser.py in * Sort the repository as a starting point in designing your own Python Usage Report scriptoutput by percentage, or by filename.
=== Python script coding and debugging ===For each function, identify what type of objects It is expected that the additional features you provided should be passed to the functionuseful, non-trivial, they should not require super-user privileges and what type should not require the installation of objects should be returned additional packages to the callerwork.Once you (ie: I shouldn't have finished coding a function, you should start a Python3 interactive shell, import your functions and manually test each function and verify its correctness.=== Final Test===Once you have all the individual function tested and that each is working properly, perform the final test with test data provided by your professor and verify that your script produces the correct results before submitting to run pip to make your python program on Blackboard. Upload all the files for this assignment 2 to your vm in myvmlab and perform the final testwork).
== The Assignment (due March 7, 11:59pm) ==
* Be sure to make your final commit before the deadline.
* Then, copy the contents of your <b>duim.py</b> file into a Word document, and submit it to Blackboard. <i>I will use GitHub to evaluate your deadline, but submitting to Blackboard tells me that you wish to be evaluated.</i>
== Rubric ==
{| class="wikitable" border="1"
! Task !! Maximum mark !! Actual mark
|-
| Algorithm Submission Program Authorship Declaration || 10 5 ||
|-
| Check Script Results required functions design || 30 5 ||
|-
| Additional Check: 'online' required functions readability || 5 ||
|-
| GitHub Use main loop design || 15 10 ||
|-
| List Functions main loop readability || 5 10 ||
|-
| Daily/Monthly Functions output function design || 10 5 ||
|-
| Output Functions output function readability || 5 ||
|-
| Other Functions additional features implemented || 5 20 ||
|-
| Overall Design/Coherence docstrings and comments || 10 5 ||
|-
| Documentation First Milestone || 5 10 ||
|-
| Second Milestone || 10 |||-| github.com repository: Commit messages and use || 10 |||-| '''Total''' || 100 ||  
|}
== Due Date and Final Submission requirement ==* Stage 1Please submit the following files by the due date: Submit your algorithm document file to Blackboard by July 31, 2020.* Stage 2: Use commits to push [ ] your python script for this assignment to Github, named as 'duim.com. The final state of py', in your repository will be looked at on August 14, 2020 at 9:00 PM.* Stage 3: Copy your python script into a Word document and submit also '''submitted to Blackboard ''', by August 14, 2020 March 7 at 911:00 PM59pm.

Navigation menu