Difference between revisions of "The Real A Team"

From CDOT Wiki
Jump to: navigation, search
(Project Name Goes here)
(Determining Author By Style Of Writing)
Line 31: Line 31:
  
 
=== Assignment ===
 
=== Assignment ===
 +
 +
The program I wrote relies on one single loop to run through a piece of text.  It has no dependencies, so it can easily be parallelized using the methods discussed in this class.
 +
 +
====Timing====
 +
 +
To time the program I used various pieces of text.  I used text from 3 authors, with varying lengths.  I used 2 Shakespeare works (long - 46,956 words 250,234 characters), 2 assignments I completed for school (medium - 1,885 words 11,336 characters), and 2 blog posts that were written by the same author (short - 869 words 4,997 characters). 
 +
 +
=====Serial Timing=====
 +
 +
{| class="wikitable"
 +
|+ Time for serial program run
 +
! Author!! Character Count !! Time (milliseconds)
 +
|-
 +
| Shakespeare || 250,234 || 157
 +
|-
 +
| Adrian Sauvageot|| 11,336 || 7
 +
|-
 +
| Blog Post|| 4,997 || 3
 +
|}
 +
 +
 +
=====OpenMP Timing=====
 +
 +
{| class="wikitable"
 +
|+ Time for OpenMP parallel program run
 +
! Author!! Character Count !! Time (milliseconds)
 +
|-
 +
| Shakespeare || 250,234 || 72
 +
|-
 +
| Adrian Sauvageot|| 11,336 || 17
 +
|-
 +
| Blog Post|| 4,997 || 7
 +
|}

Revision as of 19:26, 21 March 2016


GPU621/DPS921 | Participants | Groups and Projects | Resources | Glossary

Determining Author By Style Of Writing

A Team Members

  1. Adrian Sauvageot, All
  2. ...

Email All

Progress

Pre-Assignment

I decided to create a new program to test a theory I was told.

I was told by a professor that she believed that by taking a look at how a paper was written, she could tell if it was written by the same author. Further, she believed that a computer could tell if two pieces of text were written by the same author by looking at how it was written.

I decided to create a program that would analyze two pieces of text to try and determine if the same person wrote both pieces.

I decided to look at:

  1. average words/sentence
  2. average word length
  3. average sentences/paragraph
  4. average commas/sentence
  5. average colons/paragraph.


I then use this information to calculate how different two pieces are from each other. If they are within what I determined to be a 5% different writing style, I suggest the two pieces were written by the same person, otherwise I suggest they were written by two separate people.

To test this I ran the program on work by Shakespeare, One of my friends, and myself.

The program successfully was able to determine which author wrote each piece of text.


Assignment

The program I wrote relies on one single loop to run through a piece of text. It has no dependencies, so it can easily be parallelized using the methods discussed in this class.

Timing

To time the program I used various pieces of text. I used text from 3 authors, with varying lengths. I used 2 Shakespeare works (long - 46,956 words 250,234 characters), 2 assignments I completed for school (medium - 1,885 words 11,336 characters), and 2 blog posts that were written by the same author (short - 869 words 4,997 characters).

Serial Timing
Time for serial program run
Author Character Count Time (milliseconds)
Shakespeare 250,234 157
Adrian Sauvageot 11,336 7
Blog Post 4,997 3


OpenMP Timing
Time for OpenMP parallel program run
Author Character Count Time (milliseconds)
Shakespeare 250,234 72
Adrian Sauvageot 11,336 17
Blog Post 4,997 7