Intro to Computer Science
PA11
Code due by Friday, April 27th at classtime
The goal of this project is to 1) gain experience using pieces of code written by someone other than yourself, 2) test your new top-down design skills in breaking a large problem down into steps and helper functions, and 3) gain more practice with file I/O,
dictionaries, lists, files, and functions.
A common element seen on web pages these days are tag clouds (http://en.wikipedia.org/wiki/Tag_cloud). A tag cloud is a visual representation of frequency of words, where more frequent words are represented in larger font. One can also use colors and placement. We are going to analyze a famous movie and create a tag cloud for each movie character based on the words they used, where the frequency of the words indicates the size of the font in the cloud.
The movie we will analyze is Montey Python and the Holy Grail. While watching the movie prior to starting this assignment is not a requirement, it's highly recommended if you have some extra time!
To help you with this assignment you are provided with the following documents:
Each of these files is explained below:
This function takes a word and
wraps it in a font tag with a specific size. The function takes the word to
be wrapped, how many times it occurred in the document, the highest word
count and the lowest word count of words being processed (the highest count
we are considering for this tag and the lowest). It returns a string that is
the word and fontSize between htmlBig and htmlLittle (two local vars in the
function. You can change them to be whatever you like)
This function takes a single string of all the font-wrapped words from makeHTMLword and places them in an html box to be displayed. It returns a string which is the html code for the box.
Takes the body returned from makeHTMLbox and wraps a standard html web page around it. The string title is used in the html. The title is also the file name with an ‘.html’ suffix
Play with this file for a few minutes until you see how it works. You do not need to understand all of the details, but you need to understand what each function does and how they work together.
In a file called pa11.py, you will need to write a group of helper functions called by a master function (named main() ) which will read through one of the debate files, create a dictionary of words spoken by a specific character, remove stop words, identify the 40 most frequently used words by that character, and use that information to call the helper functions provided to in the file htmlFunctions.py to create the word cloud. Your code should include no less than 3 new helper functions (other than main()) to illustrate that you understand how to break large problems into small, managable steps. Of course, you can create more than 3 new helper functions if you like.
The master function main() will take in two parameters:
For example, if I invoke:
main("holy-grail.txt","ARTHUR")
my code should produce a file called ARTHUR.html which is the word cloud spoken by that character in the holy grail movie. BTW, that file should look like this one (ARTHUR.html)
CHECKING YOUR WORK. If you run the commands above on your finished code and the html files look different from mine, then something isn't quite right for one of us. It COULD be me who is wrong, but you should ask questions.
To upload your homework for grading, log on to eLearning, select this class, and navigate to the "Assignment Submissions" area. Click on the "Programming Assignment 11" folder and upload the python file in its designated location.