Back in Session 20, we built a function called multi_find(), which we used on Homework 9. Let's refresh our memory on how it works.
def multi_find(source, target, start, end): result = '' pos = start while pos < end: next_pos = source.find(target, pos, end) # 1 if next_pos == -1: break result += (str(next_pos) + ',') # 2 pos = next_pos + 1 if result == '': return result return result[:-1]
Trace the code for the call
multi_find('abcdabccabacdeacbe', 'ab', 2, 12)Write down the value of pos and next_pos every time you reach #1.
Let's run the code and find out... Now we can see that find() lets the code speed through the source string, focusing on the matches. Try changing the 12 to 100!
Several students have told me "I don't really understand what this function does..." What can you do when you find yourself in this position?
All programmers occasionally run into code that baffles them. We all use techniques like this to get out of the dark. You, too, can be the source of your own enlightenment.
When we wrote multi_find(), strings and files were the only collections we knew about, and we had only begun to write our own functions. Now that we know lists and understand functions pretty well, we can make the function more Python-like and work more like its inspiration, the string method find().
Return Type. multi_find() returns a string, but that was a product of our limited Python knowledge. A list is a much more useful return type. That's an easy improvement to make, affecting only three points in the code
The result is a straightforward function that's easier for client code to use.
>>> multi_find('abcdabccabacdeacbe', 'ab', 2, 100) [4, 8] >>> multi_find('abcdabccabacdeacbe', 'b', 2, 100) [5, 9, 16] >>> multi_find('abcdabccabacdeacbe', 'a', 0, 100) [0, 4, 8, 10, 14]
Optional Arguments. If we want to search all the way to the end of a string, the find() method allows us to leave off the last argument. The default value is the length of the string.
>>> len('abcdabccabacdeacbe') 18 >>> 'abcdabccabacdeacbe'.find('a', 6, 18) 8 >>> 'abcdabccabacdeacbe'.find('a', 6) 8
If we want to search from the beginning of the string, we can even leave off the second argument. The method uses 0 as the default value.
>>> 'abcdabccabacdeacbe'.find('a', 9) 10 >>> 'abcdabccabacdeacbe'.find('a', 0) 0 >>> 'abcdabccabacdeacbe'.find('a') 0
We can do the same thing by giving a parameter a default value in the function header. Here is our current header for multi_find():
def multi_find(source, target, start, end):
Making start default to 0 is as easy as this:
def multi_find(source, target, start=0, end):
We also need to give a default value to end. There are two reasons:
... look at "Check Yourself" on Page 369 of the text.
What is the default value for end? It is the length of the string to be searched, len(source). But if we try that...
def multi_find(source, target, start=0, end=len(source)):
We get an error:
Traceback (most recent call last): File "/Users/wallingf/home/teaching/cs1510/web/sessions/session28/multi_find_v3.py", line 1, indef multi_find(source, target, start=0, end=len(source)): NameError: name 'source' is not defined
The variable source does not exist until the function body executes. (You will learn why when you take CS 3540 Programming Languages and Paradigms.) So we have to write code to handle that case:
def multi_find(source, target, start=0, end=None): if end == None: end = len(source)
This uses the None that we have seen a time or two as a sentinel value. If it is ever the value of end, then we know the user did not pass four values, and the function should search all the way to the end of source.
This works nicely!
>>> multi_find('abcdabccabacdeacbe', 'ab', 2, 100) [4, 8] >>> multi_find('abcdabccabacdeacbe', 'ab', 2) [4, 8] >>> multi_find('abcdabccabacdeacbe', 'ab', 0) [0, 4, 8] >>> multi_find('abcdabccabacdeacbe', 'ab') [0, 4, 8]
I like languages that let me create code that works like the built-in features of the language. Python gives us that freedom on occasion.
In Lab 14, you got your first official experience using dictionaries in a program, using this data file. Let's step through it...
KEYS VALUES 'Brad Pitt' → [ 'Sleepers', ... ] 'Anthony Hopkins' → [ 'Hannibal', ... ] ... 'Bruce Willis' → [ 'Die Hard', ... ] 'Kevin Bacon' → [ 'A Few Good Men', ... ]
Standard running total loop, where we build a dictionary instead of a number, string, or list. That is the first step in breaking the task down into manageable steps...
One wrinkle: a space at the front of each movie name. Why? How to fix? (Write a strip_all(lst) function?)
KEYS VALUES 'Sleepers' → [ 'Brad Pitt', ... ] 'Hannibal' → [ 'Anthony Hopkins', ... ] ... 'Die Hard' → [ 'Bruce Willis', ... ] 'A Few Good Men' → [ 'Kevin Bacon', ... ]
Standard running total loop to build another dictionary. The main wrinkle is that we encounter movies multiple times in the actor database. So:
Together, the two dictionaries tell us a lot:
With just these two dictionaries, we can begin to answer many interesting questions. Some are just for fun, such as The Six Degrees of Kevin Bacon, a movie watcher's pastime from a few years ago. The Oracle of Bacon implements a simple look-up. Can you use your two dictionaries to implement something similar?