Strings and For Loops
Bart Massey 2013-01-31
Review: Values and Expressions
Values can be given by constants:
Expressions can be made by combining values with operators:
1 + 3 * 5.0,
False and (True or False)
Expressions have whatever value they evaluate to; can be used wherever a value is needed
Values can be stored in variables:
x = 3 + 8,
y = False or False,
z = "hello"
Variables can be referenced in expressions:
1 + x,
y and True,
Functions are another kind of operator that combines values to produce a value:
Review: Types and Statements
Values have types: int, float, string, bool
Operators usually require and produce values of specific type:
3 + "hello"doesn't work. This is often true of functions as well:
sqrt("5")fails, for example.
Operators can produce values of different type than they accept:
2 ⟨ 3,
Sometimes a value's type is "promoted". Mostly, ints may be promote to floats:
sqrt(5)works, since it is interpreted as
Control statements require values of specific type
The string Type
So far, int, float and bool seem to have lots of operators and functions; string doesn't seem to have so much.
It turns out that
+can be a string operator as well, "concatenation":
"hello" + "world" == "helloworld".
Also, * can take a string on the left and an int on the right and produce a string that is a "repetition":
"hello" * 3 == "hellohellohello".
But strings seem to have "structure": they look visually like they are made of characters. Treating them as atomic blobs sometimes isn't what you want.
Actually, strings are "sequences" of characters, and thus structured values. Any one-character string constant is also a character constant (???).
strings As Sequences
How do we reference a specific character in a string? With the
"hello" == 'h'
- This, by the way, is the main reason why we start counting at 0 in Python.
Can we assign a specific character in a string? Nope.
"hello" = 'j'does not work.
operator is more versatile than it appears:
Negative indices count from the right:
"hello"[-1] == 'o'
"Slices" grab subsequences / substrings via
"hello"[1:3] == "el",
"hello"[1:-2] == "el",
"hello"[1:] == "ello",
"hello"[:2] == "he",
"hello"[:] == "hello"
Chopping Up and Pasting Together Strings
How do we make
"hello"? We now have enough machinery to do it:
'j' + "hello"[1:] == "jello"
Pretty common to want to process strings "character at a time". For this it's sometimes handy to know the number of characters in the string:
len("hello") == 5
Example: put dots between every character in a string
s = input("string to dotify? ") dot_s = s i = 1 while i < len(s): dot_s = dot_s + '.' + s[i] i += 1 print(dot_s)
Aside: Character Codes
Characters are represented inside the computer using a code called Unicode, which gives a number to every possible character in the world (~100000).
Originally, was ASCII, which gave a number to every common typewriter character (~100).
Unicode is a superset of ASCII.
Can find out the code of a character with ord(): ord('h') == 104, ord('⁋') == 8267
Can make new characters with chr(): chr(8267) == '⁋'
Unicode is semi-sane: 'a' through 'z', 'A' through 'Z' and '0' through '9' are all together in order.
"Non-printing" characters (also "combining characters" etc.)
The "pattern" of the previous program is really common: Set a loop control variable to starting value, then increase by one until it gets to ending value.
- "iteration": many programs do little else.
The "for" loop captures this pattern in an easier to read and more reliable form:
s = input("string to dotify? ") dot_s = s for i in range(1, len(s)): dot_s = dot_s + '.' + s[i] print(dot_s)
- Cannot forget to initialize the loop variable.
- Cannot forget to increment the loop variable.
The syntax is a little weird (and unique to Python)
More About the range() Function
range(9)is the same as
range(0, 9, 2)hits 0, 2, 4, 6 and 8.
range(9, 0, -1)is sometimes quite useful, but watch for the boundary cases.
Note that range starts with the initial value, but finishes just before the final value:
range(0, 3)hits 0, 1 and 2.
The range() Function Produces a Sequence (sort of)
Strings are cool because sequences are cool.
It turns out that Python also lets you have sequences of other things; for example, ints.
list(range(1, 5)) == [1, 2, 3, 4]
list()? Well, because
range()actually produces a wacky object called a "generator" that can produce a list. Ow.
All the operators you learned for strings (sequences of characters) work for sequences of ints.
[1, 2, 3, 4][1 : -1] == [2, 3]
for and Sequences
The for loop just sets the loop control variable to each element of a sequence in turn. We write
s = input("string to dotify? ") dot_s = s for c in s[1:]: dot_s = dot_s + '.' + c print(dot_s)
More About sequences
Python doesn't actually care what type of values you put in its sequences:
[1, 2.5, "hello", [1, 2, "goodbye"]]is a valid sequence.
- However, these kinds of sequences tend to be less useful.
Can we change an element of an arbitrary sequence?
x = ['a', 'b', 'c'] x = '!'
Yes, this works fine, and now
x == ['a', '!', 'c'].
So why can't we change our strings? Because strings are magic sequences of characters (cause Python is stupid sometimes): they are "immutable".
- There's a workaround, but it's too ugly to show.
Like lists, but with immutable elements. Different syntax:
Singleton tuple is
(1)was already taken.
Not terribly important yet, but you should know they exist.
Lots Of Material!
You need to carry these ideas home and try them out. Tonight!
Obfuscation: Input a string and then print that string with every character with an even character code changed to '.' . Is the string still readable?
Input a string and then print all subsequences of three consecutive characters from that string. These are called "trigrams" and are used in cryptography.