PYTHON DICTIONARIES CHAPTER 11 FROM THINK PYTHON HOW TO THINK LIKE A COMPUTER SCIENTIST WHAT IS A REAL DICTIONARY? It’s a book or file that contains the.

Transcript PYTHON DICTIONARIES CHAPTER 11 FROM THINK PYTHON HOW TO THINK LIKE A COMPUTER SCIENTIST WHAT IS A REAL DICTIONARY? It’s a book or file that contains the.

PYTHON DICTIONARIES CHAPTER 11

FROM THINK PYTHON HOW TO THINK LIKE A COMPUTER SCIENTIST

WHAT IS A REAL DICTIONARY?

It’s a book or file that contains the definitions of words.

In particular there is a pairing between a word and its definition. You look up the word and get the definition. Can you look up the definition and get the word??

de·moc·ra·cy : a system of government by the whole population or all the eligible members of a state, typically through elected representatives.

word : definition is the pairing In Python we can define a variable to contain a dictionary.

>>>d1 = { 1 :’one’, 2 :’two’, 3 :’three’, 4 :’four’} >>> print d1[2]



what is the value that goes with key 2 two

KEY:VALUE PARING

Lets look at a English to German translation table engGer ={‘one’:’eins’,’two’:’zwei’,’three’:’drei’,’four’:’vier’} print engGer [‘three’] drei The keys can be anything you want. How about >>> decToBinary = {0:0,1:1,2:10,3:11,4:100,5:101,6:110} >>> print decToBinary[3] 11 NOTE: Dictionaries are mutable !

THE VALUES() METHOD

To see whether something appears as a value in a dictionary, you can use the method values(), which returns the values as a list , and then use the in operator: >>> vals = eng2Ger.values() >>> print vals

[

’eins’,’zwei’,’drei’,’vier] so you can do things like if ‘seiben’ in vals: do_something

#this function returns the number #of key:value pairs print len(eng2Ger) 4 dictionary

NOTE: if you put in a key that not there you get an error!

i.e. eng2Ger[‘seiben’] throws and exception!

DICTIONARY ACCESS IS VERY FAST!

You recall we can use the in operator in both lists, sets and dictionaries.

If we have a dictionary that contains 1000000 key: value pairs and a list that has 1000000 elements in it the speed of value in dictionary is much faster than element in list This is because dictionaries are implemented in a special way under the hood , so to speak. See

Exercise 10.11

DICTIONARY AS A SET OF COUNTERS

Suppose you are given a string and you want to count how many times each letter appears.

There are several ways you could do it: 1. You could create 26 variables, one for each letter of the alphabet. Then you could traverse the string and, for each character, increment the corresponding counter, probably using a chained conditional.

2. You could create a list with 26 elements. Then you could convert each character to a number (using the built-in function counter.

ord ), use the number as an index into the list, and increment the appropriate 3. You could create a dictionary with characters as keys and counters as the corresponding values. The first time you see a character, you would add an item

LET USE DICTIONARIES

def histogram(s): d = dict() for c in s: if c not in d: d[c] = 1 else: d[c] += 1 return d >>> h = histogram('brontosaurus') >>> print h {'a': 1, 'b': 1, 'o': 2, 'n': 1, 's': 2, 'r': 2, 'u': 2, 't': 1} The histogram indicates that the letters 'a' and 'b' appear once; 'o' appears twice, and so on.

Add the key c to the dictionary and set its value to 1 if not found if it is already there then just increment the value How about doing this for an entire book! or a DNA string

LOOPING OVER A DICTIONARY

def print_hist(h): for c in h: print c, h[c] Here’s what the output looks like: >>> h = histogram('parrot') >>> print_hist(h) a 1 p 1 r 2 i.e. You can format this anyway you choose. Dictionaries have a method called keys that returns the keys of the dictionary, in no particular order, as a list.

t 1 o 1 How would you do this so they were in alphabetical order? Remember you can sort a list. Lets do it in class!

CLICK TO SEE ANSWER

def histogram(s): d = dict() for c in s: if c not in d: d[c] = 1 else: d[c] += 1 return d def print_hist(h): keylist = h.keys() keylist.sort() for c in keylist: print c, h[c] h = histogram (‘bothriolepus') print_hist(h)

REVERSE LOOKUP

Given a dictionary d and a key k, it is easy to find the corresponding value v = d[k]. This operation is called a lookup .

But what if you have v and you want to find k? You have two problems: first, there might be more than one key that maps to the value v. Depending on the application, you might be able to pick one, or you might have to make a list that contains all of them.

SEARCH THE DICT

def reverse_lookup(d, v): for k in d: if d[k] == v: return k raise ValueError  no k found such that k:v exists This function is yet another example of the search pattern, but it uses a feature we haven’t seen before, raise . The raise statement causes an exception; in this case it causes a ValueError, which generally indicates that there is something wrong with the value of a parameter. Note: this is slower than the other way.

RETURN A LIST OF MATCHING CASES

#Returns a list of the keys that give v. If no key gives v then #return the empty list () def reverse_lookup(d, v): r=() for k in d: if d[k] == v: r.append(k) return r

RNA AMINO ACID TRANSLATION TABLE

DNA_codon { 'ATA':'I', 'ATC':'I', 'ATT':'I', 'ATG':'M', 'ACA':'T', 'ACC':'T', 'ACG':'T', 'ACT':'T', 'AAC':'N', 'AAT':'N', 'AAA':'K', 'AAG':'K', 'AGC':'S', 'AGT':'S', 'AGA':'R', 'AGG':'R', 'CTA':'L', 'CTC':'L', 'CTG':'L', 'CTT':'L', 'CCA':'P', 'CCC':'P', 'CCG':'P', 'CCT':'P', 'CAC':'H', 'CAT':'H', 'CAA':'Q', 'CAG':'Q', 'CGA':'R', 'CGC':'R', 'CGG':'R', 'CGT':'R', 'GTA':'V', 'GTC':'V', 'GTG':'V', 'GTT':'V', 'GCA':'A', 'GCC':'A', 'GCG':'A', 'GCT':'A', 'GAC':'D', 'GAT':'D', 'GAA':'E', 'GAG':'E', 'GGA':'G', 'GGC':'G', 'GGG':'G', 'GGT':'G', 'TCA':'S', 'TCC':'S', 'TCG':'S', 'TCT':'S', 'TTC':'F', 'TTT':'F', 'TTA':'L', 'TTG':'L', 'TAC':'Y', 'TAT':'Y', 'TAA':'_', 'TAG':'_', 'TGC':'C', 'TGT':'C', 'TGA':'_', 'TGG': 'W‘ } # A tricky translation for those of you who love this stuff. def translate( sequence ): """Return the translated protein from 'sequence' assuming +1 reading frame""" return ''.join([DNA_codon.get(sequence[3*i:3*i+3],'X') for i in range(len(sequence)//3)])

ANOTHER WAY (MORE UNDERSTANDABLE)

def translate( sequence ): s = '‘



initialize to empty string numcodons = len(sequence)//3 pos=0 for i in range(numcodons): s=s+DNA_codon[sequence[pos:pos+3]] pos+=3



goes to every third char return s pos sequence = ACT GTA AGC CGT ACA ’