Back
Close

Computing with Data

elgeish
33.1K views
Previous: Counters Next: Hashable Objects

Dictionaries with Default Values

An alternative for the \texttt{dict} class to use as a sparse vector is defaultdict, which provides default values for nonexistent keys using a factory function that creates said default value:

from collections import defaultdict
# the default value is what int() returns: 0
d = defaultdict(int)
print(d[13])
# example: grouping elements by their respective keys
books = [ # documents to index: a list of (title, text) tuples
('hamlet', 'to be or not to be'),
('twelfth night', 'if music be the food of love'),
]
index = defaultdict(set) # the default value is what set() returns: {}
for title, text in books:
for word in text.split():
# get or create the respective set
# then add the title to it
index[word].add(title)
from pprint import pprint # pretty-printer (recommended for nested data)
pprint(index)
print(index['music']) # query for documents that have the word 'music'
# example: building a trie (aka prefix tree)
words = { 'apple', 'ban', 'banana', 'bar', 'buzz' }
# a factory function/lambda expression that returns
# a defaultdict which uses it as a factory function
factory = lambda: defaultdict(factory)
trie = defaultdict(factory)
DELIMITER = '--END--'
LEAF = dict()
for word in words:
level = trie
for letter in word:
level = level[letter]
level[DELIMITER] = LEAF
def autocomplete(root, prefix, suffix=''):
if DELIMITER in root:
print(prefix + suffix)
for key, value in root.items():
autocomplete(value, prefix, suffix + key)
autocomplete(trie['b']['a'], 'ba') # query for words that start with 'ba'
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Create your playground on Tech.io
This playground was created on Tech.io, our hands-on, knowledge-sharing platform for developers.
Go to tech.io
codingame x discord
Join the CodinGame community on Discord to chat about puzzle contributions, challenges, streams, blog articles - all that good stuff!
JOIN US ON DISCORD
Online Participants