Skip to content
Snippets Groups Projects
Commit f49a90aa authored by LOUIS TYRRELL OLIPHANT's avatar LOUIS TYRRELL OLIPHANT
Browse files

added lec 17 dictionaries 1

parent 98cc8c28
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
## Announcements
### CS 220 Enrichment Activities
Students interested in working on a real-world data set and learning about the full data processing pipeline?
Voluntary working groups will learn about data management, data wrangling/processing, modeling, and reporting/communication skills.
**When: Thursday, March 6th @ 4pm**
**Where: Computer Science Room 1325**
### Resources To Improve In The Course
* **CS 220 Office Hours** -- As I'm sure you know, you can go to the [course office hours](https://sites.google.com/wisc.edu/cs220-oh-sp25/) to get help with labs and projects.
* **CS Learning Center** -- Offer free [small group tutoring](https://www.cs.wisc.edu/computer-sciences-learning-center-cslc/), not for debugging your programs, but to talk about course concepts.
* **Undergraduate Learning Center** -- Provides tutoring and [academic support](https://engineering.wisc.edu/student-services/undergraduate-learning-center/). They have [drop-in tutoring](https://intranet.engineering.wisc.edu/undergraduate-students/ulc/drop-in-tutoring/).
%% Cell type:markdown id: tags:
## Warmup
Make sure you at least do Warmup 1, 2, and 5 because we will be using the survey data today and need the `cell()` function.
%% Cell type:code id: tags:
``` python
import csv
# Warmup 1: Read in the file 'cs220_survey_data.csv' into a list of lists
# source: Automate the Boring Stuff with Python Ch 12
def process_csv(filename):
exampleFile = open(filename, encoding="utf-8")
exampleReader = csv.reader(exampleFile)
exampleData = list(exampleReader)
exampleFile.close()
return exampleData
cs220_csv = process_csv("cs220_survey_data.csv") # change this
#TODO: show the length of this list of lists
```
%% Cell type:code id: tags:
``` python
# Warmup 2: store the first row in a variable called header and the rest in data
cs220_header = ..
print(cs220_header)
cs220_data = ...
```
%% Cell type:code id: tags:
``` python
# Warmup 4: show the last 3 rows of data
```
%% Cell type:code id: tags:
``` python
# Warmup 5: Finish writing the count_col_frequency function
def cell(row_idx, col_name):
col_idx = cs220_header.index(col_name)
val = cs220_data[row_idx][col_idx]
if val == "":
return None
elif col_name == "Age":
if "." in val:
return None
return int(val)
elif col_name == 'Latitude' or col_name == 'Longitude':
return float(val)
else:
return val
def count_col_frequency(value, col_name, indexes=None):
''' returns the frequency of value in col_name for the rows in indexes.
if indexes is None then looks at all rows.'''
if indexes==None:
indexes = list(range(len(cs220_data)))
count = 0
#TODO: finish writing this function
return count
```
%% Cell type:code id: tags:
``` python
#test your function
count_col_frequency("pepperoni", "Pizza Topping")
```
%% Cell type:code id: tags:
``` python
# Warmup 6: Think about it: Is there an easy way to count *every* topping frequency?
pepperoni_count = count_col_frequency("pepperoni", "Pizza Topping")
sausage_count = count_col_frequency("sausage", "Pizza Topping")
basil_spinach_count = count_col_frequency("basil/spinach", "Pizza Topping")
print(pepperoni_count)
print(sausage_count)
print(basil_spinach_count)
```
%% Cell type:markdown id: tags:
# Dictionaries
## Reading
- [Downey Ch 11 ("A Dictionary is a Mapping" through "Looping and Dictionaries")](https://greenteapress.com/thinkpython2/html/thinkpython2012.html)
- [Python for Everybody, 10.1 - 10.7](https://runestone.academy/ns/books/published/py4e-int/dictionaries/toctree.html)
## Learning Objectives
After this lecture you will be able to...
- Use correct dictionary syntax to:
- Create a dictionary using either {} or dict()
- Lookup, insert, update, and pop key/value pairs
- Use a for loop, the in operator, and common methods when working with dictionaries.
- Write code that uses a dictionary
- to store frequencies
- to iterate through all key/value pairs
%% Cell type:markdown id: tags:
## Dictionary Data Type
As we are getting more sophisticated in this course, its time to define...
### Data Structure <br>
a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data (Wikipedia)
Python contains built-in Data Structures called Collections. Today you will learn how to store data in Dictionaries.
### Dictionary
**A dictionary is like a list, but more general. In a list, the indices have to be integers; but a dictionary they can be any <u>immutable</u> type.**
You can think of a dictionary as a mapping between a set of indices (which are called keys) and a set of values. Each key maps to a value. The association of a key and a value is called a key-value pair or sometimes an item.
(from Think Python, Chapter 11)
The term **dictionary** is a useful one because it is similar to a language dictionary. A language dictionary has terms and their definitions. A Python dictionary has keys and the values associated with the keys.
%% Cell type:code id: tags:
``` python
# a dictionary that stores prices of bakery items
# create a dictionary of key/value pairs
# notice the curly brackets
# notice it can span over more than one line, indenting doesn't matter
price_dict = { 'pie': 3.95,
'ala mode':1.50,
'donut': 1.25,
'muffin': 2.25,
'brownie': 3.15,
'cookie': 0.79, 'milk':1.65, 'loaf': 5.99,
'hot dog': 4.99} # feel free to add some of your own here
price_dict
```
%% Cell type:code id: tags:
``` python
# print the length of the dictionary
print(len(price_dict)) # number of key/value pairs
#get the price for a certain item
print(price_dict['loaf']) # value of dict [ key]
#get the price for donut
print(price_dict['donut'])
# what's wrong with this line?
print(price_dict[1.25]) # No keys are 1.25 -- that is the value
```
%% Cell type:markdown id: tags:
### Dictionaries are Mutable
Just like a list, you can and remove items from a dictionary and you can change the values that are stored in a dictionary.
%% Cell type:code id: tags:
``` python
# add a new key/value pair using [ ] notation
price_dict['donut'] = 2.50
price_dict
```
%% Cell type:code id: tags:
``` python
# change the value associated with a key....syntax is like add
price_dict['donut'] = 2.25
price_dict
```
%% Cell type:code id: tags:
``` python
# use pop to delete a key/value pair
price_dict.pop('hot dog') # or del(price_dict['hot dog'])
price_dict
```
%% Cell type:code id: tags:
``` python
# delete another key/value pair using del()
del(price_dict['cookie'])
price_dict
```
%% Cell type:code id: tags:
``` python
# try deleting something that is not there
price_dict.pop('pizza')
price_dict
```
%% Cell type:code id: tags:
``` python
# fix this with an if statement
if 'pizza' in price_dict: # safe programming, check before popping. 'in' checks the keys
price_dict.pop('pizza')
price_dict
```
%% Cell type:markdown id: tags:
## A dictionary's `.keys()` and `.values()` methods
Like other data types, a dictionary has its own set of methods. The `.keys()` and `.values()` methods can be used to return just that portion of the dictionary. You can read more about a dictionaries methods at [W3schools](https://www.w3schools.com/python/python_ref_dictionary.asp).
%% Cell type:code id: tags:
``` python
# get all keys and convert to a list
keys = list(price_dict.keys())
print(keys)
sum_of_groc = 0
for key in keys:
sum_of_groc += price_dict[key]
print(sum_of_groc)
```
%% Cell type:code id: tags:
``` python
# get all values and convert to a list
values = list(price_dict.values())
print(values)
```
%% Cell type:code id: tags:
``` python
price_dict
```
%% Cell type:code id: tags:
``` python
# use 'in' price_dict, price_dict.keys(), price_dict.values()
print(price_dict)
print('donut' in price_dict) # default is to check the keys
print(3.95 in price_dict) # default is NOT values # FALSE
print('muffin' in price_dict.keys()) # can call out the keys
print(3.95 in price_dict.values()) # can check the values
```
%% Cell type:markdown id: tags:
## Common uses of Dictionaries
Dictionaries can be a very useful data structure. In Python, you would typically use a dictionary when you need to store data where each element is associated with a unique key, allowing for quick lookups and retrieval of specific values based on those keys; this is ideal for problems like mapping words to their definitions, storing user data with unique identifiers, or associating items in a game with their properties.
Take a look at the following examples that use dictionaries.
%% Cell type:code id: tags:
``` python
price_dict
```
%% Cell type:code id: tags:
``` python
# Example 1: given a list of items, find the total cost of the order
order = ['pie', 'donut', 'milk', 'cookie', 'tofu'] # add more items to the order
print(order)
total_cost = 0
for item in order:
if item in price_dict:
total_cost += price_dict[item]
else:
print(item + " is not in our prices!")
# find the total of the items in the order
print ("Your total is ${:.2f}".format(total_cost))
```
%% Cell type:code id: tags:
``` python
# Example 2a: find the frequency of characters in a sentence
# start with an empty dictionary
letter_freq = {} # populate this, with the letters find
# letter_freq = dict() # other way
sentence = "A data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data."
for letter in sentence:
letter = letter.lower()
if not letter in letter_freq:
letter_freq[letter] = 1
else:
letter_freq[letter] += 1
letter_freq
```
%% Cell type:code id: tags:
``` python
# Example 2b: find the letter that occurred the most
# HINT: Use the dictionary above.
most_used_key = None
max_usage = 0
for current_key in letter_freq:
if most_used_key == None:
most_used_key = current_key
max_usage = letter_freq[current_key]
else:
if max_usage < letter_freq[current_key] and current_key.isalpha():
most_used_key = current_key
max_usage = letter_freq[current_key]
print("The character '{}' appeared {} times".format(str(most_used_key), max_usage))
```
%% Cell type:code id: tags:
``` python
# think about it... why not use for i in range?
for i in range(len(letter_freq)): # this is NOT how we iterate over a dictionary
print(i)
```
%% Cell type:code id: tags:
``` python
# Recall: survey data
cs220_data[-1]
```
%% Cell type:markdown id: tags:
## You Try It
Finish the code in the cell below to find the counts of all of the majors
%% Cell type:code id: tags:
``` python
# Example 3a. Same as 2a above (find frequency of letter),
# but use the survey_data to find the frequency of the different majors
major_freq = dict() # another way to make a dictionary
for i in range(len(cs220_data)):
major = cell(i,'Primary Major')
##TODO finish the code here
major_freq
```
%% Cell type:markdown id: tags:
## Make It A Helper Function
Getting a frequency table is so common, let's make it a helper function and then use it to answer some common questions.
Finish the function definition in the cell below.
%% Cell type:code id: tags:
``` python
def get_frequency_distribution(col_name,indexes=None):
"""returns a dictionary of values in col_name and their frequencies.
If indexes is None then looks at all rows otherwise only looks at the rows in indexes
"""
if indexes == None:
indexes = list(range(len(cs220_data)))
ret_value = {}
for i in indexes:
val = cell(i,col_name)
##TODO if val is a key in ret_value then increment the associated value
## otherwise add the val as a key with 1 as the associated value
return ret_value
```
%% Cell type:markdown id: tags:
### Use The `get_frequency_distribution()` function
Use the `get_frequencey_distribution()` function to answer the questions in the cells below.
%% Cell type:code id: tags:
``` python
##TODO: Are there more runners or non-runners who filled out the survey?
```
%% Cell type:code id: tags:
``` python
##TODO: Are there more early birds or night owls who filled out the survey?
```
%% Cell type:code id: tags:
``` python
##TODO: What is the most popular major and how many are in that major who filled out the survey?
```
%% Cell type:code id: tags:
``` python
def filter_match(col_name,col_value,indexes=None):
if indexes == None:
indexes = list(range(len(cs220_data)))
ret_value = []
for i in indexes:
val = cell(i,col_name)
if col_value == val:
ret_value.append(i)
return ret_value
```
%% Cell type:markdown id: tags:
Now combine the use of the `get_frequency_distribution()` with the `filter_match()` function which we created last time to answer a few more questions.
%% Cell type:code id: tags:
``` python
##TODO: Of those who run, what is the distribution of sleep habbits?
```
%% Cell type:code id: tags:
``` python
##TODO: Of those who do not run, what is the distribution of sleep habbits?
```
%% Cell type:code id: tags:
``` python
##TODO: Are procrastinators more common among dog lovers or cat lovers?
```
%% Cell type:markdown id: tags:
### Summary
We have practiced creating **dictionaries**, adding and removing key-value pairs from the dictionaries, accessing values by using the associated key, and iterating over the keys of a dictionary to find a specific value (e.g. the largest value).
This diff is collapsed.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment