lec 16 list practice

98cc8c28 · LOUIS TYRRELL OLIPHANT · e4a0714f · 98cc8c28 · 98cc8c28 · 98cc8c28
Commit 98cc8c28 authored 1 month ago by LOUIS TYRRELL OLIPHANT
--- a/s25/Louis_Lecture_Notes/16_List_Practice/Lec_16_List_Practice.ipynb
+++ b/s25/Louis_Lecture_Notes/16_List_Practice/Lec_16_List_Practice.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Announcements\n",
+    "\n",
+    "### CS 220 Enrichment Activities\n",
+    "Students interested in working on a real-world data set and learning about the full data processing pipeline?\n",
+    "\n",
+    "Voluntary working groups will learn about data management, data wrangling/processing, modeling, and reporting/communication skills.\n",
+    "\n",
+    "**When: Thursday, March 6th @ 4pm**\n",
+    "\n",
+    "**Where: Computer Science Room 1325**\n",
+    "\n",
+    "\n",
+    "### Resources To Improve In The Course\n",
+    "\n",
+    "* **CS 220 Office Hours** -- As I'm sure you know, you can go to the [course office hours](https://sites.google.com/wisc.edu/cs220-oh-sp25/) to get help with labs and projects.\n",
+    "* **CS Learning Center** -- Offer free [small group tutoring](https://www.cs.wisc.edu/computer-sciences-learning-center-cslc/), not for debugging your programs, but to talk about course concepts.\n",
+    "* **Undergraduate Learning Center** -- Provides tutoring and [academic support](https://engineering.wisc.edu/student-services/undergraduate-learning-center/).  They have [drop-in tutoring](https://intranet.engineering.wisc.edu/undergraduate-students/ulc/drop-in-tutoring/)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Warmup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Warmup 0: Hotkeys\n",
+    "# We move quickly, it's good to know some hotkeys!\n",
+    "\n",
+    "# All-around good-to-knows...\n",
+    "#  Ctrl+A: Select all the text in a cell.\n",
+    "#  Ctrl+C: Copy selected text.\n",
+    "#  Ctrl+X: Cut selected text.\n",
+    "#  Ctrl+V: Paste text from clipboard.\n",
+    "#  Ctrl+S: Save.\n",
+    "\n",
+    "# Jupyter-specific good-to-knows...\n",
+    "#  Ctrl+Enter: Run Cell\n",
+    "#  Ctrl+/: Comment/uncomment sections of code.\n",
+    "#  Esc->A: Insert cell above\n",
+    "#  Esc->B: Insert cell below\n",
+    "#  Esc->Shift+L: Toggle line numbers (not working on my machine, but can look under View->Show Line Numbers).\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Warmup 1: Empty List\n",
+    "\n",
+    "weekend_plans = []  # I have no weekend plans :(\n",
+    "print(weekend_plans)\n",
+    "\n",
+    "# TODO add three things to your weekend plans using .append\n",
+    "\n",
+    "print(weekend_plans)\n",
+    "\n",
+    "# TODO add three things to your weekend using .extend\n",
+    "\n",
+    "print(weekend_plans)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Warmup 2: Tic-Tac-Toe\n",
+    "\n",
+    "board = []\n",
+    "\n",
+    "# TODO using .append(), add three lists of row data for tic-tac-toe (Noughts and Crosses)\n",
+    "# make up the placement of X's and O's.\n",
+    "\n",
+    "# TODO now use nested loops to print the board\n",
+    "\n",
+    "# TODO print out the center value using double indexing\n",
+    "print(board[1][1])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "# List Practice\n",
+    "\n",
+    "**Readings**\n",
+    "\n",
+    "- Optional: [Set Data Type](https://docs.python.org/3.10/library/stdtypes.html#set-types-set-frozenset)\n",
+    "- Optional: [W3Schools on Set Data Type](https://www.w3schools.com/python/python_sets.asp)\n",
+    "\n",
+    "**Objectives**\n",
+    "\n",
+    "- Understand and use the `set` data type for removing duplicates\n",
+    "- Create helper functions for filtering data\n",
+    "- Use the `sort()` and `sorted()` functions and understand the difference between them.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "## Set Data Type\n",
+    "\n",
+    "Another data type similar to a list is a set.  To create a set you use curly braces instead of square brackets.  Sets cannot contain duplicate items.\n",
+    "\n",
+    "```python\n",
+    "    my_set = {2, 3, 3, 4}\n",
+    "    print(my_set)\n",
+    "```\n",
+    "```\n",
+    "    {2,3,4}\n",
+    "```\n",
+    "\n",
+    "One common use for sets is to remove duplicates from a list.  You can do this by converting a list to a set and then convert it back again.\n",
+    "\n",
+    "```python\n",
+    "    groceries = ['apples','oranges','kiwis','apples']\n",
+    "    groceries = list(set(groceries))\n",
+    "    print(groceries)\n",
+    "```\n",
+    "```\n",
+    "    ['apples','oranges','kiwis']\n",
+    "```\n",
+    "\n",
+    "You can read more about sets and the methods that you can use with them at [w3school](https://www.w3schools.com/python/python_sets.asp).\n",
+    "\n",
+    "### You Try It\n",
+    "\n",
+    "Use the `.add()` and `.discard()` methods to the set below to add at least 4 weekend plans, with one being a duplicate and then removing one of the weekend plans."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# However, it is unordered and unique.\n",
+    "# The function names are also a little different!\n",
+    "weekend_plans = set() # creates an empty set\n",
+    "\n",
+    "## TODO: use .add() to add 4 items to weekend_plans with one being a duplicate\n",
+    "\n",
+    "\n",
+    "print(weekend_plans)\n",
+    "\n",
+    "# TODO: use .discard() to remove one of the items from the set\n",
+    "# Unlike a list's remove, this will not throw an error if DNE (does not exist).\n",
+    "\n",
+    "print(weekend_plans)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "## Helper Functions\n",
+    "\n",
+    "Last class we created the `cell()` function to help with selecting values from the survey data.  Let's take this a step further today and create a range of helper functions that we can use to answer more challenging questions.  Investing time into creating good helper functions can speed up tackling answering these challenging questions and they provide flexibility so you can use the same helper functions to answer a variety of questions.\n",
+    "\n",
+    "First, let's create our `process_csv()` function to help with loading the data and then split the data into the header and data portions.  Finish the code in the cells below:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import csv\n",
+    "\n",
+    "# source:  Automate the Boring Stuff with Python Ch 12\n",
+    "def process_csv(filename):\n",
+    "    exampleFile = open(filename, encoding=\"utf-8\")  \n",
+    "    exampleReader = csv.reader(exampleFile) \n",
+    "    exampleData = list(exampleReader)        \n",
+    "    exampleFile.close()  \n",
+    "    return exampleData"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# TODO: Seperate the data into 2 parts...\n",
+    "# a header row, and a list of data rows\n",
+    "cs220_csv = process_csv('cs220_survey_data.csv')\n",
+    "cs220_header = ...\n",
+    "cs220_data = ..."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "Let's make the `cell()` function again, but this time let's make it a little more robust.  If you recall we filtered empty values and returned the `None` instead of an empty string.  We also converted the `Age` column's type to int() (and filtered out if a decimal point was found.\n",
+    "\n",
+    "Let's take that a step further and think about every column and what values we would want to filter out and instead return a `None` and what the data types we would want to return.  The `Latitude` and `Longitude` columns contain values that would best be treated as a float.  Can you think of any other changes to data types or specific values that should be filtered out?\n",
+    "\n",
+    "Make those changes to the cell function below:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Remember the improved cell function\n",
+    "\n",
+    "def cell(row_idx, col_name):\n",
+    "    col_idx = cs220_header.index(col_name)\n",
+    "    val = cs220_data[row_idx][col_idx]\n",
+    "    if val == \"\":\n",
+    "        return None\n",
+    "    elif col_name == \"Age\":\n",
+    "        return int(val)\n",
+    "    ##TODO add an elif for converting latitude and longitude to float\n",
+    "    else:\n",
+    "        return val"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "Let's test our improved `cell()` function and make sure it is working properly.\n",
+    "\n",
+    "Since the `Age`, `Latitude` and `Longitude` should now all be numbers, let's loop through the data and check the return type for these columns.  Finish the code in the cell below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "for i in range(len(cs220_data)):\n",
+    "    age = cell(i,'Age')\n",
+    "    if age == None or type(age) == int:\n",
+    "        pass\n",
+    "    else:\n",
+    "        print(\"found an age which is the wrong type:\",age)\n",
+    "    ##TODO add checks for latitude and longitude"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "### Helper Function: Getting smallest value in column\n",
+    "\n",
+    "Now let's think about functions as tools.  What tool would we want to help answer questions we might have.  One possible question we might have revolves around find the smallest or largest value in a column.\n",
+    "\n",
+    "And if we think for a moment about the design of the function, we can make it very flexible.  In other words it might help with multiple possible questions.  Some possible questions involving the smallest value in a column might be:\n",
+    "\n",
+    "* What is the age of the youngest person who answered the survey?\n",
+    "* What is the age of the youngest person in your lecture?\n",
+    "* What song did the youngest person enter?\n",
+    "\n",
+    "All of these questions involve a youngest person.  But notice two things:\n",
+    "\n",
+    "* Knowing the index of this person allows us to find out other properties about the person\n",
+    "* Working with a subset of the data (e.g. youngest in your lecture) is a common practice\n",
+    "\n",
+    "To handle these different ways of looking for the youngest, we can create our function like so:\n",
+    "\n",
+    "```python\n",
+    "def get_smallest(col_name, indexes=None):\n",
+    "    \"\"\"Returns the index of the smallest value\n",
+    "    for the column with col_name, looking only\n",
+    "    at the rows in indexes.  If indexes is None\n",
+    "    then looks at all rows.\"\"\"\n",
+    "```\n",
+    "\n",
+    "Notice that the function returns the index, not the value.  The function also has an optional `indexes` parameter which can be used if you already have a subset of the rows you want to work with.\n",
+    "\n",
+    "Finish writing the function below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "def get_smallest(col_name, indexes=None):\n",
+    "    \"\"\"Returns the index of the smallest value\n",
+    "    for the column with col_name, looking only\n",
+    "    at the rows in indexes.  If indexes is None\n",
+    "    then looks at all rows.\"\"\"\n",
+    "    if indexes == None:\n",
+    "        indexes = list(range(0,len(cs220_data)))\n",
+    "    smallest_value = None\n",
+    "    smallest_index = 0\n",
+    "    for i in indexes:\n",
+    "        val = cell(i,col_name)\n",
+    "        if val == None:\n",
+    "            continue\n",
+    "        ##TODO FINISH Function\n",
+    "    return smallest_index"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "Okay, if we have written our function well, we can now use it to answer the questions we have.  Use the `get_smallest()` to answer the questions in the cell below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "##TODO What is the age of the youngest person who answered the survey?\n",
+    "\n",
+    "\n",
+    "##TODO What song did the youngest person enter?\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "### Helper Function: Filter rows by a column that matches a value\n",
+    "Notice we didn't ask one question -- What is the age of the youngest person in your lecture?  To answer this we first need to get a list of just the rows that are for your lecture.\n",
+    "\n",
+    "Let's create a filter function that will return a list of the indexes that meet some matching criteria.  And we can still use the option indexes idea.\n",
+    "\n",
+    "```python\n",
+    "def filter_match(col_name,col_value,indexes=None):\n",
+    "    \"\"\"returns a subset of indexes where the \n",
+    "    col_name has a value of col_value.  If indexes\n",
+    "    is None then looks at all rows\"\"\"\n",
+    "```\n",
+    "\n",
+    "Finish the code in the cell below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "def filter_match(col_name,col_value,indexes=None):\n",
+    "    if indexes == None:\n",
+    "        indexes = list(range(len(cs220_data)))\n",
+    "    ret_value = []\n",
+    "    for i in indexes:\n",
+    "        val = cell(i,col_name)\n",
+    "        ##TODO: Finish function\n",
+    "    return ret_value"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "Okay, use this `filter_match()` function, perhaps combined with `get_smallest()`, to answer the questions in the cell below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "## TODO how many people are in your lecture who filled out the survey?\n",
+    "\n",
+    "## TODO What is the age of the youngest person in your lecture?\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "### Helper Function: Filter rows by a column that contains a value\n",
+    "Exact matching is only one way that we might want to filter our data.  Another possible way would be if a column contains a particular value as a portion of what was entered.  For example, how many primary majors are Engineering majors?  Since the field is a string and \"Engineering\" is only a portion of the value, we really want to see of the field contains the value.\n",
+    "\n",
+    "Finish the `filter_contains()` function below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "def filter_contains(col_name,col_value,indexes=None):\n",
+    "    \"\"\"returns a subset of indexes where the column\n",
+    "    col_name has values that are strings and col_value is\n",
+    "    a portion of the value (case insensitive)\n",
+    "    \"\"\"   \n",
+    "    if indexes == None:\n",
+    "        indexes = list(range(len(cs220_data)))\n",
+    "    ret_value = []\n",
+    "    col_value = col_value.lower()\n",
+    "    for i in indexes:\n",
+    "        val = cell(i,col_name)\n",
+    "        if val == None:\n",
+    "            continue\n",
+    "        val = val.lower()\n",
+    "        ##TODO FINISH FUNCTION\n",
+    "    return ret_value"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "With our functions we can answer some rather challenging questions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "## TODO: How many people in your lecture are majoring in Engineering?\n",
+    "\n",
+    "\n",
+    "## TODO: What Bruno Mar's songs did people enter?\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "Come up with your own questions where you can use these functions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "## TODO: What is your question?  Write the question then write code to answer it.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "source": [
+    "## Sorting\n",
+    "\n",
+    "There are two ways common ways in Python to sort a list.  One way modifies the original list and the second creates a new list that is sorted but does not modify the original list.\n",
+    "\n",
+    "- The `sorted()` function creates a new sorted list and leaves the original list unmodified.\n",
+    "- The `.sort()` method sorts the original list, mutating it.\n",
+    "\n",
+    "```python\n",
+    "x = [2, 4, 1]\n",
+    "y = sorted(x)\n",
+    "print(x)\n",
+    "print(y)\n",
+    "```\n",
+    "```\n",
+    "[2,4,1]\n",
+    "[1,2,4]\n",
+    "```\n",
+    "\n",
+    "```python\n",
+    "x = [2, 4, 1]\n",
+    "y = x.sort()\n",
+    "print(x)\n",
+    "print(y)\n",
+    "```\n",
+    "```\n",
+    "[1,2,4]\n",
+    "None\n",
+    "```\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "editable": true,
+    "slideshow": {
+     "slide_type": ""
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# TODO Sort using sorted() function\n",
+    "\n",
+    "x = [2, 4, 1]\n",
+    "y = ...\n",
+    "print(x)\n",
+    "print(y)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# TODO Sort using sort() method\n",
+    "x = [2, 4, 1]\n",
+    "y = ...\n",
+    "\n",
+    "print(x)\n",
+    "print(y)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## You Try It\n",
+    "\n",
+    "Using sorting and the functions above to find:\n",
+    "\n",
+    "- A sorted list of songs from those who run and are over 20\n",
+    "- A sorted list of majors of the procrastinators, removing duplicates"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "##TODO: Sorted list of songs from runners over 20\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "##TODO: Sorted list of majors by procrastinators -- no duplicates\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
+%% Cell type:markdown id: tags:
+## Announcements
+### CS 220 Enrichment Activities
+Students interested in working on a real-world data set and learning about the full data processing pipeline?
+Voluntary working groups will learn about data management, data wrangling/processing, modeling, and reporting/communication skills.
+**When: Thursday, March 6th @ 4pm**
+**Where: Computer Science Room 1325**
+### Resources To Improve In The Course
+* **CS 220 Office Hours** -- As I'm sure you know, you can go to the [course office hours](https://sites.google.com/wisc.edu/cs220-oh-sp25/) to get help with labs and projects.
+* **CS Learning Center** -- Offer free [small group tutoring](https://www.cs.wisc.edu/computer-sciences-learning-center-cslc/), not for debugging your programs, but to talk about course concepts.
+* **Undergraduate Learning Center** -- Provides tutoring and [academic support](https://engineering.wisc.edu/student-services/undergraduate-learning-center/).  They have [drop-in tutoring](https://intranet.engineering.wisc.edu/undergraduate-students/ulc/drop-in-tutoring/).
+%% Cell type:markdown id: tags:
+## Warmup
+%% Cell type:code id: tags:
+``` python
+# Warmup 0: Hotkeys
+# We move quickly, it's good to know some hotkeys!
+# All-around good-to-knows...
+#  Ctrl+A: Select all the text in a cell.
+#  Ctrl+C: Copy selected text.
+#  Ctrl+X: Cut selected text.
+#  Ctrl+V: Paste text from clipboard.
+#  Ctrl+S: Save.
+# Jupyter-specific good-to-knows...
+#  Ctrl+Enter: Run Cell
+#  Ctrl+/: Comment/uncomment sections of code.
+#  Esc->A: Insert cell above
+#  Esc->B: Insert cell below
+#  Esc->Shift+L: Toggle line numbers (not working on my machine, but can look under View->Show Line Numbers).
+```
+%% Cell type:code id: tags:
+``` python
+# Warmup 1: Empty List
+weekend_plans = []  # I have no weekend plans :(
+print(weekend_plans)
+# TODO add three things to your weekend plans using .append
+print(weekend_plans)
+# TODO add three things to your weekend using .extend
+print(weekend_plans)
+```
+%% Cell type:code id: tags:
+``` python
+# Warmup 2: Tic-Tac-Toe
+board = []
+# TODO using .append(), add three lists of row data for tic-tac-toe (Noughts and Crosses)
+# make up the placement of X's and O's.
+# TODO now use nested loops to print the board
+# TODO print out the center value using double indexing
+print(board[1][1])
+```
+%% Cell type:markdown id: tags:
+# List Practice
+**Readings**
+- Optional: [Set Data Type](https://docs.python.org/3.10/library/stdtypes.html#set-types-set-frozenset)
+- Optional: [W3Schools on Set Data Type](https://www.w3schools.com/python/python_sets.asp)
+**Objectives**
+- Understand and use the `set` data type for removing duplicates
+- Create helper functions for filtering data
+- Use the `sort()` and `sorted()` functions and understand the difference between them.
+%% Cell type:markdown id: tags:
+## Set Data Type
+Another data type similar to a list is a set.  To create a set you use curly braces instead of square brackets.  Sets cannot contain duplicate items.
+```python
+    my_set = {2, 3, 3, 4}
+    print(my_set)
+```
+```
+    {2,3,4}
+```
+One common use for sets is to remove duplicates from a list.  You can do this by converting a list to a set and then convert it back again.
+```python
+    groceries = ['apples','oranges','kiwis','apples']
+    groceries = list(set(groceries))
+    print(groceries)
+```
+```
+    ['apples','oranges','kiwis']
+```
+You can read more about sets and the methods that you can use with them at [w3school](https://www.w3schools.com/python/python_sets.asp).
+### You Try It
+Use the `.add()` and `.discard()` methods to the set below to add at least 4 weekend plans, with one being a duplicate and then removing one of the weekend plans.
+%% Cell type:code id: tags:
+``` python
+# However, it is unordered and unique.
+# The function names are also a little different!
+weekend_plans = set() # creates an empty set
+## TODO: use .add() to add 4 items to weekend_plans with one being a duplicate
+print(weekend_plans)
+# TODO: use .discard() to remove one of the items from the set
+# Unlike a list's remove, this will not throw an error if DNE (does not exist).
+print(weekend_plans)
+```
+%% Cell type:markdown id: tags:
+## Helper Functions
+Last class we created the `cell()` function to help with selecting values from the survey data.  Let's take this a step further today and create a range of helper functions that we can use to answer more challenging questions.  Investing time into creating good helper functions can speed up tackling answering these challenging questions and they provide flexibility so you can use the same helper functions to answer a variety of questions.
+First, let's create our `process_csv()` function to help with loading the data and then split the data into the header and data portions.  Finish the code in the cells below:
+%% Cell type:code id: tags:
+``` python
+import csv
+# source:  Automate the Boring Stuff with Python Ch 12
+def process_csv(filename):
+    exampleFile = open(filename, encoding="utf-8")
+    exampleReader = csv.reader(exampleFile)
+    exampleData = list(exampleReader)
+    exampleFile.close()
+    return exampleData
+```
+%% Cell type:code id: tags:
+``` python
+# TODO: Seperate the data into 2 parts...
+# a header row, and a list of data rows
+cs220_csv = process_csv('cs220_survey_data.csv')
+cs220_header = ...
+cs220_data = ...
+```
+%% Cell type:markdown id: tags:
+Let's make the `cell()` function again, but this time let's make it a little more robust.  If you recall we filtered empty values and returned the `None` instead of an empty string.  We also converted the `Age` column's type to int() (and filtered out if a decimal point was found.
+Let's take that a step further and think about every column and what values we would want to filter out and instead return a `None` and what the data types we would want to return.  The `Latitude` and `Longitude` columns contain values that would best be treated as a float.  Can you think of any other changes to data types or specific values that should be filtered out?
+Make those changes to the cell function below:
+%% Cell type:code id: tags:
+``` python
+# Remember the improved cell function
+def cell(row_idx, col_name):
+    col_idx = cs220_header.index(col_name)
+    val = cs220_data[row_idx][col_idx]
+    if val == "":
+        return None
+    elif col_name == "Age":
+        return int(val)
+    ##TODO add an elif for converting latitude and longitude to float
+    else:
+        return val
+```
+%% Cell type:markdown id: tags:
+Let's test our improved `cell()` function and make sure it is working properly.
+Since the `Age`, `Latitude` and `Longitude` should now all be numbers, let's loop through the data and check the return type for these columns.  Finish the code in the cell below.
+%% Cell type:code id: tags:
+``` python
+for i in range(len(cs220_data)):
+    age = cell(i,'Age')
+    if age == None or type(age) == int:
+        pass
+    else:
+        print("found an age which is the wrong type:",age)
+    ##TODO add checks for latitude and longitude
+```
+%% Cell type:markdown id: tags:
+### Helper Function: Getting smallest value in column
+Now let's think about functions as tools.  What tool would we want to help answer questions we might have.  One possible question we might have revolves around find the smallest or largest value in a column.
+And if we think for a moment about the design of the function, we can make it very flexible.  In other words it might help with multiple possible questions.  Some possible questions involving the smallest value in a column might be:
+* What is the age of the youngest person who answered the survey?
+* What is the age of the youngest person in your lecture?
+* What song did the youngest person enter?
+All of these questions involve a youngest person.  But notice two things:
+* Knowing the index of this person allows us to find out other properties about the person
+* Working with a subset of the data (e.g. youngest in your lecture) is a common practice
+To handle these different ways of looking for the youngest, we can create our function like so:
+```python
+def get_smallest(col_name, indexes=None):
+    """Returns the index of the smallest value
+    for the column with col_name, looking only
+    at the rows in indexes.  If indexes is None
+    then looks at all rows."""
+```
+Notice that the function returns the index, not the value.  The function also has an optional `indexes` parameter which can be used if you already have a subset of the rows you want to work with.
+Finish writing the function below.
+%% Cell type:code id: tags:
+``` python
+def get_smallest(col_name, indexes=None):
+    """Returns the index of the smallest value
+    for the column with col_name, looking only
+    at the rows in indexes.  If indexes is None
+    then looks at all rows."""
+    if indexes == None:
+        indexes = list(range(0,len(cs220_data)))
+    smallest_value = None
+    smallest_index = 0
+    for i in indexes:
+        val = cell(i,col_name)
+        if val == None:
+            continue
+        ##TODO FINISH Function
+    return smallest_index
+```
+%% Cell type:markdown id: tags:
+Okay, if we have written our function well, we can now use it to answer the questions we have.  Use the `get_smallest()` to answer the questions in the cell below.
+%% Cell type:code id: tags:
+``` python
+##TODO What is the age of the youngest person who answered the survey?
+##TODO What song did the youngest person enter?
+```
+%% Cell type:markdown id: tags:
+### Helper Function: Filter rows by a column that matches a value
+Notice we didn't ask one question -- What is the age of the youngest person in your lecture?  To answer this we first need to get a list of just the rows that are for your lecture.
+Let's create a filter function that will return a list of the indexes that meet some matching criteria.  And we can still use the option indexes idea.
+```python
+def filter_match(col_name,col_value,indexes=None):
+    """returns a subset of indexes where the
+    col_name has a value of col_value.  If indexes
+    is None then looks at all rows"""
+```
+Finish the code in the cell below.
+%% Cell type:code id: tags:
+``` python
+def filter_match(col_name,col_value,indexes=None):
+    if indexes == None:
+        indexes = list(range(len(cs220_data)))
+    ret_value = []
+    for i in indexes:
+        val = cell(i,col_name)
+        ##TODO: Finish function
+    return ret_value
+```
+%% Cell type:markdown id: tags:
+Okay, use this `filter_match()` function, perhaps combined with `get_smallest()`, to answer the questions in the cell below.
+%% Cell type:code id: tags:
+``` python
+## TODO how many people are in your lecture who filled out the survey?
+## TODO What is the age of the youngest person in your lecture?
+```
+%% Cell type:markdown id: tags:
+### Helper Function: Filter rows by a column that contains a value
+Exact matching is only one way that we might want to filter our data.  Another possible way would be if a column contains a particular value as a portion of what was entered.  For example, how many primary majors are Engineering majors?  Since the field is a string and "Engineering" is only a portion of the value, we really want to see of the field contains the value.
+Finish the `filter_contains()` function below.
+%% Cell type:code id: tags:
+``` python
+def filter_contains(col_name,col_value,indexes=None):
+    """returns a subset of indexes where the column
+    col_name has values that are strings and col_value is
+    a portion of the value (case insensitive)
+    """
+    if indexes == None:
+        indexes = list(range(len(cs220_data)))
+    ret_value = []
+    col_value = col_value.lower()
+    for i in indexes:
+        val = cell(i,col_name)
+        if val == None:
+            continue
+        val = val.lower()
+        ##TODO FINISH FUNCTION
+    return ret_value
+```
+%% Cell type:markdown id: tags:
+With our functions we can answer some rather challenging questions.
+%% Cell type:code id: tags:
+``` python
+## TODO: How many people in your lecture are majoring in Engineering?
+## TODO: What Bruno Mar's songs did people enter?
+```
+%% Cell type:markdown id: tags:
+Come up with your own questions where you can use these functions.
+%% Cell type:code id: tags:
+``` python
+## TODO: What is your question?  Write the question then write code to answer it.
+```
+%% Cell type:markdown id: tags:
+## Sorting
+There are two ways common ways in Python to sort a list.  One way modifies the original list and the second creates a new list that is sorted but does not modify the original list.
+- The `sorted()` function creates a new sorted list and leaves the original list unmodified.
+- The `.sort()` method sorts the original list, mutating it.
+```python
+x = [2, 4, 1]
+y = sorted(x)
+print(x)
+print(y)
+```
+```
+[2,4,1]
+[1,2,4]
+```
+```python
+x = [2, 4, 1]
+y = x.sort()
+print(x)
+print(y)
+```
+```
+[1,2,4]
+None
+```
+%% Cell type:code id: tags:
+``` python
+# TODO Sort using sorted() function
+x = [2, 4, 1]
+y = ...
+print(x)
+print(y)
+```
+%% Cell type:code id: tags:
+``` python
+# TODO Sort using sort() method
+x = [2, 4, 1]
+y = ...
+print(x)
+print(y)
+```
+%% Cell type:markdown id: tags:
+## You Try It
+Using sorting and the functions above to find:
+- A sorted list of songs from those who run and are over 20
+- A sorted list of majors of the procrastinators, removing duplicates
+%% Cell type:code id: tags:
+``` python
+##TODO: Sorted list of songs from runners over 20
+```
+%% Cell type:code id: tags:
+``` python
+##TODO: Sorted list of majors by procrastinators -- no duplicates
+```
--- a/s25/Louis_Lecture_Notes/16_List_Practice/Lec_16_List_Practice_Solution.ipynb
+++ b/s25/Louis_Lecture_Notes/16_List_Practice/Lec_16_List_Practice_Solution.ipynb
--- a/s25/Louis_Lecture_Notes/16_List_Practice/Lec_16_Worksheet_Solution.ipynb
+++ b/s25/Louis_Lecture_Notes/16_List_Practice/Lec_16_Worksheet_Solution.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Lecture 16 worksheet answers\n",
+    "# https://www.msyamkumar.com/cs220/s22/materials/lec-16-worksheet.pdf\n",
+    "# The purpose of this worksheet is to prepare you for exam questions.\n",
+    "# You should do the worksheet by hand, then check your work.\n",
+    "\n",
+    "# If you have questions please make a public post on Piazza and include the Given\n",
+    "# Students, feel free to answer each other's questions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Problem 1 Given:\n",
+    "nums = [100, 2, 3, 40, 99]\n",
+    "words = [\"three\", \"two\", \"one\"]\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "99\n",
+      "[2, 3]\n",
+      "two\n",
+      "w\n",
+      "www\n",
+      "\n",
+      "1\n",
+      "2\n",
+      "[100, 'three']\n",
+      "three,two,one\n",
+      "e,t\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Problem 1 answers\n",
+    "print(nums[-1])\n",
+    "print(nums[1:3])\n",
+    "print(words[1])\n",
+    "print(words[1][1])\n",
+    "print(words[1][-2] * nums[2])\n",
+    "print()\n",
+    "print(words.index(\"two\"))\n",
+    "print(nums[words.index(\"two\")])\n",
+    "print(nums[:1] + words[:1])\n",
+    "print(\",\".join(words))\n",
+    "print((\",\".join(words))[4:7])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Problem 2 Given:\n",
+    "rows = [[\"x\", \"y\",\"name\"], [3,4,\"Alice\"], [9,1,\"Bob\"], [-3,4,\"Cindy\"]]\n",
+    "header = rows[0]\n",
+    "data = rows[1:]\n",
+    "X = 0\n",
+    "Y = 1\n",
+    "NAME = 2\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "4\n",
+      "3\n",
+      "3\n",
+      "Alice\n",
+      "Bob\n",
+      "\n",
+      "2\n",
+      "Cindy\n",
+      "3.0\n",
+      "5.0\n",
+      "Alice\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Problem 2 answers\n",
+    "print(len(rows))\n",
+    "print(len(data))\n",
+    "print(len(header))\n",
+    "print(rows[1][-1])\n",
+    "print(data[1][-1])\n",
+    "print()\n",
+    "print(header.index(\"name\"))\n",
+    "print(data[-1][header.index(\"name\")])\n",
+    "print((data[0][X] + data[1][X] + data[2][X]) / 3)\n",
+    "print((data[-1][X] ** 2 + data[-1][Y] ** 2) ** 0.5)\n",
+    "print(min(data[0][NAME], data[1][NAME], data[2][NAME]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Problem 3 Given:\n",
+    "rows = [ [\"Food Science\", \"24000\", \"0.049188446\", \"62000\"],\n",
+    "         [\"CS\", \"783000\", \"0.049518657\", \"78000\"],\n",
+    "         [\"Microbiology\", \"70000\", \"0.050880749\", \"60000\"],\n",
+    "         [\"Math\", \"433000\", \"0.05293608\", \"66000\"] ]\n",
+    "hd = [\"major\", \"students\", \"unemployed\", \"salary\"]\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CS\n",
+      "433000\n",
+      "True\n",
+      "2400070000\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Problem 3 answers\n",
+    "print(rows[1][0])\n",
+    "print(rows[3][hd.index(\"students\")])\n",
+    "print(len(hd) == len(rows[1]))\n",
+    "print(rows[0][1] + rows[2][1])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Problem 4 Given:\n",
+    "rows = [ [\"city\", \"state\", \"y14\", \"y15\"],\n",
+    "         [\"Chicago\", \"Illinois\", \"411\", \"478\"],\n",
+    "         [\"Milwaukee\", \"Wisconsin\", \"90\", \"145\"],\n",
+    "         [\"Detroit\", \"Michigan\", \"298\", \"295\"] ]\n",
+    "hd = rows[0]\n",
+    "rows = rows[1:]  #this removes the header and stores the result in rows\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Chicago\n",
+      "411\n",
+      "False\n",
+      "Detroit, Michigan\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Problem 4 answers:\n",
+    "print(rows[0][hd.index(\"city\")])\n",
+    "print(rows[0][hd.index(\"y14\")])\n",
+    "print(rows[2][hd.index(\"y14\")] < rows[2][hd.index(\"y15\")])\n",
+    "print(\", \".join(rows[-1][:2]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Problem 5 Given:\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
+%% Cell type:code id: tags:
+``` python
+# Lecture 16 worksheet answers
+# https://www.msyamkumar.com/cs220/s22/materials/lec-16-worksheet.pdf
+# The purpose of this worksheet is to prepare you for exam questions.
+# You should do the worksheet by hand, then check your work.
+# If you have questions please make a public post on Piazza and include the Given
+# Students, feel free to answer each other's questions
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 1 Given:
+nums = [100, 2, 3, 40, 99]
+words = ["three", "two", "one"]
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 1 answers
+print(nums[-1])
+print(nums[1:3])
+print(words[1])
+print(words[1][1])
+print(words[1][-2] * nums[2])
+print()
+print(words.index("two"))
+print(nums[words.index("two")])
+print(nums[:1] + words[:1])
+print(",".join(words))
+print((",".join(words))[4:7])
+```
+%% Output
+    99
+    [2, 3]
+    two
+    w
+    www
+    1
+    2
+    [100, 'three']
+    three,two,one
+    e,t
+%% Cell type:code id: tags:
+``` python
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 2 Given:
+rows = [["x", "y","name"], [3,4,"Alice"], [9,1,"Bob"], [-3,4,"Cindy"]]
+header = rows[0]
+data = rows[1:]
+X = 0
+Y = 1
+NAME = 2
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 2 answers
+print(len(rows))
+print(len(data))
+print(len(header))
+print(rows[1][-1])
+print(data[1][-1])
+print()
+print(header.index("name"))
+print(data[-1][header.index("name")])
+print((data[0][X] + data[1][X] + data[2][X]) / 3)
+print((data[-1][X] ** 2 + data[-1][Y] ** 2) ** 0.5)
+print(min(data[0][NAME], data[1][NAME], data[2][NAME]))
+```
+%% Output
+    4
+    3
+    3
+    Alice
+    Bob
+    2
+    Cindy
+    3.0
+    5.0
+    Alice
+%% Cell type:code id: tags:
+``` python
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 3 Given:
+rows = [ ["Food Science", "24000", "0.049188446", "62000"],
+         ["CS", "783000", "0.049518657", "78000"],
+         ["Microbiology", "70000", "0.050880749", "60000"],
+         ["Math", "433000", "0.05293608", "66000"] ]
+hd = ["major", "students", "unemployed", "salary"]
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 3 answers
+print(rows[1][0])
+print(rows[3][hd.index("students")])
+print(len(hd) == len(rows[1]))
+print(rows[0][1] + rows[2][1])
+```
+%% Output
+    CS
+    433000
+    True
+    2400070000
+%% Cell type:code id: tags:
+``` python
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 4 Given:
+rows = [ ["city", "state", "y14", "y15"],
+         ["Chicago", "Illinois", "411", "478"],
+         ["Milwaukee", "Wisconsin", "90", "145"],
+         ["Detroit", "Michigan", "298", "295"] ]
+hd = rows[0]
+rows = rows[1:]  #this removes the header and stores the result in rows
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 4 answers:
+print(rows[0][hd.index("city")])
+print(rows[0][hd.index("y14")])
+print(rows[2][hd.index("y14")] < rows[2][hd.index("y15")])
+print(", ".join(rows[-1][:2]))
+```
+%% Output
+    Chicago
+    411
+    False
+    Detroit, Michigan
+%% Cell type:code id: tags:
+``` python
+```
+%% Cell type:code id: tags:
+``` python
+# Problem 5 Given:
+```
--- a/s25/Louis_Lecture_Notes/16_List_Practice/cs220_survey_data.csv
+++ b/s25/Louis_Lecture_Notes/16_List_Practice/cs220_survey_data.csv