diff --git a/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation.ipynb b/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..01977fb1f4ff63c7c57d52a91272fd60a7f72978
--- /dev/null
+++ b/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation.ipynb
@@ -0,0 +1,4560 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Announcements - Wednesday, December 6\n",
+    "* Download ALL files for today's lecture\n",
+    "* Q10 Released tonight at 5 pm\n",
+    "* <b>If you have any problem with P8-P11 grades, please send me (Gurmail.Singh@wisc.edu) an email by December 11.</b>\n",
+    "* Late days may not be used on P13\n",
+    "* If you have questions, it is almost always faster to \n",
+    "  * Post on Piazza\n",
+    "  * Go to [office hours](https://sites.google.com/wisc.edu/cs220-oh-f23/home?pli=1) \n",
+    "### Conflict Form\n",
+    "  * [Final - December 19, 7:45 am](https://cs220.cs.wisc.edu/f23/surveys.html)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "RHvDCo4fhXBx"
+   },
+   "source": [
+    "# Lecture 37 Pandas 3: Data Transformation\n",
+    "* Data transformation is the process of changing the format, structure, or values of data. \n",
+    "* Often needed during data cleaning and sometimes during data analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "yoLGptrqhbBo"
+   },
+   "source": [
+    "# Today's Learning Objectives: \n",
+    "\n",
+    "* Setting column as index for pandas `DataFrame`\n",
+    "* Identify, drop, or fill missing values (`np.NaN`) using Pandas `isna`, `dropna`, and `fillna`\n",
+    "* Applying transformations to `DataFrame`:\n",
+    "  * Use `apply` on pandas `Series` to apply a transformation function\n",
+    "  * Use `replace` to replace all target values in Pandas `Series` and `DataFrame` rows / columns\n",
+    "* Filter, aggregate, group, and summarize information in a `DataFrame` with `groupby`\n",
+    "* Convert .groupby examples to SQL\n",
+    "* Solving the same question using SQL and pandas `DataFrame` manipulations:\n",
+    "  * filtering, grouping, and aggregation / summarization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "id": "CeWtFirwteFY"
+   },
+   "outputs": [],
+   "source": [
+    "# known import statements\n",
+    "import pandas as pd\n",
+    "import sqlite3 as sql # note that we are renaming to sql\n",
+    "import os\n",
+    "\n",
+    "# new import statement\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "FgnTeNRIswsm"
+   },
+   "source": [
+    "# The dataset: Spotify songs\n",
+    "Adapted from https://www.kaggle.com/datasets/mrmorj/dataset-of-songs-in-spotify.\n",
+    "\n",
+    "If you are interested in digging deeper in this dataset, here's a [blog post](https://medium.com/@boplantinga/what-do-spotifys-audio-features-tell-us-about-this-year-s-eurovision-song-contest-66ad188e112a) that explain each column in details.  "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 1: Establish a connection to the spotify.db database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 232
+    },
+    "id": "8y9scvgCnTHl",
+    "outputId": "c72388f8-576c-4cf2-ef51-352cd11b6c92"
+   },
+   "outputs": [],
+   "source": [
+    "# open up the spotify database\n",
+    "db_pathname = \"spotify.db\"\n",
+    "assert os.path.exists(db_pathname)\n",
+    "conn = sql.connect(db_pathname)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def qry(sql):\n",
+    "    return pd.read_sql(sql, conn)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 2: Identify the table name(s) inside the database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 112
+    },
+    "id": "ybTqbDSOnR2f",
+    "outputId": "8dcc943b-9382-4abb-ef78-6c6d56ad89eb"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>type</th>\n",
+       "      <th>name</th>\n",
+       "      <th>tbl_name</th>\n",
+       "      <th>rootpage</th>\n",
+       "      <th>sql</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>table</td>\n",
+       "      <td>spotify</td>\n",
+       "      <td>spotify</td>\n",
+       "      <td>1527</td>\n",
+       "      <td>CREATE TABLE spotify(\\nid TEXT PRIMARY KEY,\\nt...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>index</td>\n",
+       "      <td>sqlite_autoindex_spotify_1</td>\n",
+       "      <td>spotify</td>\n",
+       "      <td>1528</td>\n",
+       "      <td>None</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "    type                        name tbl_name  rootpage  \\\n",
+       "0  table                     spotify  spotify      1527   \n",
+       "1  index  sqlite_autoindex_spotify_1  spotify      1528   \n",
+       "\n",
+       "                                                 sql  \n",
+       "0  CREATE TABLE spotify(\\nid TEXT PRIMARY KEY,\\nt...  \n",
+       "1                                               None  "
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df = qry(\"SELECT * from sqlite_master\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 3: Use pandas lookup expression to extract the \"sql\" column and display the full query using .iloc lookup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CREATE TABLE spotify(\n",
+      "id TEXT PRIMARY KEY,\n",
+      "title BLOB,\n",
+      "song_name BLOB, \n",
+      "genre TEXT,\n",
+      "duration_ms INTEGER, \n",
+      "key INTEGER, \n",
+      "mode INTEGER, \n",
+      "time_signature INTEGER, \n",
+      "tempo REAL,\n",
+      "acousticness REAL, \n",
+      "danceability REAL, \n",
+      "energy REAL, \n",
+      "instrumentalness REAL, \n",
+      "liveness REAL, \n",
+      "loudness REAL, \n",
+      "speechiness REAL, \n",
+      "valence REAL)\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(df[\"sql\"].iloc[0])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 4: Store the data inside `spotify` table inside a variable called `df`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 632
+    },
+    "id": "txAH9OIjnoQv",
+    "outputId": "ac9152ba-32df-4fb2-d4e0-a97f50fe58fb"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>id</th>\n",
+       "      <th>title</th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "      <th>duration_ms</th>\n",
+       "      <th>key</th>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th>tempo</th>\n",
+       "      <th>acousticness</th>\n",
+       "      <th>danceability</th>\n",
+       "      <th>energy</th>\n",
+       "      <th>instrumentalness</th>\n",
+       "      <th>liveness</th>\n",
+       "      <th>loudness</th>\n",
+       "      <th>speechiness</th>\n",
+       "      <th>valence</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>7pgJBLVz5VmnL7uGHmRj6p</td>\n",
+       "      <td></td>\n",
+       "      <td>Pathology</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>224427</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>115.080</td>\n",
+       "      <td>0.401000</td>\n",
+       "      <td>0.719</td>\n",
+       "      <td>0.493</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1180</td>\n",
+       "      <td>-7.230</td>\n",
+       "      <td>0.0794</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>0vSWgAlfpye0WCGeNmuNhy</td>\n",
+       "      <td></td>\n",
+       "      <td>Symbiote</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>98821</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>218.050</td>\n",
+       "      <td>0.013800</td>\n",
+       "      <td>0.850</td>\n",
+       "      <td>0.893</td>\n",
+       "      <td>0.000004</td>\n",
+       "      <td>0.3720</td>\n",
+       "      <td>-4.783</td>\n",
+       "      <td>0.0623</td>\n",
+       "      <td>0.0391</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>7EL7ifncK2PWFYThJjzR25</td>\n",
+       "      <td></td>\n",
+       "      <td>BRAINFOOD</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>101172</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>189.938</td>\n",
+       "      <td>0.187000</td>\n",
+       "      <td>0.864</td>\n",
+       "      <td>0.365</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1160</td>\n",
+       "      <td>-10.219</td>\n",
+       "      <td>0.0655</td>\n",
+       "      <td>0.0478</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>1umsRbM7L4ju7rn9aU8Ju6</td>\n",
+       "      <td></td>\n",
+       "      <td>Sacrifice</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>96062</td>\n",
+       "      <td>10</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>139.990</td>\n",
+       "      <td>0.145000</td>\n",
+       "      <td>0.767</td>\n",
+       "      <td>0.576</td>\n",
+       "      <td>0.000003</td>\n",
+       "      <td>0.0968</td>\n",
+       "      <td>-9.683</td>\n",
+       "      <td>0.2560</td>\n",
+       "      <td>0.1870</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4SKqOHKYU5pgHr5UiVKiQN</td>\n",
+       "      <td></td>\n",
+       "      <td>Backpack</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>135079</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>128.014</td>\n",
+       "      <td>0.007700</td>\n",
+       "      <td>0.765</td>\n",
+       "      <td>0.726</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.6190</td>\n",
+       "      <td>-5.580</td>\n",
+       "      <td>0.1910</td>\n",
+       "      <td>0.2700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35872</th>\n",
+       "      <td>46bXU7Sgj7104ZoXxzz9tM</td>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>269208</td>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.013</td>\n",
+       "      <td>0.031500</td>\n",
+       "      <td>0.528</td>\n",
+       "      <td>0.693</td>\n",
+       "      <td>0.000345</td>\n",
+       "      <td>0.1210</td>\n",
+       "      <td>-5.148</td>\n",
+       "      <td>0.0304</td>\n",
+       "      <td>0.3940</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35873</th>\n",
+       "      <td>0he2ViGMUO3ajKTxLOfWVT</td>\n",
+       "      <td>Greatest Hardstyle Playlist</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>210112</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>149.928</td>\n",
+       "      <td>0.022500</td>\n",
+       "      <td>0.517</td>\n",
+       "      <td>0.768</td>\n",
+       "      <td>0.000018</td>\n",
+       "      <td>0.2050</td>\n",
+       "      <td>-7.922</td>\n",
+       "      <td>0.0479</td>\n",
+       "      <td>0.3830</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35874</th>\n",
+       "      <td>72DAt9Lbpy9EUS29OzQLob</td>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>234823</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>154.935</td>\n",
+       "      <td>0.026000</td>\n",
+       "      <td>0.361</td>\n",
+       "      <td>0.821</td>\n",
+       "      <td>0.000242</td>\n",
+       "      <td>0.3850</td>\n",
+       "      <td>-3.102</td>\n",
+       "      <td>0.0505</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35875</th>\n",
+       "      <td>6HXgExFVuE1c3cq9QjFCcU</td>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>323200</td>\n",
+       "      <td>6</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.042</td>\n",
+       "      <td>0.000551</td>\n",
+       "      <td>0.477</td>\n",
+       "      <td>0.921</td>\n",
+       "      <td>0.029600</td>\n",
+       "      <td>0.0575</td>\n",
+       "      <td>-4.777</td>\n",
+       "      <td>0.0392</td>\n",
+       "      <td>0.4880</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35876</th>\n",
+       "      <td>6MAAMZImxcvYhRnxDLTufD</td>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>162161</td>\n",
+       "      <td>9</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>155.047</td>\n",
+       "      <td>0.001890</td>\n",
+       "      <td>0.529</td>\n",
+       "      <td>0.945</td>\n",
+       "      <td>0.000055</td>\n",
+       "      <td>0.4140</td>\n",
+       "      <td>-5.862</td>\n",
+       "      <td>0.0615</td>\n",
+       "      <td>0.1340</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>35877 rows × 17 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                           id                        title  song_name  \\\n",
+       "0      7pgJBLVz5VmnL7uGHmRj6p                               Pathology   \n",
+       "1      0vSWgAlfpye0WCGeNmuNhy                                Symbiote   \n",
+       "2      7EL7ifncK2PWFYThJjzR25                               BRAINFOOD   \n",
+       "3      1umsRbM7L4ju7rn9aU8Ju6                               Sacrifice   \n",
+       "4      4SKqOHKYU5pgHr5UiVKiQN                                Backpack   \n",
+       "...                       ...                          ...        ...   \n",
+       "35872  46bXU7Sgj7104ZoXxzz9tM           Euphoric Hardstyle              \n",
+       "35873  0he2ViGMUO3ajKTxLOfWVT  Greatest Hardstyle Playlist              \n",
+       "35874  72DAt9Lbpy9EUS29OzQLob       Best of Hardstyle 2020              \n",
+       "35875  6HXgExFVuE1c3cq9QjFCcU           Euphoric Hardstyle              \n",
+       "35876  6MAAMZImxcvYhRnxDLTufD       Best of Hardstyle 2020              \n",
+       "\n",
+       "           genre  duration_ms  key  mode  time_signature    tempo  \\\n",
+       "0      Dark Trap       224427    8     1               4  115.080   \n",
+       "1      Dark Trap        98821    5     1               4  218.050   \n",
+       "2      Dark Trap       101172    8     1               4  189.938   \n",
+       "3      Dark Trap        96062   10     0               4  139.990   \n",
+       "4      Dark Trap       135079    5     1               4  128.014   \n",
+       "...          ...          ...  ...   ...             ...      ...   \n",
+       "35872  hardstyle       269208    4     1               4  150.013   \n",
+       "35873  hardstyle       210112    0     0               4  149.928   \n",
+       "35874  hardstyle       234823    8     1               4  154.935   \n",
+       "35875  hardstyle       323200    6     0               4  150.042   \n",
+       "35876  hardstyle       162161    9     1               4  155.047   \n",
+       "\n",
+       "       acousticness  danceability  energy  instrumentalness  liveness  \\\n",
+       "0          0.401000         0.719   0.493          0.000000    0.1180   \n",
+       "1          0.013800         0.850   0.893          0.000004    0.3720   \n",
+       "2          0.187000         0.864   0.365          0.000000    0.1160   \n",
+       "3          0.145000         0.767   0.576          0.000003    0.0968   \n",
+       "4          0.007700         0.765   0.726          0.000000    0.6190   \n",
+       "...             ...           ...     ...               ...       ...   \n",
+       "35872      0.031500         0.528   0.693          0.000345    0.1210   \n",
+       "35873      0.022500         0.517   0.768          0.000018    0.2050   \n",
+       "35874      0.026000         0.361   0.821          0.000242    0.3850   \n",
+       "35875      0.000551         0.477   0.921          0.029600    0.0575   \n",
+       "35876      0.001890         0.529   0.945          0.000055    0.4140   \n",
+       "\n",
+       "       loudness  speechiness  valence  \n",
+       "0        -7.230       0.0794   0.1240  \n",
+       "1        -4.783       0.0623   0.0391  \n",
+       "2       -10.219       0.0655   0.0478  \n",
+       "3        -9.683       0.2560   0.1870  \n",
+       "4        -5.580       0.1910   0.2700  \n",
+       "...         ...          ...      ...  \n",
+       "35872    -5.148       0.0304   0.3940  \n",
+       "35873    -7.922       0.0479   0.3830  \n",
+       "35874    -3.102       0.0505   0.1240  \n",
+       "35875    -4.777       0.0392   0.4880  \n",
+       "35876    -5.862       0.0615   0.1340  \n",
+       "\n",
+       "[35877 rows x 17 columns]"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df = qry(\"SELECT * FROM spotify\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Setting a column as row indices for the `DataFrame`\n",
+    "\n",
+    "- Syntax: `df.set_index(\"<COLUMN>\")`\n",
+    "- Returns a new DataFrame object instance reference.\n",
+    "- WARNING: executing this twice will result in `KeyError` being thrown. Once you set a column as row index, it will no longer be a column within the `DataFrame`. If you tried this, go back and execute the above cell and update `df` once more and then execute the below cell exactly once."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>title</th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "      <th>duration_ms</th>\n",
+       "      <th>key</th>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th>tempo</th>\n",
+       "      <th>acousticness</th>\n",
+       "      <th>danceability</th>\n",
+       "      <th>energy</th>\n",
+       "      <th>instrumentalness</th>\n",
+       "      <th>liveness</th>\n",
+       "      <th>loudness</th>\n",
+       "      <th>speechiness</th>\n",
+       "      <th>valence</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>id</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>7pgJBLVz5VmnL7uGHmRj6p</th>\n",
+       "      <td></td>\n",
+       "      <td>Pathology</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>224427</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>115.080</td>\n",
+       "      <td>0.401000</td>\n",
+       "      <td>0.719</td>\n",
+       "      <td>0.493</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1180</td>\n",
+       "      <td>-7.230</td>\n",
+       "      <td>0.0794</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0vSWgAlfpye0WCGeNmuNhy</th>\n",
+       "      <td></td>\n",
+       "      <td>Symbiote</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>98821</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>218.050</td>\n",
+       "      <td>0.013800</td>\n",
+       "      <td>0.850</td>\n",
+       "      <td>0.893</td>\n",
+       "      <td>0.000004</td>\n",
+       "      <td>0.3720</td>\n",
+       "      <td>-4.783</td>\n",
+       "      <td>0.0623</td>\n",
+       "      <td>0.0391</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7EL7ifncK2PWFYThJjzR25</th>\n",
+       "      <td></td>\n",
+       "      <td>BRAINFOOD</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>101172</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>189.938</td>\n",
+       "      <td>0.187000</td>\n",
+       "      <td>0.864</td>\n",
+       "      <td>0.365</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1160</td>\n",
+       "      <td>-10.219</td>\n",
+       "      <td>0.0655</td>\n",
+       "      <td>0.0478</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1umsRbM7L4ju7rn9aU8Ju6</th>\n",
+       "      <td></td>\n",
+       "      <td>Sacrifice</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>96062</td>\n",
+       "      <td>10</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>139.990</td>\n",
+       "      <td>0.145000</td>\n",
+       "      <td>0.767</td>\n",
+       "      <td>0.576</td>\n",
+       "      <td>0.000003</td>\n",
+       "      <td>0.0968</td>\n",
+       "      <td>-9.683</td>\n",
+       "      <td>0.2560</td>\n",
+       "      <td>0.1870</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4SKqOHKYU5pgHr5UiVKiQN</th>\n",
+       "      <td></td>\n",
+       "      <td>Backpack</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>135079</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>128.014</td>\n",
+       "      <td>0.007700</td>\n",
+       "      <td>0.765</td>\n",
+       "      <td>0.726</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.6190</td>\n",
+       "      <td>-5.580</td>\n",
+       "      <td>0.1910</td>\n",
+       "      <td>0.2700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>46bXU7Sgj7104ZoXxzz9tM</th>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>269208</td>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.013</td>\n",
+       "      <td>0.031500</td>\n",
+       "      <td>0.528</td>\n",
+       "      <td>0.693</td>\n",
+       "      <td>0.000345</td>\n",
+       "      <td>0.1210</td>\n",
+       "      <td>-5.148</td>\n",
+       "      <td>0.0304</td>\n",
+       "      <td>0.3940</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0he2ViGMUO3ajKTxLOfWVT</th>\n",
+       "      <td>Greatest Hardstyle Playlist</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>210112</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>149.928</td>\n",
+       "      <td>0.022500</td>\n",
+       "      <td>0.517</td>\n",
+       "      <td>0.768</td>\n",
+       "      <td>0.000018</td>\n",
+       "      <td>0.2050</td>\n",
+       "      <td>-7.922</td>\n",
+       "      <td>0.0479</td>\n",
+       "      <td>0.3830</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>72DAt9Lbpy9EUS29OzQLob</th>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>234823</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>154.935</td>\n",
+       "      <td>0.026000</td>\n",
+       "      <td>0.361</td>\n",
+       "      <td>0.821</td>\n",
+       "      <td>0.000242</td>\n",
+       "      <td>0.3850</td>\n",
+       "      <td>-3.102</td>\n",
+       "      <td>0.0505</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6HXgExFVuE1c3cq9QjFCcU</th>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>323200</td>\n",
+       "      <td>6</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.042</td>\n",
+       "      <td>0.000551</td>\n",
+       "      <td>0.477</td>\n",
+       "      <td>0.921</td>\n",
+       "      <td>0.029600</td>\n",
+       "      <td>0.0575</td>\n",
+       "      <td>-4.777</td>\n",
+       "      <td>0.0392</td>\n",
+       "      <td>0.4880</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6MAAMZImxcvYhRnxDLTufD</th>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>162161</td>\n",
+       "      <td>9</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>155.047</td>\n",
+       "      <td>0.001890</td>\n",
+       "      <td>0.529</td>\n",
+       "      <td>0.945</td>\n",
+       "      <td>0.000055</td>\n",
+       "      <td>0.4140</td>\n",
+       "      <td>-5.862</td>\n",
+       "      <td>0.0615</td>\n",
+       "      <td>0.1340</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>35877 rows × 16 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                                              title  song_name      genre  \\\n",
+       "id                                                                          \n",
+       "7pgJBLVz5VmnL7uGHmRj6p                               Pathology  Dark Trap   \n",
+       "0vSWgAlfpye0WCGeNmuNhy                                Symbiote  Dark Trap   \n",
+       "7EL7ifncK2PWFYThJjzR25                               BRAINFOOD  Dark Trap   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6                               Sacrifice  Dark Trap   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN                                Backpack  Dark Trap   \n",
+       "...                                             ...        ...        ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM           Euphoric Hardstyle             hardstyle   \n",
+       "0he2ViGMUO3ajKTxLOfWVT  Greatest Hardstyle Playlist             hardstyle   \n",
+       "72DAt9Lbpy9EUS29OzQLob       Best of Hardstyle 2020             hardstyle   \n",
+       "6HXgExFVuE1c3cq9QjFCcU           Euphoric Hardstyle             hardstyle   \n",
+       "6MAAMZImxcvYhRnxDLTufD       Best of Hardstyle 2020             hardstyle   \n",
+       "\n",
+       "                        duration_ms  key  mode  time_signature    tempo  \\\n",
+       "id                                                                        \n",
+       "7pgJBLVz5VmnL7uGHmRj6p       224427    8     1               4  115.080   \n",
+       "0vSWgAlfpye0WCGeNmuNhy        98821    5     1               4  218.050   \n",
+       "7EL7ifncK2PWFYThJjzR25       101172    8     1               4  189.938   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6        96062   10     0               4  139.990   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN       135079    5     1               4  128.014   \n",
+       "...                             ...  ...   ...             ...      ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM       269208    4     1               4  150.013   \n",
+       "0he2ViGMUO3ajKTxLOfWVT       210112    0     0               4  149.928   \n",
+       "72DAt9Lbpy9EUS29OzQLob       234823    8     1               4  154.935   \n",
+       "6HXgExFVuE1c3cq9QjFCcU       323200    6     0               4  150.042   \n",
+       "6MAAMZImxcvYhRnxDLTufD       162161    9     1               4  155.047   \n",
+       "\n",
+       "                        acousticness  danceability  energy  instrumentalness  \\\n",
+       "id                                                                             \n",
+       "7pgJBLVz5VmnL7uGHmRj6p      0.401000         0.719   0.493          0.000000   \n",
+       "0vSWgAlfpye0WCGeNmuNhy      0.013800         0.850   0.893          0.000004   \n",
+       "7EL7ifncK2PWFYThJjzR25      0.187000         0.864   0.365          0.000000   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6      0.145000         0.767   0.576          0.000003   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN      0.007700         0.765   0.726          0.000000   \n",
+       "...                              ...           ...     ...               ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM      0.031500         0.528   0.693          0.000345   \n",
+       "0he2ViGMUO3ajKTxLOfWVT      0.022500         0.517   0.768          0.000018   \n",
+       "72DAt9Lbpy9EUS29OzQLob      0.026000         0.361   0.821          0.000242   \n",
+       "6HXgExFVuE1c3cq9QjFCcU      0.000551         0.477   0.921          0.029600   \n",
+       "6MAAMZImxcvYhRnxDLTufD      0.001890         0.529   0.945          0.000055   \n",
+       "\n",
+       "                        liveness  loudness  speechiness  valence  \n",
+       "id                                                                \n",
+       "7pgJBLVz5VmnL7uGHmRj6p    0.1180    -7.230       0.0794   0.1240  \n",
+       "0vSWgAlfpye0WCGeNmuNhy    0.3720    -4.783       0.0623   0.0391  \n",
+       "7EL7ifncK2PWFYThJjzR25    0.1160   -10.219       0.0655   0.0478  \n",
+       "1umsRbM7L4ju7rn9aU8Ju6    0.0968    -9.683       0.2560   0.1870  \n",
+       "4SKqOHKYU5pgHr5UiVKiQN    0.6190    -5.580       0.1910   0.2700  \n",
+       "...                          ...       ...          ...      ...  \n",
+       "46bXU7Sgj7104ZoXxzz9tM    0.1210    -5.148       0.0304   0.3940  \n",
+       "0he2ViGMUO3ajKTxLOfWVT    0.2050    -7.922       0.0479   0.3830  \n",
+       "72DAt9Lbpy9EUS29OzQLob    0.3850    -3.102       0.0505   0.1240  \n",
+       "6HXgExFVuE1c3cq9QjFCcU    0.0575    -4.777       0.0392   0.4880  \n",
+       "6MAAMZImxcvYhRnxDLTufD    0.4140    -5.862       0.0615   0.1340  \n",
+       "\n",
+       "[35877 rows x 16 columns]"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Set the id column as row indices\n",
+    "df = df.set_index(\"id\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Not a Number\n",
+    "\n",
+    "- `np.NaN` is the floating point representation of Not a Number\n",
+    "- You do not need to know / learn the details about the `numpy` package \n",
+    "\n",
+    "### Replacing / modifying values within the `DataFrame`\n",
+    "\n",
+    "Syntax: `df.replace(<TARGET>, <REPLACE>)`\n",
+    "- Your target can be `str`, `int`, `float`, `None` (there are other possiblities, but those are too advanced for this course)\n",
+    "- Returns a new DataFrame object instance reference.\n",
+    "\n",
+    "Let's now replace the missing values (empty strings) with `np.NAN`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>title</th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "      <th>duration_ms</th>\n",
+       "      <th>key</th>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th>tempo</th>\n",
+       "      <th>acousticness</th>\n",
+       "      <th>danceability</th>\n",
+       "      <th>energy</th>\n",
+       "      <th>instrumentalness</th>\n",
+       "      <th>liveness</th>\n",
+       "      <th>loudness</th>\n",
+       "      <th>speechiness</th>\n",
+       "      <th>valence</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>id</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>7pgJBLVz5VmnL7uGHmRj6p</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Pathology</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>224427</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>115.080</td>\n",
+       "      <td>0.4010</td>\n",
+       "      <td>0.719</td>\n",
+       "      <td>0.493</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1180</td>\n",
+       "      <td>-7.230</td>\n",
+       "      <td>0.0794</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0vSWgAlfpye0WCGeNmuNhy</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Symbiote</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>98821</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>218.050</td>\n",
+       "      <td>0.0138</td>\n",
+       "      <td>0.850</td>\n",
+       "      <td>0.893</td>\n",
+       "      <td>0.000004</td>\n",
+       "      <td>0.3720</td>\n",
+       "      <td>-4.783</td>\n",
+       "      <td>0.0623</td>\n",
+       "      <td>0.0391</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7EL7ifncK2PWFYThJjzR25</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>BRAINFOOD</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>101172</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>189.938</td>\n",
+       "      <td>0.1870</td>\n",
+       "      <td>0.864</td>\n",
+       "      <td>0.365</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1160</td>\n",
+       "      <td>-10.219</td>\n",
+       "      <td>0.0655</td>\n",
+       "      <td>0.0478</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1umsRbM7L4ju7rn9aU8Ju6</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Sacrifice</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>96062</td>\n",
+       "      <td>10</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>139.990</td>\n",
+       "      <td>0.1450</td>\n",
+       "      <td>0.767</td>\n",
+       "      <td>0.576</td>\n",
+       "      <td>0.000003</td>\n",
+       "      <td>0.0968</td>\n",
+       "      <td>-9.683</td>\n",
+       "      <td>0.2560</td>\n",
+       "      <td>0.1870</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4SKqOHKYU5pgHr5UiVKiQN</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Backpack</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>135079</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>128.014</td>\n",
+       "      <td>0.0077</td>\n",
+       "      <td>0.765</td>\n",
+       "      <td>0.726</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.6190</td>\n",
+       "      <td>-5.580</td>\n",
+       "      <td>0.1910</td>\n",
+       "      <td>0.2700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3uE1swbcRp5BrO64UNy6Ex</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>TakingOutTheTrash</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>192833</td>\n",
+       "      <td>11</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>120.004</td>\n",
+       "      <td>0.1720</td>\n",
+       "      <td>0.814</td>\n",
+       "      <td>0.575</td>\n",
+       "      <td>0.000291</td>\n",
+       "      <td>0.1090</td>\n",
+       "      <td>-9.635</td>\n",
+       "      <td>0.0635</td>\n",
+       "      <td>0.2880</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3KJrwOuqiEwHq6QTreZT61</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Io sono qui</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>180880</td>\n",
+       "      <td>10</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>128.066</td>\n",
+       "      <td>0.0987</td>\n",
+       "      <td>0.812</td>\n",
+       "      <td>0.813</td>\n",
+       "      <td>0.000150</td>\n",
+       "      <td>0.0758</td>\n",
+       "      <td>-5.583</td>\n",
+       "      <td>0.0984</td>\n",
+       "      <td>0.3480</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4QhUXx4ON40TIBrZIlnIke</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Murder</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>186261</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>114.956</td>\n",
+       "      <td>0.0343</td>\n",
+       "      <td>0.602</td>\n",
+       "      <td>0.578</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1640</td>\n",
+       "      <td>-5.610</td>\n",
+       "      <td>0.0283</td>\n",
+       "      <td>0.1560</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>09320vyX4qHd4GjHIpy5w0</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>High 'N Mighty</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>124676</td>\n",
+       "      <td>7</td>\n",
+       "      <td>1</td>\n",
+       "      <td>5</td>\n",
+       "      <td>111.958</td>\n",
+       "      <td>0.1120</td>\n",
+       "      <td>0.876</td>\n",
+       "      <td>0.768</td>\n",
+       "      <td>0.000012</td>\n",
+       "      <td>0.2830</td>\n",
+       "      <td>-6.606</td>\n",
+       "      <td>0.2010</td>\n",
+       "      <td>0.7200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6xEnbXM1us9fDJy2LC0lru</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Bang Ya Fucking Head</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>154929</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>125.013</td>\n",
+       "      <td>0.0525</td>\n",
+       "      <td>0.690</td>\n",
+       "      <td>0.760</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1340</td>\n",
+       "      <td>-5.431</td>\n",
+       "      <td>0.0895</td>\n",
+       "      <td>0.0797</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                       title             song_name      genre  duration_ms  \\\n",
+       "id                                                                           \n",
+       "7pgJBLVz5VmnL7uGHmRj6p   NaN             Pathology  Dark Trap       224427   \n",
+       "0vSWgAlfpye0WCGeNmuNhy   NaN              Symbiote  Dark Trap        98821   \n",
+       "7EL7ifncK2PWFYThJjzR25   NaN             BRAINFOOD  Dark Trap       101172   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6   NaN             Sacrifice  Dark Trap        96062   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN   NaN              Backpack  Dark Trap       135079   \n",
+       "3uE1swbcRp5BrO64UNy6Ex   NaN     TakingOutTheTrash  Dark Trap       192833   \n",
+       "3KJrwOuqiEwHq6QTreZT61   NaN           Io sono qui  Dark Trap       180880   \n",
+       "4QhUXx4ON40TIBrZIlnIke   NaN                Murder  Dark Trap       186261   \n",
+       "09320vyX4qHd4GjHIpy5w0   NaN        High 'N Mighty  Dark Trap       124676   \n",
+       "6xEnbXM1us9fDJy2LC0lru   NaN  Bang Ya Fucking Head  Dark Trap       154929   \n",
+       "\n",
+       "                        key  mode  time_signature    tempo  acousticness  \\\n",
+       "id                                                                         \n",
+       "7pgJBLVz5VmnL7uGHmRj6p    8     1               4  115.080        0.4010   \n",
+       "0vSWgAlfpye0WCGeNmuNhy    5     1               4  218.050        0.0138   \n",
+       "7EL7ifncK2PWFYThJjzR25    8     1               4  189.938        0.1870   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6   10     0               4  139.990        0.1450   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN    5     1               4  128.014        0.0077   \n",
+       "3uE1swbcRp5BrO64UNy6Ex   11     1               4  120.004        0.1720   \n",
+       "3KJrwOuqiEwHq6QTreZT61   10     0               4  128.066        0.0987   \n",
+       "4QhUXx4ON40TIBrZIlnIke    0     1               4  114.956        0.0343   \n",
+       "09320vyX4qHd4GjHIpy5w0    7     1               5  111.958        0.1120   \n",
+       "6xEnbXM1us9fDJy2LC0lru    1     1               4  125.013        0.0525   \n",
+       "\n",
+       "                        danceability  energy  instrumentalness  liveness  \\\n",
+       "id                                                                         \n",
+       "7pgJBLVz5VmnL7uGHmRj6p         0.719   0.493          0.000000    0.1180   \n",
+       "0vSWgAlfpye0WCGeNmuNhy         0.850   0.893          0.000004    0.3720   \n",
+       "7EL7ifncK2PWFYThJjzR25         0.864   0.365          0.000000    0.1160   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6         0.767   0.576          0.000003    0.0968   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN         0.765   0.726          0.000000    0.6190   \n",
+       "3uE1swbcRp5BrO64UNy6Ex         0.814   0.575          0.000291    0.1090   \n",
+       "3KJrwOuqiEwHq6QTreZT61         0.812   0.813          0.000150    0.0758   \n",
+       "4QhUXx4ON40TIBrZIlnIke         0.602   0.578          0.000000    0.1640   \n",
+       "09320vyX4qHd4GjHIpy5w0         0.876   0.768          0.000012    0.2830   \n",
+       "6xEnbXM1us9fDJy2LC0lru         0.690   0.760          0.000000    0.1340   \n",
+       "\n",
+       "                        loudness  speechiness  valence  \n",
+       "id                                                      \n",
+       "7pgJBLVz5VmnL7uGHmRj6p    -7.230       0.0794   0.1240  \n",
+       "0vSWgAlfpye0WCGeNmuNhy    -4.783       0.0623   0.0391  \n",
+       "7EL7ifncK2PWFYThJjzR25   -10.219       0.0655   0.0478  \n",
+       "1umsRbM7L4ju7rn9aU8Ju6    -9.683       0.2560   0.1870  \n",
+       "4SKqOHKYU5pgHr5UiVKiQN    -5.580       0.1910   0.2700  \n",
+       "3uE1swbcRp5BrO64UNy6Ex    -9.635       0.0635   0.2880  \n",
+       "3KJrwOuqiEwHq6QTreZT61    -5.583       0.0984   0.3480  \n",
+       "4QhUXx4ON40TIBrZIlnIke    -5.610       0.0283   0.1560  \n",
+       "09320vyX4qHd4GjHIpy5w0    -6.606       0.2010   0.7200  \n",
+       "6xEnbXM1us9fDJy2LC0lru    -5.431       0.0895   0.0797  "
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df = df.replace(\"\", np.NaN)\n",
+    "df.head(10) # title is the album name"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Checking for missing values\n",
+    "\n",
+    "Syntax: `Series.isna()`\n",
+    "- Returns a boolean Series\n",
+    "\n",
+    "Let's check if any of the \"song_name\"(s) are missing"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "JqzSwG5PEZRq",
+    "outputId": "05529a3d-4a5c-4654-fe05-d04b2c10ae6c"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "id\n",
+       "7pgJBLVz5VmnL7uGHmRj6p    False\n",
+       "0vSWgAlfpye0WCGeNmuNhy    False\n",
+       "7EL7ifncK2PWFYThJjzR25    False\n",
+       "1umsRbM7L4ju7rn9aU8Ju6    False\n",
+       "4SKqOHKYU5pgHr5UiVKiQN    False\n",
+       "                          ...  \n",
+       "46bXU7Sgj7104ZoXxzz9tM     True\n",
+       "0he2ViGMUO3ajKTxLOfWVT     True\n",
+       "72DAt9Lbpy9EUS29OzQLob     True\n",
+       "6HXgExFVuE1c3cq9QjFCcU     True\n",
+       "6MAAMZImxcvYhRnxDLTufD     True\n",
+       "Name: song_name, Length: 35877, dtype: bool"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[\"song_name\"].isna()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Review: `Pandas.Series.value_counts()`\n",
+    "- Returns a new `Series` with unique values from the original `Series` as keys and the count of those unique values as values. \n",
+    "- Return value `Series` is ordered using descending order of counts"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "uCLDr8EIGMeJ",
+    "outputId": "241d6181-d525-4019-a8f2-689939b2ab33"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "False    18342\n",
+       "True     17535\n",
+       "Name: song_name, dtype: int64"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# count the number of missing values for song name\n",
+    "df[\"song_name\"].isna().value_counts()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Missing value manipulation\n",
+    "Syntax: `df.fillna(<REPLACE>)`\n",
+    "- Returns a new DataFrame object instance reference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "pJ2CIqq9HWvN",
+    "outputId": "2895e862-18e5-4742-9750-31b130aae668"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>title</th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "      <th>duration_ms</th>\n",
+       "      <th>key</th>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th>tempo</th>\n",
+       "      <th>acousticness</th>\n",
+       "      <th>danceability</th>\n",
+       "      <th>energy</th>\n",
+       "      <th>instrumentalness</th>\n",
+       "      <th>liveness</th>\n",
+       "      <th>loudness</th>\n",
+       "      <th>speechiness</th>\n",
+       "      <th>valence</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>id</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>7pgJBLVz5VmnL7uGHmRj6p</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Pathology</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>224427</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>115.080</td>\n",
+       "      <td>0.401000</td>\n",
+       "      <td>0.719</td>\n",
+       "      <td>0.493</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1180</td>\n",
+       "      <td>-7.230</td>\n",
+       "      <td>0.0794</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0vSWgAlfpye0WCGeNmuNhy</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Symbiote</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>98821</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>218.050</td>\n",
+       "      <td>0.013800</td>\n",
+       "      <td>0.850</td>\n",
+       "      <td>0.893</td>\n",
+       "      <td>0.000004</td>\n",
+       "      <td>0.3720</td>\n",
+       "      <td>-4.783</td>\n",
+       "      <td>0.0623</td>\n",
+       "      <td>0.0391</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7EL7ifncK2PWFYThJjzR25</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>BRAINFOOD</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>101172</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>189.938</td>\n",
+       "      <td>0.187000</td>\n",
+       "      <td>0.864</td>\n",
+       "      <td>0.365</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1160</td>\n",
+       "      <td>-10.219</td>\n",
+       "      <td>0.0655</td>\n",
+       "      <td>0.0478</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1umsRbM7L4ju7rn9aU8Ju6</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Sacrifice</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>96062</td>\n",
+       "      <td>10</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>139.990</td>\n",
+       "      <td>0.145000</td>\n",
+       "      <td>0.767</td>\n",
+       "      <td>0.576</td>\n",
+       "      <td>0.000003</td>\n",
+       "      <td>0.0968</td>\n",
+       "      <td>-9.683</td>\n",
+       "      <td>0.2560</td>\n",
+       "      <td>0.1870</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4SKqOHKYU5pgHr5UiVKiQN</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Backpack</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>135079</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>128.014</td>\n",
+       "      <td>0.007700</td>\n",
+       "      <td>0.765</td>\n",
+       "      <td>0.726</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.6190</td>\n",
+       "      <td>-5.580</td>\n",
+       "      <td>0.1910</td>\n",
+       "      <td>0.2700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>46bXU7Sgj7104ZoXxzz9tM</th>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>269208</td>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.013</td>\n",
+       "      <td>0.031500</td>\n",
+       "      <td>0.528</td>\n",
+       "      <td>0.693</td>\n",
+       "      <td>0.000345</td>\n",
+       "      <td>0.1210</td>\n",
+       "      <td>-5.148</td>\n",
+       "      <td>0.0304</td>\n",
+       "      <td>0.3940</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0he2ViGMUO3ajKTxLOfWVT</th>\n",
+       "      <td>Greatest Hardstyle Playlist</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>210112</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>149.928</td>\n",
+       "      <td>0.022500</td>\n",
+       "      <td>0.517</td>\n",
+       "      <td>0.768</td>\n",
+       "      <td>0.000018</td>\n",
+       "      <td>0.2050</td>\n",
+       "      <td>-7.922</td>\n",
+       "      <td>0.0479</td>\n",
+       "      <td>0.3830</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>72DAt9Lbpy9EUS29OzQLob</th>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>234823</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>154.935</td>\n",
+       "      <td>0.026000</td>\n",
+       "      <td>0.361</td>\n",
+       "      <td>0.821</td>\n",
+       "      <td>0.000242</td>\n",
+       "      <td>0.3850</td>\n",
+       "      <td>-3.102</td>\n",
+       "      <td>0.0505</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6HXgExFVuE1c3cq9QjFCcU</th>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>323200</td>\n",
+       "      <td>6</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.042</td>\n",
+       "      <td>0.000551</td>\n",
+       "      <td>0.477</td>\n",
+       "      <td>0.921</td>\n",
+       "      <td>0.029600</td>\n",
+       "      <td>0.0575</td>\n",
+       "      <td>-4.777</td>\n",
+       "      <td>0.0392</td>\n",
+       "      <td>0.4880</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6MAAMZImxcvYhRnxDLTufD</th>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>162161</td>\n",
+       "      <td>9</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>155.047</td>\n",
+       "      <td>0.001890</td>\n",
+       "      <td>0.529</td>\n",
+       "      <td>0.945</td>\n",
+       "      <td>0.000055</td>\n",
+       "      <td>0.4140</td>\n",
+       "      <td>-5.862</td>\n",
+       "      <td>0.0615</td>\n",
+       "      <td>0.1340</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>35877 rows × 16 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                                              title     song_name      genre  \\\n",
+       "id                                                                             \n",
+       "7pgJBLVz5VmnL7uGHmRj6p                          NaN     Pathology  Dark Trap   \n",
+       "0vSWgAlfpye0WCGeNmuNhy                          NaN      Symbiote  Dark Trap   \n",
+       "7EL7ifncK2PWFYThJjzR25                          NaN     BRAINFOOD  Dark Trap   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6                          NaN     Sacrifice  Dark Trap   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN                          NaN      Backpack  Dark Trap   \n",
+       "...                                             ...           ...        ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM           Euphoric Hardstyle  No Song Name  hardstyle   \n",
+       "0he2ViGMUO3ajKTxLOfWVT  Greatest Hardstyle Playlist  No Song Name  hardstyle   \n",
+       "72DAt9Lbpy9EUS29OzQLob       Best of Hardstyle 2020  No Song Name  hardstyle   \n",
+       "6HXgExFVuE1c3cq9QjFCcU           Euphoric Hardstyle  No Song Name  hardstyle   \n",
+       "6MAAMZImxcvYhRnxDLTufD       Best of Hardstyle 2020  No Song Name  hardstyle   \n",
+       "\n",
+       "                        duration_ms  key  mode  time_signature    tempo  \\\n",
+       "id                                                                        \n",
+       "7pgJBLVz5VmnL7uGHmRj6p       224427    8     1               4  115.080   \n",
+       "0vSWgAlfpye0WCGeNmuNhy        98821    5     1               4  218.050   \n",
+       "7EL7ifncK2PWFYThJjzR25       101172    8     1               4  189.938   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6        96062   10     0               4  139.990   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN       135079    5     1               4  128.014   \n",
+       "...                             ...  ...   ...             ...      ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM       269208    4     1               4  150.013   \n",
+       "0he2ViGMUO3ajKTxLOfWVT       210112    0     0               4  149.928   \n",
+       "72DAt9Lbpy9EUS29OzQLob       234823    8     1               4  154.935   \n",
+       "6HXgExFVuE1c3cq9QjFCcU       323200    6     0               4  150.042   \n",
+       "6MAAMZImxcvYhRnxDLTufD       162161    9     1               4  155.047   \n",
+       "\n",
+       "                        acousticness  danceability  energy  instrumentalness  \\\n",
+       "id                                                                             \n",
+       "7pgJBLVz5VmnL7uGHmRj6p      0.401000         0.719   0.493          0.000000   \n",
+       "0vSWgAlfpye0WCGeNmuNhy      0.013800         0.850   0.893          0.000004   \n",
+       "7EL7ifncK2PWFYThJjzR25      0.187000         0.864   0.365          0.000000   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6      0.145000         0.767   0.576          0.000003   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN      0.007700         0.765   0.726          0.000000   \n",
+       "...                              ...           ...     ...               ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM      0.031500         0.528   0.693          0.000345   \n",
+       "0he2ViGMUO3ajKTxLOfWVT      0.022500         0.517   0.768          0.000018   \n",
+       "72DAt9Lbpy9EUS29OzQLob      0.026000         0.361   0.821          0.000242   \n",
+       "6HXgExFVuE1c3cq9QjFCcU      0.000551         0.477   0.921          0.029600   \n",
+       "6MAAMZImxcvYhRnxDLTufD      0.001890         0.529   0.945          0.000055   \n",
+       "\n",
+       "                        liveness  loudness  speechiness  valence  \n",
+       "id                                                                \n",
+       "7pgJBLVz5VmnL7uGHmRj6p    0.1180    -7.230       0.0794   0.1240  \n",
+       "0vSWgAlfpye0WCGeNmuNhy    0.3720    -4.783       0.0623   0.0391  \n",
+       "7EL7ifncK2PWFYThJjzR25    0.1160   -10.219       0.0655   0.0478  \n",
+       "1umsRbM7L4ju7rn9aU8Ju6    0.0968    -9.683       0.2560   0.1870  \n",
+       "4SKqOHKYU5pgHr5UiVKiQN    0.6190    -5.580       0.1910   0.2700  \n",
+       "...                          ...       ...          ...      ...  \n",
+       "46bXU7Sgj7104ZoXxzz9tM    0.1210    -5.148       0.0304   0.3940  \n",
+       "0he2ViGMUO3ajKTxLOfWVT    0.2050    -7.922       0.0479   0.3830  \n",
+       "72DAt9Lbpy9EUS29OzQLob    0.3850    -3.102       0.0505   0.1240  \n",
+       "6HXgExFVuE1c3cq9QjFCcU    0.0575    -4.777       0.0392   0.4880  \n",
+       "6MAAMZImxcvYhRnxDLTufD    0.4140    -5.862       0.0615   0.1340  \n",
+       "\n",
+       "[35877 rows x 16 columns]"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# use .fillna to replace missing values\n",
+    "df[\"song_name\"].fillna(\"No Song Name\")\n",
+    "\n",
+    "# to replace the original DataFrame's column, you need to explicitly update that object instance\n",
+    "df[\"song_name\"] = df[\"song_name\"].fillna(\"No Song Name\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Dropping missing values\n",
+    "Syntax: `df.dropna()`\n",
+    "- Returns a new DataFrame object instance reference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 145
+    },
+    "id": "O_1ZeHG8N-rB",
+    "outputId": "3b112da2-2b3c-4fb8-c7ae-dc2f2127856d"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>title</th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "      <th>duration_ms</th>\n",
+       "      <th>key</th>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th>tempo</th>\n",
+       "      <th>acousticness</th>\n",
+       "      <th>danceability</th>\n",
+       "      <th>energy</th>\n",
+       "      <th>instrumentalness</th>\n",
+       "      <th>liveness</th>\n",
+       "      <th>loudness</th>\n",
+       "      <th>speechiness</th>\n",
+       "      <th>valence</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>id</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>5LzAV6KfjN8VhWCedeygfY</th>\n",
+       "      <td>Dirtybird Players</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>techhouse</td>\n",
+       "      <td>197499</td>\n",
+       "      <td>7</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>127.997</td>\n",
+       "      <td>0.000957</td>\n",
+       "      <td>0.806</td>\n",
+       "      <td>0.950</td>\n",
+       "      <td>0.920000</td>\n",
+       "      <td>0.1130</td>\n",
+       "      <td>-6.782</td>\n",
+       "      <td>0.0811</td>\n",
+       "      <td>0.580</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3TsCb6ueD678XBJDiRrvhr</th>\n",
+       "      <td>tech house</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>techhouse</td>\n",
+       "      <td>206000</td>\n",
+       "      <td>10</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>124.994</td>\n",
+       "      <td>0.062300</td>\n",
+       "      <td>0.729</td>\n",
+       "      <td>0.978</td>\n",
+       "      <td>0.908000</td>\n",
+       "      <td>0.0353</td>\n",
+       "      <td>-6.645</td>\n",
+       "      <td>0.0420</td>\n",
+       "      <td>0.778</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6Y0Fy2buEis7bEOlG0QET1</th>\n",
+       "      <td>Tech House Bangerz</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>techhouse</td>\n",
+       "      <td>199839</td>\n",
+       "      <td>4</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>124.006</td>\n",
+       "      <td>0.019100</td>\n",
+       "      <td>0.724</td>\n",
+       "      <td>0.792</td>\n",
+       "      <td>0.812000</td>\n",
+       "      <td>0.1080</td>\n",
+       "      <td>-8.555</td>\n",
+       "      <td>0.0405</td>\n",
+       "      <td>0.346</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4EJI2XGViSQp6WscLKgYDD</th>\n",
+       "      <td>tech house</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>techhouse</td>\n",
+       "      <td>173861</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>125.031</td>\n",
+       "      <td>0.053000</td>\n",
+       "      <td>0.700</td>\n",
+       "      <td>0.898</td>\n",
+       "      <td>0.418000</td>\n",
+       "      <td>0.5740</td>\n",
+       "      <td>-6.099</td>\n",
+       "      <td>0.2570</td>\n",
+       "      <td>0.791</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4x6VzOQTLIrkkCWcDPh5Y0</th>\n",
+       "      <td>blanc | Tech House</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>techhouse</td>\n",
+       "      <td>394960</td>\n",
+       "      <td>8</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>127.029</td>\n",
+       "      <td>0.000301</td>\n",
+       "      <td>0.803</td>\n",
+       "      <td>0.919</td>\n",
+       "      <td>0.926000</td>\n",
+       "      <td>0.1020</td>\n",
+       "      <td>-8.667</td>\n",
+       "      <td>0.0702</td>\n",
+       "      <td>0.754</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>46bXU7Sgj7104ZoXxzz9tM</th>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>269208</td>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.013</td>\n",
+       "      <td>0.031500</td>\n",
+       "      <td>0.528</td>\n",
+       "      <td>0.693</td>\n",
+       "      <td>0.000345</td>\n",
+       "      <td>0.1210</td>\n",
+       "      <td>-5.148</td>\n",
+       "      <td>0.0304</td>\n",
+       "      <td>0.394</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0he2ViGMUO3ajKTxLOfWVT</th>\n",
+       "      <td>Greatest Hardstyle Playlist</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>210112</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>149.928</td>\n",
+       "      <td>0.022500</td>\n",
+       "      <td>0.517</td>\n",
+       "      <td>0.768</td>\n",
+       "      <td>0.000018</td>\n",
+       "      <td>0.2050</td>\n",
+       "      <td>-7.922</td>\n",
+       "      <td>0.0479</td>\n",
+       "      <td>0.383</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>72DAt9Lbpy9EUS29OzQLob</th>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>234823</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>154.935</td>\n",
+       "      <td>0.026000</td>\n",
+       "      <td>0.361</td>\n",
+       "      <td>0.821</td>\n",
+       "      <td>0.000242</td>\n",
+       "      <td>0.3850</td>\n",
+       "      <td>-3.102</td>\n",
+       "      <td>0.0505</td>\n",
+       "      <td>0.124</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6HXgExFVuE1c3cq9QjFCcU</th>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>323200</td>\n",
+       "      <td>6</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.042</td>\n",
+       "      <td>0.000551</td>\n",
+       "      <td>0.477</td>\n",
+       "      <td>0.921</td>\n",
+       "      <td>0.029600</td>\n",
+       "      <td>0.0575</td>\n",
+       "      <td>-4.777</td>\n",
+       "      <td>0.0392</td>\n",
+       "      <td>0.488</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6MAAMZImxcvYhRnxDLTufD</th>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>162161</td>\n",
+       "      <td>9</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>155.047</td>\n",
+       "      <td>0.001890</td>\n",
+       "      <td>0.529</td>\n",
+       "      <td>0.945</td>\n",
+       "      <td>0.000055</td>\n",
+       "      <td>0.4140</td>\n",
+       "      <td>-5.862</td>\n",
+       "      <td>0.0615</td>\n",
+       "      <td>0.134</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>17529 rows × 16 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                                              title     song_name      genre  \\\n",
+       "id                                                                             \n",
+       "5LzAV6KfjN8VhWCedeygfY            Dirtybird Players  No Song Name  techhouse   \n",
+       "3TsCb6ueD678XBJDiRrvhr                   tech house  No Song Name  techhouse   \n",
+       "6Y0Fy2buEis7bEOlG0QET1           Tech House Bangerz  No Song Name  techhouse   \n",
+       "4EJI2XGViSQp6WscLKgYDD                   tech house  No Song Name  techhouse   \n",
+       "4x6VzOQTLIrkkCWcDPh5Y0           blanc | Tech House  No Song Name  techhouse   \n",
+       "...                                             ...           ...        ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM           Euphoric Hardstyle  No Song Name  hardstyle   \n",
+       "0he2ViGMUO3ajKTxLOfWVT  Greatest Hardstyle Playlist  No Song Name  hardstyle   \n",
+       "72DAt9Lbpy9EUS29OzQLob       Best of Hardstyle 2020  No Song Name  hardstyle   \n",
+       "6HXgExFVuE1c3cq9QjFCcU           Euphoric Hardstyle  No Song Name  hardstyle   \n",
+       "6MAAMZImxcvYhRnxDLTufD       Best of Hardstyle 2020  No Song Name  hardstyle   \n",
+       "\n",
+       "                        duration_ms  key  mode  time_signature    tempo  \\\n",
+       "id                                                                        \n",
+       "5LzAV6KfjN8VhWCedeygfY       197499    7     1               4  127.997   \n",
+       "3TsCb6ueD678XBJDiRrvhr       206000   10     1               4  124.994   \n",
+       "6Y0Fy2buEis7bEOlG0QET1       199839    4     0               4  124.006   \n",
+       "4EJI2XGViSQp6WscLKgYDD       173861    8     1               4  125.031   \n",
+       "4x6VzOQTLIrkkCWcDPh5Y0       394960    8     0               4  127.029   \n",
+       "...                             ...  ...   ...             ...      ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM       269208    4     1               4  150.013   \n",
+       "0he2ViGMUO3ajKTxLOfWVT       210112    0     0               4  149.928   \n",
+       "72DAt9Lbpy9EUS29OzQLob       234823    8     1               4  154.935   \n",
+       "6HXgExFVuE1c3cq9QjFCcU       323200    6     0               4  150.042   \n",
+       "6MAAMZImxcvYhRnxDLTufD       162161    9     1               4  155.047   \n",
+       "\n",
+       "                        acousticness  danceability  energy  instrumentalness  \\\n",
+       "id                                                                             \n",
+       "5LzAV6KfjN8VhWCedeygfY      0.000957         0.806   0.950          0.920000   \n",
+       "3TsCb6ueD678XBJDiRrvhr      0.062300         0.729   0.978          0.908000   \n",
+       "6Y0Fy2buEis7bEOlG0QET1      0.019100         0.724   0.792          0.812000   \n",
+       "4EJI2XGViSQp6WscLKgYDD      0.053000         0.700   0.898          0.418000   \n",
+       "4x6VzOQTLIrkkCWcDPh5Y0      0.000301         0.803   0.919          0.926000   \n",
+       "...                              ...           ...     ...               ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM      0.031500         0.528   0.693          0.000345   \n",
+       "0he2ViGMUO3ajKTxLOfWVT      0.022500         0.517   0.768          0.000018   \n",
+       "72DAt9Lbpy9EUS29OzQLob      0.026000         0.361   0.821          0.000242   \n",
+       "6HXgExFVuE1c3cq9QjFCcU      0.000551         0.477   0.921          0.029600   \n",
+       "6MAAMZImxcvYhRnxDLTufD      0.001890         0.529   0.945          0.000055   \n",
+       "\n",
+       "                        liveness  loudness  speechiness  valence  \n",
+       "id                                                                \n",
+       "5LzAV6KfjN8VhWCedeygfY    0.1130    -6.782       0.0811    0.580  \n",
+       "3TsCb6ueD678XBJDiRrvhr    0.0353    -6.645       0.0420    0.778  \n",
+       "6Y0Fy2buEis7bEOlG0QET1    0.1080    -8.555       0.0405    0.346  \n",
+       "4EJI2XGViSQp6WscLKgYDD    0.5740    -6.099       0.2570    0.791  \n",
+       "4x6VzOQTLIrkkCWcDPh5Y0    0.1020    -8.667       0.0702    0.754  \n",
+       "...                          ...       ...          ...      ...  \n",
+       "46bXU7Sgj7104ZoXxzz9tM    0.1210    -5.148       0.0304    0.394  \n",
+       "0he2ViGMUO3ajKTxLOfWVT    0.2050    -7.922       0.0479    0.383  \n",
+       "72DAt9Lbpy9EUS29OzQLob    0.3850    -3.102       0.0505    0.124  \n",
+       "6HXgExFVuE1c3cq9QjFCcU    0.0575    -4.777       0.0392    0.488  \n",
+       "6MAAMZImxcvYhRnxDLTufD    0.4140    -5.862       0.0615    0.134  \n",
+       "\n",
+       "[17529 rows x 16 columns]"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# .dropna will drop all rows that contain NaN in them\n",
+    "df.dropna()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "ggttXEqUbI_E"
+   },
+   "source": [
+    "### Review: `Pandas.Series.apply(...)`\n",
+    "Syntax: `Series.apply(<FUNCTION OBJECT REFERENCE>)`\n",
+    "- applies input function to every element of the Series.\n",
+    "- Returns a new `Series` object instance reference.\n",
+    "\n",
+    "Let's apply transformation function to `mode` column `Series`:\n",
+    "- mode = 1 means major modality (sounds happy)\n",
+    "- mode = 0 means minor modality (sounds sad)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def replace_mode(m): \n",
+    "    if m == 1: \n",
+    "        return \"major\"\n",
+    "    else: \n",
+    "        return \"minor\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "id\n",
+       "7pgJBLVz5VmnL7uGHmRj6p    major\n",
+       "0vSWgAlfpye0WCGeNmuNhy    major\n",
+       "7EL7ifncK2PWFYThJjzR25    major\n",
+       "1umsRbM7L4ju7rn9aU8Ju6    minor\n",
+       "4SKqOHKYU5pgHr5UiVKiQN    major\n",
+       "                          ...  \n",
+       "46bXU7Sgj7104ZoXxzz9tM    major\n",
+       "0he2ViGMUO3ajKTxLOfWVT    minor\n",
+       "72DAt9Lbpy9EUS29OzQLob    major\n",
+       "6HXgExFVuE1c3cq9QjFCcU    minor\n",
+       "6MAAMZImxcvYhRnxDLTufD    major\n",
+       "Name: mode, Length: 35877, dtype: object"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[\"mode\"].apply(replace_mode)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### `lambda`\n",
+    "\n",
+    "Let's write a `lambda` function instead of the `replace_mode` function"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "9AJ3p-_TarnN",
+    "outputId": "a087df5d-2002-417c-e99c-5e6fc8ea9809"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "id\n",
+       "7pgJBLVz5VmnL7uGHmRj6p    major\n",
+       "0vSWgAlfpye0WCGeNmuNhy    major\n",
+       "7EL7ifncK2PWFYThJjzR25    major\n",
+       "1umsRbM7L4ju7rn9aU8Ju6    minor\n",
+       "4SKqOHKYU5pgHr5UiVKiQN    major\n",
+       "                          ...  \n",
+       "46bXU7Sgj7104ZoXxzz9tM    major\n",
+       "0he2ViGMUO3ajKTxLOfWVT    minor\n",
+       "72DAt9Lbpy9EUS29OzQLob    major\n",
+       "6HXgExFVuE1c3cq9QjFCcU    minor\n",
+       "6MAAMZImxcvYhRnxDLTufD    major\n",
+       "Name: mode, Length: 35877, dtype: object"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[\"mode\"].apply(lambda m: \"major\" if m == 1 else \"minor\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Typically transformed columns are added as new columns within the DataFrame.\n",
+    "Let's add a new `modified_mode` column."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>title</th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "      <th>duration_ms</th>\n",
+       "      <th>key</th>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th>tempo</th>\n",
+       "      <th>acousticness</th>\n",
+       "      <th>danceability</th>\n",
+       "      <th>energy</th>\n",
+       "      <th>instrumentalness</th>\n",
+       "      <th>liveness</th>\n",
+       "      <th>loudness</th>\n",
+       "      <th>speechiness</th>\n",
+       "      <th>valence</th>\n",
+       "      <th>modified_mode</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>id</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>7pgJBLVz5VmnL7uGHmRj6p</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Pathology</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>224427</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>115.080</td>\n",
+       "      <td>0.401000</td>\n",
+       "      <td>0.719</td>\n",
+       "      <td>0.493</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1180</td>\n",
+       "      <td>-7.230</td>\n",
+       "      <td>0.0794</td>\n",
+       "      <td>0.1240</td>\n",
+       "      <td>major</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0vSWgAlfpye0WCGeNmuNhy</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Symbiote</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>98821</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>218.050</td>\n",
+       "      <td>0.013800</td>\n",
+       "      <td>0.850</td>\n",
+       "      <td>0.893</td>\n",
+       "      <td>0.000004</td>\n",
+       "      <td>0.3720</td>\n",
+       "      <td>-4.783</td>\n",
+       "      <td>0.0623</td>\n",
+       "      <td>0.0391</td>\n",
+       "      <td>major</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7EL7ifncK2PWFYThJjzR25</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>BRAINFOOD</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>101172</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>189.938</td>\n",
+       "      <td>0.187000</td>\n",
+       "      <td>0.864</td>\n",
+       "      <td>0.365</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1160</td>\n",
+       "      <td>-10.219</td>\n",
+       "      <td>0.0655</td>\n",
+       "      <td>0.0478</td>\n",
+       "      <td>major</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1umsRbM7L4ju7rn9aU8Ju6</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Sacrifice</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>96062</td>\n",
+       "      <td>10</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>139.990</td>\n",
+       "      <td>0.145000</td>\n",
+       "      <td>0.767</td>\n",
+       "      <td>0.576</td>\n",
+       "      <td>0.000003</td>\n",
+       "      <td>0.0968</td>\n",
+       "      <td>-9.683</td>\n",
+       "      <td>0.2560</td>\n",
+       "      <td>0.1870</td>\n",
+       "      <td>minor</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4SKqOHKYU5pgHr5UiVKiQN</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Backpack</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>135079</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>128.014</td>\n",
+       "      <td>0.007700</td>\n",
+       "      <td>0.765</td>\n",
+       "      <td>0.726</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.6190</td>\n",
+       "      <td>-5.580</td>\n",
+       "      <td>0.1910</td>\n",
+       "      <td>0.2700</td>\n",
+       "      <td>major</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>46bXU7Sgj7104ZoXxzz9tM</th>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>269208</td>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.013</td>\n",
+       "      <td>0.031500</td>\n",
+       "      <td>0.528</td>\n",
+       "      <td>0.693</td>\n",
+       "      <td>0.000345</td>\n",
+       "      <td>0.1210</td>\n",
+       "      <td>-5.148</td>\n",
+       "      <td>0.0304</td>\n",
+       "      <td>0.3940</td>\n",
+       "      <td>major</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>0he2ViGMUO3ajKTxLOfWVT</th>\n",
+       "      <td>Greatest Hardstyle Playlist</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>210112</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>149.928</td>\n",
+       "      <td>0.022500</td>\n",
+       "      <td>0.517</td>\n",
+       "      <td>0.768</td>\n",
+       "      <td>0.000018</td>\n",
+       "      <td>0.2050</td>\n",
+       "      <td>-7.922</td>\n",
+       "      <td>0.0479</td>\n",
+       "      <td>0.3830</td>\n",
+       "      <td>minor</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>72DAt9Lbpy9EUS29OzQLob</th>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>234823</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>154.935</td>\n",
+       "      <td>0.026000</td>\n",
+       "      <td>0.361</td>\n",
+       "      <td>0.821</td>\n",
+       "      <td>0.000242</td>\n",
+       "      <td>0.3850</td>\n",
+       "      <td>-3.102</td>\n",
+       "      <td>0.0505</td>\n",
+       "      <td>0.1240</td>\n",
+       "      <td>major</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6HXgExFVuE1c3cq9QjFCcU</th>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>323200</td>\n",
+       "      <td>6</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.042</td>\n",
+       "      <td>0.000551</td>\n",
+       "      <td>0.477</td>\n",
+       "      <td>0.921</td>\n",
+       "      <td>0.029600</td>\n",
+       "      <td>0.0575</td>\n",
+       "      <td>-4.777</td>\n",
+       "      <td>0.0392</td>\n",
+       "      <td>0.4880</td>\n",
+       "      <td>minor</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6MAAMZImxcvYhRnxDLTufD</th>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td>No Song Name</td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>162161</td>\n",
+       "      <td>9</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>155.047</td>\n",
+       "      <td>0.001890</td>\n",
+       "      <td>0.529</td>\n",
+       "      <td>0.945</td>\n",
+       "      <td>0.000055</td>\n",
+       "      <td>0.4140</td>\n",
+       "      <td>-5.862</td>\n",
+       "      <td>0.0615</td>\n",
+       "      <td>0.1340</td>\n",
+       "      <td>major</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>35877 rows × 17 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                                              title     song_name      genre  \\\n",
+       "id                                                                             \n",
+       "7pgJBLVz5VmnL7uGHmRj6p                          NaN     Pathology  Dark Trap   \n",
+       "0vSWgAlfpye0WCGeNmuNhy                          NaN      Symbiote  Dark Trap   \n",
+       "7EL7ifncK2PWFYThJjzR25                          NaN     BRAINFOOD  Dark Trap   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6                          NaN     Sacrifice  Dark Trap   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN                          NaN      Backpack  Dark Trap   \n",
+       "...                                             ...           ...        ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM           Euphoric Hardstyle  No Song Name  hardstyle   \n",
+       "0he2ViGMUO3ajKTxLOfWVT  Greatest Hardstyle Playlist  No Song Name  hardstyle   \n",
+       "72DAt9Lbpy9EUS29OzQLob       Best of Hardstyle 2020  No Song Name  hardstyle   \n",
+       "6HXgExFVuE1c3cq9QjFCcU           Euphoric Hardstyle  No Song Name  hardstyle   \n",
+       "6MAAMZImxcvYhRnxDLTufD       Best of Hardstyle 2020  No Song Name  hardstyle   \n",
+       "\n",
+       "                        duration_ms  key  mode  time_signature    tempo  \\\n",
+       "id                                                                        \n",
+       "7pgJBLVz5VmnL7uGHmRj6p       224427    8     1               4  115.080   \n",
+       "0vSWgAlfpye0WCGeNmuNhy        98821    5     1               4  218.050   \n",
+       "7EL7ifncK2PWFYThJjzR25       101172    8     1               4  189.938   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6        96062   10     0               4  139.990   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN       135079    5     1               4  128.014   \n",
+       "...                             ...  ...   ...             ...      ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM       269208    4     1               4  150.013   \n",
+       "0he2ViGMUO3ajKTxLOfWVT       210112    0     0               4  149.928   \n",
+       "72DAt9Lbpy9EUS29OzQLob       234823    8     1               4  154.935   \n",
+       "6HXgExFVuE1c3cq9QjFCcU       323200    6     0               4  150.042   \n",
+       "6MAAMZImxcvYhRnxDLTufD       162161    9     1               4  155.047   \n",
+       "\n",
+       "                        acousticness  danceability  energy  instrumentalness  \\\n",
+       "id                                                                             \n",
+       "7pgJBLVz5VmnL7uGHmRj6p      0.401000         0.719   0.493          0.000000   \n",
+       "0vSWgAlfpye0WCGeNmuNhy      0.013800         0.850   0.893          0.000004   \n",
+       "7EL7ifncK2PWFYThJjzR25      0.187000         0.864   0.365          0.000000   \n",
+       "1umsRbM7L4ju7rn9aU8Ju6      0.145000         0.767   0.576          0.000003   \n",
+       "4SKqOHKYU5pgHr5UiVKiQN      0.007700         0.765   0.726          0.000000   \n",
+       "...                              ...           ...     ...               ...   \n",
+       "46bXU7Sgj7104ZoXxzz9tM      0.031500         0.528   0.693          0.000345   \n",
+       "0he2ViGMUO3ajKTxLOfWVT      0.022500         0.517   0.768          0.000018   \n",
+       "72DAt9Lbpy9EUS29OzQLob      0.026000         0.361   0.821          0.000242   \n",
+       "6HXgExFVuE1c3cq9QjFCcU      0.000551         0.477   0.921          0.029600   \n",
+       "6MAAMZImxcvYhRnxDLTufD      0.001890         0.529   0.945          0.000055   \n",
+       "\n",
+       "                        liveness  loudness  speechiness  valence modified_mode  \n",
+       "id                                                                              \n",
+       "7pgJBLVz5VmnL7uGHmRj6p    0.1180    -7.230       0.0794   0.1240         major  \n",
+       "0vSWgAlfpye0WCGeNmuNhy    0.3720    -4.783       0.0623   0.0391         major  \n",
+       "7EL7ifncK2PWFYThJjzR25    0.1160   -10.219       0.0655   0.0478         major  \n",
+       "1umsRbM7L4ju7rn9aU8Ju6    0.0968    -9.683       0.2560   0.1870         minor  \n",
+       "4SKqOHKYU5pgHr5UiVKiQN    0.6190    -5.580       0.1910   0.2700         major  \n",
+       "...                          ...       ...          ...      ...           ...  \n",
+       "46bXU7Sgj7104ZoXxzz9tM    0.1210    -5.148       0.0304   0.3940         major  \n",
+       "0he2ViGMUO3ajKTxLOfWVT    0.2050    -7.922       0.0479   0.3830         minor  \n",
+       "72DAt9Lbpy9EUS29OzQLob    0.3850    -3.102       0.0505   0.1240         major  \n",
+       "6HXgExFVuE1c3cq9QjFCcU    0.0575    -4.777       0.0392   0.4880         minor  \n",
+       "6MAAMZImxcvYhRnxDLTufD    0.4140    -5.862       0.0615   0.1340         major  \n",
+       "\n",
+       "[35877 rows x 17 columns]"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[\"modified_mode\"] = df[\"mode\"].apply(lambda m: \"major\" if m == 1 else \"minor\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Let's go back to the original table from the SQL database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {
+    "id": "ZoiyUleiyhMg"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>id</th>\n",
+       "      <th>title</th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "      <th>duration_ms</th>\n",
+       "      <th>key</th>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th>tempo</th>\n",
+       "      <th>acousticness</th>\n",
+       "      <th>danceability</th>\n",
+       "      <th>energy</th>\n",
+       "      <th>instrumentalness</th>\n",
+       "      <th>liveness</th>\n",
+       "      <th>loudness</th>\n",
+       "      <th>speechiness</th>\n",
+       "      <th>valence</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>7pgJBLVz5VmnL7uGHmRj6p</td>\n",
+       "      <td></td>\n",
+       "      <td>Pathology</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>224427</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>115.080</td>\n",
+       "      <td>0.401000</td>\n",
+       "      <td>0.719</td>\n",
+       "      <td>0.493</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1180</td>\n",
+       "      <td>-7.230</td>\n",
+       "      <td>0.0794</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>0vSWgAlfpye0WCGeNmuNhy</td>\n",
+       "      <td></td>\n",
+       "      <td>Symbiote</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>98821</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>218.050</td>\n",
+       "      <td>0.013800</td>\n",
+       "      <td>0.850</td>\n",
+       "      <td>0.893</td>\n",
+       "      <td>0.000004</td>\n",
+       "      <td>0.3720</td>\n",
+       "      <td>-4.783</td>\n",
+       "      <td>0.0623</td>\n",
+       "      <td>0.0391</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>7EL7ifncK2PWFYThJjzR25</td>\n",
+       "      <td></td>\n",
+       "      <td>BRAINFOOD</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>101172</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>189.938</td>\n",
+       "      <td>0.187000</td>\n",
+       "      <td>0.864</td>\n",
+       "      <td>0.365</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.1160</td>\n",
+       "      <td>-10.219</td>\n",
+       "      <td>0.0655</td>\n",
+       "      <td>0.0478</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>1umsRbM7L4ju7rn9aU8Ju6</td>\n",
+       "      <td></td>\n",
+       "      <td>Sacrifice</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>96062</td>\n",
+       "      <td>10</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>139.990</td>\n",
+       "      <td>0.145000</td>\n",
+       "      <td>0.767</td>\n",
+       "      <td>0.576</td>\n",
+       "      <td>0.000003</td>\n",
+       "      <td>0.0968</td>\n",
+       "      <td>-9.683</td>\n",
+       "      <td>0.2560</td>\n",
+       "      <td>0.1870</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4SKqOHKYU5pgHr5UiVKiQN</td>\n",
+       "      <td></td>\n",
+       "      <td>Backpack</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>135079</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>128.014</td>\n",
+       "      <td>0.007700</td>\n",
+       "      <td>0.765</td>\n",
+       "      <td>0.726</td>\n",
+       "      <td>0.000000</td>\n",
+       "      <td>0.6190</td>\n",
+       "      <td>-5.580</td>\n",
+       "      <td>0.1910</td>\n",
+       "      <td>0.2700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35872</th>\n",
+       "      <td>46bXU7Sgj7104ZoXxzz9tM</td>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>269208</td>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.013</td>\n",
+       "      <td>0.031500</td>\n",
+       "      <td>0.528</td>\n",
+       "      <td>0.693</td>\n",
+       "      <td>0.000345</td>\n",
+       "      <td>0.1210</td>\n",
+       "      <td>-5.148</td>\n",
+       "      <td>0.0304</td>\n",
+       "      <td>0.3940</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35873</th>\n",
+       "      <td>0he2ViGMUO3ajKTxLOfWVT</td>\n",
+       "      <td>Greatest Hardstyle Playlist</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>210112</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>149.928</td>\n",
+       "      <td>0.022500</td>\n",
+       "      <td>0.517</td>\n",
+       "      <td>0.768</td>\n",
+       "      <td>0.000018</td>\n",
+       "      <td>0.2050</td>\n",
+       "      <td>-7.922</td>\n",
+       "      <td>0.0479</td>\n",
+       "      <td>0.3830</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35874</th>\n",
+       "      <td>72DAt9Lbpy9EUS29OzQLob</td>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>234823</td>\n",
+       "      <td>8</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>154.935</td>\n",
+       "      <td>0.026000</td>\n",
+       "      <td>0.361</td>\n",
+       "      <td>0.821</td>\n",
+       "      <td>0.000242</td>\n",
+       "      <td>0.3850</td>\n",
+       "      <td>-3.102</td>\n",
+       "      <td>0.0505</td>\n",
+       "      <td>0.1240</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35875</th>\n",
+       "      <td>6HXgExFVuE1c3cq9QjFCcU</td>\n",
+       "      <td>Euphoric Hardstyle</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>323200</td>\n",
+       "      <td>6</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>150.042</td>\n",
+       "      <td>0.000551</td>\n",
+       "      <td>0.477</td>\n",
+       "      <td>0.921</td>\n",
+       "      <td>0.029600</td>\n",
+       "      <td>0.0575</td>\n",
+       "      <td>-4.777</td>\n",
+       "      <td>0.0392</td>\n",
+       "      <td>0.4880</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35876</th>\n",
+       "      <td>6MAAMZImxcvYhRnxDLTufD</td>\n",
+       "      <td>Best of Hardstyle 2020</td>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>162161</td>\n",
+       "      <td>9</td>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>155.047</td>\n",
+       "      <td>0.001890</td>\n",
+       "      <td>0.529</td>\n",
+       "      <td>0.945</td>\n",
+       "      <td>0.000055</td>\n",
+       "      <td>0.4140</td>\n",
+       "      <td>-5.862</td>\n",
+       "      <td>0.0615</td>\n",
+       "      <td>0.1340</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>35877 rows × 17 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                           id                        title  song_name  \\\n",
+       "0      7pgJBLVz5VmnL7uGHmRj6p                               Pathology   \n",
+       "1      0vSWgAlfpye0WCGeNmuNhy                                Symbiote   \n",
+       "2      7EL7ifncK2PWFYThJjzR25                               BRAINFOOD   \n",
+       "3      1umsRbM7L4ju7rn9aU8Ju6                               Sacrifice   \n",
+       "4      4SKqOHKYU5pgHr5UiVKiQN                                Backpack   \n",
+       "...                       ...                          ...        ...   \n",
+       "35872  46bXU7Sgj7104ZoXxzz9tM           Euphoric Hardstyle              \n",
+       "35873  0he2ViGMUO3ajKTxLOfWVT  Greatest Hardstyle Playlist              \n",
+       "35874  72DAt9Lbpy9EUS29OzQLob       Best of Hardstyle 2020              \n",
+       "35875  6HXgExFVuE1c3cq9QjFCcU           Euphoric Hardstyle              \n",
+       "35876  6MAAMZImxcvYhRnxDLTufD       Best of Hardstyle 2020              \n",
+       "\n",
+       "           genre  duration_ms  key  mode  time_signature    tempo  \\\n",
+       "0      Dark Trap       224427    8     1               4  115.080   \n",
+       "1      Dark Trap        98821    5     1               4  218.050   \n",
+       "2      Dark Trap       101172    8     1               4  189.938   \n",
+       "3      Dark Trap        96062   10     0               4  139.990   \n",
+       "4      Dark Trap       135079    5     1               4  128.014   \n",
+       "...          ...          ...  ...   ...             ...      ...   \n",
+       "35872  hardstyle       269208    4     1               4  150.013   \n",
+       "35873  hardstyle       210112    0     0               4  149.928   \n",
+       "35874  hardstyle       234823    8     1               4  154.935   \n",
+       "35875  hardstyle       323200    6     0               4  150.042   \n",
+       "35876  hardstyle       162161    9     1               4  155.047   \n",
+       "\n",
+       "       acousticness  danceability  energy  instrumentalness  liveness  \\\n",
+       "0          0.401000         0.719   0.493          0.000000    0.1180   \n",
+       "1          0.013800         0.850   0.893          0.000004    0.3720   \n",
+       "2          0.187000         0.864   0.365          0.000000    0.1160   \n",
+       "3          0.145000         0.767   0.576          0.000003    0.0968   \n",
+       "4          0.007700         0.765   0.726          0.000000    0.6190   \n",
+       "...             ...           ...     ...               ...       ...   \n",
+       "35872      0.031500         0.528   0.693          0.000345    0.1210   \n",
+       "35873      0.022500         0.517   0.768          0.000018    0.2050   \n",
+       "35874      0.026000         0.361   0.821          0.000242    0.3850   \n",
+       "35875      0.000551         0.477   0.921          0.029600    0.0575   \n",
+       "35876      0.001890         0.529   0.945          0.000055    0.4140   \n",
+       "\n",
+       "       loudness  speechiness  valence  \n",
+       "0        -7.230       0.0794   0.1240  \n",
+       "1        -4.783       0.0623   0.0391  \n",
+       "2       -10.219       0.0655   0.0478  \n",
+       "3        -9.683       0.2560   0.1870  \n",
+       "4        -5.580       0.1910   0.2700  \n",
+       "...         ...          ...      ...  \n",
+       "35872    -5.148       0.0304   0.3940  \n",
+       "35873    -7.922       0.0479   0.3830  \n",
+       "35874    -3.102       0.0505   0.1240  \n",
+       "35875    -4.777       0.0392   0.4880  \n",
+       "35876    -5.862       0.0615   0.1340  \n",
+       "\n",
+       "[35877 rows x 17 columns]"
+      ]
+     },
+     "execution_count": 17,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df = qry(\"SELECT * FROM spotify\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Extract just the \"genre\" and \"duration_ms\" columns from `df`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>genre</th>\n",
+       "      <th>duration_ms</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>224427</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>98821</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>101172</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>96062</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>135079</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35872</th>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>269208</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35873</th>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>210112</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35874</th>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>234823</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35875</th>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>323200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35876</th>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>162161</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>35877 rows × 2 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "           genre  duration_ms\n",
+       "0      Dark Trap       224427\n",
+       "1      Dark Trap        98821\n",
+       "2      Dark Trap       101172\n",
+       "3      Dark Trap        96062\n",
+       "4      Dark Trap       135079\n",
+       "...          ...          ...\n",
+       "35872  hardstyle       269208\n",
+       "35873  hardstyle       210112\n",
+       "35874  hardstyle       234823\n",
+       "35875  hardstyle       323200\n",
+       "35876  hardstyle       162161\n",
+       "\n",
+       "[35877 rows x 2 columns]"
+      ]
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### `Pandas.DataFrame.groupby(...)`\n",
+    "\n",
+    "Syntax: `DataFrame.groupby(<COLUMN>)`\n",
+    "- Returns a `groupby` object instance reference\n",
+    "- Need to apply aggregation methods to use the return value of `groupby`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 551
+    },
+    "id": "trRMgGMysdkb",
+    "outputId": "d02098c3-7722-4505-c599-5897bb8ace19"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fbc472bad90>"
+      ]
+     },
+     "execution_count": 19,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]].groupby(\"genre\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the average duration for each genre ordered based on decreasing order of averages?\n",
+    "#### v1: using `df` (`pandas`) to answer the question"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>duration_ms</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>genre</th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>Dark Trap</th>\n",
+       "      <td>196059.938997</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Emo</th>\n",
+       "      <td>218370.989519</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Hiphop</th>\n",
+       "      <td>227885.028411</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Pop</th>\n",
+       "      <td>211558.052980</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Rap</th>\n",
+       "      <td>200816.798836</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>RnB</th>\n",
+       "      <td>225628.556955</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Trap Metal</th>\n",
+       "      <td>145940.519467</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Underground Rap</th>\n",
+       "      <td>175506.191224</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>dnb</th>\n",
+       "      <td>288860.138811</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>hardstyle</th>\n",
+       "      <td>232828.626542</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>psytrance</th>\n",
+       "      <td>445770.492075</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>techhouse</th>\n",
+       "      <td>298395.587596</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>techno</th>\n",
+       "      <td>399123.187453</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>trance</th>\n",
+       "      <td>288729.366262</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>trap</th>\n",
+       "      <td>225149.277731</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                   duration_ms\n",
+       "genre                         \n",
+       "Dark Trap        196059.938997\n",
+       "Emo              218370.989519\n",
+       "Hiphop           227885.028411\n",
+       "Pop              211558.052980\n",
+       "Rap              200816.798836\n",
+       "RnB              225628.556955\n",
+       "Trap Metal       145940.519467\n",
+       "Underground Rap  175506.191224\n",
+       "dnb              288860.138811\n",
+       "hardstyle        232828.626542\n",
+       "psytrance        445770.492075\n",
+       "techhouse        298395.587596\n",
+       "techno           399123.187453\n",
+       "trance           288729.366262\n",
+       "trap             225149.277731"
+      ]
+     },
+     "execution_count": 20,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]].groupby(\"genre\").mean()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>duration_ms</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>genre</th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>psytrance</th>\n",
+       "      <td>445770.492075</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>techno</th>\n",
+       "      <td>399123.187453</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>techhouse</th>\n",
+       "      <td>298395.587596</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>dnb</th>\n",
+       "      <td>288860.138811</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>trance</th>\n",
+       "      <td>288729.366262</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>hardstyle</th>\n",
+       "      <td>232828.626542</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Hiphop</th>\n",
+       "      <td>227885.028411</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>RnB</th>\n",
+       "      <td>225628.556955</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>trap</th>\n",
+       "      <td>225149.277731</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Emo</th>\n",
+       "      <td>218370.989519</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Pop</th>\n",
+       "      <td>211558.052980</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Rap</th>\n",
+       "      <td>200816.798836</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Dark Trap</th>\n",
+       "      <td>196059.938997</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Underground Rap</th>\n",
+       "      <td>175506.191224</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Trap Metal</th>\n",
+       "      <td>145940.519467</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                   duration_ms\n",
+       "genre                         \n",
+       "psytrance        445770.492075\n",
+       "techno           399123.187453\n",
+       "techhouse        298395.587596\n",
+       "dnb              288860.138811\n",
+       "trance           288729.366262\n",
+       "hardstyle        232828.626542\n",
+       "Hiphop           227885.028411\n",
+       "RnB              225628.556955\n",
+       "trap             225149.277731\n",
+       "Emo              218370.989519\n",
+       "Pop              211558.052980\n",
+       "Rap              200816.798836\n",
+       "Dark Trap        196059.938997\n",
+       "Underground Rap  175506.191224\n",
+       "Trap Metal       145940.519467"
+      ]
+     },
+     "execution_count": 21,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]].groupby(\"genre\").mean().sort_values(by = \"duration_ms\", ascending = False)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "One way to check whether `groupby` works would be to use `value_counts` on the same column `Series`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Underground Rap    4330\n",
+       "Dark Trap          3590\n",
+       "Hiphop             3027\n",
+       "trance             2804\n",
+       "psytrance          2650\n",
+       "techno             2646\n",
+       "dnb                2507\n",
+       "trap               2362\n",
+       "hardstyle          2351\n",
+       "techhouse          2209\n",
+       "RnB                1905\n",
+       "Trap Metal         1875\n",
+       "Emo                1622\n",
+       "Rap                1546\n",
+       "Pop                 453\n",
+       "Name: genre, dtype: int64"
+      ]
+     },
+     "execution_count": 22,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[\"genre\"].value_counts()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the average duration for each genre ordered based on decreasing order of averages?\n",
+    "#### v2: using SQL query to answer the question"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 551
+    },
+    "id": "89hMTXCKxWG8",
+    "outputId": "5737da11-aa8a-4ed0-9b05-cd379b28904b"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>avg_duration</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>genre</th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>psytrance</th>\n",
+       "      <td>445770.492075</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>techno</th>\n",
+       "      <td>399123.187453</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>techhouse</th>\n",
+       "      <td>298395.587596</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>dnb</th>\n",
+       "      <td>288860.138811</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>trance</th>\n",
+       "      <td>288729.366262</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>hardstyle</th>\n",
+       "      <td>232828.626542</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Hiphop</th>\n",
+       "      <td>227885.028411</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>RnB</th>\n",
+       "      <td>225628.556955</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>trap</th>\n",
+       "      <td>225149.277731</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Emo</th>\n",
+       "      <td>218370.989519</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Pop</th>\n",
+       "      <td>211558.052980</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Rap</th>\n",
+       "      <td>200816.798836</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Dark Trap</th>\n",
+       "      <td>196059.938997</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Underground Rap</th>\n",
+       "      <td>175506.191224</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Trap Metal</th>\n",
+       "      <td>145940.519467</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                  avg_duration\n",
+       "genre                         \n",
+       "psytrance        445770.492075\n",
+       "techno           399123.187453\n",
+       "techhouse        298395.587596\n",
+       "dnb              288860.138811\n",
+       "trance           288729.366262\n",
+       "hardstyle        232828.626542\n",
+       "Hiphop           227885.028411\n",
+       "RnB              225628.556955\n",
+       "trap             225149.277731\n",
+       "Emo              218370.989519\n",
+       "Pop              211558.052980\n",
+       "Rap              200816.798836\n",
+       "Dark Trap        196059.938997\n",
+       "Underground Rap  175506.191224\n",
+       "Trap Metal       145940.519467"
+      ]
+     },
+     "execution_count": 23,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# SQL equivalent query of the above Pandas query\n",
+    "avg_duration_per_genre = qry(\"\"\"\n",
+    "SELECT genre, AVG(duration_ms) as avg_duration\n",
+    "FROM spotify \n",
+    "GROUP BY genre\n",
+    "ORDER BY avg_duration DESC\n",
+    "\"\"\")\n",
+    "\n",
+    "# How can we get make the SQL query output to be exactly same as df.groupby?\n",
+    "avg_duration_per_genre = avg_duration_per_genre.set_index(\"genre\")\n",
+    "avg_duration_per_genre"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "12ZdqYoIy_8U"
+   },
+   "source": [
+    "### What is the average speechiness for each mode, time signature pair?\n",
+    "#### v1: pandas"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 332
+    },
+    "id": "fVejD2KPyveX",
+    "outputId": "fe5c8fda-29a2-4f1a-8ff4-de9ad2a3cde0"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "      <th>speechiness</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th rowspan=\"4\" valign=\"top\">0</th>\n",
+       "      <th>1</th>\n",
+       "      <td>0.181224</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>0.121837</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>0.126688</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>0.204890</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th rowspan=\"4\" valign=\"top\">1</th>\n",
+       "      <th>1</th>\n",
+       "      <td>0.173138</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>0.129512</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>0.139170</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>0.220177</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                     speechiness\n",
+       "mode time_signature             \n",
+       "0    1                  0.181224\n",
+       "     3                  0.121837\n",
+       "     4                  0.126688\n",
+       "     5                  0.204890\n",
+       "1    1                  0.173138\n",
+       "     3                  0.129512\n",
+       "     4                  0.139170\n",
+       "     5                  0.220177"
+      ]
+     },
+     "execution_count": 24,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# use a list to indicate all the columns you want to groupby \n",
+    "df[[\"mode\", \"time_signature\", \"speechiness\"]].groupby([\"mode\", \"time_signature\"]).mean()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 300
+    },
+    "id": "ImYEuOMox-ps",
+    "outputId": "2674dabd-3ff7-4099-fdc3-54e5ba0e2628"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>mode</th>\n",
+       "      <th>time_signature</th>\n",
+       "      <th>avg_speechiness</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0.181224</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>0.121837</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>0</td>\n",
+       "      <td>4</td>\n",
+       "      <td>0.126688</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>0</td>\n",
+       "      <td>5</td>\n",
+       "      <td>0.204890</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0.173138</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>0.129512</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>1</td>\n",
+       "      <td>4</td>\n",
+       "      <td>0.139170</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7</th>\n",
+       "      <td>1</td>\n",
+       "      <td>5</td>\n",
+       "      <td>0.220177</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   mode  time_signature  avg_speechiness\n",
+       "0     0               1         0.181224\n",
+       "1     0               3         0.121837\n",
+       "2     0               4         0.126688\n",
+       "3     0               5         0.204890\n",
+       "4     1               1         0.173138\n",
+       "5     1               3         0.129512\n",
+       "6     1               4         0.139170\n",
+       "7     1               5         0.220177"
+      ]
+     },
+     "execution_count": 25,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# SQL equivalent query of the above Pandas query\n",
+    "qry(\"\"\"\n",
+    "SELECT mode, time_signature, AVG(speechiness) as avg_speechiness\n",
+    "FROM spotify \n",
+    "GROUP BY mode, time_signature\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "sEDc5zGu0bc9"
+   },
+   "source": [
+    "### Self-practice"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Which songs have a tempo greater than 150 and what are their genre?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Symbiote</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>BRAINFOOD</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>18</th>\n",
+       "      <td>FunnyToSeeYouHere</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>19</th>\n",
+       "      <td>Killer</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>20</th>\n",
+       "      <td>608</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35871</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35872</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35874</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35875</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>35876</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>13753 rows × 2 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "               song_name      genre\n",
+       "1               Symbiote  Dark Trap\n",
+       "2              BRAINFOOD  Dark Trap\n",
+       "18     FunnyToSeeYouHere  Dark Trap\n",
+       "19                Killer  Dark Trap\n",
+       "20                   608  Dark Trap\n",
+       "...                  ...        ...\n",
+       "35871                     hardstyle\n",
+       "35872                     hardstyle\n",
+       "35874                     hardstyle\n",
+       "35875                     hardstyle\n",
+       "35876                     hardstyle\n",
+       "\n",
+       "[13753 rows x 2 columns]"
+      ]
+     },
+     "execution_count": 26,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# v1: pandas\n",
+    "fast_songs = df[df[\"tempo\"] > 150]\n",
+    "fast_songs[[\"song_name\", \"genre\"]]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>song_name</th>\n",
+       "      <th>genre</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Symbiote</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>BRAINFOOD</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>FunnyToSeeYouHere</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Killer</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>608</td>\n",
+       "      <td>Dark Trap</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>13748</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>13749</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>13750</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>13751</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>13752</th>\n",
+       "      <td></td>\n",
+       "      <td>hardstyle</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>13753 rows × 2 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "               song_name      genre\n",
+       "0               Symbiote  Dark Trap\n",
+       "1              BRAINFOOD  Dark Trap\n",
+       "2      FunnyToSeeYouHere  Dark Trap\n",
+       "3                 Killer  Dark Trap\n",
+       "4                    608  Dark Trap\n",
+       "...                  ...        ...\n",
+       "13748                     hardstyle\n",
+       "13749                     hardstyle\n",
+       "13750                     hardstyle\n",
+       "13751                     hardstyle\n",
+       "13752                     hardstyle\n",
+       "\n",
+       "[13753 rows x 2 columns]"
+      ]
+     },
+     "execution_count": 27,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# v2: SQL\n",
+    "\n",
+    "qry(\"\"\"\n",
+    "SELECT song_name, genre\n",
+    "FROM spotify\n",
+    "WHERE tempo > 150\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the sum of danceability and liveness for \"Hiphop\" genre songs?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 28,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "15321    0.8416\n",
+       "15322    0.9201\n",
+       "15323    0.8580\n",
+       "15324    0.8240\n",
+       "15325    0.9348\n",
+       "          ...  \n",
+       "18343    0.6690\n",
+       "18344    0.5370\n",
+       "18345    0.8850\n",
+       "18346    0.8770\n",
+       "18347    0.8703\n",
+       "Length: 3027, dtype: float64"
+      ]
+     },
+     "execution_count": 28,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# v1: pandas\n",
+    "hiphop_songs = df[df[\"genre\"] == \"Hiphop\"]\n",
+    "hiphop_songs[\"danceability\"] + hiphop_songs[\"liveness\"]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0       0.8416\n",
+       "1       0.9201\n",
+       "2       0.8580\n",
+       "3       0.8240\n",
+       "4       0.9348\n",
+       "         ...  \n",
+       "3022    0.6690\n",
+       "3023    0.5370\n",
+       "3024    0.8850\n",
+       "3025    0.8770\n",
+       "3026    0.8703\n",
+       "Name: song_score, Length: 3027, dtype: float64"
+      ]
+     },
+     "execution_count": 29,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# v2: SQL\n",
+    "hiphop_songs = qry(\"\"\"\n",
+    "SELECT danceability + liveness as song_score\n",
+    "FROM spotify\n",
+    "WHERE genre = \"Hiphop\"\n",
+    "\"\"\")\n",
+    "hiphop_songs[\"song_score\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Find all song_name ordered by ascending order of duration_ms. Eliminate songs which don't have a song_name"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 30,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "songs_by_duration = list(df.sort_values(by = \"duration_ms\")[\"song_name\"])\n",
+    "# [song for song in songs_by_duration if song != \"\"] # uncomment to see the output"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 31,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2\n",
+    "songs_by_duration = qry(\"\"\"\n",
+    "SELECT song_name\n",
+    "FROM spotify\n",
+    "ORDER BY duration_ms\n",
+    "\"\"\")\n",
+    "songs_by_duration = list(songs_by_duration[\"song_name\"])\n",
+    "# [song for song in songs_by_duration if song != \"\"] # uncomment to see the output"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### How many distinct \"genre\"s are there in the dataset?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 32,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['trance',\n",
+       " 'techno',\n",
+       " 'dnb',\n",
+       " 'Trap Metal',\n",
+       " 'RnB',\n",
+       " 'Pop',\n",
+       " 'psytrance',\n",
+       " 'techhouse',\n",
+       " 'trap',\n",
+       " 'Dark Trap',\n",
+       " 'Emo',\n",
+       " 'Underground Rap',\n",
+       " 'Rap',\n",
+       " 'Hiphop',\n",
+       " 'hardstyle']"
+      ]
+     },
+     "execution_count": 32,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# v1: pandas\n",
+    "list(set(list(df[\"genre\"])))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['Dark Trap',\n",
+       " 'Underground Rap',\n",
+       " 'Trap Metal',\n",
+       " 'Emo',\n",
+       " 'Rap',\n",
+       " 'RnB',\n",
+       " 'Pop',\n",
+       " 'Hiphop',\n",
+       " 'techhouse',\n",
+       " 'techno',\n",
+       " 'trance',\n",
+       " 'psytrance',\n",
+       " 'trap',\n",
+       " 'dnb',\n",
+       " 'hardstyle']"
+      ]
+     },
+     "execution_count": 33,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# v2: SQL\n",
+    "genres = qry(\"\"\"\n",
+    "SELECT DISTINCT genre\n",
+    "FROM spotify\n",
+    "\"\"\")\n",
+    "list(genres[\"genre\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Considering only songs with energy greater than 0.5, what is the maximum energy for each \"genre\" with song count greater than 2000?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 34,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "genre\n",
+       "Dark Trap          0.998\n",
+       "Emo                0.995\n",
+       "Hiphop             0.978\n",
+       "Pop                0.977\n",
+       "Rap                0.980\n",
+       "RnB                0.974\n",
+       "Trap Metal         0.999\n",
+       "Underground Rap    0.997\n",
+       "dnb                0.999\n",
+       "hardstyle          0.999\n",
+       "psytrance          0.999\n",
+       "techhouse          0.999\n",
+       "techno             1.000\n",
+       "trance             1.000\n",
+       "trap               1.000\n",
+       "Name: energy, dtype: float64"
+      ]
+     },
+     "execution_count": 34,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# v1: pandas\n",
+    "high_energy_songs = df[df[\"energy\"] > 0.5]\n",
+    "genre_groups = high_energy_songs[[\"genre\", \"energy\"]].groupby(\"genre\")\n",
+    "max_energy = genre_groups.max()\n",
+    "max_energy[\"energy\"]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 35,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>energy</th>\n",
+       "      <th>energy_max</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>genre</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>Dark Trap</th>\n",
+       "      <td>2757</td>\n",
+       "      <td>0.998</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Hiphop</th>\n",
+       "      <td>2497</td>\n",
+       "      <td>0.978</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Underground Rap</th>\n",
+       "      <td>3420</td>\n",
+       "      <td>0.997</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>dnb</th>\n",
+       "      <td>2496</td>\n",
+       "      <td>0.999</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>hardstyle</th>\n",
+       "      <td>2345</td>\n",
+       "      <td>0.999</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>psytrance</th>\n",
+       "      <td>2642</td>\n",
+       "      <td>0.999</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>techhouse</th>\n",
+       "      <td>2164</td>\n",
+       "      <td>0.999</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>techno</th>\n",
+       "      <td>2534</td>\n",
+       "      <td>1.000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>trance</th>\n",
+       "      <td>2786</td>\n",
+       "      <td>1.000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>trap</th>\n",
+       "      <td>2346</td>\n",
+       "      <td>1.000</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                 energy  energy_max\n",
+       "genre                              \n",
+       "Dark Trap          2757       0.998\n",
+       "Hiphop             2497       0.978\n",
+       "Underground Rap    3420       0.997\n",
+       "dnb                2496       0.999\n",
+       "hardstyle          2345       0.999\n",
+       "psytrance          2642       0.999\n",
+       "techhouse          2164       0.999\n",
+       "techno             2534       1.000\n",
+       "trance             2786       1.000\n",
+       "trap               2346       1.000"
+      ]
+     },
+     "execution_count": 35,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "genre_counts = genre_groups.count()\n",
+    "genre_counts[\"energy_max\"] = max_energy[\"energy\"]\n",
+    "filtered_genre_counts = genre_counts[genre_counts[\"energy\"] > 2000]\n",
+    "filtered_genre_counts"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 36,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>genre</th>\n",
+       "      <th>song_count</th>\n",
+       "      <th>energy_max</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Dark Trap</td>\n",
+       "      <td>2757</td>\n",
+       "      <td>0.998</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Hiphop</td>\n",
+       "      <td>2497</td>\n",
+       "      <td>0.978</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>Underground Rap</td>\n",
+       "      <td>3420</td>\n",
+       "      <td>0.997</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>dnb</td>\n",
+       "      <td>2496</td>\n",
+       "      <td>0.999</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>hardstyle</td>\n",
+       "      <td>2345</td>\n",
+       "      <td>0.999</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>psytrance</td>\n",
+       "      <td>2642</td>\n",
+       "      <td>0.999</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>techhouse</td>\n",
+       "      <td>2164</td>\n",
+       "      <td>0.999</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7</th>\n",
+       "      <td>techno</td>\n",
+       "      <td>2534</td>\n",
+       "      <td>1.000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>8</th>\n",
+       "      <td>trance</td>\n",
+       "      <td>2786</td>\n",
+       "      <td>1.000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>9</th>\n",
+       "      <td>trap</td>\n",
+       "      <td>2346</td>\n",
+       "      <td>1.000</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "             genre  song_count  energy_max\n",
+       "0        Dark Trap        2757       0.998\n",
+       "1           Hiphop        2497       0.978\n",
+       "2  Underground Rap        3420       0.997\n",
+       "3              dnb        2496       0.999\n",
+       "4        hardstyle        2345       0.999\n",
+       "5        psytrance        2642       0.999\n",
+       "6        techhouse        2164       0.999\n",
+       "7           techno        2534       1.000\n",
+       "8           trance        2786       1.000\n",
+       "9             trap        2346       1.000"
+      ]
+     },
+     "execution_count": 36,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# v2: SQL\n",
+    "qry(\"\"\"\n",
+    "SELECT genre, COUNT(*) as song_count, MAX(\"energy\") as energy_max\n",
+    "FROM spotify\n",
+    "WHERE energy > 0.5\n",
+    "GROUP BY genre\n",
+    "HAVING song_count > 2000\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 37,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Close the database connection here\n",
+    "conn.close()"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
diff --git a/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation_template_Gurmail_lec1.ipynb b/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation_template_Gurmail_lec1.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..95dcf3a346b7bba48cd85a92613f73ad24c99f0a
--- /dev/null
+++ b/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation_template_Gurmail_lec1.ipynb
@@ -0,0 +1,810 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Announcements - Wednesday, December 6\n",
+    "* Download ALL files for today's lecture\n",
+    "* Q10 Released tonight at 5 pm\n",
+    "* <b>If you have any problem with P8-P11 grades, please send me (Gurmail.Singh@wisc.edu) an email by December 11.</b>\n",
+    "* Late days may not be used on P13\n",
+    "* If you have questions, it is almost always faster to \n",
+    "  * Post on Piazza\n",
+    "  * Go to [office hours](https://sites.google.com/wisc.edu/cs220-oh-f23/home?pli=1) \n",
+    "### Conflict Form\n",
+    "  * [Final - December 19, 7:45 am](https://cs220.cs.wisc.edu/f23/surveys.html)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "RHvDCo4fhXBx"
+   },
+   "source": [
+    "# Lecture 37 Pandas 3: Data Transformation\n",
+    "* Data transformation is the process of changing the format, structure, or values of data. \n",
+    "* Often needed during data cleaning and sometimes during data analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "yoLGptrqhbBo"
+   },
+   "source": [
+    "# Today's Learning Objectives: \n",
+    "\n",
+    "* Setting column as index for pandas `DataFrame`\n",
+    "* Identify, drop, or fill missing values (`np.NaN`) using Pandas `isna`, `dropna`, and `fillna`\n",
+    "* Applying transformations to `DataFrame`:\n",
+    "  * Use `apply` on pandas `Series` to apply a transformation function\n",
+    "  * Use `replace` to replace all target values in Pandas `Series` and `DataFrame` rows / columns\n",
+    "* Filter, aggregate, group, and summarize information in a `DataFrame` with `groupby`\n",
+    "* Convert .groupby examples to SQL\n",
+    "* Solving the same question using SQL and pandas `DataFrame` manipulations:\n",
+    "  * filtering, grouping, and aggregation / summarization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "CeWtFirwteFY"
+   },
+   "outputs": [],
+   "source": [
+    "# known import statements\n",
+    "import pandas as pd\n",
+    "import sqlite3 as sql # note that we are renaming to sql\n",
+    "import os\n",
+    "\n",
+    "# new import statement\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "FgnTeNRIswsm"
+   },
+   "source": [
+    "# The dataset: Spotify songs\n",
+    "Adapted from https://www.kaggle.com/datasets/mrmorj/dataset-of-songs-in-spotify.\n",
+    "\n",
+    "If you are interested in digging deeper in this dataset, here's a [blog post](https://medium.com/@boplantinga/what-do-spotifys-audio-features-tell-us-about-this-year-s-eurovision-song-contest-66ad188e112a) that explain each column in details.  "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 1: Establish a connection to the spotify.db database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 232
+    },
+    "id": "8y9scvgCnTHl",
+    "outputId": "c72388f8-576c-4cf2-ef51-352cd11b6c92"
+   },
+   "outputs": [],
+   "source": [
+    "# open up the spotify database\n",
+    "db_pathname = \"spotify.db\"\n",
+    "assert ???\n",
+    "conn = sql.connect(db_pathname)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def qry(sql):\n",
+    "    return pd.read_sql(sql, conn)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 2: Identify the table name(s) inside the database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 112
+    },
+    "id": "ybTqbDSOnR2f",
+    "outputId": "8dcc943b-9382-4abb-ef78-6c6d56ad89eb"
+   },
+   "outputs": [],
+   "source": [
+    "df = qry(\"\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 3: Use pandas lookup expression to extract the \"sql\" column and display the full query using .iloc lookup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 4: Store the data inside `spotify` table inside a variable called `df`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 632
+    },
+    "id": "txAH9OIjnoQv",
+    "outputId": "ac9152ba-32df-4fb2-d4e0-a97f50fe58fb"
+   },
+   "outputs": [],
+   "source": [
+    "df = qry(\"\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Setting a column as row indices for the `DataFrame`\n",
+    "\n",
+    "- Syntax: `df.set_index(\"<COLUMN>\")`\n",
+    "- Returns a new DataFrame object instance reference.\n",
+    "- WARNING: executing this twice will result in `KeyError` being thrown. Once you set a column as row index, it will no longer be a column within the `DataFrame`. If you tried this, go back and execute the above cell and update `df` once more and then execute the below cell exactly once."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Set the id column as row indices\n",
+    "df = \n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Not a Number\n",
+    "\n",
+    "- `np.NaN` is the floating point representation of Not a Number\n",
+    "- You do not need to know / learn the details about the `numpy` package \n",
+    "\n",
+    "### Replacing / modifying values within the `DataFrame`\n",
+    "\n",
+    "Syntax: `df.replace(<TARGET>, <REPLACE>)`\n",
+    "- Your target can be `str`, `int`, `float`, `None` (there are other possiblities, but those are too advanced for this course)\n",
+    "- Returns a new DataFrame object instance reference.\n",
+    "\n",
+    "Let's now replace the missing values (empty strings) with `np.NAN`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df = \n",
+    "df.head(10) # title is the album name"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Checking for missing values\n",
+    "\n",
+    "Syntax: `Series.isna()`\n",
+    "- Returns a boolean Series\n",
+    "\n",
+    "Let's check if any of the \"song_name\"(s) are missing"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "JqzSwG5PEZRq",
+    "outputId": "05529a3d-4a5c-4654-fe05-d04b2c10ae6c"
+   },
+   "outputs": [],
+   "source": [
+    "df[\"song_name\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Review: `Pandas.Series.value_counts()`\n",
+    "- Returns a new `Series` with unique values from the original `Series` as keys and the count of those unique values as values. \n",
+    "- Return value `Series` is ordered using descending order of counts"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "uCLDr8EIGMeJ",
+    "outputId": "241d6181-d525-4019-a8f2-689939b2ab33"
+   },
+   "outputs": [],
+   "source": [
+    "# count the number of missing values for song name\n",
+    "df[\"song_name\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Missing value manipulation\n",
+    "Syntax: `df.fillna(<REPLACE>)`\n",
+    "- Returns a new DataFrame object instance reference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "pJ2CIqq9HWvN",
+    "outputId": "2895e862-18e5-4742-9750-31b130aae668"
+   },
+   "outputs": [],
+   "source": [
+    "# use .fillna to replace missing values\n",
+    "df[\"song_name\"]\n",
+    "\n",
+    "# to replace the original DataFrame's column, you need to explicitly update that object instance\n",
+    "# TODO: uncomment the below lines and update the code\n",
+    "#df[\"song_name\"] = ???\n",
+    "#df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Dropping missing values\n",
+    "Syntax: `df.dropna()`\n",
+    "- Returns a new DataFrame object instance reference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 145
+    },
+    "id": "O_1ZeHG8N-rB",
+    "outputId": "3b112da2-2b3c-4fb8-c7ae-dc2f2127856d"
+   },
+   "outputs": [],
+   "source": [
+    "# .dropna will drop all rows that contain NaN in them\n",
+    "df.dropna()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "ggttXEqUbI_E"
+   },
+   "source": [
+    "### Review: `Pandas.Series.apply(...)`\n",
+    "Syntax: `Series.apply(<FUNCTION OBJECT REFERENCE>)`\n",
+    "- applies input function to every element of the Series.\n",
+    "- Returns a new `Series` object instance reference.\n",
+    "\n",
+    "Let's apply transformation function to `mode` column `Series`:\n",
+    "- mode = 1 means major modality (sounds happy)\n",
+    "- mode = 0 means minor modality (sounds sad)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def replace_mode(m): \n",
+    "    if m == 1: \n",
+    "        return \"major\"\n",
+    "    else: \n",
+    "        return \"minor\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[\"mode\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### `lambda`\n",
+    "\n",
+    "Let's write a `lambda` function instead of the `replace_mode` function"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "9AJ3p-_TarnN",
+    "outputId": "a087df5d-2002-417c-e99c-5e6fc8ea9809"
+   },
+   "outputs": [],
+   "source": [
+    "df[\"mode\"].apply(???)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Typically transformed columns are added as new columns within the DataFrame.\n",
+    "Let's add a new `modified_mode` column."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[\"modified_mode\"] = df[\"mode\"].apply(lambda m: \"major\" if m == 1 else \"minor\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Let's go back to the original table from the SQL database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "ZoiyUleiyhMg"
+   },
+   "outputs": [],
+   "source": [
+    "df = qry(\"SELECT * FROM spotify\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Extract just the \"genre\" and \"duration_ms\" columns from `df`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[???]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### `Pandas.DataFrame.groupby(...)`\n",
+    "\n",
+    "Syntax: `DataFrame.groupby(<COLUMN>)`\n",
+    "- Returns a `groupby` object instance reference\n",
+    "- Need to apply aggregation methods to use the return value of `groupby`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 551
+    },
+    "id": "trRMgGMysdkb",
+    "outputId": "d02098c3-7722-4505-c599-5897bb8ace19"
+   },
+   "outputs": [],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the average duration for each genre ordered based on decreasing order of averages?\n",
+    "#### v1: using `df` (`pandas`) to answer the question"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "One way to check whether `groupby` works would be to use `value_counts` on the same column `Series`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[\"genre\"].value_counts()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the average duration for each genre ordered based on decreasing order of averages?\n",
+    "#### v2: using SQL query to answer the question"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 551
+    },
+    "id": "89hMTXCKxWG8",
+    "outputId": "5737da11-aa8a-4ed0-9b05-cd379b28904b"
+   },
+   "outputs": [],
+   "source": [
+    "# SQL equivalent query of the above Pandas query\n",
+    "avg_duration_per_genre = qry(\"\"\"\n",
+    "\n",
+    "\"\"\")\n",
+    "\n",
+    "# How can we get make the SQL query output to be exactly same as df.groupby?\n",
+    "avg_duration_per_genre = avg_duration_per_genre.set_index(\"genre\")\n",
+    "avg_duration_per_genre"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "12ZdqYoIy_8U"
+   },
+   "source": [
+    "### What is the average speechiness for each mode, time signature pair?\n",
+    "#### v1: pandas"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 332
+    },
+    "id": "fVejD2KPyveX",
+    "outputId": "fe5c8fda-29a2-4f1a-8ff4-de9ad2a3cde0"
+   },
+   "outputs": [],
+   "source": [
+    "# use a list to indicate all the columns you want to groupby \n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 300
+    },
+    "id": "ImYEuOMox-ps",
+    "outputId": "2674dabd-3ff7-4099-fdc3-54e5ba0e2628"
+   },
+   "outputs": [],
+   "source": [
+    "# SQL equivalent query of the above Pandas query\n",
+    "qry(\"\"\"\n",
+    "\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "sEDc5zGu0bc9"
+   },
+   "source": [
+    "### Self-practice"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Which songs have a tempo greater than 150 and what are their genre?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "fast_songs = "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2: SQL\n",
+    "\n",
+    "qry(\"\"\"\n",
+    "\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the sum of danceability and liveness for \"Hiphop\" genre songs?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "hiphop_songs = "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2: SQL\n",
+    "hiphop_songs = qry(\"\"\"\n",
+    "\n",
+    "\"\"\")\n",
+    "hiphop_songs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Find all song_name ordered by ascending order of duration_ms. Eliminate songs which don't have a song_name"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "songs_by_duration = "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2\n",
+    "songs_by_duration = qry(\"\"\"\n",
+    "\n",
+    "\"\"\")\n",
+    "songs_by_duration"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### How many distinct \"genre\"s are there in the dataset?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2: SQL\n",
+    "genres = qry(\"\"\"\n",
+    "\n",
+    "\"\"\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Considering only songs with energy greater than 0.5, what is the maximum energy for each \"genre\" with song count greater than 2000?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "genre_groups = "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "high_energy_songs = ???\n",
+    "genre_groups = ???\n",
+    "max_energy = ???\n",
+    "max_energy[\"energy\"]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "genre_counts = ???\n",
+    "genre_counts[\"energy_max\"] = max_energy[\"energy\"]\n",
+    "filtered_genre_counts = ???\n",
+    "filtered_genre_counts"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2: SQL\n",
+    "qry(\"\"\"\n",
+    "\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Close the database connection here\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
diff --git a/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation_template_Gurmail_lec2.ipynb b/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation_template_Gurmail_lec2.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..95dcf3a346b7bba48cd85a92613f73ad24c99f0a
--- /dev/null
+++ b/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/lec_37_pandas3_data_transformation_template_Gurmail_lec2.ipynb
@@ -0,0 +1,810 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Announcements - Wednesday, December 6\n",
+    "* Download ALL files for today's lecture\n",
+    "* Q10 Released tonight at 5 pm\n",
+    "* <b>If you have any problem with P8-P11 grades, please send me (Gurmail.Singh@wisc.edu) an email by December 11.</b>\n",
+    "* Late days may not be used on P13\n",
+    "* If you have questions, it is almost always faster to \n",
+    "  * Post on Piazza\n",
+    "  * Go to [office hours](https://sites.google.com/wisc.edu/cs220-oh-f23/home?pli=1) \n",
+    "### Conflict Form\n",
+    "  * [Final - December 19, 7:45 am](https://cs220.cs.wisc.edu/f23/surveys.html)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "RHvDCo4fhXBx"
+   },
+   "source": [
+    "# Lecture 37 Pandas 3: Data Transformation\n",
+    "* Data transformation is the process of changing the format, structure, or values of data. \n",
+    "* Often needed during data cleaning and sometimes during data analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "yoLGptrqhbBo"
+   },
+   "source": [
+    "# Today's Learning Objectives: \n",
+    "\n",
+    "* Setting column as index for pandas `DataFrame`\n",
+    "* Identify, drop, or fill missing values (`np.NaN`) using Pandas `isna`, `dropna`, and `fillna`\n",
+    "* Applying transformations to `DataFrame`:\n",
+    "  * Use `apply` on pandas `Series` to apply a transformation function\n",
+    "  * Use `replace` to replace all target values in Pandas `Series` and `DataFrame` rows / columns\n",
+    "* Filter, aggregate, group, and summarize information in a `DataFrame` with `groupby`\n",
+    "* Convert .groupby examples to SQL\n",
+    "* Solving the same question using SQL and pandas `DataFrame` manipulations:\n",
+    "  * filtering, grouping, and aggregation / summarization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "CeWtFirwteFY"
+   },
+   "outputs": [],
+   "source": [
+    "# known import statements\n",
+    "import pandas as pd\n",
+    "import sqlite3 as sql # note that we are renaming to sql\n",
+    "import os\n",
+    "\n",
+    "# new import statement\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "FgnTeNRIswsm"
+   },
+   "source": [
+    "# The dataset: Spotify songs\n",
+    "Adapted from https://www.kaggle.com/datasets/mrmorj/dataset-of-songs-in-spotify.\n",
+    "\n",
+    "If you are interested in digging deeper in this dataset, here's a [blog post](https://medium.com/@boplantinga/what-do-spotifys-audio-features-tell-us-about-this-year-s-eurovision-song-contest-66ad188e112a) that explain each column in details.  "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 1: Establish a connection to the spotify.db database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 232
+    },
+    "id": "8y9scvgCnTHl",
+    "outputId": "c72388f8-576c-4cf2-ef51-352cd11b6c92"
+   },
+   "outputs": [],
+   "source": [
+    "# open up the spotify database\n",
+    "db_pathname = \"spotify.db\"\n",
+    "assert ???\n",
+    "conn = sql.connect(db_pathname)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def qry(sql):\n",
+    "    return pd.read_sql(sql, conn)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 2: Identify the table name(s) inside the database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 112
+    },
+    "id": "ybTqbDSOnR2f",
+    "outputId": "8dcc943b-9382-4abb-ef78-6c6d56ad89eb"
+   },
+   "outputs": [],
+   "source": [
+    "df = qry(\"\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 3: Use pandas lookup expression to extract the \"sql\" column and display the full query using .iloc lookup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### WARMUP 4: Store the data inside `spotify` table inside a variable called `df`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 632
+    },
+    "id": "txAH9OIjnoQv",
+    "outputId": "ac9152ba-32df-4fb2-d4e0-a97f50fe58fb"
+   },
+   "outputs": [],
+   "source": [
+    "df = qry(\"\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Setting a column as row indices for the `DataFrame`\n",
+    "\n",
+    "- Syntax: `df.set_index(\"<COLUMN>\")`\n",
+    "- Returns a new DataFrame object instance reference.\n",
+    "- WARNING: executing this twice will result in `KeyError` being thrown. Once you set a column as row index, it will no longer be a column within the `DataFrame`. If you tried this, go back and execute the above cell and update `df` once more and then execute the below cell exactly once."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Set the id column as row indices\n",
+    "df = \n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Not a Number\n",
+    "\n",
+    "- `np.NaN` is the floating point representation of Not a Number\n",
+    "- You do not need to know / learn the details about the `numpy` package \n",
+    "\n",
+    "### Replacing / modifying values within the `DataFrame`\n",
+    "\n",
+    "Syntax: `df.replace(<TARGET>, <REPLACE>)`\n",
+    "- Your target can be `str`, `int`, `float`, `None` (there are other possiblities, but those are too advanced for this course)\n",
+    "- Returns a new DataFrame object instance reference.\n",
+    "\n",
+    "Let's now replace the missing values (empty strings) with `np.NAN`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df = \n",
+    "df.head(10) # title is the album name"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Checking for missing values\n",
+    "\n",
+    "Syntax: `Series.isna()`\n",
+    "- Returns a boolean Series\n",
+    "\n",
+    "Let's check if any of the \"song_name\"(s) are missing"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "JqzSwG5PEZRq",
+    "outputId": "05529a3d-4a5c-4654-fe05-d04b2c10ae6c"
+   },
+   "outputs": [],
+   "source": [
+    "df[\"song_name\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Review: `Pandas.Series.value_counts()`\n",
+    "- Returns a new `Series` with unique values from the original `Series` as keys and the count of those unique values as values. \n",
+    "- Return value `Series` is ordered using descending order of counts"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "uCLDr8EIGMeJ",
+    "outputId": "241d6181-d525-4019-a8f2-689939b2ab33"
+   },
+   "outputs": [],
+   "source": [
+    "# count the number of missing values for song name\n",
+    "df[\"song_name\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Missing value manipulation\n",
+    "Syntax: `df.fillna(<REPLACE>)`\n",
+    "- Returns a new DataFrame object instance reference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "pJ2CIqq9HWvN",
+    "outputId": "2895e862-18e5-4742-9750-31b130aae668"
+   },
+   "outputs": [],
+   "source": [
+    "# use .fillna to replace missing values\n",
+    "df[\"song_name\"]\n",
+    "\n",
+    "# to replace the original DataFrame's column, you need to explicitly update that object instance\n",
+    "# TODO: uncomment the below lines and update the code\n",
+    "#df[\"song_name\"] = ???\n",
+    "#df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Dropping missing values\n",
+    "Syntax: `df.dropna()`\n",
+    "- Returns a new DataFrame object instance reference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 145
+    },
+    "id": "O_1ZeHG8N-rB",
+    "outputId": "3b112da2-2b3c-4fb8-c7ae-dc2f2127856d"
+   },
+   "outputs": [],
+   "source": [
+    "# .dropna will drop all rows that contain NaN in them\n",
+    "df.dropna()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "ggttXEqUbI_E"
+   },
+   "source": [
+    "### Review: `Pandas.Series.apply(...)`\n",
+    "Syntax: `Series.apply(<FUNCTION OBJECT REFERENCE>)`\n",
+    "- applies input function to every element of the Series.\n",
+    "- Returns a new `Series` object instance reference.\n",
+    "\n",
+    "Let's apply transformation function to `mode` column `Series`:\n",
+    "- mode = 1 means major modality (sounds happy)\n",
+    "- mode = 0 means minor modality (sounds sad)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def replace_mode(m): \n",
+    "    if m == 1: \n",
+    "        return \"major\"\n",
+    "    else: \n",
+    "        return \"minor\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[\"mode\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### `lambda`\n",
+    "\n",
+    "Let's write a `lambda` function instead of the `replace_mode` function"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "9AJ3p-_TarnN",
+    "outputId": "a087df5d-2002-417c-e99c-5e6fc8ea9809"
+   },
+   "outputs": [],
+   "source": [
+    "df[\"mode\"].apply(???)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Typically transformed columns are added as new columns within the DataFrame.\n",
+    "Let's add a new `modified_mode` column."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[\"modified_mode\"] = df[\"mode\"].apply(lambda m: \"major\" if m == 1 else \"minor\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Let's go back to the original table from the SQL database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "ZoiyUleiyhMg"
+   },
+   "outputs": [],
+   "source": [
+    "df = qry(\"SELECT * FROM spotify\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Extract just the \"genre\" and \"duration_ms\" columns from `df`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[???]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### `Pandas.DataFrame.groupby(...)`\n",
+    "\n",
+    "Syntax: `DataFrame.groupby(<COLUMN>)`\n",
+    "- Returns a `groupby` object instance reference\n",
+    "- Need to apply aggregation methods to use the return value of `groupby`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 551
+    },
+    "id": "trRMgGMysdkb",
+    "outputId": "d02098c3-7722-4505-c599-5897bb8ace19"
+   },
+   "outputs": [],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the average duration for each genre ordered based on decreasing order of averages?\n",
+    "#### v1: using `df` (`pandas`) to answer the question"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[[\"genre\", \"duration_ms\"]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "One way to check whether `groupby` works would be to use `value_counts` on the same column `Series`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[\"genre\"].value_counts()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the average duration for each genre ordered based on decreasing order of averages?\n",
+    "#### v2: using SQL query to answer the question"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 551
+    },
+    "id": "89hMTXCKxWG8",
+    "outputId": "5737da11-aa8a-4ed0-9b05-cd379b28904b"
+   },
+   "outputs": [],
+   "source": [
+    "# SQL equivalent query of the above Pandas query\n",
+    "avg_duration_per_genre = qry(\"\"\"\n",
+    "\n",
+    "\"\"\")\n",
+    "\n",
+    "# How can we get make the SQL query output to be exactly same as df.groupby?\n",
+    "avg_duration_per_genre = avg_duration_per_genre.set_index(\"genre\")\n",
+    "avg_duration_per_genre"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "12ZdqYoIy_8U"
+   },
+   "source": [
+    "### What is the average speechiness for each mode, time signature pair?\n",
+    "#### v1: pandas"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 332
+    },
+    "id": "fVejD2KPyveX",
+    "outputId": "fe5c8fda-29a2-4f1a-8ff4-de9ad2a3cde0"
+   },
+   "outputs": [],
+   "source": [
+    "# use a list to indicate all the columns you want to groupby \n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 300
+    },
+    "id": "ImYEuOMox-ps",
+    "outputId": "2674dabd-3ff7-4099-fdc3-54e5ba0e2628"
+   },
+   "outputs": [],
+   "source": [
+    "# SQL equivalent query of the above Pandas query\n",
+    "qry(\"\"\"\n",
+    "\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "sEDc5zGu0bc9"
+   },
+   "source": [
+    "### Self-practice"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Which songs have a tempo greater than 150 and what are their genre?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "fast_songs = "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2: SQL\n",
+    "\n",
+    "qry(\"\"\"\n",
+    "\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### What is the sum of danceability and liveness for \"Hiphop\" genre songs?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "hiphop_songs = "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2: SQL\n",
+    "hiphop_songs = qry(\"\"\"\n",
+    "\n",
+    "\"\"\")\n",
+    "hiphop_songs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Find all song_name ordered by ascending order of duration_ms. Eliminate songs which don't have a song_name"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "songs_by_duration = "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2\n",
+    "songs_by_duration = qry(\"\"\"\n",
+    "\n",
+    "\"\"\")\n",
+    "songs_by_duration"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### How many distinct \"genre\"s are there in the dataset?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2: SQL\n",
+    "genres = qry(\"\"\"\n",
+    "\n",
+    "\"\"\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Considering only songs with energy greater than 0.5, what is the maximum energy for each \"genre\" with song count greater than 2000?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "genre_groups = "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v1: pandas\n",
+    "high_energy_songs = ???\n",
+    "genre_groups = ???\n",
+    "max_energy = ???\n",
+    "max_energy[\"energy\"]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "genre_counts = ???\n",
+    "genre_counts[\"energy_max\"] = max_energy[\"energy\"]\n",
+    "filtered_genre_counts = ???\n",
+    "filtered_genre_counts"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# v2: SQL\n",
+    "qry(\"\"\"\n",
+    "\n",
+    "\"\"\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Close the database connection here\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
diff --git a/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/spotify.db b/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/spotify.db
new file mode 100644
index 0000000000000000000000000000000000000000..a0e53761991a54fc8804d2b98bcc34ac4d99b70f
Binary files /dev/null and b/f23/Gurmail_Lecture_Notes/37N_Advanced_pandas_topics/spotify.db differ