diff --git a/lab-p10/README.md b/lab-p10/README.md new file mode 100644 index 0000000000000000000000000000000000000000..dd115ef6fd5bdd9994e9ba5cb2a9cad7f493f473 --- /dev/null +++ b/lab-p10/README.md @@ -0,0 +1,71 @@ +# Lab-P10: Files and Namedtuples + +In this lab, you'll get practice with files and namedtuples, in preparation for P10. + +----------------------------- +## Corrections/Clarifications + + +**Find any issues?** Please report to us: + +- Ashwin Maran <amaran@wisc.edu> + +------------------------------ +## Learning Objectives + +In this lab, you will practice... +* loading data in json files +* loading data in csv files +* using try/except to handle malformed data + +------------------------------ + +## Note on Academic Misconduct + +You may do these lab exercises only with your project partner; you are not allowed to start +working on Lab-P10 with one person, then do the project with a different partner. Now may be a +good time to review [our course policies](https://cs220.cs.wisc.edu/f23/syllabus.html). + +**Important:** P10 and P11 are two parts of the same data analysis. +You **cannot** switch project partners between these two projects. +If you partner up with someone for P10, you have to sustain that partnership until end of P11. +**You must acknowledge that you have read this to your lab TA**. + +------------------------------ + +## Segment 1: Setup + +Create a `lab-p10` directory and download the following files into the `lab-p10` directory. + +* `small_data.zip` +* `lab-p10.ipynb` +* `public_tests.py` + +After downloading data.zip, make sure to extract it (using [Mac directions](http://osxdaily.com/2017/11/05/how-open-zip-file-mac/) or [Windows directions](https://support.microsoft.com/en-us/help/4028088/windows-zip-and-unzip-files)). After extracting, you need to make sure that the project files are stored in the following structure: + +``` ++-- lab-p10.ipynb ++-- public_tests.py ++-- small_data +| +-- .DS_Store +| +-- .ipynb_checkpoints +| +-- mapping_1.json +| +-- mapping_2.json +| +-- mapping_3.json +| +-- planets_1.csv +| +-- planets_2.csv +| +-- planets_3.csv +| +-- stars_1.csv +| +-- stars_2.csv +| +-- stars_3.csv +``` + +Make sure that the files inside `small_data.zip` are inside the `small_data` directory. You may delete `small_data.zip` after extracting these files from it. + + +## Segment 2: +For the remaining segments, detailed instructions are provided in `lab-p10.ipynb`. From the terminal, open a `jupyter notebook` session, open your `lab-p10.ipynb`, and follow the instructions in `lab-p10.ipynb`. + +## Project 10 + +You can now get started with [p10](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f23-projects/-/tree/main/p10). **You may copy/paste any code created here in project P10**. Have fun! diff --git a/lab-p10/lab-p10.ipynb b/lab-p10/lab-p10.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..6c9ef8eec942f267f6724d5195a7f8ef3d356fb4 --- /dev/null +++ b/lab-p10/lab-p10.ipynb @@ -0,0 +1,3499 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": null, + "id": "cfa86fb2", + "metadata": { + "cell_type": "code", + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "# import and initialize otter\n", + "import otter\n", + "grader = otter.Notebook(\"lab-p10.ipynb\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "894307b9", + "metadata": { + "editable": false, + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.315665Z", + "iopub.status.busy": "2023-11-04T18:31:20.315665Z", + "iopub.status.idle": "2023-11-04T18:31:20.598466Z", + "shell.execute_reply": "2023-11-04T18:31:20.597454Z" + } + }, + "outputs": [], + "source": [ + "import public_tests" + ] + }, + { + "cell_type": "markdown", + "id": "c6057033", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "# Lab-P10: File Handling and Namedtuples" + ] + }, + { + "cell_type": "markdown", + "id": "2ce1ad9b", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Learning Objectives:\n", + "\n", + "In this lab, you will practice how to...\n", + "* use the `os` module to handle files,\n", + "* load data in json files,\n", + "* combine data from different files to create data structures,\n", + "* create named tuples,\n", + "* use `try/except` to handle malformed data." + ] + }, + { + "cell_type": "markdown", + "id": "89f6242d", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "<h2 style=\"color:red\">Warning (Note on Academic Misconduct):</h2>\n", + "\n", + "**IMPORTANT**: **P10 and P11 are two parts of the same data analysis.** You **cannot** switch project partners between these two projects. That is if you partner up with someone for P10, you have to sustain that partnership until end of P11. **You must acknowledge to the Lab TA to receive lab attendance credit.**\n", + "Be careful not to work with more than one partner. If you work with a partner on Lab-P10, you are **not** allowed to finish your project with a different partner.\n", + "You may either continue to work with the same partner, or work on P10 and P11 alone. Now may be a good time to review [our course policies](https://cs220.cs.wisc.edu/f23/syllabus.html).\n", + "\n", + "Under any circumstances, **no more than two students are allowed to work together on a project** as mentioned in the course policies. If your code is flagged by our code similarity detection tools, **both partners will be responsible** for sharing/copying the code, even if the code is shared/copied by one of the partners with/from other non-partner student(s). Note that each case of plagiarism will be reported to the Dean of Students with a zero grade on the project. **If you think that someone cannot be your project partner then don’t make that student your lab partner.**" + ] + }, + { + "cell_type": "markdown", + "id": "831fd765", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Setup:\n", + "\n", + "Before proceeding much further, download `small_data.zip` and extract it to a directory on your\n", + "computer (using [Mac directions](http://osxdaily.com/2017/11/05/how-open-zip-file-mac/) or\n", + "[Windows directions](https://support.microsoft.com/en-us/help/4028088/windows-zip-and-unzip-files)).\n", + "\n", + "**Warning:** You need to make sure that the project files are stored in the following structure:\n", + "\n", + "```\n", + "+-- lab-p10.ipynb\n", + "+-- public_tests.py\n", + "+-- small_data\n", + "| +-- .DS_Store\n", + "| +-- .ipynb_checkpoints\n", + "| +-- mapping_1.json\n", + "| +-- mapping_2.json\n", + "| +-- mapping_3.json\n", + "| +-- planets_1.csv\n", + "| +-- planets_2.csv\n", + "| +-- planets_3.csv\n", + "| +-- stars_1.csv\n", + "| +-- stars_2.csv\n", + "| +-- stars_3.csv\n", + "```\n", + "\n", + "Make sure that the files inside `small_data.zip` are inside the `small_data` directory." + ] + }, + { + "cell_type": "markdown", + "id": "77319e8a", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Introduction:\n", + "\n", + "In P10 and P11, we will be studying stars and planets outside our Solar System using this dataset from the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/cgi-bin/TblView/nph-tblView?app=ExoTbls&config=PSCompPars). We will use Python to ask some interesting questions about the laws of the universe and explore the habitability of other planets in our universe.\n", + "\n", + "In Lab-P10, you will work with a small subset of the full dataset. You can find these files inside `small_data.zip`. The full dataset used in P10 and P11 is stored in the same format, so you can then use this code to parse the dataset in P10 and P11." + ] + }, + { + "cell_type": "markdown", + "id": "98060e97", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## The Data:\n", + "\n", + "You can open each of the files inside the `small_data` directory using Microsoft Excel or some other Spreadsheet viewing software to see how the data is stored. For example, these are the contents of the file `stars_1.csv`:\n", + "\n", + "|Star Name|Spectral Type|Stellar Effective Temperature [K]|Stellar Radius [Solar Radius]|Stellar Mass [Solar mass]|Stellar Luminosity [log(Solar)]|Stellar Surface Gravity [log10(cm/s**2)]|Stellar Age [Gyr]|\n", + "|----|-------------|---------------------------------|-----------------------------|-------------------------|-------------------------------|----------------------------------------|-----------------|\n", + "|55 Cnc|G8V|5172.00|0.94|0.91|-0.197|4.43|10.200|\n", + "|DMPP-1|F8V|6196.00|1.26|1.21|0.320|4.41|2.010|\n", + "|GJ 876|M2.5V|3271.00|0.30|0.32|-1.907|4.87|1.000|\n", + "\n", + "As you might have already guessed, this file contains data on a number of *stars* outside our solar system along with some important statistics about these stars. The columns here are as follows:\n", + "\n", + "- `Star Name`: The **name** given to the star by the *International Astronomical Union*,\n", + "- `Spectral Type`: The **Spectral Classification** of the star as per the *Morgan–Keenan (MK) system*,\n", + "- `Stellar Effective Temperature [K]`: The **temperature** of a *black body* (in units of Kelvin) that would emit the *observed radiation* of the star,\n", + "- `Stellar Radius [Solar Radius]`: The **radius** of the star (in units of the radius of the Sun),\n", + "- `Stellar Mass [Solar mass]`: The **mass** of the star (in units of the mass of the Sun),\n", + "- `Stellar Luminosity [log(Solar)]`: The *total* amount of **energy radiated** by the star **each second** (represented by the logarithm of the energy radiated by the Sun in each second),\n", + "- `Stellar Surface Gravity [log10(cm/s**2)]`: The **acceleration due to the gravity** of the Star at its *surface* (represented by the logarithm of the acceleration measured in centimeter per second squared),\n", + "- `Stellar Age [Gyr]`: The total **age** of the star (in units of Giga years, i.e., billions of years).\n", + "\n", + "The two other files `stars_2.csv`, and `stars_3.csv` also store similar data in the same format. At this stage, it is alright if you do not understand what these columns mean - they will be explained to you when they become necessary (in P10 and P11)." + ] + }, + { + "cell_type": "markdown", + "id": "7214a1ea", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "On the other hand, here are the contents of the file `planets_1.csv`:\n", + "\n", + "|Planet Name|Discovery Method|Discovery Year|Controversial Flag|Orbital Period [days]|Planet Radius [Earth Radius]|Planet Mass [Earth Mass]|Orbit Semi-Major Axis [au]|Eccentricity|Equilibrium Temperature [K]|Insolation Flux [Earth Flux]|\n", + "|-----------|----------------|--------------|------------------|---------------------|----------------------------|------------------------|---------------------------|------------|---------------------------|----------------------------|\n", + "|55 Cnc b|Radial Velocity|1996|0|14.65160000|13.900|263.97850|0.113400|0.000000|700||\n", + "|55 Cnc c|Radial Velocity|2004|0|44.39890000|8.510|54.47380|0.237300|0.030000|||\n", + "|DMPP-1 b|Radial Velocity|2019|0|18.57000000|5.290|24.27000|0.146200|0.083000|877||\n", + "|GJ 876 b|Radial Velocity|1998|0|61.11660000|13.300|723.22350|0.208317|0.032400|||\n", + "|GJ 876 c|Radial Velocity|2000|0|30.08810000|14.000|226.98460|0.129590|0.255910|||\n", + "\n", + "\n", + "This file contains data on a number of *planets* outside our solar system along with some important statistics about these planets. The columns here are as follows:\n", + "\n", + "- `Planet Name`: The **name** given to the planet by the *International Astronomical Union*,\n", + "- `Discovery Method`: The **method** by which the planet was *discovered*,\n", + "- `Discovery Year`: The **year** in which the planet was *discovered*,\n", + "- `Controversial Flag`: Indicates whether the status of the discovered object as a planet was **disputed** at the time of discovery, \n", + "- `Orbital Period [days]`: The amount of **time** (in units of days) it takes for the planet to **complete one orbit** around its star,\n", + "- `Planet Radius [Earth Radius]`: The **radius** of the planet (in units of the radius of the Earth),\n", + "- `Planet Mass [Earth Mass]`: The **mass** of the planet (in units of the mass of the Earth),\n", + "- `Orbit Semi-Major Axis [au]`: The **semi-major axis** of the planet's elliptical **orbit** around its host star (in units of Astronomical Units),\n", + "- `Eccentricity`: The **eccentricity** of the planet's orbit around its host star,\n", + "- `Equilibrium Temperature [K]`: The **temperature** of the planet (in units of Kelvin) if it were a *black body* heated only by its host star,\n", + "- `Insolation Flux [Earth Flux]`: The amount of **radiation** the planet received from its host star **per unit of area** (in units of the Insolation Flux of the Earth from the Sun).\n", + "\n", + "The two other files `planets_2.csv`, and `planets_3.csv` also store similar data in the same format." + ] + }, + { + "cell_type": "markdown", + "id": "8bb8727d", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "Finally, if you take a look at `mapping_1.json` (you can open json files using any Text Editor), you will see that the file looks like this:\n", + "\n", + "```python\n", + "{\"55 Cnc b\":\"55 Cnc\",\"55 Cnc c\":\"55 Cnc\",\"DMPP-1 b\":\"DMPP-1\",\"GJ 876 b\":\"GJ 876\",\"GJ 876 c\":\"GJ 876\"}\n", + "```\n", + "\n", + "This file contains a *mapping* from each *planet* in `planets_1.csv` to the *star* in `stars_1.csv` that the planet orbits. Similarly, `mapping_2.json` contains a *mapping* from each *planet* in `planets_2.csv` to the *star* in `stars_2.csv` that the planet orbits, and `mapping_3.json` contains a *mapping* from each *planet* in `planets_3.csv` to the *star* in `stars_3.csv` that the planet orbits." + ] + }, + { + "cell_type": "markdown", + "id": "fb233a29", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Questions and Functions:\n", + "\n", + "Let us start by importing all the modules we will need for this project." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "48449d80", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.604464Z", + "iopub.status.busy": "2023-11-04T18:31:20.604464Z", + "iopub.status.idle": "2023-11-04T18:31:20.612646Z", + "shell.execute_reply": "2023-11-04T18:31:20.611632Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# it is considered a good coding practice to place all import statements at the top of the notebook\n", + "# place all your import statements in this cell if you need to import any more modules for this project\n", + "\n", + "# we have imported these modules for you\n", + "import os\n", + "from collections import namedtuple\n", + "import csv\n", + "import json" + ] + }, + { + "cell_type": "markdown", + "id": "bc993948", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Segment 2: File handling with the `os` module\n", + "\n", + "In this segment, you will learn how to use the `os` module effectively." + ] + }, + { + "cell_type": "markdown", + "id": "fc4413e2", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 1**: List **all** the files and directories in the directory `small_data` using the `os.listdir` function.\n", + "\n", + "Your output **must** be a **list** of **strings**. The order does **not** matter." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a2539165", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.617648Z", + "iopub.status.busy": "2023-11-04T18:31:20.617648Z", + "iopub.status.idle": "2023-11-04T18:31:20.629189Z", + "shell.execute_reply": "2023-11-04T18:31:20.628176Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "\n", + "all_files = os.listdir('small_data')\n", + "\n", + "all_files" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "734a79cd", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q1\")" + ] + }, + { + "cell_type": "markdown", + "id": "2b87295e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Important Warning:** That appeared to work just fine, but you should be **very careful** when using the `os` module. You might have noticed that there are files and directories in the list returned by `os.listdir` that **begin** with the character `\".\"` (specifically in this case, the file `\".DS_Store\"` and the directory `\".ipynb_checkpoints\"`). Such files and directories are used by some operating systems to store metadata. These files are not actually a part of your dataset, and must be **ignored**. \n", + "\n", + "When you are processing the files in any directory, you **must** always **ignore** such files that begin with the character `\".\"`, as they are not actually files in the directory. You **must** do this every time you use `os.listdir`." + ] + }, + { + "cell_type": "markdown", + "id": "288e7746", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 2**: List **all** the files and directories in the directory `small_data` that do **not** **start with** the character`\".\"`.\n", + "\n", + "Your output **must** be a **list** of **strings**. The order does **not** matter." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9283d54a", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.648992Z", + "iopub.status.busy": "2023-11-04T18:31:20.648992Z", + "iopub.status.idle": "2023-11-04T18:31:20.657926Z", + "shell.execute_reply": "2023-11-04T18:31:20.656914Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'actual_files', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2e7dfe5e", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q2\")" + ] + }, + { + "cell_type": "markdown", + "id": "404b6b33", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Important Warning:** You are not done yet. Look at the order in which the files in the **list** `actual_files` are stored. The **ordering** of the files in the **list** returned by `os.listdir` **depends on the operating system**. This means that if you run this code on a **different OS**, the files might be sorted in a **different order**. This makes `os.listdir` a little dangerous because you could write something like `actual_files[0]` in your code, and it will always work the same way on your computer, but will **behave differently on another computer**. To avoid these issues, you should make sure that you always **explicitly sort** the output of `os.listdir` before you use it. This will ensure that the ordering remains consistent across all operating systems.\n", + "\n", + "When you are processing the files in any directory, you **must** always **explicitly sort** the output of `os.listdir` first. You **must** do this every time you use `os.listdir`." + ] + }, + { + "cell_type": "markdown", + "id": "cfc0f108", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 3**: List **all** the files and directories in the directory `small_data` that do **not** **start with** the character`\".\"`, sorted in **reverse alphabetical order**.\n", + "\n", + "Your output **must** be a **list** of **strings**, sorted in **reverse alphabetical** order." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ce9a6977", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.673488Z", + "iopub.status.busy": "2023-11-04T18:31:20.673488Z", + "iopub.status.idle": "2023-11-04T18:31:20.681456Z", + "shell.execute_reply": "2023-11-04T18:31:20.680441Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'files_in_small_data', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6c4fd974", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q3\")" + ] + }, + { + "cell_type": "markdown", + "id": "fa90b33e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 4**: What is the **path** of the file `stars_1.csv` in the directory `small_data`.\n", + "\n", + "You are **allowed** to 'hardcode' the strings `'small_data'` and `'stars_1.csv'` to answer this question.\n", + "\n", + "**Warnings:**\n", + "\n", + "1. You **must not** hardcode the **absolute path** of any file in your code. For instance, the **absolute path** of this file `stars_1.csv` could be: `C:\\Users\\mdoescher\\cs220\\lab-p10\\small_data\\stars_1.csv`. However, if you hardcode this path in your code, it will **only work on your computer**. In this case, since the notebook `lab-p10.ipynb` is stored in the path `C:\\Users\\mdoescher\\cs220\\lab-p10`, the **relative path** of the file is `small_data\\stars_1.csv`, and this is the path that **must** be used, if you want your code to work on all computers.\n", + "2. You **must not** hardcode either the character `\"\\\"` or the character `\"/\"` in your paths. If you do so, your code will **crash** when it runs on a **different operating system**. You **must** use the `os.path.join` function to create path strings." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f315e5b6", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.705136Z", + "iopub.status.busy": "2023-11-04T18:31:20.704137Z", + "iopub.status.idle": "2023-11-04T18:31:20.715695Z", + "shell.execute_reply": "2023-11-04T18:31:20.714682Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "\n", + "stars_1_path = os.path.join(\"small_data\", \"stars_1.csv\")\n", + "\n", + "stars_1_path" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e31630a3", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q4\")" + ] + }, + { + "cell_type": "markdown", + "id": "15da3e0d", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 5**: List the **paths** of **all** the files in the directory `small_data`.\n", + "\n", + "Your output **must** be a **list** of **strings**. You must **ignore** files that **start with** the character`\".\"`, and your output **must** be sorted in **reverse alphabetical order**.\n", + "\n", + "**Warnings:**\n", + "\n", + "1. You **must not** hardcode the **absolute path** of any file in your code. You must use the **relative path** of the files.\n", + "2. You **must not** hardcode either the character `\"\\\"` or the character `\"/\"` in your paths. You **must** use the `os.path.join` function to create paths." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c5ba072e", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.740598Z", + "iopub.status.busy": "2023-11-04T18:31:20.739600Z", + "iopub.status.idle": "2023-11-04T18:31:20.748239Z", + "shell.execute_reply": "2023-11-04T18:31:20.747227Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'paths_in_small_data', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6e4da4db", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q5\")" + ] + }, + { + "cell_type": "markdown", + "id": "acbf604e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 6**: List the **paths** of **all** the JSON files in the directory `small_data`.\n", + "\n", + "Your output **must** be a **list** of **strings**. You must **ignore** files that **start with** the character`\".\"`, and your output **must** sorted in **reverse alphabetical order**.\n", + "\n", + "**Hint:** You can identify the JSON files as the files which end with the string `\".json\"`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d270a2e3", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.767956Z", + "iopub.status.busy": "2023-11-04T18:31:20.766957Z", + "iopub.status.idle": "2023-11-04T18:31:20.777083Z", + "shell.execute_reply": "2023-11-04T18:31:20.776074Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'json_paths', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "169ea981", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q6\")" + ] + }, + { + "cell_type": "markdown", + "id": "e504581c", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 7**: List the **paths** of **all** the files in the directory `small_data`, whose filename starts with `\"stars\"`.\n", + "\n", + "Your output **must** be a **list** of **strings**. You must **ignore** files that **start with** the character`\".\"`, and your output **must** sorted in **reverse alphabetical order**." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "07940c2e", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.796490Z", + "iopub.status.busy": "2023-11-04T18:31:20.795489Z", + "iopub.status.idle": "2023-11-04T18:31:20.808026Z", + "shell.execute_reply": "2023-11-04T18:31:20.807009Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'stars_paths', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7cad68a4", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q7\")" + ] + }, + { + "cell_type": "markdown", + "id": "0a754680", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Segment 3: Creating Namedtuples\n", + "\n", + "In P10, you will be reading the data in files similar to `stars_1.csv`, `stars_2.csv`, and `stars_3.csv`, and storing the data as a **dictionary** of **named tuples**. Now would be a great time to practice creating similar data structues." + ] + }, + { + "cell_type": "markdown", + "id": "d9a69e4f", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Data Structure 1: namedtuple `Star`\n", + "\n", + "We will now create a new `Star` type (using `namedtuple`). It **must** have the following attributes:\n", + "\n", + "* `spectral_type`,\n", + "* `stellar_effective_temperature`,\n", + "* `stellar_radius`,\n", + "* `stellar_mass`,\n", + "* `stellar_luminosity`,\n", + "* `stellar_surface_gravity`,\n", + "* `stellar_age`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "03da7912", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.825269Z", + "iopub.status.busy": "2023-11-04T18:31:20.825269Z", + "iopub.status.idle": "2023-11-04T18:31:20.832306Z", + "shell.execute_reply": "2023-11-04T18:31:20.831295Z" + } + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "\n", + "# define the list of attributes we want in our namedtuple\n", + "star_attributes = ['spectral_type',\n", + " 'stellar_effective_temperature',\n", + " 'stellar_radius',\n", + " 'stellar_mass',\n", + " 'stellar_luminosity',\n", + " 'stellar_surface_gravity',\n", + " 'stellar_age']\n", + "\n", + "# create the namedtuple type 'Star' with the correct attributes\n", + "Star = namedtuple(\"Star\", star_attributes)" + ] + }, + { + "cell_type": "markdown", + "id": "a07ae1f3", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "Let us now test whether we have defined the namedtuple properly by creating a `Star` object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f561d50b", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.838308Z", + "iopub.status.busy": "2023-11-04T18:31:20.837310Z", + "iopub.status.idle": "2023-11-04T18:31:20.847410Z", + "shell.execute_reply": "2023-11-04T18:31:20.846400Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# run this following cell to initialize and test an example Star object\n", + "\n", + "sun = Star('G2 V', 5780.0, 1.0, 1.0, 0.0, 4.44, 4.6)\n", + "\n", + "sun" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "124f519f", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"Star\")" + ] + }, + { + "cell_type": "markdown", + "id": "c8fe3393", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Segment 3.1: Creating `Star` objects from `stars_1.csv`\n", + "\n", + "Now that we have created the `Star` namedtuple, our next objective will be to read the files `stars_1.csv`, `stars_2.csv`, and `stars_3.csv` and create `Star` objects out of all the stars in there. In order to process the CSV files, you will first need to copy/paste the `process_csv` function you have been using since P6." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8aa239c1", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.863492Z", + "iopub.status.busy": "2023-11-04T18:31:20.863492Z", + "iopub.status.idle": "2023-11-04T18:31:20.870927Z", + "shell.execute_reply": "2023-11-04T18:31:20.869911Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# copy & paste the process_csv file from previous projects here\n" + ] + }, + { + "cell_type": "markdown", + "id": "e517862e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "You are now ready to read the data in `stars_1.csv` using `process_csv` and convert the data into `Star` objects. In the cell below, you **must** read the data in `stars_1.csv` and extract the **header** and the non-header **rows** of the file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e913e4ea", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.879929Z", + "iopub.status.busy": "2023-11-04T18:31:20.878929Z", + "iopub.status.idle": "2023-11-04T18:31:20.887166Z", + "shell.execute_reply": "2023-11-04T18:31:20.886155Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "stars_1_csv = process_csv(os.path.join(\"small_data\", \"stars_1.csv\")) # read the data in 'stars_1.csv'\n", + "stars_header = ...\n", + "stars_1_rows = ..." + ] + }, + { + "cell_type": "markdown", + "id": "46b644cd", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "If you wish to **verify** that you have read the file and defined the variables correctly, you can check that `stars_header` has the value:\n", + "\n", + "```python\n", + "['Star Name', 'Spectral Type', 'Stellar Effective Temperature [K]', 'Stellar Radius [Solar Radius]', \n", + " 'Stellar Mass [Solar mass]', 'Stellar Luminosity [log(Solar)]', 'Stellar Surface Gravity [log10(cm/s**2)]',\n", + " 'Stellar Age [Gyr]']\n", + "```\n", + "\n", + "and that `stars_1_rows` has the value:\n", + "\n", + "```python\n", + "[['55 Cnc', 'G8V', '5172.00', '0.94', '0.91', '-0.197', '4.43', '10.200'],\n", + " ['DMPP-1', 'F8V', '6196.00', '1.26', '1.21', '0.320', '4.41', '2.010'],\n", + " ['GJ 876', 'M2.5V', '3271.00', '0.30', '0.32', '-1.907', '4.87', '1.000']]\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "1c093a04", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 8**: Create a `Star` object for the **first** star in `\"stars_1.csv\"`.\n", + "\n", + "The **attribute** of the `Star` namedtuple object, the corresponding **column** of the `stars_1.csv` file where the value should be obtained from, and the correct **data type** for the value are listed in the table below:\n", + "\n", + "|Attribute of `Star` object|Column of `stars_1.csv`|Data Type|\n", + "|---------|------|---------|\n", + "|`spectral_type`|Spectral Type|**string**|\n", + "|`stellar_effective_temperature`|Stellar Effective Temperature [K]|**float**|\n", + "|`stellar_radius`|Stellar Radius [Solar Radius]|**float**|\n", + "|`stellar_mass`|Stellar Mass [Solar mass]|**float**|\n", + "|`stellar_luminosity`|Stellar Luminosity [log(Solar)]|**float**|\n", + "|`stellar_surface_gravity`|Stellar Surface Gravity [log10(cm/s**2)]|**float**|\n", + "|`stellar_age`|Stellar Age [Gyr]|**float**|" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f2ce62c2", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.893167Z", + "iopub.status.busy": "2023-11-04T18:31:20.892166Z", + "iopub.status.idle": "2023-11-04T18:31:20.905692Z", + "shell.execute_reply": "2023-11-04T18:31:20.904681Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "row_idx = 0 # the index of the star we want to convert into a Star object\n", + "\n", + "# extract the values from stars_1_rows\n", + "spectral_type = stars_1_rows[row_idx][stars_header.index(...)]\n", + "stellar_effective_temperature = float(stars_1_rows[row_idx][stars_header.index(...)])\n", + "stellar_radius = ...\n", + "stellar_mass = ...\n", + "stellar_luminosity = ...\n", + "stellar_surface_gravity = ...\n", + "stellar_age = ...\n", + "\n", + "# initialize 'first_star'\n", + "first_star = Star(spectral_type, stellar_effective_temperature, stellar_radius, \\\n", + " stellar_mass, stellar_luminosity, \\\n", + " stellar_surface_gravity, stellar_age)\n", + "\n", + "first_star" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "af0985c5", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q8\")" + ] + }, + { + "cell_type": "markdown", + "id": "90fc47c3", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 9**: Create a `Star` object for the **second** star in `\"stars_1.csv\"`.\n", + "\n", + "You **must** create the `Star` object similarly to what you did in the previous question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5314bd44", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.923080Z", + "iopub.status.busy": "2023-11-04T18:31:20.922079Z", + "iopub.status.idle": "2023-11-04T18:31:20.933977Z", + "shell.execute_reply": "2023-11-04T18:31:20.932967Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'second_star', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3257ccb4", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q9\")" + ] + }, + { + "cell_type": "markdown", + "id": "782b3175", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 10**: What is the `spectral_type` of the **second** star in `\"stars_1.csv\"`?\n", + "\n", + "You **must** answer this question by accessing the correct **attribute** of the `Star` object `second_star`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "93256002", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.949375Z", + "iopub.status.busy": "2023-11-04T18:31:20.948373Z", + "iopub.status.idle": "2023-11-04T18:31:20.955310Z", + "shell.execute_reply": "2023-11-04T18:31:20.955310Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "\n", + "second_star_spectral_type = second_star.spectral_type\n", + "\n", + "second_star_spectral_type" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e399b365", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q10\")" + ] + }, + { + "cell_type": "markdown", + "id": "1e3272b8", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 11**: What is the `stellar_age` of the **first** star in `\"stars_1.csv\"`?\n", + "\n", + "You **must** answer this question by accessing the correct **attribute** of the `Star` object `first_star`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8a22b382", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.971477Z", + "iopub.status.busy": "2023-11-04T18:31:20.970477Z", + "iopub.status.idle": "2023-11-04T18:31:20.978275Z", + "shell.execute_reply": "2023-11-04T18:31:20.977263Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'first_star_stellar_age', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c9d471a6", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q11\")" + ] + }, + { + "cell_type": "markdown", + "id": "9e0bd9f2", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 12**: What is the **ratio** of the `stellar_radius` of the **first** star in `\"stars_1.csv\"` to the **second** star in `\"stars_1.csv\"`?\n", + "\n", + "You **must** answer this question by accessing the correct **attribute** of the `Star` objects `first_star` and `second_star`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c67a29c5", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:20.993595Z", + "iopub.status.busy": "2023-11-04T18:31:20.992596Z", + "iopub.status.idle": "2023-11-04T18:31:21.002202Z", + "shell.execute_reply": "2023-11-04T18:31:21.001189Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'stellar_radius_ratio', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1afeab22", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q12\")" + ] + }, + { + "cell_type": "markdown", + "id": "3c85c397", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 13**: Create a **dictionary** mapping the `name` of each star in `\"stars_1.csv\"` to its `Star` object.\n", + "\n", + "Your output **must** look like this:\n", + "```python\n", + "{'55 Cnc': Star(spectral_type='G8V', stellar_effective_temperature=5172.0, stellar_radius=0.94, \n", + " stellar_mass=0.91, stellar_luminosity=-0.197, stellar_surface_gravity=4.43, stellar_age=10.2),\n", + " 'DMPP-1': Star(spectral_type='F8V', stellar_effective_temperature=6196.0, stellar_radius=1.26, \n", + " stellar_mass=1.21, stellar_luminosity=0.32, stellar_surface_gravity=4.41, stellar_age=2.01),\n", + " 'GJ 876': Star(spectral_type='M2.5V', stellar_effective_temperature=3271.0, stellar_radius=0.3, \n", + " stellar_mass=0.32, stellar_luminosity=-1.907, stellar_surface_gravity=4.87, stellar_age=1.0)}\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "134dcc62", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.018575Z", + "iopub.status.busy": "2023-11-04T18:31:21.017573Z", + "iopub.status.idle": "2023-11-04T18:31:21.031533Z", + "shell.execute_reply": "2023-11-04T18:31:21.030521Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "stars_1_dict = {} # initialize empty dictionary to store all stars\n", + "\n", + "for row_idx in range(len(stars_1_rows)):\n", + " star_name = stars_1_rows[row_idx][stars_header.index(...)]\n", + " spectral_type = ...\n", + " stellar_effective_temperature = ...\n", + " # extract the other columns from 'stars_1_rows'\n", + " \n", + " star = ... # initialize the 'Star' object using the variables defined above\n", + " stars_1_dict[...] = star\n", + "\n", + "stars_1_dict" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e70bcc38", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q13\")" + ] + }, + { + "cell_type": "markdown", + "id": "4b5cd61d", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 14**: What is the `Star` object of the star (in `stars_1.csv`) named *GJ 876*?\n", + "\n", + "You **must** access the `Star` object in `stars_1_dict` **dictionary** defined above to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1a1991bd", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.047647Z", + "iopub.status.busy": "2023-11-04T18:31:21.047647Z", + "iopub.status.idle": "2023-11-04T18:31:21.055344Z", + "shell.execute_reply": "2023-11-04T18:31:21.054332Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'gj_876', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c8548752", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q14\")" + ] + }, + { + "cell_type": "markdown", + "id": "34333560", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 15**: What is the `stellar_luminosity` of the star (in `stars_1.csv`) named *GJ 876*?\n", + "\n", + "You **must** access the `Star` object in `stars_1_dict` **dictionary** defined above to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ec4581f3", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.071703Z", + "iopub.status.busy": "2023-11-04T18:31:21.071703Z", + "iopub.status.idle": "2023-11-04T18:31:21.078596Z", + "shell.execute_reply": "2023-11-04T18:31:21.078596Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'gj_876_luminosity', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "89c84497", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q15\")" + ] + }, + { + "cell_type": "markdown", + "id": "08555312", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Segment 3.2: Data Cleaning - missing data\n", + "\n", + "We have already parsed the data in `stars_1.csv`. We are now ready to parse the data in **all** the star files of the `small_data` directory. However, there is one minor inconvenience - there is some missing data in `stars_2.csv` and `stars_3.csv`. For example, this is the **first** row of `stars_2.csv`:\n", + "\n", + "```python\n", + "['HD 158259', 'G0', '5801.89', '1.21', '1.08', '0.212', '4.25', '']\n", + "```\n", + "\n", + "As you can see, the value of the last column (`Stellar Age [Gyr]`) is `''`, which means that the data is missing. When the data is missing, we will want the value of the corresponding attribute in the `Star` object to be `None`.\n", + "\n", + "So, for example, if we are to convert the row above to be a `Star` object, it should look like:\n", + "\n", + "```python\n", + "Star(spectral_type='G0', stellar_effective_temperature=5801.89, stellar_radius=1.21, stellar_mass=1.08,\n", + " stellar_luminosity=0.212, stellar_surface_gravity=4.25, stellar_age=None)\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "4e1c86b7", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Function 1: `star_cell(row_idx, col_name, stars_rows, header=stars_header)`\n", + "\n", + "Since we need to clean the values of the **list** of **lists** `stars_rows` before we can create our required data structure (**dictionary** mapping **strings** to `Star` objects), now would be a good time to create a function that takes in a `row_idx`, a `col_name` and a **list** of **lists** `stars_rows` (as well as the optional argument `header`) and returns the value of the column `col_name` at the row `row_idx`.\n", + "\n", + "This function **must** typecast the values it returns based on the `col_name`. If the value in `stars_rows` is missing (i.e., it is `''`), then the value returned **must** be `None`.\n", + "\n", + "Recall that the **column** of `stars_rows` where the value should be obtained from, and the correct **data type** for the value are listed in the table below:\n", + "\n", + "|Column of `stars_rows`|Data Type|\n", + "|------|---------|\n", + "|Star Name|**string**|\n", + "|Spectral Type|**string**|\n", + "|Stellar Effective Temperature [K]|**float**|\n", + "|Stellar Radius [Solar Radius]|**float**|\n", + "|Stellar Mass [Solar mass]|**float**|\n", + "|Stellar Luminosity [log(Solar)]|**float**|\n", + "|Stellar Surface Gravity [log10(cm/s**2)]|**float**|\n", + "|Stellar Age [Gyr]|**float**|\n", + "\n", + "**Hint:** You can use the `cell` function defined in P6 and P7 for inspiration here." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "57e495be", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.095472Z", + "iopub.status.busy": "2023-11-04T18:31:21.095472Z", + "iopub.status.idle": "2023-11-04T18:31:21.103336Z", + "shell.execute_reply": "2023-11-04T18:31:21.102325Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "# the default argument to the parameter 'header' is the global variable 'stars_header' defined above\n", + "def star_cell(row_idx, col_name, stars_rows, header=stars_header):\n", + " col_idx = header.index(...)\n", + " val = stars_rows[row_idx][col_idx]\n", + " # return None if value is missing\n", + " # else typecast 'val' and return it depending on 'col_name'" + ] + }, + { + "cell_type": "markdown", + "id": "763686e9", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 16**: Use the `star_cell` function to find the value of the column `\"Spectral Type\"` of the **first** star in `\"stars_2.csv\"`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "56b65ea5", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.107335Z", + "iopub.status.busy": "2023-11-04T18:31:21.107335Z", + "iopub.status.idle": "2023-11-04T18:31:21.116140Z", + "shell.execute_reply": "2023-11-04T18:31:21.115124Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "\n", + "# first read the data in 'stars_2.csv' as a list of lists\n", + "stars_2_data = process_csv(os.path.join(\"small_data\", \"stars_2.csv\"))\n", + "stars_2_rows = stars_2_data[1:]\n", + "\n", + "# use the 'star_cell' function to extract the correct value\n", + "first_star_type = star_cell(0, 'Spectral Type', stars_2_rows)\n", + "\n", + "first_star_type" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3241ca1a", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q16\")" + ] + }, + { + "cell_type": "markdown", + "id": "501c9b4d", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 17**: Use the `star_cell` function to find the value of the column `\"Stellar Age [Gyr]\"` of the **second** star in `\"stars_2.csv\"`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0d1d8b72", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.137145Z", + "iopub.status.busy": "2023-11-04T18:31:21.136148Z", + "iopub.status.idle": "2023-11-04T18:31:21.142798Z", + "shell.execute_reply": "2023-11-04T18:31:21.141786Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "# do not worry if there is no output, the variable is expected to hold the value None\n", + "\n", + "# use the 'star_cell' function to extract the correct value\n", + "second_star_age = star_cell(1, 'Stellar Age [Gyr]', stars_2_rows)\n", + "\n", + "second_star_age" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b6b7c9e6", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q17\")" + ] + }, + { + "cell_type": "markdown", + "id": "15d28bae", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 18**: Use the `star_cell` function to find the value of the column `\"Stellar Mass [Solar mass]\"` of the **third** star in `\"stars_2.csv\"`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a0b1d827", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.164416Z", + "iopub.status.busy": "2023-11-04T18:31:21.163415Z", + "iopub.status.idle": "2023-11-04T18:31:21.173854Z", + "shell.execute_reply": "2023-11-04T18:31:21.172840Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "\n", + "# use the 'star_cell' function to extract the correct value\n", + "third_star_mass = star_cell(2, 'Stellar Mass [Solar mass]', stars_2_rows)\n", + "\n", + "third_star_mass" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f163f294", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q18\")" + ] + }, + { + "cell_type": "markdown", + "id": "e43b2f70", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 19**: Create a **dictionary** mapping the `name` of each star in `\"stars_2.csv\"` to its `Star` object.\n", + "\n", + "You **must** use the `star_cell` function to extract data from `stars_2.csv`.\n", + "\n", + "Your output **must** look like this:\n", + "```python\n", + "{'HD 158259': Star(spectral_type='G0', stellar_effective_temperature=5801.89, stellar_radius=1.21, \n", + " stellar_mass=1.08, stellar_luminosity=0.212, stellar_surface_gravity=4.25, stellar_age=None),\n", + " 'K2-187': Star(spectral_type=None, stellar_effective_temperature=5438.0, stellar_radius=0.83, \n", + " stellar_mass=0.97, stellar_luminosity=-0.21, stellar_surface_gravity=4.6, stellar_age=None),\n", + " 'WASP-47': Star(spectral_type=None, stellar_effective_temperature=5552.0, stellar_radius=1.14, \n", + " stellar_mass=1.04, stellar_luminosity=0.032, stellar_surface_gravity=4.34, stellar_age=6.5)}\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "45b7e5c2", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.197085Z", + "iopub.status.busy": "2023-11-04T18:31:21.197085Z", + "iopub.status.idle": "2023-11-04T18:31:21.211162Z", + "shell.execute_reply": "2023-11-04T18:31:21.210151Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "stars_2_dict = {} # initialize empty dictionary to store all stars\n", + "\n", + "for row_idx in range(len(stars_2_rows)):\n", + " star_name = star_cell(row_idx, 'Star Name', stars_2_rows)\n", + " spectral_type = ...\n", + " stellar_effective_temperature = ...\n", + " # extract the other columns from 'stars_2_rows'\n", + " \n", + " star = ... # initialize the 'Star' object using the variables defined above\n", + " stars_2_dict[...] = star\n", + "\n", + "stars_2_dict" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8177fa89", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q19\")" + ] + }, + { + "cell_type": "markdown", + "id": "1216c018", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 20**: Create a **dictionary** mapping the `name` of each star in `\"stars_3.csv\"` to its `Star` object.\n", + "\n", + "You **must** use the `star_cell` function to extract data from `stars_3.csv`.\n", + "\n", + "Your output **must** look like this:\n", + "```python\n", + "{'K2-133': Star(spectral_type='M1.5V', stellar_effective_temperature=3655.0, stellar_radius=0.46, \n", + " stellar_mass=0.46, stellar_luminosity=-1.479, stellar_surface_gravity=4.77, stellar_age=None),\n", + " 'K2-138': Star(spectral_type='G8V', stellar_effective_temperature=5356.3, stellar_radius=0.86, \n", + " stellar_mass=0.94, stellar_luminosity=-0.287, stellar_surface_gravity=4.54, stellar_age=2.8),\n", + " 'GJ 667 C': Star(spectral_type='M1.5V', stellar_effective_temperature=3350.0, stellar_radius=None, \n", + " stellar_mass=0.33, stellar_luminosity=-1.863, stellar_surface_gravity=4.69, stellar_age=2.0)}\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "94336d4f", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.228229Z", + "iopub.status.busy": "2023-11-04T18:31:21.227230Z", + "iopub.status.idle": "2023-11-04T18:31:21.240730Z", + "shell.execute_reply": "2023-11-04T18:31:21.239719Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'stars_3_dict', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "58942905", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q20\")" + ] + }, + { + "cell_type": "markdown", + "id": "87d0d814", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 21**: Combine the three **dictionaries** `stars_1_dict`, `stars_2_dict`, and `stars_3_dict` into a single **dictionary** with all the stars in the `small_data` directory." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "290facf3", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.257110Z", + "iopub.status.busy": "2023-11-04T18:31:21.256110Z", + "iopub.status.idle": "2023-11-04T18:31:21.266497Z", + "shell.execute_reply": "2023-11-04T18:31:21.266497Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "stars_dict = ... # initialize an empty dictionary\n", + "stars_dict.update(...) # add stars_1_dict to stars_dict\n", + "# add stars_2_dict and stars_3_dict to stars_dict\n", + "\n", + "stars_dict" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a634804e", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q21\")" + ] + }, + { + "cell_type": "markdown", + "id": "cb80f37a", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Data Structure 2: namedtuple `Planet`\n", + "\n", + "Just as you did with the stars, you will be using named tuples to store the data about the planets in the `planets_1.csv`, `planets_2.csv`, and `planets_3.csv` files. Before you start reading these files however, you **must** create a new `Planet` type (using namedtuple). It **must** have the following attributes:\n", + "\n", + "* `planet_name`,\n", + "* `host_name`,\n", + "* `discovery_method`,\n", + "* `discovery_year`,\n", + "* `controversial_flag`,\n", + "* `orbital_period`,\n", + "* `planet_radius`,\n", + "* `planet_mass`,\n", + "* `semi_major_radius`,\n", + "* `eccentricity`,\n", + "* `equilibrium_temperature`\n", + "* `insolation_flux`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f2748264", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.285141Z", + "iopub.status.busy": "2023-11-04T18:31:21.284140Z", + "iopub.status.idle": "2023-11-04T18:31:21.291175Z", + "shell.execute_reply": "2023-11-04T18:31:21.290166Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# define the namedtuple 'Planet' here\n", + "\n", + "planets_attributes = ... # initialize the list of attributes\n", + "\n", + "# define the namedtuple 'Planet'\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dfe08fa0", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.295176Z", + "iopub.status.busy": "2023-11-04T18:31:21.295176Z", + "iopub.status.idle": "2023-11-04T18:31:21.303285Z", + "shell.execute_reply": "2023-11-04T18:31:21.302273Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# run this following cell to initialize and test an example Planet object\n", + "# if this cell fails to execute, you have likely not defined the namedtuple 'Planet' correctly\n", + "jupiter = Planet('Jupiter', 'Sun', 'Imaging', 1610, False, 4333.0, 11.209, 317.828, 5.2038, 0.0489, 110, 0.0345)\n", + "\n", + "jupiter" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4ad341be", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"Planet\")" + ] + }, + { + "cell_type": "markdown", + "id": "ed437cd7", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Segment 3.3: Creating `Planet` objects\n", + "\n", + "We are now ready to read the files in the `small_data` directory and create `Planet` objects. Creating `Planet` objects however, is going to be more difficult than creating `Star` objects, because the data required to create a single `Planet` object is split up into different files.\n", + "\n", + "The `planets_1.csv`, `planets_2.csv`, and `planets_3.csv` files contain all the data required to create `Planet` objects **except** for the `host_name`. The `host_name` for each planet is to be found in the `mapping_1.json`, `mapping_2.json`, and `mapping_3.json` files." + ] + }, + { + "cell_type": "markdown", + "id": "418c826f", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "First, let us read the data in `planets_1.csv`. Since this is a CSV file, you can use the `process_csv` function from above to read this file. In the cell below, you **must** read the data in `planets_1.csv` and extract the **header** and the non-header **rows** of the file." + ] + }, + { + "cell_type": "markdown", + "id": "3b4a9d38", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 22**: Read the contents of `'planets_1.csv'` into a **list** of **lists** using the `process_csv` function, and extract the **header** and the **rows** in the file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bb5fe960", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.321312Z", + "iopub.status.busy": "2023-11-04T18:31:21.320306Z", + "iopub.status.idle": "2023-11-04T18:31:21.328084Z", + "shell.execute_reply": "2023-11-04T18:31:21.327073Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "planets_1_csv = process_csv(...) # read the data in 'planets_1.csv'\n", + "planets_header = ...\n", + "planets_1_rows = ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "821a1f7e", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q22\")" + ] + }, + { + "cell_type": "markdown", + "id": "2049dbfd", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "Now, you are ready to read the data in `mapping_1.json`. Since this is a JSON file, you will need a new function to read this file:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b7ab3301", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.345064Z", + "iopub.status.busy": "2023-11-04T18:31:21.344065Z", + "iopub.status.idle": "2023-11-04T18:31:21.350424Z", + "shell.execute_reply": "2023-11-04T18:31:21.349413Z" + } + }, + "outputs": [], + "source": [ + "# this function uses the 'load' function from the json module (already imported in this notebook) to read files\n", + "def read_json(path):\n", + " with open(path, encoding=\"utf-8\") as f:\n", + " return json.load(f)" + ] + }, + { + "cell_type": "markdown", + "id": "8659adf2", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 23**: Read the contents of `'mapping_1.json'` into a **dictionary** using the `read_json` function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "147e2924", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.355425Z", + "iopub.status.busy": "2023-11-04T18:31:21.354428Z", + "iopub.status.idle": "2023-11-04T18:31:21.364394Z", + "shell.execute_reply": "2023-11-04T18:31:21.363382Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this for you\n", + "\n", + "mapping_1_json = read_json(os.path.join(\"small_data\", \"mapping_1.json\"))\n", + "\n", + "mapping_1_json" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6cf69463", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q23\")" + ] + }, + { + "cell_type": "markdown", + "id": "f413cd28", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Segment 3.4: Combining data from CSV and JSON files\n", + "\n", + "We are now ready to combine the data from `planets_1_rows` and `mapping_1_json` to create `Planet` objects. Before we start, it might be useful to create a function similar to `star_cell` for preprocessing the values in the CSV files." + ] + }, + { + "cell_type": "markdown", + "id": "4f6044d2", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Function 2: `planet_cell(row_idx, col_name, planets_rows, header=planets_header)`\n", + "\n", + "Just like the data in `stars_1.csv`, `stars_2.csv`, and `stars_3.csv`, some of the data in `planets_1.csv`, `planets_2.csv`, and `planets_3.csv` is **missing**. So, now would be a good time to create a function that takes in a `row_idx`, a `col_name` and a **list** of **lists** `planets_rows` (as well as the optional argument `header`) and returns the value of the column `col_name` at the row `row_idx`.\n", + "\n", + "This function **must** typecast the values it returns based on the `col_name`. If the value in `planets_rows` is missing (i.e., it is `''`), then the value returned **must** be `None`.\n", + "\n", + "The **column** of `planets_rows` where the value should be obtained from, and the correct **data type** for the value are listed in the table below:\n", + "\n", + "|Column of `planets_rows`|Data Type|\n", + "|------|---------|\n", + "|Planet Name|**string**|\n", + "|Discovery Year|**int**|\n", + "|Discovery Method|**string**|\n", + "|Controversial Flag|**bool**|\n", + "|Orbital Period [days]|**float**|\n", + "|Planet Radius [Earth Radius]|**float**|\n", + "|Planet Mass [Earth Mass]|**float**|\n", + "|Orbit Semi-Major Axis [au]|**float**|\n", + "|Eccentricity|**float**|\n", + "|Equilibrium Temperature [K]|**float**|\n", + "|Insolation Flux [Earth Flux]|**float**|\n", + "\n", + "**Important Warning:** Notice that the `Controversial Flag` column has to be converted into a **bool**. The data is stored in `planets_1.csv` (and consequently in `planets_rows`) as `\"0\"/\"1\"` values (with `\"0\"` representing `False` and `\"1\"` representing `True`). However typecasting **strings** to **bools** is not straightforward. Run the following cell and try to figure out what is happening:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dff7b4a9", + "metadata": { + "deletable": false, + "editable": false, + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.382575Z", + "iopub.status.busy": "2023-11-04T18:31:21.382575Z", + "iopub.status.idle": "2023-11-04T18:31:21.391608Z", + "shell.execute_reply": "2023-11-04T18:31:21.389595Z" + } + }, + "outputs": [], + "source": [ + "strings = [\"0\", \"1\", \"\", \" \", \"True\", \"False\"]\n", + "for string in strings:\n", + " print(bool(string))" + ] + }, + { + "cell_type": "markdown", + "id": "1a940371", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "If you want to convert the **strings** into **bools**, you will have to explicitly use `if/else` statements to determine whether the value is `\"0\"` or `\"1\"`, as can be seen in the starter code below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3069bccd", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.399605Z", + "iopub.status.busy": "2023-11-04T18:31:21.398607Z", + "iopub.status.idle": "2023-11-04T18:31:21.408902Z", + "shell.execute_reply": "2023-11-04T18:31:21.407885Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "def planet_cell(row_idx, col_name, planets_rows, header=planets_header):\n", + " col_idx = ... # extract col_idx from col_name and header\n", + " val = ... # extract the value at row_idx and col_idx\n", + " if val == '':\n", + " return None\n", + " if col_name in [\"Controversial Flag\"]:\n", + " if val == \"1\":\n", + " return ...\n", + " else:\n", + " return ...\n", + " # for all other columns typecast 'val' and return it depending on col_name" + ] + }, + { + "cell_type": "markdown", + "id": "c8d81a8a", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 24**: Use the `planet_cell` function to find the value of the column `\"Planet Name\"` of the **first** planet in `\"planets_1.csv\"`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9debe1fa", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.415902Z", + "iopub.status.busy": "2023-11-04T18:31:21.414902Z", + "iopub.status.idle": "2023-11-04T18:31:21.426645Z", + "shell.execute_reply": "2023-11-04T18:31:21.425626Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "\n", + "first_planet_name = planet_cell(0, 'Planet Name', planets_1_rows)\n", + "\n", + "first_planet_name" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e7d83ea2", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q24\")" + ] + }, + { + "cell_type": "markdown", + "id": "b40d6274", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 25**: Use the `planet_cell` function to find the value of the column `\"Insolation Flux [Earth Flux]\"` of the **first** planet in `\"planets_1.csv\"`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fb490a1f", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.444512Z", + "iopub.status.busy": "2023-11-04T18:31:21.443511Z", + "iopub.status.idle": "2023-11-04T18:31:21.451416Z", + "shell.execute_reply": "2023-11-04T18:31:21.450747Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "# do not worry if there is no output, the variable is expected to hold the value None\n", + "\n", + "first_planet_flux = planet_cell(0, 'Insolation Flux [Earth Flux]', planets_1_rows)\n", + "\n", + "first_planet_flux" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cf8585d5", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q25\")" + ] + }, + { + "cell_type": "markdown", + "id": "8b24bcda", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 26**: Use the `planet_cell` function to find the value of the column `\"Controversial Flag\"` of the **second** planet in `\"planets_1.csv\"`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fe557bd6", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.470265Z", + "iopub.status.busy": "2023-11-04T18:31:21.469265Z", + "iopub.status.idle": "2023-11-04T18:31:21.479630Z", + "shell.execute_reply": "2023-11-04T18:31:21.478615Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'second_planet_controversy', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bfe7339d", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q26\")" + ] + }, + { + "cell_type": "markdown", + "id": "87f5ae58", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 27**: Create a `Planet` object for the **first** star in `\"planets_1.csv\"`.\n", + "\n", + "The **attribute** of the `Planet` namedtuple object, the corresponding **column** of the `planets_1.csv` file where the value should be obtained from, and the correct **data type** for the value are listed in the table below:\n", + "\n", + "|Attribute of `Planet` object|Column of `planets_1.csv`|Data Type|\n", + "|---------|------|---------|\n", + "|`planet_name`|Planet Name|**string**|\n", + "|`host_name`| - |**string**|\n", + "|`discovery_method`|Discovery Method|**string**|\n", + "|`discovery_year`|Discovery Year|**int**|\n", + "|`controversial_flag`|Controversial Flag|**bool**|\n", + "|`orbital_period`|Orbital Period [days]|**float**|\n", + "|`planet_radius`|Planet Radius [Earth Radius]|**float**|\n", + "|`planet_mass`|Planet Mass [Earth Mass]|**float**|\n", + "|`semi_major_radius`|Orbit Semi-Major Axis [au]|**float**|\n", + "|`eccentricity`|Eccentricity|**float**|\n", + "|`equilibrium_temperature`|Equilibrium Temperature [K]|**float**|\n", + "|`insolation_flux`|Insolation Flux [Earth Flux]|**float**|\n", + "\n", + "\n", + "The value of the `host_name` attribute is found in `mapping_1.json`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b66cf1f7", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.499507Z", + "iopub.status.busy": "2023-11-04T18:31:21.498506Z", + "iopub.status.idle": "2023-11-04T18:31:21.514441Z", + "shell.execute_reply": "2023-11-04T18:31:21.513428Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "row_idx = 0 # the index of the planet we want to convert into a Planet object\n", + "\n", + "# extract the values from planets_1_rows\n", + "planet_name = planet_cell(row_idx, 'Planet Name', planets_1_rows)\n", + "host_name = mapping_1_json[planet_name]\n", + "discovery_method = planet_cell(row_idx, 'Discovery Method', planets_1_rows)\n", + "discovery_year = ...\n", + "controversial_flag = ...\n", + "orbital_period = ...\n", + "planet_radius = ...\n", + "planet_mass = ...\n", + "semi_major_radius = ...\n", + "eccentricity = ...\n", + "equilibrium_temperature = ...\n", + "insolation_flux = ...\n", + "\n", + "# initialize 'first_planet'\n", + "first_planet = Planet(planet_name, host_name, discovery_method, discovery_year,\\\n", + " controversial_flag, orbital_period, planet_radius, planet_mass,\\\n", + " semi_major_radius, eccentricity, equilibrium_temperature, insolation_flux)\n", + "\n", + "first_planet" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "63d00846", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q27\")" + ] + }, + { + "cell_type": "markdown", + "id": "bc475a2e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 28**: Create a **list** of `Planet` objects of each planet in `\"planets_1.csv\"`.\n", + "\n", + "Your output **must** look like this:\n", + "```python\n", + "[Planet(planet_name='55 Cnc b', host_name='55 Cnc', discovery_method='Radial Velocity', \n", + " discovery_year=1996, controversial_flag=False, orbital_period=14.6516, \n", + " planet_radius=13.9, planet_mass=263.9785, semi_major_radius=0.1134, eccentricity=0.0,\n", + " equilibrium_temperature=700.0, insolation_flux=None),\n", + " Planet(planet_name='55 Cnc c', host_name='55 Cnc', discovery_method='Radial Velocity', \n", + " discovery_year=2004, controversial_flag=False, orbital_period=44.3989, \n", + " planet_radius=8.51, planet_mass=54.4738, semi_major_radius=0.2373, eccentricity=0.03, \n", + " equilibrium_temperature=None, insolation_flux=None),\n", + " Planet(planet_name='DMPP-1 b', host_name='DMPP-1', discovery_method='Radial Velocity', \n", + " discovery_year=2019, controversial_flag=False, orbital_period=18.57, \n", + " planet_radius=5.29, planet_mass=24.27, semi_major_radius=0.1462, eccentricity=0.083, \n", + " equilibrium_temperature=877.0, insolation_flux=None),\n", + " Planet(planet_name='GJ 876 b', host_name='GJ 876', discovery_method='Radial Velocity', \n", + " discovery_year=1998, controversial_flag=False, orbital_period=61.1166, \n", + " planet_radius=13.3, planet_mass=723.2235, semi_major_radius=0.208317, eccentricity=0.0324,\n", + " equilibrium_temperature=None, insolation_flux=None),\n", + " Planet(planet_name='GJ 876 c', host_name='GJ 876', discovery_method='Radial Velocity', \n", + " discovery_year=2000, controversial_flag=False, orbital_period=30.0881, \n", + " planet_radius=14.0, planet_mass=226.9846, semi_major_radius=0.12959, eccentricity=0.25591, \n", + " equilibrium_temperature=None, insolation_flux=None)]\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "af0c689c", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.536073Z", + "iopub.status.busy": "2023-11-04T18:31:21.536073Z", + "iopub.status.idle": "2023-11-04T18:31:21.552781Z", + "shell.execute_reply": "2023-11-04T18:31:21.551771Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'planets_1_list', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "180671d1", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q28\")" + ] + }, + { + "cell_type": "markdown", + "id": "4a71cb3f", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 29**: What is the **fifth** `Planet` object in `'planets_1.csv'`?\n", + "\n", + "You **must** access from the `planets_1_list` to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9658a259", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.577746Z", + "iopub.status.busy": "2023-11-04T18:31:21.576747Z", + "iopub.status.idle": "2023-11-04T18:31:21.588561Z", + "shell.execute_reply": "2023-11-04T18:31:21.587547Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'fifth_planet', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "14a7483d", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q29\")" + ] + }, + { + "cell_type": "markdown", + "id": "a43d30a6", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 30**: What is the `planet_name` of the **fifth** `Planet` in `'planets_1.csv'`?\n", + "\n", + "You **must** access from the `planets_1_list` to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "06b05ef5", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.614367Z", + "iopub.status.busy": "2023-11-04T18:31:21.613368Z", + "iopub.status.idle": "2023-11-04T18:31:21.624426Z", + "shell.execute_reply": "2023-11-04T18:31:21.623410Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'fifth_planet_name', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3b61fa54", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q30\")" + ] + }, + { + "cell_type": "markdown", + "id": "dc5c8fbc", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 31**: What is the `controversial_flag` of the **fourth** `Planet` in `'planets_1.csv'`?\n", + "\n", + "You **must** access from the `planets_1_list` to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8691a7d6", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.651046Z", + "iopub.status.busy": "2023-11-04T18:31:21.650046Z", + "iopub.status.idle": "2023-11-04T18:31:21.661414Z", + "shell.execute_reply": "2023-11-04T18:31:21.660400Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'fourth_planet_controversy', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "55fe3856", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q31\")" + ] + }, + { + "cell_type": "markdown", + "id": "56abdbf1", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Segment 3.5: Data Cleaning - broken CSV rows\n", + "\n", + "The code you have written worked well for reading the data in `planets_1.csv` and `mapping_1.json`. However, it will likely **not** work for `planets_2.csv` and `mapping_2.json`. This is because the file `planets_2.csv` is **broken**. For some reason, a few rows in `planets_2.csv` have their data jumbled up. This is what `planets_2.csv` looks like:\n", + "\n", + "|Planet Name|Discovery Method|Discovery Year|Controversial Flag|Orbital Period [days]|Planet Radius [Earth Radius]|Planet Mass [Earth Mass]|Orbit Semi-Major Axis [au]|Eccentricity|Equilibrium Temperature [K]|Insolation Flux [Earth Flux]|\n", + "|-----------|----------------|--------------|------------------|---------------------|----------------------------|------------------------|--------------------------|------------|---------------------------|----------------------------|\n", + "|HD 158259 b|Radial Velocity|2020|0|2.17800000|1.292|2.22000|||1478|794.22|\n", + "|K2-187 b|Transit|2018|0|0.77401000|1.200|1.87000|0.016400||1815||\n", + "|K2-187 c|Transit|2018|0|2.87151200|1.400|2.54000|0.039200||1173||\n", + "|K2-187 d|K2-187|Transit|2018|0|7.14958400|2.400|6.35000|0.072000||865|\n", + "|WASP-47 b|2012|Transit|0|4.15914920|12.640|363.60000|0.052000|0.002800|1275|534.00|\n", + "\n", + "We can see that for some reason, in the **fourth** row, the value under the column `Discovery Method` is the name of the planet's host star. This is causing all the other columns in the row to also take meaningless values.\n", + "\n", + "Similarly, in the **fifth** row, we see that the values under the columns `Discovery Method` and `Discovery Year` are swapped.\n", + "\n", + "We will call such a **row** in a CSV file where the values under a column do not match the expected format to be a **broken row**. While it is possible to sometimes extract useful data from broken rows, in this lab and in P10, we will simply **skip** broken rows.\n", + "\n", + "In order to **skip** broken rows, you should first know how to recognize a **broken row**. In general, there is no general rule that helps you identify when a row is broken. This is because CSV rows can be **broken** in all sorts of different ways. Thankfully, we don't have to write code to catch all sorts of weird cases. It will suffice for us to manually **inspect** the file `planets_2.csv`, and identify **how** the rows are broken.\n", + "\n", + "The simplest way to recognize if a row is broken is if you run into any **RunTime Errors** when you execute your code. So, one simple way to skip bad rows would be to use `try/except` blocks to avoid processing any rows that cause the code to crash.\n", + "\n", + "**Important Note:** In this dataset, as you might have already noticed, it would be **significantly harder** to detect **broken rows** where some of the numerical values are swapped (for example, `Planet Radius [Earth Radius]` and `Planet Mass [Earth Mass]`). You may **assume** that the numerical values are **not** swapped in **any** row, and that **only the rows** in which the **data types** are not as expected are **broken**." + ] + }, + { + "cell_type": "markdown", + "id": "89325183", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 32**: Create a **list** of `Planet` objects of each planet in `\"planets_2.csv\"`.\n", + "\n", + "You **must** skip any broken rows in the CSV file. Your output **must** look like this:\n", + "```python\n", + "[Planet(planet_name='HD 158259 b', host_name='HD 158259', discovery_method='Radial Velocity', \n", + " discovery_year=2020, controversial_flag=False, orbital_period=2.178, \n", + " planet_radius=1.292, planet_mass=2.22, semi_major_radius=None, eccentricity=None, \n", + " equilibrium_temperature=1478.0, insolation_flux=794.22),\n", + " Planet(planet_name='K2-187 b', host_name='K2-187', discovery_method='Transit', \n", + " discovery_year=2018, controversial_flag=False, orbital_period=0.77401, \n", + " planet_radius=1.2, planet_mass=1.87, semi_major_radius=0.0164, eccentricity=None, \n", + " equilibrium_temperature=1815.0, insolation_flux=None),\n", + " Planet(planet_name='K2-187 c', host_name='K2-187', discovery_method='Transit', \n", + " discovery_year=2018, controversial_flag=False, orbital_period=2.871512, \n", + " planet_radius=1.4, planet_mass=2.54, semi_major_radius=0.0392, eccentricity=None, \n", + " equilibrium_temperature=1173.0, insolation_flux=None)]\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a4c36c6c", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.688594Z", + "iopub.status.busy": "2023-11-04T18:31:21.687592Z", + "iopub.status.idle": "2023-11-04T18:31:21.726918Z", + "shell.execute_reply": "2023-11-04T18:31:21.725903Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "planets_2_data = ... # read planets_2.csv\n", + "planets_2_rows = ... # extract the rows from planets_2_data\n", + "mapping_2_json = ... # read mapping_2.json\n", + "\n", + "planets_2_list = []\n", + "for row_idx in range(len(planets_2_rows)):\n", + " try:\n", + " pass # replace with your code\n", + " # create a Planet object and append to 'planets_2_list'\n", + " except ValueError:\n", + " continue\n", + "\n", + "planets_2_list" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1f6776a3", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q32\")" + ] + }, + { + "cell_type": "markdown", + "id": "ea2a40e6", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Important Warning:** It is considered a bad coding practice to use *bare* `try/except` blocks. This means that you should **never** write code like this:\n", + "\n", + "```python\n", + "try:\n", + " # some code\n", + "except:\n", + " # some other code\n", + "```\n", + "\n", + "If you use *bare* `try/except` blocks, your code will seemingly work even if there are bugs in there, and it can get very hard to debug. You should always **explicitly** catch for specific errors like this:\n", + "\n", + "```python\n", + "try:\n", + " # some code\n", + "except ValueError:\n", + " # some other code\n", + "except IndexError:\n", + " # some other code\n", + "```\n", + "\n", + "This way, your code will still crash if there is some other unexpected bug in your code that needs to be fixed, and will only go to the `except` block if it runs into a `ValueError` or an `IndexError`. The starter code above already catches specifically for `ValueError`. You **must** continue this practice in P10 as well." + ] + }, + { + "cell_type": "markdown", + "id": "c7a86bc8", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Segment 3.6: Data Cleaning - broken JSON files\n", + "\n", + "So far, we have written code that can read `planets_1.csv` and `mapping_1.json`, as well as `planets_2.csv` and `mapping_2.json`. However, if you try to read `mapping_3.json`, you are likely to run into some issues. This is because the file `mapping_3.json` is **broken**. Unlike **broken** CSV files, where we only had to skip the **broken rows**, it is much harder to parse **broken JSON files**. When a JSON file is **broken**, we often have no choice but to **skip the file entirely**.\n", + "\n", + "It is also not easy to detect if a JSON file is **broken** using `if` statements. The easiest is to simply try to read the file using the `read_json` function and check if the code crashes." + ] + }, + { + "cell_type": "markdown", + "id": "ce45caff", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 33**: Determine if the `'mapping_3.json'` file is **broken** using a `try/except` block." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9385a807", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.757619Z", + "iopub.status.busy": "2023-11-04T18:31:21.756616Z", + "iopub.status.idle": "2023-11-04T18:31:21.771455Z", + "shell.execute_reply": "2023-11-04T18:31:21.769416Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# we have done this one for you\n", + "\n", + "try:\n", + " mapping_3_json = read_json(os.path.join(\"small_data\", \"mapping_3.json\"))\n", + "except json.JSONDecodeError:\n", + " mapping_3_json = {}\n", + " \n", + "mapping_3_json" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "115bdede", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q33\")" + ] + }, + { + "cell_type": "markdown", + "id": "14baf23e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "In the above cell, note that in the `try/except` block, we specifically checked for the `json.JSONDecodeError`. This is the error that is thrown when you try to call `json.load` on a **broken** JSON file." + ] + }, + { + "cell_type": "markdown", + "id": "648f3a93", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Segment 4: Data Analysis\n", + "\n", + "We have now managed to read all the data in the `small_data` directory. Now is the time to test if our data structures work!" + ] + }, + { + "cell_type": "markdown", + "id": "bc7ab671", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 34**: What is the `host_name` of the **second** planet in `'planets_2.csv'`?\n", + "\n", + "You **must** skip any broken rows. So, you can directly access from the list `planets_2_list` to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8775af92", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.797384Z", + "iopub.status.busy": "2023-11-04T18:31:21.796386Z", + "iopub.status.idle": "2023-11-04T18:31:21.809364Z", + "shell.execute_reply": "2023-11-04T18:31:21.807351Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'second_planet_host', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "680fc9ea", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q34\")" + ] + }, + { + "cell_type": "markdown", + "id": "e557c06b", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 35**: What is the `Star` object of the **third** planet in `'planets_2.csv'`?\n", + "\n", + "You **must** skip any broken rows. So, you can directly access from the list `planets_2_list` to answer this question.\n", + "\n", + "**Hint:** You can use the `stars_dict` **dictionary** defined in q12.2 to find the `Star` object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "de95be9c", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.836596Z", + "iopub.status.busy": "2023-11-04T18:31:21.835596Z", + "iopub.status.idle": "2023-11-04T18:31:21.846718Z", + "shell.execute_reply": "2023-11-04T18:31:21.845701Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'third_planet_star', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d5ffad69", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q35\")" + ] + }, + { + "cell_type": "markdown", + "id": "6e1a4aa4", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 36**: What is the `stellar_radius` of the star around which the **first** planet in `'planets_1.csv'` orbits?\n", + "\n", + "You can directly access from the list `planets_1_list` to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f03feac1", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-04T18:31:21.876761Z", + "iopub.status.busy": "2023-11-04T18:31:21.875760Z", + "iopub.status.idle": "2023-11-04T18:31:21.886388Z", + "shell.execute_reply": "2023-11-04T18:31:21.884361Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'first_planet_star_radius', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6418cc60", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q36\")" + ] + }, + { + "cell_type": "markdown", + "id": "05ac549f", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Great work! You are now ready to start P10." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.4" + }, + "otter": { + "OK_FORMAT": true, + "tests": { + "Planet": { + "name": "Planet", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('Planet', jupiter)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "Star": { + "name": "Star", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('Star', sun)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q1": { + "name": "q1", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q1', all_files)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q10": { + "name": "q10", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q10', second_star_spectral_type)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q11": { + "name": "q11", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q11', second_star_spectral_type)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q12": { + "name": "q12", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q12', stellar_radius_ratio)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q13": { + "name": "q13", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q13', stars_1_dict)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q14": { + "name": "q14", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q14', gj_876)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q15": { + "name": "q15", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q15', gj_876_luminosity)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q16": { + "name": "q16", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q16', first_star_type)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q17": { + "name": "q17", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q17', second_star_age)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q18": { + "name": "q18", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q18', third_star_mass)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q19": { + "name": "q19", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q19', stars_2_dict)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q2": { + "name": "q2", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q2', actual_files)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q20": { + "name": "q20", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q20', stars_3_dict)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q21": { + "name": "q21", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q21', stars_dict)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q22": { + "name": "q22", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q22', {'planets_1_csv': planets_1_csv, 'planets_header': planets_header, 'planets_1_rows': planets_1_rows})\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q23": { + "name": "q23", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q23', mapping_1_json)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q24": { + "name": "q24", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q24', first_planet_name)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q25": { + "name": "q25", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q25', first_planet_flux)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q26": { + "name": "q26", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q26', second_planet_controversy)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q27": { + "name": "q27", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q27', first_planet)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q28": { + "name": "q28", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q28', planets_1_list)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q29": { + "name": "q29", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q29', fifth_planet)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q3": { + "name": "q3", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q3', files_in_small_data)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q30": { + "name": "q30", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q30', fifth_planet_name)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q31": { + "name": "q31", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q31', fourth_planet_controversy)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q32": { + "name": "q32", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q32', planets_2_list)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q33": { + "name": "q33", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q33', mapping_3_json)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q34": { + "name": "q34", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q34', second_planet_host)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q35": { + "name": "q35", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q35', third_planet_star)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q36": { + "name": "q36", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q36', first_planet_star_radius)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q4": { + "name": "q4", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q4', stars_1_path)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q5": { + "name": "q5", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q5', paths_in_small_data)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q6": { + "name": "q6", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q6', json_paths)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q7": { + "name": "q7", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q7', stars_paths)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q8": { + "name": "q8", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q8', first_star)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q9": { + "name": "q9", + "points": 5, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q9', second_star)\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + } + } + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/lab-p10/public_tests.py b/lab-p10/public_tests.py new file mode 100644 index 0000000000000000000000000000000000000000..dadbbba29d635a20b9f93146ddf95b182ef38312 --- /dev/null +++ b/lab-p10/public_tests.py @@ -0,0 +1,1056 @@ +#!/usr/bin/python +# + +import os, json, math, copy +from collections import namedtuple +from bs4 import BeautifulSoup + +HIDDEN_FILE = os.path.join("hidden", "hidden_tests.py") +if os.path.exists(HIDDEN_FILE): + import hidden.hidden_tests as hidn +# - + +MAX_FILE_SIZE = 750 # units - KB +REL_TOL = 6e-04 # relative tolerance for floats +ABS_TOL = 15e-03 # absolute tolerance for floats +TOTAL_SCORE = 100 # total score for the project + +DF_FILE = 'expected_dfs.html' +PLOT_FILE = 'expected_plots.json' + +PASS = "All test cases passed!" + +TEXT_FORMAT = "TEXT_FORMAT" # question type when expected answer is a type, str, int, float, or bool +TEXT_FORMAT_UNORDERED_LIST = "TEXT_FORMAT_UNORDERED_LIST" # question type when the expected answer is a list or a set where the order does *not* matter +TEXT_FORMAT_ORDERED_LIST = "TEXT_FORMAT_ORDERED_LIST" # question type when the expected answer is a list or tuple where the order does matter +TEXT_FORMAT_DICT = "TEXT_FORMAT_DICT" # question type when the expected answer is a dictionary +TEXT_FORMAT_SPECIAL_ORDERED_LIST = "TEXT_FORMAT_SPECIAL_ORDERED_LIST" # question type when the expected answer is a list where order does matter, but with possible ties. Elements are ordered according to values in special_ordered_json (with ties allowed) +TEXT_FORMAT_NAMEDTUPLE = "TEXT_FORMAT_NAMEDTUPLE" # question type when expected answer is a namedtuple +PNG_FORMAT_SCATTER = "PNG_FORMAT_SCATTER" # question type when the expected answer is a scatter plot +HTML_FORMAT = "HTML_FORMAT" # question type when the expected answer is a DataFrame +FILE_JSON_FORMAT = "FILE_JSON_FORMAT" # question type when the expected answer is a JSON file +SLASHES = " SLASHES" # question SUFFIX when expected answer contains paths with slashes + +def get_expected_format(): + """get_expected_format() returns a dict mapping each question to the format + of the expected answer.""" + expected_format = {'q1': 'TEXT_FORMAT_UNORDERED_LIST', + 'q2': 'TEXT_FORMAT_UNORDERED_LIST', + 'q3': 'TEXT_FORMAT_ORDERED_LIST', + 'q4': 'TEXT_FORMAT_SLASHES', + 'q5': 'TEXT_FORMAT_ORDERED_LIST_SLASHES', + 'q6': 'TEXT_FORMAT_ORDERED_LIST_SLASHES', + 'q7': 'TEXT_FORMAT_ORDERED_LIST_SLASHES', + 'Star': 'TEXT_FORMAT_NAMEDTUPLE', + 'q8': 'TEXT_FORMAT_NAMEDTUPLE', + 'q9': 'TEXT_FORMAT_NAMEDTUPLE', + 'q10': 'TEXT_FORMAT', + 'q11': 'TEXT_FORMAT', + 'q12': 'TEXT_FORMAT', + 'q13': 'TEXT_FORMAT_DICT', + 'q14': 'TEXT_FORMAT_NAMEDTUPLE', + 'q15': 'TEXT_FORMAT', + 'q16': 'TEXT_FORMAT', + 'q17': 'TEXT_FORMAT', + 'q18': 'TEXT_FORMAT', + 'q19': 'TEXT_FORMAT_DICT', + 'q20': 'TEXT_FORMAT_DICT', + 'q21': 'TEXT_FORMAT_DICT', + 'Planet': 'TEXT_FORMAT_NAMEDTUPLE', + 'q22': 'TEXT_FORMAT_DICT', + 'q23': 'TEXT_FORMAT_DICT', + 'q24': 'TEXT_FORMAT', + 'q25': 'TEXT_FORMAT', + 'q26': 'TEXT_FORMAT', + 'q27': 'TEXT_FORMAT_NAMEDTUPLE', + 'q28': 'TEXT_FORMAT_ORDERED_LIST', + 'q29': 'TEXT_FORMAT_NAMEDTUPLE', + 'q30': 'TEXT_FORMAT', + 'q31': 'TEXT_FORMAT', + 'q32': 'TEXT_FORMAT_ORDERED_LIST', + 'q33': 'TEXT_FORMAT_DICT', + 'q34': 'TEXT_FORMAT', + 'q35': 'TEXT_FORMAT_NAMEDTUPLE', + 'q36': 'TEXT_FORMAT'} + return expected_format + + +def get_expected_json(): + """get_expected_json() returns a dict mapping each question to the expected + answer (if the format permits it).""" + expected_json = {'q1': ['.DS_Store', + '.ipynb_checkpoints', + 'mapping_1.json', + 'mapping_2.json', + 'mapping_3.json', + 'planets_1.csv', + 'planets_2.csv', + 'planets_3.csv', + 'small_data.zip', + 'stars_1.csv', + 'stars_2.csv', + 'stars_3.csv'], + 'q2': ['mapping_1.json', + 'mapping_2.json', + 'mapping_3.json', + 'planets_1.csv', + 'planets_2.csv', + 'planets_3.csv', + 'small_data.zip', + 'stars_1.csv', + 'stars_2.csv', + 'stars_3.csv'], + 'q3': ['stars_3.csv', + 'stars_2.csv', + 'stars_1.csv', + 'small_data.zip', + 'planets_3.csv', + 'planets_2.csv', + 'planets_1.csv', + 'mapping_3.json', + 'mapping_2.json', + 'mapping_1.json'], + 'q4': 'small_data\\stars_1.csv', + 'q5': ['small_data\\stars_3.csv', + 'small_data\\stars_2.csv', + 'small_data\\stars_1.csv', + 'small_data\\small_data.zip', + 'small_data\\planets_3.csv', + 'small_data\\planets_2.csv', + 'small_data\\planets_1.csv', + 'small_data\\mapping_3.json', + 'small_data\\mapping_2.json', + 'small_data\\mapping_1.json'], + 'q6': ['small_data\\mapping_3.json', + 'small_data\\mapping_2.json', + 'small_data\\mapping_1.json'], + 'q7': ['small_data\\stars_3.csv', + 'small_data\\stars_2.csv', + 'small_data\\stars_1.csv'], + 'Star': Star(spectral_type='G2 V', stellar_effective_temperature=5780.0, stellar_radius=1.0, stellar_mass=1.0, stellar_luminosity=0.0, stellar_surface_gravity=4.44, stellar_age=4.6), + 'q8': Star(spectral_type='G8V', stellar_effective_temperature=5172.0, stellar_radius=0.94, stellar_mass=0.91, stellar_luminosity=-0.197, stellar_surface_gravity=4.43, stellar_age=10.2), + 'q9': Star(spectral_type='F8V', stellar_effective_temperature=6196.0, stellar_radius=1.26, stellar_mass=1.21, stellar_luminosity=0.32, stellar_surface_gravity=4.41, stellar_age=2.01), + 'q10': 'F8V', + 'q11': 'F8V', + 'q12': 0.7460317460317459, + 'q13': {'55 Cnc': Star(spectral_type='G8V', stellar_effective_temperature=5172.0, stellar_radius=0.94, stellar_mass=0.91, stellar_luminosity=-0.197, stellar_surface_gravity=4.43, stellar_age=10.2), + 'DMPP-1': Star(spectral_type='F8V', stellar_effective_temperature=6196.0, stellar_radius=1.26, stellar_mass=1.21, stellar_luminosity=0.32, stellar_surface_gravity=4.41, stellar_age=2.01), + 'GJ 876': Star(spectral_type='M2.5V', stellar_effective_temperature=3271.0, stellar_radius=0.3, stellar_mass=0.32, stellar_luminosity=-1.907, stellar_surface_gravity=4.87, stellar_age=1.0)}, + 'q14': Star(spectral_type='M2.5V', stellar_effective_temperature=3271.0, stellar_radius=0.3, stellar_mass=0.32, stellar_luminosity=-1.907, stellar_surface_gravity=4.87, stellar_age=1.0), + 'q15': -1.907, + 'q16': 'G0', + 'q17': None, + 'q18': 1.04, + 'q19': {'HD 158259': Star(spectral_type='G0', stellar_effective_temperature=5801.89, stellar_radius=1.21, stellar_mass=1.08, stellar_luminosity=0.212, stellar_surface_gravity=4.25, stellar_age=None), + 'K2-187': Star(spectral_type=None, stellar_effective_temperature=5438.0, stellar_radius=0.83, stellar_mass=0.97, stellar_luminosity=-0.21, stellar_surface_gravity=4.6, stellar_age=None), + 'WASP-47': Star(spectral_type=None, stellar_effective_temperature=5552.0, stellar_radius=1.14, stellar_mass=1.04, stellar_luminosity=0.032, stellar_surface_gravity=4.34, stellar_age=6.5)}, + 'q20': {'K2-133': Star(spectral_type='M1.5V', stellar_effective_temperature=3655.0, stellar_radius=0.46, stellar_mass=0.46, stellar_luminosity=-1.479, stellar_surface_gravity=4.77, stellar_age=None), + 'K2-138': Star(spectral_type='G8V', stellar_effective_temperature=5356.3, stellar_radius=0.86, stellar_mass=0.94, stellar_luminosity=-0.287, stellar_surface_gravity=4.54, stellar_age=2.8), + 'GJ 667 C': Star(spectral_type='M1.5V', stellar_effective_temperature=3350.0, stellar_radius=None, stellar_mass=0.33, stellar_luminosity=-1.863, stellar_surface_gravity=4.69, stellar_age=2.0)}, + 'q21': {'55 Cnc': Star(spectral_type='G8V', stellar_effective_temperature=5172.0, stellar_radius=0.94, stellar_mass=0.91, stellar_luminosity=-0.197, stellar_surface_gravity=4.43, stellar_age=10.2), + 'DMPP-1': Star(spectral_type='F8V', stellar_effective_temperature=6196.0, stellar_radius=1.26, stellar_mass=1.21, stellar_luminosity=0.32, stellar_surface_gravity=4.41, stellar_age=2.01), + 'GJ 876': Star(spectral_type='M2.5V', stellar_effective_temperature=3271.0, stellar_radius=0.3, stellar_mass=0.32, stellar_luminosity=-1.907, stellar_surface_gravity=4.87, stellar_age=1.0), + 'HD 158259': Star(spectral_type='G0', stellar_effective_temperature=5801.89, stellar_radius=1.21, stellar_mass=1.08, stellar_luminosity=0.212, stellar_surface_gravity=4.25, stellar_age=None), + 'K2-187': Star(spectral_type=None, stellar_effective_temperature=5438.0, stellar_radius=0.83, stellar_mass=0.97, stellar_luminosity=-0.21, stellar_surface_gravity=4.6, stellar_age=None), + 'WASP-47': Star(spectral_type=None, stellar_effective_temperature=5552.0, stellar_radius=1.14, stellar_mass=1.04, stellar_luminosity=0.032, stellar_surface_gravity=4.34, stellar_age=6.5), + 'K2-133': Star(spectral_type='M1.5V', stellar_effective_temperature=3655.0, stellar_radius=0.46, stellar_mass=0.46, stellar_luminosity=-1.479, stellar_surface_gravity=4.77, stellar_age=None), + 'K2-138': Star(spectral_type='G8V', stellar_effective_temperature=5356.3, stellar_radius=0.86, stellar_mass=0.94, stellar_luminosity=-0.287, stellar_surface_gravity=4.54, stellar_age=2.8), + 'GJ 667 C': Star(spectral_type='M1.5V', stellar_effective_temperature=3350.0, stellar_radius=None, stellar_mass=0.33, stellar_luminosity=-1.863, stellar_surface_gravity=4.69, stellar_age=2.0)}, + 'Planet': Planet(planet_name='Jupiter', host_name='Sun', discovery_method='Imaging', discovery_year=1610, controversial_flag=False, orbital_period=4333.0, planet_radius=11.209, planet_mass=317.828, semi_major_radius=5.2038, eccentricity=0.0489, equilibrium_temperature=110, insolation_flux=0.0345), + 'q22': {'planets_1_csv': [['Planet Name', + 'Discovery Method', + 'Discovery Year', + 'Controversial Flag', + 'Orbital Period [days]', + 'Planet Radius [Earth Radius]', + 'Planet Mass [Earth Mass]', + 'Orbit Semi-Major Axis [au]', + 'Eccentricity', + 'Equilibrium Temperature [K]', + 'Insolation Flux [Earth Flux]'], + ['55 Cnc b', + 'Radial Velocity', + '1996', + '0', + '14.65160000', + '13.900', + '263.97850', + '0.113400', + '0.000000', + '700', + ''], + ['55 Cnc c', + 'Radial Velocity', + '2004', + '0', + '44.39890000', + '8.510', + '54.47380', + '0.237300', + '0.030000', + '', + ''], + ['DMPP-1 b', + 'Radial Velocity', + '2019', + '0', + '18.57000000', + '5.290', + '24.27000', + '0.146200', + '0.083000', + '877', + ''], + ['GJ 876 b', + 'Radial Velocity', + '1998', + '0', + '61.11660000', + '13.300', + '723.22350', + '0.208317', + '0.032400', + '', + ''], + ['GJ 876 c', + 'Radial Velocity', + '2000', + '0', + '30.08810000', + '14.000', + '226.98460', + '0.129590', + '0.255910', + '', + '']], + 'planets_header': ['Planet Name', + 'Discovery Method', + 'Discovery Year', + 'Controversial Flag', + 'Orbital Period [days]', + 'Planet Radius [Earth Radius]', + 'Planet Mass [Earth Mass]', + 'Orbit Semi-Major Axis [au]', + 'Eccentricity', + 'Equilibrium Temperature [K]', + 'Insolation Flux [Earth Flux]'], + 'planets_1_rows': [['55 Cnc b', + 'Radial Velocity', + '1996', + '0', + '14.65160000', + '13.900', + '263.97850', + '0.113400', + '0.000000', + '700', + ''], + ['55 Cnc c', + 'Radial Velocity', + '2004', + '0', + '44.39890000', + '8.510', + '54.47380', + '0.237300', + '0.030000', + '', + ''], + ['DMPP-1 b', + 'Radial Velocity', + '2019', + '0', + '18.57000000', + '5.290', + '24.27000', + '0.146200', + '0.083000', + '877', + ''], + ['GJ 876 b', + 'Radial Velocity', + '1998', + '0', + '61.11660000', + '13.300', + '723.22350', + '0.208317', + '0.032400', + '', + ''], + ['GJ 876 c', + 'Radial Velocity', + '2000', + '0', + '30.08810000', + '14.000', + '226.98460', + '0.129590', + '0.255910', + '', + '']]}, + 'q23': {'55 Cnc b': '55 Cnc', + '55 Cnc c': '55 Cnc', + 'DMPP-1 b': 'DMPP-1', + 'GJ 876 b': 'GJ 876', + 'GJ 876 c': 'GJ 876'}, + 'q24': '55 Cnc b', + 'q25': None, + 'q26': False, + 'q27': Planet(planet_name='55 Cnc b', host_name='55 Cnc', discovery_method='Radial Velocity', discovery_year=1996, controversial_flag=False, orbital_period=14.6516, planet_radius=13.9, planet_mass=263.9785, semi_major_radius=0.1134, eccentricity=0.0, equilibrium_temperature=700.0, insolation_flux=None), + 'q28': [Planet(planet_name='55 Cnc b', host_name='55 Cnc', discovery_method='Radial Velocity', discovery_year=1996, controversial_flag=False, orbital_period=14.6516, planet_radius=13.9, planet_mass=263.9785, semi_major_radius=0.1134, eccentricity=0.0, equilibrium_temperature=700.0, insolation_flux=None), + Planet(planet_name='55 Cnc c', host_name='55 Cnc', discovery_method='Radial Velocity', discovery_year=2004, controversial_flag=False, orbital_period=44.3989, planet_radius=8.51, planet_mass=54.4738, semi_major_radius=0.2373, eccentricity=0.03, equilibrium_temperature=None, insolation_flux=None), + Planet(planet_name='DMPP-1 b', host_name='DMPP-1', discovery_method='Radial Velocity', discovery_year=2019, controversial_flag=False, orbital_period=18.57, planet_radius=5.29, planet_mass=24.27, semi_major_radius=0.1462, eccentricity=0.083, equilibrium_temperature=877.0, insolation_flux=None), + Planet(planet_name='GJ 876 b', host_name='GJ 876', discovery_method='Radial Velocity', discovery_year=1998, controversial_flag=False, orbital_period=61.1166, planet_radius=13.3, planet_mass=723.2235, semi_major_radius=0.208317, eccentricity=0.0324, equilibrium_temperature=None, insolation_flux=None), + Planet(planet_name='GJ 876 c', host_name='GJ 876', discovery_method='Radial Velocity', discovery_year=2000, controversial_flag=False, orbital_period=30.0881, planet_radius=14.0, planet_mass=226.9846, semi_major_radius=0.12959, eccentricity=0.25591, equilibrium_temperature=None, insolation_flux=None)], + 'q29': Planet(planet_name='GJ 876 c', host_name='GJ 876', discovery_method='Radial Velocity', discovery_year=2000, controversial_flag=False, orbital_period=30.0881, planet_radius=14.0, planet_mass=226.9846, semi_major_radius=0.12959, eccentricity=0.25591, equilibrium_temperature=None, insolation_flux=None), + 'q30': 'GJ 876 c', + 'q31': False, + 'q32': [Planet(planet_name='HD 158259 b', host_name='HD 158259', discovery_method='Radial Velocity', discovery_year=2020, controversial_flag=False, orbital_period=2.178, planet_radius=1.292, planet_mass=2.22, semi_major_radius=None, eccentricity=None, equilibrium_temperature=1478.0, insolation_flux=794.22), + Planet(planet_name='K2-187 b', host_name='K2-187', discovery_method='Transit', discovery_year=2018, controversial_flag=False, orbital_period=0.77401, planet_radius=1.2, planet_mass=1.87, semi_major_radius=0.0164, eccentricity=None, equilibrium_temperature=1815.0, insolation_flux=None), + Planet(planet_name='K2-187 c', host_name='K2-187', discovery_method='Transit', discovery_year=2018, controversial_flag=False, orbital_period=2.871512, planet_radius=1.4, planet_mass=2.54, semi_major_radius=0.0392, eccentricity=None, equilibrium_temperature=1173.0, insolation_flux=None)], + 'q33': {}, + 'q34': 'K2-187', + 'q35': Star(spectral_type=None, stellar_effective_temperature=5438.0, stellar_radius=0.83, stellar_mass=0.97, stellar_luminosity=-0.21, stellar_surface_gravity=4.6, stellar_age=None), + 'q36': 0.94} + return expected_json + + +def get_special_json(): + """get_special_json() returns a dict mapping each question to the expected + answer stored in a special format as a list of tuples. Each tuple contains + the element expected in the list, and its corresponding value. Any two + elements with the same value can appear in any order in the actual list, + but if two elements have different values, then they must appear in the + same order as in the expected list of tuples.""" + special_json = {} + return special_json + + +def compare(expected, actual, q_format=TEXT_FORMAT): + """compare(expected, actual) is used to compare when the format of + the expected answer is known for certain.""" + try: + if q_format == TEXT_FORMAT: + return simple_compare(expected, actual) + elif q_format == TEXT_FORMAT_UNORDERED_LIST: + return list_compare_unordered(expected, actual) + elif q_format == TEXT_FORMAT_ORDERED_LIST: + return list_compare_ordered(expected, actual) + elif q_format == TEXT_FORMAT_DICT: + return dict_compare(expected, actual) + elif q_format == TEXT_FORMAT_SPECIAL_ORDERED_LIST: + return list_compare_special(expected, actual) + elif q_format == TEXT_FORMAT_NAMEDTUPLE: + return namedtuple_compare(expected, actual) + elif q_format == PNG_FORMAT_SCATTER: + return compare_flip_dicts(expected, actual) + elif q_format == HTML_FORMAT: + return compare_cell_html(expected, actual) + elif q_format == FILE_JSON_FORMAT: + return compare_json(expected, actual) + else: + if expected != actual: + return "expected %s but found %s " % (repr(expected), repr(actual)) + except: + if expected != actual: + return "expected %s" % (repr(expected)) + return PASS + + +def print_message(expected, actual, complete_msg=True): + """print_message(expected, actual) displays a simple error message.""" + msg = "expected %s" % (repr(expected)) + if complete_msg: + msg = msg + " but found %s" % (repr(actual)) + return msg + + +def simple_compare(expected, actual, complete_msg=True): + """simple_compare(expected, actual) is used to compare when the expected answer + is a type/Nones/str/int/float/bool. When the expected answer is a float, + the actual answer is allowed to be within the tolerance limit. Otherwise, + the values must match exactly, or a very simple error message is displayed.""" + msg = PASS + if 'numpy' in repr(type((actual))): + actual = actual.item() + if isinstance(expected, type): + if expected != actual: + if isinstance(actual, type): + msg = "expected %s but found %s" % (expected.__name__, actual.__name__) + else: + msg = "expected %s but found %s" % (expected.__name__, repr(actual)) + elif not isinstance(actual, type(expected)): + if not (isinstance(expected, (float, int)) and isinstance(actual, (float, int))): + if not is_namedtuple(expected): + msg = "expected to find type %s but found type %s" % (type(expected).__name__, type(actual).__name__) + elif isinstance(expected, float): + if not math.isclose(actual, expected, rel_tol=REL_TOL, abs_tol=ABS_TOL): + msg = print_message(expected, actual, complete_msg) + elif isinstance(expected, (list, tuple)) or is_namedtuple(expected): + new_msg = print_message(expected, actual, complete_msg) + if len(expected) != len(actual): + return new_msg + for i in range(len(expected)): + val = simple_compare(expected[i], actual[i]) + if val != PASS: + return new_msg + elif isinstance(expected, dict): + new_msg = print_message(expected, actual, complete_msg) + if len(expected) != len(actual): + return new_msg + val = simpe_compare(sorted(list(expected.keys())), sorted(list(actual.keys()))) + if val != PASS: + return new_msg + for key in expected: + val = simple_compare(expected[key], actual[key]) + if val != PASS: + return new_msg + else: + if expected != actual: + msg = print_message(expected, actual, complete_msg) + return msg + + +def intelligent_compare(expected, actual, obj=None): + """intelligent_compare(expected, actual) is used to compare when the + data type of the expected answer is not known for certain, and default + assumptions need to be made.""" + if obj == None: + obj = type(expected).__name__ + if is_namedtuple(expected): + msg = namedtuple_compare(expected, actual) + elif isinstance(expected, (list, tuple)): + msg = list_compare_ordered(expected, actual, obj) + elif isinstance(expected, set): + msg = list_compare_unordered(expected, actual, obj) + elif isinstance(expected, (dict)): + msg = dict_compare(expected, actual) + else: + msg = simple_compare(expected, actual) + msg = msg.replace("CompDict", "dict").replace("CompSet", "set").replace("NewNone", "None") + return msg + + +def is_namedtuple(obj, init_check=True): + """is_namedtuple(obj) returns True if `obj` is a namedtuple object + defined in the test file.""" + bases = type(obj).__bases__ + if len(bases) != 1 or bases[0] != tuple: + return False + fields = getattr(type(obj), '_fields', None) + if not isinstance(fields, tuple): + return False + if init_check and not type(obj).__name__ in [nt.__name__ for nt in _expected_namedtuples]: + return False + return True + + +def list_compare_ordered(expected, actual, obj=None): + """list_compare_ordered(expected, actual) is used to compare when the + expected answer is a list/tuple, where the order of the elements matters.""" + msg = PASS + if not isinstance(actual, type(expected)): + msg = "expected to find type %s but found type %s" % (type(expected).__name__, type(actual).__name__) + return msg + if obj == None: + obj = type(expected).__name__ + for i in range(len(expected)): + if i >= len(actual): + msg = "at index %d of the %s, expected missing %s" % (i, obj, repr(expected[i])) + break + val = intelligent_compare(expected[i], actual[i], "sub" + obj) + if val != PASS: + msg = "at index %d of the %s, " % (i, obj) + val + break + if len(actual) > len(expected) and msg == PASS: + msg = "at index %d of the %s, found unexpected %s" % (len(expected), obj, repr(actual[len(expected)])) + if len(expected) != len(actual): + msg = msg + " (found %d entries in %s, but expected %d)" % (len(actual), obj, len(expected)) + + if len(expected) > 0: + try: + if msg != PASS and list_compare_unordered(expected, actual, obj) == PASS: + msg = msg + " (%s may not be ordered as required)" % (obj) + except: + pass + return msg + + +def list_compare_helper(larger, smaller): + """list_compare_helper(larger, smaller) is a helper function which takes in + two lists of possibly unequal sizes and finds the item that is not present + in the smaller list, if there is such an element.""" + msg = PASS + j = 0 + for i in range(len(larger)): + if i == len(smaller): + msg = "expected %s" % (repr(larger[i])) + break + found = False + while not found: + if j == len(smaller): + val = simple_compare(larger[i], smaller[j - 1], complete_msg=False) + break + val = simple_compare(larger[i], smaller[j], complete_msg=False) + j += 1 + if val == PASS: + found = True + break + if not found: + msg = val + break + return msg + +class NewNone(): + """alternate class in place of None, which allows for comparison with + all other data types.""" + def __str__(self): + return 'None' + def __repr__(self): + return 'None' + def __lt__(self, other): + return True + def __le__(self, other): + return True + def __gt__(self, other): + return False + def __ge__(self, other): + return other == None + def __eq__(self, other): + return other == None + def __ne__(self, other): + return other != None + +class CompDict(dict): + """subclass of dict, which allows for comparison with other dicts.""" + def __init__(self, vals): + super(self.__class__, self).__init__(vals) + if type(vals) == CompDict: + self.val = vals.val + elif isinstance(vals, dict): + self.val = self.get_equiv(vals) + else: + raise TypeError("'%s' object cannot be type casted to CompDict class" % type(vals).__name__) + + def get_equiv(self, vals): + val = [] + for key in sorted(list(vals.keys())): + val.append((key, vals[key])) + return val + + def __str__(self): + return str(dict(self.val)) + def __repr__(self): + return repr(dict(self.val)) + def __lt__(self, other): + return self.val < CompDict(other).val + def __le__(self, other): + return self.val <= CompDict(other).val + def __gt__(self, other): + return self.val > CompDict(other).val + def __ge__(self, other): + return self.val >= CompDict(other).val + def __eq__(self, other): + return self.val == CompDict(other).val + def __ne__(self, other): + return self.val != CompDict(other).val + +class CompSet(set): + """subclass of set, which allows for comparison with other sets.""" + def __init__(self, vals): + super(self.__class__, self).__init__(vals) + if type(vals) == CompSet: + self.val = vals.val + elif isinstance(vals, set): + self.val = self.get_equiv(vals) + else: + raise TypeError("'%s' object cannot be type casted to CompSet class" % type(vals).__name__) + + def get_equiv(self, vals): + return sorted(list(vals)) + + def __str__(self): + return str(set(self.val)) + def __repr__(self): + return repr(set(self.val)) + def __getitem__(self, index): + return self.val[index] + def __lt__(self, other): + return self.val < CompSet(other).val + def __le__(self, other): + return self.val <= CompSet(other).val + def __gt__(self, other): + return self.val > CompSet(other).val + def __ge__(self, other): + return self.val >= CompSet(other).val + def __eq__(self, other): + return self.val == CompSet(other).val + def __ne__(self, other): + return self.val != CompSet(other).val + +def make_sortable(item): + """make_sortable(item) replaces all Nones in `item` with an alternate + class that allows for comparison with str/int/float/bool/list/set/tuple/dict. + It also replaces all dicts (and sets) with a subclass that allows for + comparison with other dicts (and sets).""" + if item == None: + return NewNone() + elif isinstance(item, (type, str, int, float, bool)): + return item + elif isinstance(item, (list, set, tuple)): + new_item = [] + for subitem in item: + new_item.append(make_sortable(subitem)) + if is_namedtuple(item): + return type(item)(*new_item) + elif isinstance(item, set): + return CompSet(new_item) + else: + return type(item)(new_item) + elif isinstance(item, dict): + new_item = {} + for key in item: + new_item[key] = make_sortable(item[key]) + return CompDict(new_item) + return item + +def list_compare_unordered(expected, actual, obj=None): + """list_compare_unordered(expected, actual) is used to compare when the + expected answer is a list/set where the order of the elements does not matter.""" + msg = PASS + if not isinstance(actual, type(expected)): + msg = "expected to find type %s but found type %s" % (type(expected).__name__, type(actual).__name__) + return msg + if obj == None: + obj = type(expected).__name__ + + try: + sort_expected = sorted(make_sortable(expected)) + sort_actual = sorted(make_sortable(actual)) + except: + return "unexpected datatype found in %s; expected entries of type %s" % (obj, obj, type(expected[0]).__name__) + + if len(actual) == 0 and len(expected) > 0: + msg = "in the %s, missing " % (obj) + sort_expected[0] + elif len(actual) > 0 and len(expected) > 0: + val = intelligent_compare(sort_expected[0], sort_actual[0]) + if val.startswith("expected to find type"): + msg = "in the %s, " % (obj) + simple_compare(sort_expected[0], sort_actual[0]) + else: + if len(expected) > len(actual): + msg = "in the %s, missing " % (obj) + list_compare_helper(sort_expected, sort_actual) + elif len(expected) < len(actual): + msg = "in the %s, found un" % (obj) + list_compare_helper(sort_actual, sort_expected) + if len(expected) != len(actual): + msg = msg + " (found %d entries in %s, but expected %d)" % (len(actual), obj, len(expected)) + return msg + else: + val = list_compare_helper(sort_expected, sort_actual) + if val != PASS: + msg = "in the %s, missing " % (obj) + val + ", but found un" + list_compare_helper(sort_actual, + sort_expected) + return msg + + +def namedtuple_compare(expected, actual): + """namedtuple_compare(expected, actual) is used to compare when the + expected answer is a namedtuple defined in the test file.""" + msg = PASS + if not is_namedtuple(actual, False): + msg = "expected namedtuple but found %s" % (type(actual).__name__) + return msg + if type(expected).__name__ != type(actual).__name__: + return "expected namedtuple %s but found namedtuple %s" % (type(expected).__name__, type(actual).__name__) + expected_fields = expected._fields + actual_fields = actual._fields + msg = list_compare_ordered(list(expected_fields), list(actual_fields), "namedtuple attributes") + if msg != PASS: + return msg + for field in expected_fields: + val = intelligent_compare(getattr(expected, field), getattr(actual, field)) + if val != PASS: + msg = "at attribute %s of namedtuple %s, " % (field, type(expected).__name__) + val + return msg + return msg + + +def clean_slashes(item): + """clean_slashes()""" + if isinstance(item, str): + return item.replace("\\", "/").replace("/", os.path.sep) + elif item == None or isinstance(item, (type, int, float, bool)): + return item + elif isinstance(item, (list, tuple, set)) or is_namedtuple(item): + new_item = [] + for subitem in item: + new_item.append(clean_slashes(subitem)) + if is_namedtuple(item): + return type(item)(*new_item) + else: + return type(item)(new_item) + elif isinstance(item, dict): + new_item = {} + for key in item: + new_item[clean_slashes(key)] = clean_slashes(item[key]) + return item + + +def list_compare_special_initialize(special_expected): + """list_compare_special_initialize(special_expected) takes in the special + ordering stored as a sorted list of items, and returns a list of lists + where the ordering among the inner lists does not matter.""" + latest_val = None + clean_special = [] + for row in special_expected: + if latest_val == None or row[1] != latest_val: + clean_special.append([]) + latest_val = row[1] + clean_special[-1].append(row[0]) + return clean_special + + +def list_compare_special(special_expected, actual): + """list_compare_special(special_expected, actual) is used to compare when the + expected answer is a list with special ordering defined in `special_expected`.""" + msg = PASS + expected_list = [] + special_order = list_compare_special_initialize(special_expected) + for expected_item in special_order: + expected_list.extend(expected_item) + val = list_compare_unordered(expected_list, actual) + if val != PASS: + return val + i = 0 + for expected_item in special_order: + j = len(expected_item) + actual_item = actual[i: i + j] + val = list_compare_unordered(expected_item, actual_item) + if val != PASS: + if j == 1: + msg = "at index %d " % (i) + val + else: + msg = "between indices %d and %d " % (i, i + j - 1) + val + msg = msg + " (list may not be ordered as required)" + break + i += j + return msg + + +def dict_compare(expected, actual, obj=None): + """dict_compare(expected, actual) is used to compare when the expected answer + is a dict.""" + msg = PASS + if not isinstance(actual, type(expected)): + msg = "expected to find type %s but found type %s" % (type(expected).__name__, type(actual).__name__) + return msg + if obj == None: + obj = type(expected).__name__ + + expected_keys = list(expected.keys()) + actual_keys = list(actual.keys()) + val = list_compare_unordered(expected_keys, actual_keys, obj) + + if val != PASS: + msg = "bad keys in %s: " % (obj) + val + if msg == PASS: + for key in expected: + new_obj = None + if isinstance(expected[key], (list, tuple, set)): + new_obj = 'value' + elif isinstance(expected[key], dict): + new_obj = 'sub' + obj + val = intelligent_compare(expected[key], actual[key], new_obj) + if val != PASS: + msg = "incorrect value for key %s in %s: " % (repr(key), obj) + val + return msg + + +def is_flippable(item): + """is_flippable(item) determines if the given dict of lists has lists of the + same length and is therefore flippable.""" + item_lens = set(([str(len(item[key])) for key in item])) + if len(item_lens) == 1: + return PASS + else: + return "found lists of lengths %s" % (", ".join(list(item_lens))) + +def flip_dict_of_lists(item): + """flip_dict_of_lists(item) flips a dict of lists into a list of dicts if the + lists are of same length.""" + new_item = [] + length = len(list(item.values())[0]) + for i in range(length): + new_dict = {} + for key in item: + new_dict[key] = item[key][i] + new_item.append(new_dict) + return new_item + +def compare_flip_dicts(expected, actual, obj="lists"): + """compare_flip_dicts(expected, actual) flips a dict of lists (or dicts) into + a list of dicts (or dict of dicts) and then compares the list ignoring order.""" + msg = PASS + example_item = list(expected.values())[0] + if isinstance(example_item, (list, tuple)): + val = is_flippable(actual) + if val != PASS: + msg = "expected to find lists of length %d, but " % (len(example_item)) + val + return msg + msg = list_compare_unordered(flip_dict_of_lists(expected), flip_dict_of_lists(actual), "lists") + elif isinstance(example_item, dict): + expected_keys = list(example_item.keys()) + for key in actual: + val = list_compare_unordered(expected_keys, list(actual[key].keys()), "dictionary %s" % key) + if val != PASS: + return val + for cat_key in expected_keys: + expected_category = {} + actual_category = {} + for key in expected: + expected_category[key] = expected[key][cat_key] + actual_category[key] = actual[key][cat_key] + val = list_compare_unordered(flip_dict_of_lists(expected), flip_dict_of_lists(actual), "category " + repr(cat_key)) + if val != PASS: + return val + return msg + + +def get_expected_tables(): + """get_expected_tables() reads the html file with the expected DataFrames + and returns a dict mapping each question to a html table.""" + if not os.path.exists(DF_FILE): + return None + + expected_tables = {} + f = open(DF_FILE, encoding='utf-8') + soup = BeautifulSoup(f.read(), 'html.parser') + f.close() + + tables = soup.find_all('table') + for table in tables: + expected_tables[table.get("data-question")] = table + + return expected_tables + +def parse_df_html_table(table): + """parse_df_html_table(table) takes in a table as a html string and returns + a dict mapping each row and column index to the value at that position.""" + rows = [] + for tr in table.find_all('tr'): + rows.append([]) + for cell in tr.find_all(['td', 'th']): + rows[-1].append(cell.get_text().strip("\n ")) + + cells = {} + for r in range(1, len(rows)): + for c in range(1, len(rows[0])): + rname = rows[r][0] + cname = rows[0][c] + cells[(rname,cname)] = rows[r][c] + return cells + + +def get_expected_namedtuples(): + """get_expected_namedtuples() defines the required namedtuple objects + globally. It also returns a tuple of the classes.""" + expected_namedtuples = [] + + global Star + star_attributes = ['spectral_type', 'stellar_effective_temperature', 'stellar_radius', 'stellar_mass', 'stellar_luminosity', 'stellar_surface_gravity', 'stellar_age'] + Star = namedtuple('Star', star_attributes) + expected_namedtuples.append(Star) + global Planet + """# BEGIN PROMPT + planets_attributes = ... # initialize the list of attributes + + # define the namedtuple 'Planet' + """ + planets_attributes = ['planet_name', 'host_name', 'discovery_method', 'discovery_year', 'controversial_flag', 'orbital_period', 'planet_radius', 'planet_mass', 'semi_major_radius', 'eccentricity', 'equilibrium_temperature', 'insolation_flux'] + Planet = namedtuple('Planet', planets_attributes) + expected_namedtuples.append(Planet) + return tuple(expected_namedtuples) + +_expected_namedtuples = get_expected_namedtuples() + + +def compare_cell_html(expected, actual): + """compare_cell_html(expected, actual) is used to compare when the + expected answer is a DataFrame stored in the `expected_dfs` html file.""" + expected_cells = parse_df_html_table(expected) + try: + actual_cells = parse_df_html_table(BeautifulSoup(actual, 'html.parser').find('table')) + except Exception as e: + return "expected to find type DataFrame but found type %s instead" % type(actual).__name__ + + expected_cols = list(set(["column %s" % (loc[1]) for loc in expected_cells])) + actual_cols = list(set(["column %s" % (loc[1]) for loc in actual_cells])) + msg = list_compare_unordered(expected_cols, actual_cols, "DataFrame") + if msg != PASS: + return msg + + expected_rows = list(set(["row index %s" % (loc[0]) for loc in expected_cells])) + actual_rows = list(set(["row index %s" % (loc[0]) for loc in actual_cells])) + msg = list_compare_unordered(expected_rows, actual_rows, "DataFrame") + if msg != PASS: + return msg + + for location, expected in expected_cells.items(): + location_name = "column {} at index {}".format(location[1], location[0]) + actual = actual_cells.get(location, None) + if actual == None: + return "in %s, expected to find %s" % (location_name, repr(expected)) + try: + actual_ans = float(actual) + expected_ans = float(expected) + if math.isnan(actual_ans) and math.isnan(expected_ans): + continue + except Exception as e: + actual_ans, expected_ans = actual, expected + msg = simple_compare(expected_ans, actual_ans) + if msg != PASS: + return "in %s, " % location_name + msg + return PASS + + +def get_expected_plots(): + """get_expected_plots() reads the json file with the expected plot data + and returns a dict mapping each question to a dictionary with the plots data.""" + if not os.path.exists(PLOT_FILE): + return None + + f = open(PLOT_FILE, encoding='utf-8') + expected_plots = json.load(f) + f.close() + return expected_plots + + +def compare_file_json(expected, actual): + """compare_file_json(expected, actual) is used to compare when the + expected answer is a JSON file.""" + msg = PASS + if not os.path.isfile(expected): + return "file %s not found; make sure it is downloaded and stored in the correct directory" % (expected) + elif not os.path.isfile(actual): + return "file %s not found; make sure that you have created the file with the correct name" % (actual) + try: + e = open(expected, encoding='utf-8') + expected_data = json.load(e) + e.close() + except json.JSONDecodeError: + return "file %s is broken and cannot be parsed; please delete and redownload the file correctly" % (expected) + try: + a = open(actual, encoding='utf-8') + actual_data = json.load(a) + a.close() + except json.JSONDecodeError: + return "file %s is broken and cannot be parsed" % (actual) + if type(expected_data) == list: + msg = list_compare_ordered(expected_data, actual_data, 'file ' + actual) + elif type(expected_data) == dict: + msg = dict_compare(expected_data, actual_data) + return msg + + +_expected_json = get_expected_json() +_special_json = get_special_json() +_expected_plots = get_expected_plots() +_expected_tables = get_expected_tables() +_expected_format = get_expected_format() + +def check(qnum, actual): + """check(qnum, actual) is used to check if the answer in the notebook is + the correct answer, and provide useful feedback if the answer is incorrect.""" + msg = PASS + error_msg = "<b style='color: red;'>ERROR:</b> " + q_format = _expected_format[qnum] + + if q_format == TEXT_FORMAT_SPECIAL_ORDERED_LIST: + expected = _special_json[qnum] + elif q_format == PNG_FORMAT_SCATTER: + if _expected_plots == None: + msg = error_msg + "file %s not parsed; make sure it is downloaded and stored in the correct directory" % (PLOT_FILE) + else: + expected = _expected_plots[qnum] + elif q_format == HTML_FORMAT: + if _expected_tables == None: + msg = error_msg + "file %s not parsed; make sure it is downloaded and stored in the correct directory" % (DF_FILE) + else: + expected = _expected_tables[qnum] + else: + expected = _expected_json[qnum] + + if SLASHES in q_format: + q_format = q_format.replace(SLASHES, "").strip("_ ") + expected = clean_slashes(expected) + actual = clean_slashes(actual) + + if msg != PASS: + print(msg) + else: + msg = compare(expected, actual, q_format) + if msg != PASS: + msg = error_msg + msg + print(msg) + + +def check_file_size(path): + """check_file_size(path) throws an error if the file is too big to display + on Gradescope.""" + size = os.path.getsize(path) + assert size < MAX_FILE_SIZE * 10**3, "Your file is too big to be displayed by Gradescope; please delete unnecessary output cells so your file size is < %s KB" % MAX_FILE_SIZE + + +def reset_hidden_tests(): + """reset_hidden_tests() resets all hidden tests on the Gradescope autograder where the hidden test file exists""" + if not os.path.exists(HIDDEN_FILE): + return + hidn.reset_hidden_tests() + +def rubric_check(rubric_point, ignore_past_errors=True): + """rubric_check(rubric_point) uses the hidden test file on the Gradescope autograder to grade the `rubric_point`""" + if not os.path.exists(HIDDEN_FILE): + print(PASS) + return + error_msg_1 = "ERROR: " + error_msg_2 = "TEST DETAILS: " + try: + msg = hidn.rubric_check(rubric_point, ignore_past_errors) + except: + msg = "hidden tests crashed before execution" + if msg != PASS: + hidn.make_deductions(rubric_point) + if msg == "public tests failed": + comment = "The public tests have failed, so you will not receive any points for this question." + comment += "\nPlease confirm that the public tests pass locally before submitting." + elif msg == "answer is hardcoded": + comment = "In the datasets for testing hardcoding, all numbers are replaced with random values." + comment += "\nIf the answer is the same as in the original dataset for all these datasets" + comment += "\ndespite this, that implies that the answer in the notebook is hardcoded." + comment += "\nYou will not receive any points for this question." + else: + comment = hidn.get_comment(rubric_point) + msg = error_msg_1 + msg + if comment != "": + msg = msg + "\n" + error_msg_2 + comment + print(msg) + +def get_summary(): + """get_summary() returns the summary of the notebook using the hidden test file on the Gradescope autograder""" + if not os.path.exists(HIDDEN_FILE): + print("Total Score: %d/%d" % (TOTAL_SCORE, TOTAL_SCORE)) + return + score = min(TOTAL_SCORE, hidn.get_score(TOTAL_SCORE)) + display_msg = "Total Score: %d/%d" % (score, TOTAL_SCORE) + if score != TOTAL_SCORE: + display_msg += "\n" + hidn.get_deduction_string() + print(display_msg) + +def get_score_digit(digit): + """get_score_digit(digit) returns the `digit` of the score using the hidden test file on the Gradescope autograder""" + if not os.path.exists(HIDDEN_FILE): + score = TOTAL_SCORE + else: + score = hidn.get_score(TOTAL_SCORE) + digits = bin(score)[2:] + digits = "0"*(7 - len(digits)) + digits + return int(digits[6 - digit]) diff --git a/lab-p10/small_data.zip b/lab-p10/small_data.zip new file mode 100644 index 0000000000000000000000000000000000000000..1002b336b815897990880ebe72ce305438750cfa Binary files /dev/null and b/lab-p10/small_data.zip differ diff --git a/p10/README.md b/p10/README.md new file mode 100644 index 0000000000000000000000000000000000000000..d7822a1a58745ffb5ac1562fba13fe1f60411fc6 --- /dev/null +++ b/p10/README.md @@ -0,0 +1,68 @@ +# Project 10 (P10): Looking at Stars and Planets + + +## Corrections and clarifications: + +* None yet. + +**Find any issues?** Report to us: + +- Divyam Anshumaan <anshumaan@wisc.edu> +- Samuel Guo <sguo258@wisc.edu> +- Cathy Cao <ccao35@wisc.edu> +- John Balis <balis@wisc.edu> +- David Parra <deparra@wisc.edu> +- Muhammad Musa <mmusa2@wisc.edu> + + +## Instructions: + +This project will focus on **file handling**, **namedtuples**, **data cleaning** and **error handling**. To start, download [`p10.ipynb`](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f23-projects/-/tree/main/p10/p10.ipynb), [`public_tests.py`](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f23-projects/-/tree/main/p10/public_tests.py), and [`data.zip`](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f23-projects/-/tree/main/p10/data.zip). + +After downloading `data.zip`, make sure to extract it (using [Mac directions](http://osxdaily.com/2017/11/05/how-open-zip-file-mac/) or [Windows directions](https://support.microsoft.com/en-us/help/4028088/windows-zip-and-unzip-files)). After extracting, you should see a folder called `data`, which has the following files in it: + +* `.DS_Store` +* `.ipynb_checkpoints` +* `mapping_1.json` +* `mapping_2.json` +* `mapping_3.json` +* `mapping_4.json` +* `mapping_5.json` +* `stars_1.csv` +* `stars_2.csv` +* `stars_3.csv` +* `stars_4.csv` +* `stars_5.csv` +* `planets_1.csv` +* `planets_2.csv` +* `planets_3.csv` +* `planets_4.csv` +* `planets_5.csv` + +You may delete `data.zip` after extracting these files from it. + +You will work on `p10.ipynb` and hand it in. You should follow the provided directions for each question. Questions have **specific** directions on what **to do** and what **not to do**. + +**Important warning:** Since P10 deals heavily with file handling, whether or not the test cases pass depends greatly on the operating system being used. Even if your code passes all tests locally, it may **fail on Gradescope** if it uses a different operating system. To avoid these issues, **follow the instructions** in P10 carefully, and **after submission, check to see that all your test cases pass on the Gradescope autograder**. + +------------------------------ + +## IMPORTANT Submission instructions: +- Review the [Grading Rubric](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f23-projects/-/tree/main/p10/rubric.md), to ensure that you don't lose points during code review. +- Login to [Gradescope](https://www.gradescope.com/) and upload the zip file into the P10 assignment. +- If you completed the project with a **partner**, make sure to **add their name** by clicking "Add Group Member" +in Gradescope when uploading the P10 zip file. + + <img src="images/add_group_member.png" width="400"> + + **Warning:** You will have to add your partner on Gradescope even if you have filled out this information in your `p10.ipynb` notebook. + +- It is **your responsibility** to make sure that your project clears auto-grader tests on the Gradescope test system. Otter test results should be available within forty minutes after your submission (usually within ten minutes). **Ignore** the `-/100.00` that is displayed to the right. You should be able to see both PASS / FAIL results for the 20 test cases, which is accessible via Gradescope Dashboard (as in the image below): + + <img src="images/gradescope.png" width="400"> + +- You can view your **final score** at the **end of the page**. If you pass all tests, then you will receive **full points** for the project. Otherwise, you can see your final score in the **summary** section of the test results (as in the image below): + + <img src="images/summary.png" width="400"> + + If you want more details on why you lost points on a particular test, you can scroll up to find more details about the test. diff --git a/p10/data.zip b/p10/data.zip new file mode 100644 index 0000000000000000000000000000000000000000..2ae59b171672e16cfd367d00b96d484c1268154d Binary files /dev/null and b/p10/data.zip differ diff --git a/p10/images/README.md b/p10/images/README.md new file mode 100644 index 0000000000000000000000000000000000000000..535f9c403813009844c0fa2466cbc76032b57dad --- /dev/null +++ b/p10/images/README.md @@ -0,0 +1,7 @@ +# Images + +Images from p10 are stored here. + +```python + +``` diff --git a/p10/images/add_group_member.png b/p10/images/add_group_member.png new file mode 100644 index 0000000000000000000000000000000000000000..402e5962e3e54ce8349f60ccfe4ce2b60840dd3b Binary files /dev/null and b/p10/images/add_group_member.png differ diff --git a/p10/images/gradescope.png b/p10/images/gradescope.png new file mode 100644 index 0000000000000000000000000000000000000000..7441faae41d8eb98bfceeb78855b67896b1ff911 Binary files /dev/null and b/p10/images/gradescope.png differ diff --git a/p10/images/summary.png b/p10/images/summary.png new file mode 100644 index 0000000000000000000000000000000000000000..4a63e32ff1a29903584746aa4873373855558e7b Binary files /dev/null and b/p10/images/summary.png differ diff --git a/p10/p10.ipynb b/p10/p10.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..50ba848fdb8e6cf1a8b181e36ddb4db01f8c494f --- /dev/null +++ b/p10/p10.ipynb @@ -0,0 +1,3421 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": null, + "id": "8b450944", + "metadata": { + "cell_type": "code", + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "# import and initialize otter\n", + "import otter\n", + "grader = otter.Notebook(\"p10.ipynb\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9ceb233f", + "metadata": { + "editable": false, + "execution": { + "iopub.execute_input": "2023-11-08T00:40:24.192731Z", + "iopub.status.busy": "2023-11-08T00:40:24.191733Z", + "iopub.status.idle": "2023-11-08T00:40:26.523134Z", + "shell.execute_reply": "2023-11-08T00:40:26.522142Z" + } + }, + "outputs": [], + "source": [ + "import public_tests" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5b932bef", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.529138Z", + "iopub.status.busy": "2023-11-08T00:40:26.529138Z", + "iopub.status.idle": "2023-11-08T00:40:26.535135Z", + "shell.execute_reply": "2023-11-08T00:40:26.534137Z" + } + }, + "outputs": [], + "source": [ + "# PLEASE FILL IN THE DETAILS\n", + "# Enter none if you don't have a project partner\n", + "# You will have to add your partner as a group member on Gradescope even after you fill this\n", + "\n", + "# project: p10\n", + "# submitter: NETID1\n", + "# partner: NETID2" + ] + }, + { + "cell_type": "markdown", + "id": "0fe8fc95", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "# Project 10: Stars and Planets" + ] + }, + { + "cell_type": "markdown", + "id": "b84d0bbd", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Learning Objectives:\n", + "\n", + "In this project, you will demonstrate how to:\n", + "\n", + "* use `os` module to get information of files in a directory,\n", + "* use `os` module to get paths of files,\n", + "* look up data between JSON and CSV files using unique keys,\n", + "* read JSON and CSV files to store data to `namedTuple` objects,\n", + "* clean up missing values and handle cases when the file is too corrupt to parse,\n", + "\n", + "Please go through [Lab-P10](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f23-projects/-/tree/main/lab-p10) before working on this project. The lab introduces some useful techniques related to this project." + ] + }, + { + "cell_type": "markdown", + "id": "e0df7cca", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "<h2 style=\"color:red\">Warning (Note on Academic Misconduct):</h2>\n", + "\n", + "**IMPORTANT**: **P10 and P11 are two parts of the same data analysis.** You **cannot** switch project partners between these two projects. That is if you partner up with someone for P10, you have to sustain that partnership until the end of P11. Now may be a good time to review [our course policies](https://cs220.cs.wisc.edu/f23/syllabus.html).\n", + "\n", + "Under any circumstances, **no more than two students are allowed to work together on a project** as mentioned in the course policies. If your code is flagged by our code similarity detection tools, **both partners will be responsible** for sharing/copying the code, even if the code is shared/copied by one of the partners with/from other non-partner student(s). Note that each case of plagiarism will be reported to the Dean of Students with a zero grade on the project. **If you think that someone cannot be your project partner then don’t make that student your lab partner.**\n", + "\n", + "**<font color = \"red\">Project partners must submit only one copy of their project on Gradescope, but they must include the names of both partners.</font>**" + ] + }, + { + "cell_type": "markdown", + "id": "e495f863", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Testing your code:\n", + "\n", + "Along with this notebook, you must have downloaded the file `public_tests.py`. If you are curious about how we test your code, you can explore this file, and specifically the output of the function `get_expected_json`, to understand the expected answers to the questions." + ] + }, + { + "cell_type": "markdown", + "id": "4bda5cf6", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Setup:\n", + "\n", + "Before proceeding much further, download `data.zip` and extract it to a directory on your\n", + "computer (using [Mac directions](http://osxdaily.com/2017/11/05/how-open-zip-file-mac/) or\n", + "[Windows directions](https://support.microsoft.com/en-us/help/4028088/windows-zip-and-unzip-files)).\n", + "\n", + "You need to make sure that the project files are stored in the following structure:\n", + "\n", + "```\n", + "+-- p10.ipynb\n", + "+-- public_tests.py\n", + "+-- data\n", + "| +-- .DS_Store\n", + "| +-- .ipynb_checkpoints\n", + "| +-- mapping_1.json\n", + "| +-- mapping_2.json\n", + "| +-- mapping_3.json\n", + "| +-- mapping_4.json\n", + "| +-- mapping_5.json\n", + "| +-- planets_1.csv\n", + "| +-- planets_2.csv\n", + "| +-- planets_3.csv\n", + "| +-- planets_4.csv\n", + "| +-- planets_5.csv\n", + "| +-- stars_1.csv\n", + "| +-- stars_2.csv\n", + "| +-- stars_3.csv\n", + "| +-- stars_4.csv\n", + "| +-- stars_5.csv\n", + "```\n", + "\n", + "Make sure that the files inside `data.zip` are inside the `data` directory. If you place your files inside some other directory, then your code will **fail on Gradescope** even after passing local tests." + ] + }, + { + "cell_type": "markdown", + "id": "be737015", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Project Description:\n", + "\n", + "Cleaning data is an important part of a data scientist's work cycle. As you have already seen, the data we will be analyzing in P10 and P11 has been split up into 15 different files of different formats. Even worse, as you shall see later in this project, some of these files have been corrupted, and lots of data is missing. Unfortunately, in the real world, a lot of data that you will come across will be in rough shape, and it is your job to clean it up before you can analyze it. In P10, you will combine the data in these different files to create a few manageable data structures, which can be easily analyzed. In the process, you will also have to deal with broken CSV files (by skipping rows with broken data), and broken JSON files (by skipping the files entirely).\n", + "\n", + "After you create these data structures, in P11, you will dive deeper by analyzing this data and arrive at some exciting conclusions about various planets and stars outside our Solar System." + ] + }, + { + "cell_type": "markdown", + "id": "1989d5d8", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## The Data:\n", + "\n", + "In P10 and P11, you will be studying stars and planets outside our Solar System using this dataset from the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/cgi-bin/TblView/nph-tblView?app=ExoTbls&config=PSCompPars). You will use Python to ask some interesting questions about the laws of the universe and explore the habitability of other planets in our universe. The raw data from the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/cgi-bin/TblView/nph-tblView?app=ExoTbls&config=PSCompPars) has been parsed and stored in multiple different files of different formats. You can find these files inside `data.zip`." + ] + }, + { + "cell_type": "markdown", + "id": "0fa6cb36", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "You can open each of these files using Microsoft Excel or some other Spreadsheet viewing software to see how the data is stored. For example, these are the first three rows of the file `stars_1.csv`:\n", + "\n", + "Star Name|Spectral Type|Stellar Effective Temperature [K]|Stellar Radius [Solar Radius]|Stellar Mass [Solar mass]|Stellar Luminosity [log(Solar)]|Stellar Surface Gravity [log10(cm/s**2)]|Stellar Age [Gyr]\n", + "---|---|---|---|---|---|---|---\n", + "11 Com|G8III|4874.00|13.76|2.09|1.978|2.45|\n", + "11 UMi|K4III|4213.00|29.79|2.78|2.430|1.93|1.560\n", + "14 And|K0III|4888.00|11.55|1.78|1.840|2.55|4.500\n", + "\n", + "As you might have already guessed, this file contains data on a number of *stars* outside our solar system along with some important statistics about these stars. The columns here are as follows:\n", + "\n", + "- `Star Name`: The **name** given to the star by the *International Astronomical Union*,\n", + "- `Spectral Type`: The **Spectral Classification** of the star as per the *Morgan–Keenan (MK) system*,\n", + "- `Stellar Effective Temperature [K]`: The **temperature** of a *black body* (in units of Kelvin) that would emit the *observed radiation* of the star,\n", + "- `Stellar Radius [Solar Radius]`: The **radius** of the star (in units of the radius of the Sun),\n", + "- `Stellar Mass [Solar mass]`: The **mass** of the star (in units of the mass of the Sun),\n", + "- `Stellar Luminosity [log(Solar)]`: The *total* amount of **energy radiated** by the star **each second** (represented by the logarithm of the energy radiated by the Sun in each second),\n", + "- `Stellar Surface Gravity [log10(cm/s**2)]`: The **acceleration due to the gravity** of the Star at its *surface* (represented by the logarithm of the acceleration measured in centimeter per second squared),\n", + "- `Stellar Age [Gyr]`: The total **age** of the star (in units of Giga years, i.e., billions of years).\n", + "\n", + "The four other files `stars_2.csv`, `stars_3.csv`, `stars_4.csv`, and `stars_5.csv` also store similar data in the same format. At this stage of the project, it is alright if you do not understand what these columns mean - they will be explained to you when they become necessary (in P11)." + ] + }, + { + "cell_type": "markdown", + "id": "8f15162c", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "On the other hand, here are the first three rows of the file `planets_1.csv`:\n", + "\n", + "Planet Name|Discovery Method|Discovery Year|Controversial Flag|Orbital Period [days]|Planet Radius [Earth Radius]|Planet Mass [Earth Mass]|Orbit Semi-Major Axis [au]|Eccentricity|Equilibrium Temperature [K]|Insolation Flux [Earth Flux]\n", + "---|---|---|---|---|---|---|---|---|---|---\n", + "11 Com b|Radial Velocity|2007|0|323.21000000|12.200|4914.89849|1.178000|0.238000||\n", + "11 UMi b|Radial Velocity|2009|0|516.21997000|12.300|4684.81420|1.530000|0.080000||\n", + "14 And b|Radial Velocity|2008|0|186.76000000|13.100|1131.15130|0.775000|0.000000||\n", + "\n", + "This file contains data on a number of *planets* outside our solar system along with some important statistics about these planets. The columns here are as follows:\n", + "\n", + "- `Planet Name`: The **name** given to the planet by the *International Astronomical Union*,\n", + "- `Discovery Method`: The **method** by which the planet was *discovered*,\n", + "- `Discovery Year`: The **year** in which the planet was *discovered*,\n", + "- `Controversial Flag`: Indicates whether the status of the discovered object as a planet was **disputed** at the time of discovery, \n", + "- `Orbital Period [days]`: The amount of **time** (in units of days) it takes for the planet to **complete one orbit** around its star,\n", + "- `Planet Radius [Earth Radius]`: The **radius** of the planet (in units of the radius of the Earth),\n", + "- `Planet Mass [Earth Mass]`: The **mass** of the planet (in units of the mass of the Earth),\n", + "- `Orbit Semi-Major Axis [au]`: The **semi-major axis** of the planet's elliptical **orbit** around its host star (in units of Astronomical Units),\n", + "- `Eccentricity`: The **eccentricity** of the planet's orbit around its host star,\n", + "- `Equilibrium Temperature [K]`: The **temperature** of the planet (in units of Kelvin) if it were a *black body* heated only by its host star,\n", + "- `Insolation Flux [Earth Flux]`: The amount of **radiation** the planet received from its host star **per unit of area** (in units of the Insolation Flux of the Earth from the Sun).\n", + "\n", + "The four other files `planets_2.csv`, `planets_3.csv`, `planets_4.csv`, and `planets_5.csv` also store similar data in the same format. At this stage of the project, it is alright if you do not understand what these columns mean - they will be explained to you when they become necessary (in P11)." + ] + }, + { + "cell_type": "markdown", + "id": "993dc38e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "Finally, if you take a look at `mapping_1.json` (you can open json files using any Text Editor), you will see that the first three entries look like this:\n", + "\n", + "```python\n", + "{\"11 Com b\":\"11 Com\",\"11 UMi b\":\"11 UMi\",\"14 And b\":\"14 And\", ...}\n", + "```\n", + "\n", + "This file contains a *mapping* from each *planet* in `planets_1.csv` to the *star* in `stars_1.csv` that the planet orbits. Similarly, `mapping_2.json` contains a *mapping* from each *planet* in `planets_2.csv` to the *star* in `stars_2.csv` that the planet orbits. The pattern also holds true for `mapping_3.json`, `mapping_4.json`, and `mapping_5.json`." + ] + }, + { + "cell_type": "markdown", + "id": "dc1c070e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Project Requirements:\n", + "\n", + "You **may not** hardcode indices in your code, unless the question explicitly says to. If you open your `.csv` files with Excel, manually count through the rows and use this number to loop through the dataset, this is also considered as hardcoding. If any instances of hardcoding are found during code review, the Gradescope autograder will **deduct** points from your public score.\n", + "\n", + "**Store** your final answer for each question in the **variable specified for each question**. This step is important because Otter grades your work by comparing the value of this variable against the correct answer.\n", + "\n", + "For some of the questions, we'll ask you to write (then use) a function to compute the answer. If you compute the answer **without** creating the function we ask you to write, the Gradescope autograder will **deduct** points from your public score, even if the way you did it produced the correct answer.\n", + "\n", + "#### Required Functions:\n", + "- `star_cell`\n", + "- `get_stars`\n", + "- `planet_cell`\n", + "- `get_planets`\n", + "\n", + "In this project, you will also be required to define certain **data structures**. If you do not create these data structures exactly as specified, the Gradescope autograder will **deduct** points from your public score, even if the way you did it produced the correct answer.\n", + "\n", + "#### Required Data Structures:\n", + "- `Star` (**namedtuple**)\n", + "- `stars_dict` (**dictionary** mapping **strings** to `Star` objects)\n", + "- `Planet` (**namedtuple**)\n", + "- `planets_list` (**list** of `Planet` objects)\n", + "\n", + "In addition, you are also **required** to follow the requirements below:\n", + "\n", + "* You **must** never use the output of the `os.listdir` function directly. You **must** always first remove all files and directories that start with `\".\"`, and **explicitly sort** the list before doing anything with it.\n", + "* You are **not** allowed to use **bare** `try/except` blocks. In other words, you can **not** use `try/except` without explicitly specifying the type of exceptions that you want to catch.\n", + "* You are **only** allowed to use Python commands and concepts that have been taught in the course prior to the release of P10. In particular, this means that you are **not** allowed to use **modules** like `pandas` to answer the questions in this project.\n", + "* Please do not display `start_dict` or `planets_list` anywhere in the notebook. Please **remove** such statements before submission.\n", + "\n", + "Otherwise, the Gradescope autograder will **deduct** points from your public score.\n", + "\n", + "For more details on what will cause you to lose points during code review and specific requirements, please take a look at the [Grading rubric](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f23-projects/-/blob/main/p10/rubric.md)." + ] + }, + { + "cell_type": "markdown", + "id": "f9ab5264", + "metadata": { + "deletable": false, + "editable": false, + "lines_to_next_cell": 0 + }, + "source": [ + "## Questions and Functions:\n", + "\n", + "Let us start by importing all the modules we will need for this project." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "642e0e3f", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.542135Z", + "iopub.status.busy": "2023-11-08T00:40:26.542135Z", + "iopub.status.idle": "2023-11-08T00:40:26.549457Z", + "shell.execute_reply": "2023-11-08T00:40:26.548457Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# it is considered a good coding practice to place all import statements at the top of the notebook\n", + "# please place all your import statements in this cell if you need to import any more modules for this project\n" + ] + }, + { + "cell_type": "markdown", + "id": "c3b886f6", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### File handling:\n", + "\n", + "In the next questions, you will be using functions in the `os` module to make **lists** of files and paths in the `data` directory. All your **lists** **must** satisfy the following conditions:\n", + "\n", + "* Any files with names beginning with `\".\"` **must** be **excluded**.\n", + "* The list **must** be in **reverse-alphabetical** order." + ] + }, + { + "cell_type": "markdown", + "id": "8231ac15", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 1:** What are the **names** of the files present in the `data` directory\n", + "\n", + "Your output **must** be a **list** of **strings** representing the **names** of the files." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b4fdd3cd", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.555460Z", + "iopub.status.busy": "2023-11-08T00:40:26.554460Z", + "iopub.status.idle": "2023-11-08T00:40:26.569113Z", + "shell.execute_reply": "2023-11-08T00:40:26.568116Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'files_in_data', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "91285c22", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q1\")" + ] + }, + { + "cell_type": "markdown", + "id": "e655661c", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 2:** What are the **paths** of all the files in the `data` directory?\n", + "\n", + "Your output **must** be a **list** of **strings** representing the **paths** of the files. You **must** use the `files_in_data` variable created in the previous question to answer this.\n", + "\n", + "**Warning:** Please **do not hardcode** `\"/\"` or `\"\\\"` in your path because doing so will cause your function to **fail** on a computer that's not using the same operating system as yours. This may result in your code failing on Gradescope." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ce9c72a6", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.625763Z", + "iopub.status.busy": "2023-11-08T00:40:26.624766Z", + "iopub.status.idle": "2023-11-08T00:40:26.635822Z", + "shell.execute_reply": "2023-11-08T00:40:26.634827Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'file_paths', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "147e69c2", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q2\")" + ] + }, + { + "cell_type": "markdown", + "id": "2dd99aea", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 3:** What are the **paths** of all the **CSV files** present in `data` directory?\n", + "\n", + "Your output **must** be filtered to **only** include files ending in `'.csv'`. You **must** use either the `files_in_data` or `file_paths` variables created in the previous questions to answer this.\n", + "\n", + "**Warning:** Please **do not hardcode** `\"/\"` or `\"\\\"` in your path because doing so will cause your function to **fail** on a computer that's not using the same operating system as yours. This may result in your code failing on Gradescope." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "99fd56f7", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.706074Z", + "iopub.status.busy": "2023-11-08T00:40:26.705074Z", + "iopub.status.idle": "2023-11-08T00:40:26.716793Z", + "shell.execute_reply": "2023-11-08T00:40:26.715800Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'csv_file_paths', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e411d77f", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q3\")" + ] + }, + { + "cell_type": "markdown", + "id": "999aa8d2", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 4:** What are the **paths** of all the files present in `data` directory, that **begin** with the phrase `'stars'`?\n", + "\n", + "Your output **must** be filtered to **only** include files start with `'stars'`. You **must** use either the `files_in_data` or `file_paths` variables created in the previous questions to answer this.\n", + "\n", + "**Warning:** Please **do not hardcode** `\"/\"` or `\"\\\"` in your path because doing so will cause your function to **fail** on a computer that's not using the same operating system as yours. This may result in your code failing on Gradescope." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fba07316", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.793042Z", + "iopub.status.busy": "2023-11-08T00:40:26.793042Z", + "iopub.status.idle": "2023-11-08T00:40:26.802270Z", + "shell.execute_reply": "2023-11-08T00:40:26.801277Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'stars_paths', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1c033025", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q4\")" + ] + }, + { + "cell_type": "markdown", + "id": "549ef1f9", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Data Structure 1: namedtuple `Star`\n", + "\n", + "You will be using named tuples to store the data in the `stars_1.csv`, ..., `stars_5.csv` files. Before you start reading these files however, you **must** create a new `Star` type (using namedtuple). It **must** have the following attributes:\n", + "\n", + "* `spectral_type`,\n", + "* `stellar_effective_temperature`,\n", + "* `stellar_radius`,\n", + "* `stellar_mass`,\n", + "* `stellar_luminosity`,\n", + "* `stellar_surface_gravity`,\n", + "* `stellar_age`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1b4d2a53", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.881127Z", + "iopub.status.busy": "2023-11-08T00:40:26.880127Z", + "iopub.status.idle": "2023-11-08T00:40:26.887847Z", + "shell.execute_reply": "2023-11-08T00:40:26.886853Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# define the namedtuple 'Star' here\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b18fa0d1", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.892849Z", + "iopub.status.busy": "2023-11-08T00:40:26.892849Z", + "iopub.status.idle": "2023-11-08T00:40:26.902087Z", + "shell.execute_reply": "2023-11-08T00:40:26.901101Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# run this following cell to initialize and test an example Star object\n", + "# if this cell fails to execute, you have likely not defined the namedtuple 'Star' correctly\n", + "\n", + "sun = Star('G2 V', 5780.0, 1.0, 1.0, 0.0, 4.44, 4.6)\n", + "\n", + "sun" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "71991699", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"Star\")" + ] + }, + { + "cell_type": "markdown", + "id": "10544951", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Creating `Star` objects\n", + "\n", + "Now that we have created the `Star` namedtuple, our next objective will be to read the files `stars_1.csv`, ..., `stars_5.csv` and create `Star` objects out of all the stars in there. In order to process the CSV files, you will first need to copy/paste the `process_csv` function you have been using since P6." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "75e8b09f", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.947045Z", + "iopub.status.busy": "2023-11-08T00:40:26.947045Z", + "iopub.status.idle": "2023-11-08T00:40:26.954045Z", + "shell.execute_reply": "2023-11-08T00:40:26.953043Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# copy & paste the process_csv file from previous projects here\n" + ] + }, + { + "cell_type": "markdown", + "id": "5dfab81c", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "You are now ready to read the data in `stars_1.csv` using `process_csv` and convert the data into `Star` objects. In the cell below, you **must** read the data in `stars_1.csv` and extract the **header** and the non-header **rows** of the file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0322bf85", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.959042Z", + "iopub.status.busy": "2023-11-08T00:40:26.958042Z", + "iopub.status.idle": "2023-11-08T00:40:26.969595Z", + "shell.execute_reply": "2023-11-08T00:40:26.968600Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "stars_1_csv = ... # read the data in 'stars_1.csv'\n", + "stars_header = ...\n", + "stars_1_rows = ..." + ] + }, + { + "cell_type": "markdown", + "id": "3ecbd66c", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "If you wish to **verify** that you have read the file and defined the variables correctly, you can check that `stars_header` has the value:\n", + "\n", + "```python\n", + "['Star Name', 'Spectral Type', 'Stellar Effective Temperature [K]', 'Stellar Radius [Solar Radius]',\n", + " 'Stellar Mass [Solar mass]', 'Stellar Luminosity [log(Solar)]', \n", + " 'Stellar Surface Gravity [log10(cm/s**2)]', 'Stellar Age [Gyr]']\n", + "```\n", + "\n", + "and that `stars_1_rows` has **1595** rows of which the **first three** are:\n", + "\n", + "```python\n", + "[['11 Com', 'G8III', '4874.00', '13.76', '2.09', '1.978', '2.45', ''],\n", + " ['11 UMi', 'K4III', '4213.00', '29.79', '2.78', '2.430', '1.93', '1.560'],\n", + " ['14 And', 'K0III', '4888.00', '11.55', '1.78', '1.840', '2.55', '4.500']]\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "1b5d0bcf", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Function 1: `star_cell(row_idx, col_name, stars_rows, header=stars_header)`\n", + "\n", + "This function **must** read the **list** of **lists** `stars_rows`, and extract the value at **row** index `row_idx` and **column** index `col_idx`. The function **must** typecast the value based on `col_name`. If the value in `stars_rows` is **missing** (i.e., it is `''`), then the value returned **must** be `None`.\n", + "\n", + "The **column** of `stars_rows` where the value should be obtained from, and the correct **data type** for the value are listed in the table below:\n", + "\n", + "|Column of `stars_rows`|Data Type|\n", + "|------|---------|\n", + "|Star Name|**string**|\n", + "|Spectral Type|**string**|\n", + "|Stellar Effective Temperature [K]|**float**|\n", + "|Stellar Radius [Solar Radius]|**float**|\n", + "|Stellar Mass [Solar mass]|**float**|\n", + "|Stellar Luminosity [log(Solar)]|**float**|\n", + "|Stellar Surface Gravity [log10(cm/s**2)]|**float**|\n", + "|Stellar Age [Gyr]|**float**|\n", + "\n", + "You are **allowed** to copy/paste this function from Lab-P10." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "061ef7fd", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:26.975601Z", + "iopub.status.busy": "2023-11-08T00:40:26.974597Z", + "iopub.status.idle": "2023-11-08T00:40:26.989939Z", + "shell.execute_reply": "2023-11-08T00:40:26.988940Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# define the 'star_cell' function here\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4372c4e9", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"star_cell\")" + ] + }, + { + "cell_type": "markdown", + "id": "b020cfb8", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 5:** Create a `Star` object for the **third** star in `\"stars_1.csv\"`.\n", + "\n", + "You **must** access the values in `stars_1.csv` using the `star_cell` function. Note that the third star would be at **index** 2.\n", + "\n", + "The **attribute** of the `Star` namedtuple object, the corresponding **column** of the `stars_1.csv` file where the value should be obtained from, and the correct **data type** for the value are listed in the table below:\n", + "\n", + "|Attribute of `Star` object|Column of `stars_1.csv`|Data Type|\n", + "|---------|------|---------|\n", + "|`spectral_type`|Spectral Type|**string**|\n", + "|`stellar_effective_temperature`|Stellar Effective Temperature [K]|**float**|\n", + "|`stellar_radius`|Stellar Radius [Solar Radius]|**float**|\n", + "|`stellar_mass`|Stellar Mass [Solar mass]|**float**|\n", + "|`stellar_luminosity`|Stellar Luminosity [log(Solar)]|**float**|\n", + "|`stellar_surface_gravity`|Stellar Surface Gravity [log10(cm/s**2)]|**float**|\n", + "|`stellar_age`|Stellar Age [Gyr]|**float**|" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "01a1bea0", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.047893Z", + "iopub.status.busy": "2023-11-08T00:40:27.046892Z", + "iopub.status.idle": "2023-11-08T00:40:27.060487Z", + "shell.execute_reply": "2023-11-08T00:40:27.059488Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + " # compute and store the answer in the variable 'third_star', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2f7717a8", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q5\")" + ] + }, + { + "cell_type": "markdown", + "id": "bf286c95", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Function 2: `get_stars(star_file)`\n", + "\n", + "This function **must** take in as its input, the path of a CSV file `star_file` which contains data on stars in the same format as `stars_1.csv`. It **must** return a **dictionary** mapping the `Name` of each star in `star_file` to a `Star` object containing all the other details of the star.\n", + "\n", + "You **must** access the values in `stars_file` using the `star_cell` function.\n", + "\n", + "You **must not** hardcode the name of the directory `data` into the `get_stars` function. Instead, you must pass it as a part of the argument `star_file`, by including it in the **path** `star_file`.\n", + "\n", + "Once again, as a reminder, the attributes of the `Star` objects should be obtained from the **rows** of `star_file` and stored as follows:\n", + "\n", + "|Attribute of `Star` object|Column of `star_file`|Data Type|\n", + "|---------|------|---------|\n", + "|`spectral_type`|Spectral Type|**string**|\n", + "|`stellar_effective_temperature`|Stellar Effective Temperature [K]|**float**|\n", + "|`stellar_radius`|Stellar Radius [Solar Radius]|**float**|\n", + "|`stellar_mass`|Stellar Mass [Solar mass]|**float**|\n", + "|`stellar_luminosity`|Stellar Luminosity [log(Solar)]|**float**|\n", + "|`stellar_surface_gravity`|Stellar Surface Gravity [log10(cm/s**2)]|**float**|\n", + "|`stellar_age`|Stellar Age [Gyr]|**float**|\n", + "\n", + "In case any data in `star_file` is **missing**, the corresponding value should be `None`.\n", + "\n", + "For example, when this function is called with the file `stars_1.csv` as the input, the **dictionary** returned should look like:\n", + "\n", + "```python\n", + "{'11 Com': Star(spectral_type='G8III', stellar_effective_temperature=4874.0, \n", + " stellar_radius=13.76, stellar_mass=2.09, stellar_luminosity=1.978, \n", + " stellar_surface_gravity=2.45, stellar_age=None),\n", + " '11 UMi': Star(spectral_type='K4III', stellar_effective_temperature=4213.0, \n", + " tellar_radius=29.79, stellar_mass=2.78, stellar_luminosity=2.43, \n", + " stellar_surface_gravity=1.93, stellar_age=1.56),\n", + " '14 And': Star(spectral_type='K0III', stellar_effective_temperature=4888.0, \n", + " stellar_radius=11.55, stellar_mass=1.78, stellar_luminosity=1.84, \n", + " stellar_surface_gravity=2.55, stellar_age=4.5),\n", + " ...\n", + "}\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "790d24b9", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.117359Z", + "iopub.status.busy": "2023-11-08T00:40:27.116361Z", + "iopub.status.idle": "2023-11-08T00:40:27.125880Z", + "shell.execute_reply": "2023-11-08T00:40:27.124887Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# define the function 'get_stars' here\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ac87e0b3", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.131886Z", + "iopub.status.busy": "2023-11-08T00:40:27.131886Z", + "iopub.status.idle": "2023-11-08T00:40:27.160604Z", + "shell.execute_reply": "2023-11-08T00:40:27.159607Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# you can now use 'get_stars' to read the data in 'stars_1.csv'\n", + "\n", + "stars_1_dict = ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "45d3b8b5", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"get_stars\")" + ] + }, + { + "cell_type": "markdown", + "id": "a85665e7", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 6:** What is the `Star` object of the star (in `stars_1.csv`) named *DP Leo*?\n", + "\n", + "You **must** access the `Star` object in `stars_1_dict` **dictionary** defined above to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2b4e1b03", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.229789Z", + "iopub.status.busy": "2023-11-08T00:40:27.228786Z", + "iopub.status.idle": "2023-11-08T00:40:27.237757Z", + "shell.execute_reply": "2023-11-08T00:40:27.236758Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'dp_leo', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5a9c1aea", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q6\")" + ] + }, + { + "cell_type": "markdown", + "id": "e7445786", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 7:** What's the **average** `stellar_luminosity` of **all** the stars in the `star_1.csv` file?\n", + "\n", + "You **must** use the `stars_1_dict` **dictionary** defined above to answer this question.\n", + "\n", + "To find the average, you **must** first **add** up the `stellar_luminosity` value of all the stars and **divide** by the total **number** of stars. You **must skip** stars which don't have the `stellar_luminosity` data. Such stars should not contribute to either the sum of `stellar_luminosity` or to the number of stars." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3037793e", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.270482Z", + "iopub.status.busy": "2023-11-08T00:40:27.270482Z", + "iopub.status.idle": "2023-11-08T00:40:27.281432Z", + "shell.execute_reply": "2023-11-08T00:40:27.280437Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'avg_lum_stars_1', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8552446f", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q7\")" + ] + }, + { + "cell_type": "markdown", + "id": "dfb9bb43", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 8:** What is the **average** `stellar_age` of **all** the stars in the `stars_2.csv` file?\n", + "\n", + "You **must** use the function `get_stars(csv_file)` to read the data in `stars_2.csv`. Your output **must** be a **float** representing the `stellar_age` in units of *gigayears*. You **must** skip stars which have missing `stellar_age` data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f72168eb", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.338156Z", + "iopub.status.busy": "2023-11-08T00:40:27.338156Z", + "iopub.status.idle": "2023-11-08T00:40:27.364161Z", + "shell.execute_reply": "2023-11-08T00:40:27.363160Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'avg_age_stars_2', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "62d0a888", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q8\")" + ] + }, + { + "cell_type": "markdown", + "id": "9c119a33", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Data Structure 2: `stars_dict`\n", + "\n", + "You are now ready to read all the data about the stars stored in the `data` directory. You **must** now create a **dictionary** mapping the `Name` of each star in the `data` directory (inside the files `stars_1.csv`, ..., `stars_5.csv`) to the `Star` object containing all the other details about the star.\n", + "\n", + "You **must not** hardcode the files/paths of the files `stars_1.csv`, ..., `stars_5.csv` to answer this question. Instead, you **must** use the `stars_paths` variable defined earlier in Question 4 to get the list of paths needed for this question. You can use the `update` dictionary **method** to combine two **dictionaries**.\n", + "\n", + "You must use this dictionary to answer the next 3 questions." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5163f940", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.419123Z", + "iopub.status.busy": "2023-11-08T00:40:27.419123Z", + "iopub.status.idle": "2023-11-08T00:40:27.485650Z", + "shell.execute_reply": "2023-11-08T00:40:27.484740Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# define the variable 'stars_dict' here,\n", + "# but do NOT display the variable at the end\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "54e0c241", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"stars_dict\")" + ] + }, + { + "cell_type": "markdown", + "id": "9c7d67a9", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "If you wish to **verify** that you have read the files and defined `stars_dict` correctly, you can check that `stars_dict` has **4125** key/value pairs in it." + ] + }, + { + "cell_type": "markdown", + "id": "65b3495f", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 9:** What is the `stellar_effective_temperature` of the star *Kepler-220*?\n", + "\n", + "You **must** access the correct `Star` object in the `stars_dict` **dictionary** defined above to answer this question." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d3c864f9", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.530823Z", + "iopub.status.busy": "2023-11-08T00:40:27.529824Z", + "iopub.status.idle": "2023-11-08T00:40:27.539550Z", + "shell.execute_reply": "2023-11-08T00:40:27.538554Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'kepler_220_temp', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6ac67a10", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q9\")" + ] + }, + { + "cell_type": "markdown", + "id": "18eeb1f7", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 10:** Find the **name** of the **largest** star (in terms of `stellar_radius`) in the `data` directory.\n", + "\n", + "Your output **must** be a **string**. You do **not** need to worry about any ties. You **must** skip any stars with **missing** `stellar_radius` data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e78df115", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.576572Z", + "iopub.status.busy": "2023-11-08T00:40:27.575574Z", + "iopub.status.idle": "2023-11-08T00:40:27.593114Z", + "shell.execute_reply": "2023-11-08T00:40:27.592117Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'biggest_star', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0701f05c", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q10\")" + ] + }, + { + "cell_type": "markdown", + "id": "c3b412e1", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 11:** What is the **average** `stellar_age` (in gigayears) of **all** the stars in the `data` directory whose names **start with** `\"Kepler\"`?\n", + "\n", + "Your output **must** be a **float**. You **must** skip all stars with **missing** `stellar_age` data. Such stars should not contribute to either the sum of `stellar_age` or to the number of stars." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e6ff1322", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.659767Z", + "iopub.status.busy": "2023-11-08T00:40:27.658768Z", + "iopub.status.idle": "2023-11-08T00:40:27.678765Z", + "shell.execute_reply": "2023-11-08T00:40:27.677766Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'avg_age_kepler', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "776dd2ec", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q11\")" + ] + }, + { + "cell_type": "markdown", + "id": "228d1c3a", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Data Structure 3: namedtuple `Planet`\n", + "\n", + "Just as you did with the stars, you will be using named tuples to store the data about the planets in the `planets_1.csv`, ..., `planets_5.csv` files. Before you start reading these files however, you **must** create a new `Planet` type (using namedtuple). It **must** have the following attributes:\n", + "\n", + "* `planet_name`,\n", + "* `host_name`,\n", + "* `discovery_method`,\n", + "* `discovery_year`,\n", + "* `controversial_flag`,\n", + "* `orbital_period`,\n", + "* `planet_radius`,\n", + "* `planet_mass`,\n", + "* `semi_major_radius`,\n", + "* `eccentricity`,\n", + "* `equilibrium_temperature`\n", + "* `insolation_flux`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "773ac3c0", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.750830Z", + "iopub.status.busy": "2023-11-08T00:40:27.750830Z", + "iopub.status.idle": "2023-11-08T00:40:27.759039Z", + "shell.execute_reply": "2023-11-08T00:40:27.758043Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# define the namedtuple 'Planet' here\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c627718c", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.766036Z", + "iopub.status.busy": "2023-11-08T00:40:27.765038Z", + "iopub.status.idle": "2023-11-08T00:40:27.777391Z", + "shell.execute_reply": "2023-11-08T00:40:27.776395Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# run this following cell to initialize and test an example Planet object\n", + "# if this cell fails to execute, you have likely not defined the namedtuple 'Planet' correctly\n", + "jupiter = Planet('Jupiter', 'Sun', 'Imaging', 1610, False, 4333.0, 11.209, 317.828, 5.2038, 0.0489, 110, 0.0345)\n", + "\n", + "jupiter" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "39bb10d2", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"Planet\")" + ] + }, + { + "cell_type": "markdown", + "id": "5fd25e40", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Creating `Planet` objects\n", + "\n", + "We are now ready to read the files in the `data` directory and create `Planet` objects. Creating `Planet` objects however, is going to be more difficult than creating `Star` objects, because the data required to create a single `Planet` object is split up into different files.\n", + "\n", + "The `planets_1.csv`, ..., `planets_5.csv` files contain all the data required to create `Planet` objects **except** for the `host_name`. The `host_name` for each planet is to be found in the `mapping_1.json`, ..., `mapping_5.json` files." + ] + }, + { + "cell_type": "markdown", + "id": "a6efc51a", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "First, let us read the data in `planets_1.csv`. Since this is a CSV file, you can use the `process_csv` function from above to read this file. In the cell below, you **must** read the data in `planets_1.csv` and extract the **header** and the non-header **rows** of the file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bf3dfea9", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.827079Z", + "iopub.status.busy": "2023-11-08T00:40:27.826079Z", + "iopub.status.idle": "2023-11-08T00:40:27.839724Z", + "shell.execute_reply": "2023-11-08T00:40:27.838726Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# replace the ... with your code\n", + "\n", + "planets_1_csv = process_csv(...) # read the data in 'planets_1.csv'\n", + "planets_header = ...\n", + "planets_1_rows = ..." + ] + }, + { + "cell_type": "markdown", + "id": "dd3228eb", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "If you wish to **verify** that you have read the file and defined the variables correctly, you can check that `planets_header` has the value:\n", + "\n", + "```python\n", + "['Planet Name', 'Discovery Method', 'Discovery Year', 'Controversial Flag',\n", + " 'Orbital Period [days]', 'Planet Radius [Earth Radius]', 'Planet Mass [Earth Mass]',\n", + " 'Orbit Semi-Major Axis [au]', 'Eccentricity', 'Equilibrium Temperature [K]',\n", + " 'Insolation Flux [Earth Flux]']\n", + "```\n", + "\n", + "and that `planets_1_rows` has **1595** rows of which the **first three** are:\n", + "\n", + "```python\n", + "[['11 Com b', 'Radial Velocity', '2007', '0', '323.21000000', '12.200', '4914.89849', '1.178000', '0.238000', '', ''],\n", + " ['11 UMi b', 'Radial Velocity', '2009', '0', '516.21997000', '12.300', '4684.81420', '1.530000', '0.080000', '', ''],\n", + " ['14 And b', 'Radial Velocity', '2008', '0', '186.76000000', '13.100', '1131.15130', '0.775000', '0.000000', '', '']]\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "82a7a93e", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "Now, you are ready to read the data in `mapping_1.json`. Since this is a JSON file, you will need to copy/paste the `read_json` function Lab-P10, and use it to read the file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "20f97e3e", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.846718Z", + "iopub.status.busy": "2023-11-08T00:40:27.845719Z", + "iopub.status.idle": "2023-11-08T00:40:27.853722Z", + "shell.execute_reply": "2023-11-08T00:40:27.852720Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# copy & paste the read_json file from Lab-P10\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "da23f5bd", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.859721Z", + "iopub.status.busy": "2023-11-08T00:40:27.858723Z", + "iopub.status.idle": "2023-11-08T00:40:27.868608Z", + "shell.execute_reply": "2023-11-08T00:40:27.867612Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# now use the read_json function to read 'mapping_1.json'\n", + "\n", + "mapping_1_json = ..." + ] + }, + { + "cell_type": "markdown", + "id": "104f741a", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "If you wish to **verify** that you have read the file correctly, you can check that `mapping_1_json` has the value:\n", + "\n", + "```python\n", + "{'11 Com b': '11 Com',\n", + " '11 UMi b': '11 UMi',\n", + " '14 And b': '14 And',\n", + " ...\n", + " }\n", + "```\n", + "\n", + "Now that we have read `planets_1.csv` and `mapping_1.json`, we are now ready to combine these two files to create `Planet` objects." + ] + }, + { + "cell_type": "markdown", + "id": "1d011f1b", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Function 3: `planet_cell(row_idx, col_name, planets_rows, header=planets_header)`\n", + "\n", + "This function **must** read the **list** of **lists** `planets_rows`, and extract the value at **row** index `row_idx` and **column** index `col_idx`. The function **must** typecast the value based on `col_name`. If the value in `planets_rows` is **missing** (i.e., it is `''`), then the value returned **must** be `None`.\n", + "\n", + "The **column** of `planets_rows` where the value should be obtained from, and the correct **data type** for the value are listed in the table below:\n", + "\n", + "|Column of `planets_rows`|Data Type|\n", + "|------|---------|\n", + "|Planet Name|**string**|\n", + "|Discovery Year|**int**|\n", + "|Discovery Method|**string**|\n", + "|Controversial Flag|**bool**|\n", + "|Orbital Period [days]|**float**|\n", + "|Planet Radius [Earth Radius]|**float**|\n", + "|Planet Mass [Earth Mass]|**float**|\n", + "|Orbit Semi-Major Axis [au]|**float**|\n", + "|Eccentricity|**float**|\n", + "|Equilibrium Temperature [K]|**float**|\n", + "|Insolation Flux [Earth Flux]|**float**|\n", + "\n", + "**Important Hint:** While computing the value of the attribute `controversial_flag`, note that the `Controversial Flag` column of `planets_1.csv` represents `True` with `'1'` and `False` with `'0'`. You **must** be careful with typecasting **strings** to **booleans**." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fb509f13", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.875610Z", + "iopub.status.busy": "2023-11-08T00:40:27.874611Z", + "iopub.status.idle": "2023-11-08T00:40:27.898015Z", + "shell.execute_reply": "2023-11-08T00:40:27.897021Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# define the function 'planet_cell' here\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d21d187c", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"planet_cell\")" + ] + }, + { + "cell_type": "markdown", + "id": "10cdb778", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 12:** Create a `Planet` object for the **fifth** planet in the `planets_1.csv` file.\n", + "\n", + "You **must** access the values in `planets_1.csv` using the `planet_cell` function. Note that the fifth planet would be at **index** 4.\n", + "\n", + "The **attribute** of the `Planet` namedtuple object, the corresponding **column** of the `planets_1.csv` file where the value should be obtained from, and the correct **data type** for the value are listed in the table below:\n", + "\n", + "|Attribute of `Planet` object|Column of `planets_1.csv`|Data Type|\n", + "|---------|------|---------|\n", + "|`planet_name`|Planet Name|**string**|\n", + "|`host_name`| - |**string**|\n", + "|`discovery_method`|Discovery Method|**string**|\n", + "|`discovery_year`|Discovery Year|**int**|\n", + "|`controversial_flag`|Controversial Flag|**bool**|\n", + "|`orbital_period`|Orbital Period [days]|**float**|\n", + "|`planet_radius`|Planet Radius [Earth Radius]|**float**|\n", + "|`planet_mass`|Planet Mass [Earth Mass]|**float**|\n", + "|`semi_major_radius`|Orbit Semi-Major Axis [au]|**float**|\n", + "|`eccentricity`|Eccentricity|**float**|\n", + "|`equilibrium_temperature`|Equilibrium Temperature [K]|**float**|\n", + "|`insolation_flux`|Insolation Flux [Earth Flux]|**float**|\n", + "\n", + "\n", + "The value of the `host_name` attribute is found in `mapping_1.json`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "747a47bf", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:27.966668Z", + "iopub.status.busy": "2023-11-08T00:40:27.966668Z", + "iopub.status.idle": "2023-11-08T00:40:27.982081Z", + "shell.execute_reply": "2023-11-08T00:40:27.981089Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'fifth_planet', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3913c49a", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q12\")" + ] + }, + { + "cell_type": "markdown", + "id": "d2e1e4c4", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Function 4: `get_planets(planet_file, mapping_file)`: \n", + "\n", + "This function **must** take in as its input, a CSV file `planet_file` which contains data on planets in the same format as `planets_1.csv`, as well as a JSON file `mapping_file` which maps planets in `planet_file` to their host star in the same format as `mapping_1.json`. This function **must** return a **list** of `Planet` objects by combining the data in these two files. The `Planet` objects **must** appear in the same order as they do in `planet_file`.\n", + "\n", + "You **must** access the values in `planets_file` using the `planet_cell` function.\n", + "\n", + "You **must not** hardcode the name of the directory `data` into the `get_planets` function. Instead, you must pass it as a part of the arguments `planet_file` and `mapping_file`.\n", + "\n", + "Once again, as a reminder, the attributes of the `Planet` objects should be obtained from the **rows** of `planet_file` and from `mapping_file` and stored as follows:\n", + "\n", + "|Attribute of `Planet` object|Column of `planets_1.csv`|Data Type|\n", + "|---------|------|---------|\n", + "|`planet_name`|Planet Name|**string**|\n", + "|`host_name`| - |**string**|\n", + "|`discovery_method`|Discovery Method|**string**|\n", + "|`discovery_year`|Discovery Year|**int**|\n", + "|`controversial_flag`|Controversial Flag|**bool**|\n", + "|`orbital_period`|Orbital Period [days]|**float**|\n", + "|`planet_radius`|Planet Radius [Earth Radius]|**float**|\n", + "|`planet_mass`|Planet Mass [Earth Mass]|**float**|\n", + "|`semi_major_radius`|Orbit Semi-Major Axis [au]|**float**|\n", + "|`eccentricity`|Eccentricity|**float**|\n", + "|`equilibrium_temperature`|Equilibrium Temperature [K]|**float**|\n", + "|`insolation_flux`|Insolation Flux [Earth Flux]|**float**|\n", + "\n", + "The value of the `host_name` attribute is found in `mapping_file`.\n", + "\n", + "In case any data in `planet_file` is **missing**, the corresponding value should be `None`.\n", + "\n", + "For example, when this function is called with the file `planets_1.csv` and `mapping_1.json` as the input, the **list** returned should look like:\n", + "\n", + "```python\n", + "[ Planet(planet_name='11 Com b', host_name='11 Com', discovery_method='Radial Velocity', discovery_year=2007, controversial_flag=False, orbital_period=323.21, planet_radius=12.2, planet_mass=4914.89849, semi_major_radius=1.178, eccentricity=0.238, equilibrium_temperature=None, insolation_flux=None),\n", + " Planet(planet_name='11 UMi b', host_name='11 UMi', discovery_method='Radial Velocity', discovery_year=2009, controversial_flag=False, orbital_period=516.21997, planet_radius=12.3, planet_mass=4684.8142, semi_major_radius=1.53, eccentricity=0.08, equilibrium_temperature=None, insolation_flux=None),\n", + " ...]\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dfcc3c07", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.048786Z", + "iopub.status.busy": "2023-11-08T00:40:28.047786Z", + "iopub.status.idle": "2023-11-08T00:40:28.061560Z", + "shell.execute_reply": "2023-11-08T00:40:28.060557Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "def get_planets(planet_file, mapping_file):\n", + " # TODO: read planet_file to a list of lists\n", + " # TODO: extract the header and rows from planet_file\n", + " # TODO: read mapping_file to a dictionary\n", + " # TODO: loop through each row in planet_file with indices\n", + " # TODO: create a Planet object (namedTuple) for each row\n", + " # TODO: add each Planet objet to a list\n", + " # TODO: return the list after the end of the loop\n", + " pass # replace with your code" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "73bbe253", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"get_planets\")" + ] + }, + { + "cell_type": "markdown", + "id": "3b1bc76c", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 13:** What are the **last five** `Planet` objects in the **list** returned by `get_planets` when `planet_file` is `planets_1.csv` and `mapping_file` is `mapping_1.json`?\n", + "\n", + "Your output **must** be a **list** of `Planet` objects.\n", + "\n", + "**Hint:** First, you **must** use the `get_planets` function to parse the data in `planets_1.csv` and `mapping_1.json` and create a **list** of `Planet` objects. Then, you may slice this **list** to get the last five `Planet` objects." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2eb1f3e2", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.126931Z", + "iopub.status.busy": "2023-11-08T00:40:28.125924Z", + "iopub.status.idle": "2023-11-08T00:40:28.168019Z", + "shell.execute_reply": "2023-11-08T00:40:28.167014Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'last_five_planets_1', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5f15a9e1", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q13\")" + ] + }, + { + "cell_type": "markdown", + "id": "4044de21", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 14:** What are the `Planet` objects whose `controversial_flag` attribute is `True` in the **list** returned by `get_planets` when `planet_file` is `planets_2.csv` and `mapping_file` is `mapping_2.json`?\n", + "\n", + "Your output **must** be a **list** of `Planet` objects." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5f21af9a", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.222608Z", + "iopub.status.busy": "2023-11-08T00:40:28.221608Z", + "iopub.status.idle": "2023-11-08T00:40:28.259628Z", + "shell.execute_reply": "2023-11-08T00:40:28.258636Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'controversial_planets', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6e60a562", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q14\")" + ] + }, + { + "cell_type": "markdown", + "id": "aa3b6fe3", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Data Cleaning 1: Broken CSV rows\n", + "\n", + "Our function `get_planets` works very well so far. However, it is likely that it will not work on all the files in the `data` directory. For example, if you use the function `get_planets` to read the data in `planets_4.csv` and `mapping_4.json`, you will most likely run into an error. **Try it yourself to verify!**\n", + "\n", + "The reason your code likely crashed is because there the file `planets_4.csv` is **broken**. For some reason, several rows in `planets_4.csv` have their data jumbled up. For example, in the **seventh** row of `planets_4.csv`, we come across this row:\n", + "\n", + "|Planet Name|Discovery Method|Discovery Year|Controversial Flag|Orbital Period [days]|Planet Radius [Earth Radius]|Planet Mass [Earth Mass]|Orbit Semi-Major Axis [au]|Eccentricity|Equilibrium Temperature [K]|Insolation Flux [Earth Flux]|\n", + "|-----------|----------------|--------------|------------------|---------------------|----------------------------|------------------------|---------------------------|------------|---------------------------|----------------------------|\n", + "123.01000000|Radial Velocity|2009|0|61 Vir d|5.110|22.90000|0.476000|0.350000||\n", + "\n", + "We can see that for some reason, the value under the column `Planet Name` is a **number** while the value under the column `Orbital Period [days]` is a **string**. It is possible that these two columns of data got *swapped* here, but we cannot be sure about this.\n", + "\n", + "We will call such a **row** in a CSV file where the values under a column do not match the expected format to be a **broken row**. While it is possible to sometimes extract useful data from broken rows, in this project, we will simply **skip** broken rows.\n", + "\n", + "You **must** now go back to your definition of `get_planets` and edit it, so that any **broken rows get skipped**.\n", + "\n", + "**Hints:**\n", + "\n", + "1. The simplest way to recognize if a row is broken is if you run into any **RunTime Errors** when you call the `get_planets` function. So, one simple way to skip bad rows would be to use `try/except` blocks to avoid processing any rows that cause the code to crash; remember **not** to use *bare* `try/except` blocks.\n", + "2. There are only **10** broken rows in `planets_4.csv`, and they are all **bunched up** at the very beginning and the very end of the dataset. You can manually **inspect** the **first 10 and last 10** rows, and figure out which of these rows are broken and why.\n", + "\n", + "**Important Warning:** You are **not** allowed to **hardcode** the indices of the broken rows. You may inspect `planets_4.csv` to identify how to tell a **broken row** apart. Therefore, to use the example of the **broken row** above, you **may not** hardcode to skip the **seventh** row of `planets_4.csv`. However, it is **acceptable** to make your function **skip** any row for which the value under the `Planet Name` is not numeric, by observing that this is the reason why the row is broken." + ] + }, + { + "cell_type": "markdown", + "id": "cb894a51", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 15:** What are the **last five** `Planet` objects produced by `get_planets` when `planet_file` is `planets_4.csv` and `mapping_file` is `mapping_4.json`?\n", + "\n", + "Your output **must** be a **list** of `Planet` objects." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "026ee35d", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.322148Z", + "iopub.status.busy": "2023-11-08T00:40:28.321149Z", + "iopub.status.idle": "2023-11-08T00:40:28.341311Z", + "shell.execute_reply": "2023-11-08T00:40:28.340368Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'last_five_planets_4', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "34314589", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q15\")" + ] + }, + { + "cell_type": "markdown", + "id": "9f551f15", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Data Cleaning 2: Broken JSON files\n", + "\n", + "We are now ready to read **all** the files in the `data` directory and create a **list** of `Planet` objects for all the planets in the directory. However, if you try to use `get_planets` on all the planet CSV files and mapping JSON files, you will likely run into another error. **Try it for yourself by calling `get_planets` on all the files in `data`!**\n", + "\n", + "It is likely that your code crashed when you tried to read the data in `planets_5.csv` and `mapping_5.json`. This is because the file `mapping_5.json` is **broken**. Unlike **broken** CSV files, where we only had to skip the **broken rows**, it is much harder to parse **broken JSON files**. When a JSON file is **broken**, we often have no choice but to **skip the file entirely**.\n", + "\n", + "You **must** now go back to your definition of `get_planets` and edit it, so that if the JSON file is **broken**, then the file is completely skipped, and only an **empty list** is returned.\n", + "\n", + "**Important Warning:** You are **not** allowed to **hardcode** the name of the files to be skipped. You **must** use `try/except` blocks to determine whether the JSON file is **broken** and skip the file if it is.\n", + "\n", + "**Hint:** You might also want to review the **Project Requirements** at the start of this project, before you use `try/except`." + ] + }, + { + "cell_type": "markdown", + "id": "611e1794", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "### Data Structure 4: `planets_list`\n", + "\n", + "You are now ready to read all the data about the planets stored in the `data` directory. You **must** now create a **list** containing `Planet` objects by parsing the data inside the files `planets_1.csv`, ..., `planets_5.csv` and `mapping_1.json`, ..., `mapping_5.json`.\n", + "\n", + "You **must** skip any **broken rows** in the CSV file, and also completely skip any **broken JSON files**. However, you are **not** allowed to **hardcode** the file you need to skip. You **must** call `get_planet` on **all** 5 pairs of files to answer this question.\n", + "\n", + "You **must** use the `get_planets` function on each of the five pairs of files in the `data` directory to create `planets_list`.\n", + "\n", + "**Warning:** Recall that the ordering of the files returned by the `os.listdir` function **depends on the operating system**. So, you need to be careful if your code relies on the ordering of the **list** returned by this function. One way to avoid any issues here would be to **sort** the **list** first, so that the ordering is identical across all operating systems." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9921ff82", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.393560Z", + "iopub.status.busy": "2023-11-08T00:40:28.392559Z", + "iopub.status.idle": "2023-11-08T00:40:28.484082Z", + "shell.execute_reply": "2023-11-08T00:40:28.483080Z" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "# define the variable 'planets_list' here,\n", + "# but do NOT display the variable at the end\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "25c0cad8", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"planets_list\")" + ] + }, + { + "cell_type": "markdown", + "id": "19bbe569", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "If you wish to **verify** that you have read the files and defined `planets_list` correctly, you can check that `planets_list` has **5026** `Planet` objects in it. If it contains fewer or a greater number of planets, it is possible that you have accidentally parsed a broken CSV row in `planets_4.csv`, or accidentally parsed data from the broken JSON file `mapping_5.json`." + ] + }, + { + "cell_type": "markdown", + "id": "235e87f6", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 16:** What is the output of `planets_list[5020:5025]`?\n", + "\n", + "Your output **must** be a **list** of `Planet` objects.\n", + "\n", + "**Hint:** If you did not get the right answer here, it is possible that you did not read the files in the correct **order**. In `planets_list`, the planets from `planets_1.csv` should appear first (in the order that they appear in the dataset), followed by the planets from `planets_2.csv`, `planets_3.csv`, `planets_4.csv`, and `planets_5.csv`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f8506324", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.543085Z", + "iopub.status.busy": "2023-11-08T00:40:28.542088Z", + "iopub.status.idle": "2023-11-08T00:40:28.552172Z", + "shell.execute_reply": "2023-11-08T00:40:28.551171Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'planets_5020_5025', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2fdfabce", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q16\")" + ] + }, + { + "cell_type": "markdown", + "id": "3b712e38", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 17:** How many planets in `planets_list` were discovered in the year *2023*?\n", + "\n", + "Your output **must** be an **integer**." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "22b89d22", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.585187Z", + "iopub.status.busy": "2023-11-08T00:40:28.584191Z", + "iopub.status.idle": "2023-11-08T00:40:28.596436Z", + "shell.execute_reply": "2023-11-08T00:40:28.595445Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'planets_disc_2023', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0d14be92", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q17\")" + ] + }, + { + "cell_type": "markdown", + "id": "f5b74c46", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 18:** Find the `Star` object around which the `Planet` named *TOI-2202 c* orbits.\n", + "\n", + "Your output **must** be a `Star` object.\n", + "\n", + "**Hint:** You **must** first find the `Planet` object with the `planet_name` *TOI-2202 c* and then use the `host_name` attribute to identify the name of the star around which the planet orbits. Then, you can get the `Star` object using the `stars_dict` **dictionary** defined above.\n", + "\n", + "You **must** exit the loop once you find the first planet with the target name." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9beed08b", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.663418Z", + "iopub.status.busy": "2023-11-08T00:40:28.662422Z", + "iopub.status.idle": "2023-11-08T00:40:28.675419Z", + "shell.execute_reply": "2023-11-08T00:40:28.674419Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'toi_2022_c_star', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1419f258", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q18\")" + ] + }, + { + "cell_type": "markdown", + "id": "d97da08c", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 19:** Find the **average** `planet_radius` (in units of the radius of the Earth) of the planets that orbit stars with `stellar_radius` more than *10* (i.e. more than *10* times the radius of the Sun).\n", + "\n", + "Your output **must** be a **float**. You **must** skip any `Planet` objects with **missing** `planet_radius` data and any `Star` objects with **missing** `stellar_radius` data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ea5c9c38", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.739431Z", + "iopub.status.busy": "2023-11-08T00:40:28.738428Z", + "iopub.status.idle": "2023-11-08T00:40:28.758509Z", + "shell.execute_reply": "2023-11-08T00:40:28.757519Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'avg_planet_radius_big_stars', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4ce93cf3", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q19\")" + ] + }, + { + "cell_type": "markdown", + "id": "3960389a", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "**Question 20:** Find all the `Planet` objects that orbit the **youngest** `Star` object.\n", + "\n", + "Your output **must** be a **list** of `Planet` objects (even if there is **only one** `Planet` in the list). The age of a `Star` can be found from its `stellar_age` column. You do **not** have to worry about any ties. There is a **unique** `Star` in the dataset which is the youngest star." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "47751dc8", + "metadata": { + "execution": { + "iopub.execute_input": "2023-11-08T00:40:28.829590Z", + "iopub.status.busy": "2023-11-08T00:40:28.829590Z", + "iopub.status.idle": "2023-11-08T00:40:28.845621Z", + "shell.execute_reply": "2023-11-08T00:40:28.844622Z" + }, + "lines_to_next_cell": 0, + "tags": [] + }, + "outputs": [], + "source": [ + "# compute and store the answer in the variable 'youngest_star_planets', then display it\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "39afe6e0", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"q20\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d510b9ea", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"general_deductions\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "980793ba", + "metadata": { + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "grader.check(\"summary\")" + ] + }, + { + "cell_type": "markdown", + "id": "0c39321c", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + "## Submission\n", + "It is recommended that at this stage, you Restart and Run all Cells in your notebook.\n", + "That will automatically save your work and generate a zip file for you to submit.\n", + "\n", + "**SUBMISSION INSTRUCTIONS**:\n", + "1. **Upload** the zipfile to Gradescope.\n", + "2. If you completed the project with a **partner**, make sure to **add their name** by clicking \"Add Group Member\"\n", + "in Gradescope when uploading the zip file.\n", + "3. Check **Gradescope** results as soon as the auto-grader execution gets completed.\n", + "4. Your **final score** for this project is the score that you see on **Gradescope**.\n", + "5. You are **allowed** to resubmit on Gradescope as many times as you want to.\n", + "6. **Contact** a TA/PM if you lose any points on Gradescope for any **unclear reasons**." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cb955e7c", + "metadata": { + "cell_type": "code", + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "# running this cell will create a new save checkpoint for your notebook\n", + "from IPython.display import display, Javascript\n", + "display(Javascript('IPython.notebook.save_checkpoint();'))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2d31d149", + "metadata": { + "cell_type": "code", + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "!jupytext --to py p10.ipynb" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8b3d7b23", + "metadata": { + "cell_type": "code", + "deletable": false, + "editable": false + }, + "outputs": [], + "source": [ + "public_tests.check_file_size(\"p10.ipynb\")\n", + "grader.export(pdf=False, run_tests=False, files=[\"p10.py\"])" + ] + }, + { + "cell_type": "markdown", + "id": "57741345", + "metadata": { + "deletable": false, + "editable": false + }, + "source": [ + " " + ] + } + ], + "metadata": { + "jupytext": { + "cell_metadata_filter": "-all", + "encoding": "# coding: utf-8", + "executable": "/usr/bin/env python", + "notebook_metadata_filter": "-all" + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.4" + }, + "otter": { + "OK_FORMAT": true, + "tests": { + "Planet": { + "name": "Planet", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('Planet', jupiter)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('Planet: data structure is defined more than once')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'data structure is defined more than once (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('Planet: data structure is defined incorrectly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'data structure is defined incorrectly (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "Star": { + "name": "Star", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('Star', sun)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('Star: data structure is defined more than once')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'data structure is defined more than once (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('Star: data structure is defined incorrectly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'data structure is defined incorrectly (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "general_deductions": { + "name": "general_deductions", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> \n>>> public_tests.rubric_check('general_deductions: Outputs not visible/did not save the notebook file prior to running the cell containing \"export\". We cannot see your output if you do not save before generating the zip file.')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'Outputs not visible/did not save the notebook file prior to running the cell containing \"export\". We cannot see your output if you do not save before generating the zip file. (-3)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('general_deductions: Used concepts/modules such as csv.DictReader and pandas not covered in class yet. Note that built-in functions that you have been introduced to can be used.')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'Used concepts/modules such as csv.DictReader and pandas not covered in class yet. Note that built-in functions that you have been introduced to can be used. (-3)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('general_deductions: Used bare try/except blocks without explicitly specifying the type of exceptions that need to be caught')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'Used bare try/except blocks without explicitly specifying the type of exceptions that need to be caught (-3)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('general_deductions: Large outputs such as stars_dict or planets_list are displayed in the notebook.')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'Large outputs such as stars_dict or planets_list are displayed in the notebook. (-3)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('general_deductions: Import statements are not mentioned in the required cell at the top of the notebook.')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'Import statements are not mentioned in the required cell at the top of the notebook. (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "get_planets": { + "name": "get_planets", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> \n>>> public_tests.rubric_check('get_planets: function logic is incorrect')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function logic is incorrect (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('get_planets: hardcoded the name of directory inside the function instead of passing it as a part of the input argument')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'hardcoded the name of directory inside the function instead of passing it as a part of the input argument (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('get_planets: function is called more than twice with the same dataset')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function is called more than twice with the same dataset (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('get_planets: `planet_cell` function is not used', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`planet_cell` function is not used (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('get_planets: function is defined more than once')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function is defined more than once (-3)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "get_stars": { + "name": "get_stars", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> \n>>> public_tests.rubric_check('get_stars: function logic is incorrect')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function logic is incorrect (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('get_stars: hardcoded the name of directory inside the function instead of passing it as a part of the input argument')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'hardcoded the name of directory inside the function instead of passing it as a part of the input argument (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('get_stars: function is called more than twice with the same dataset')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function is called more than twice with the same dataset (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('get_stars: `star_cell` function is not used', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`star_cell` function is not used (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('get_stars: function is defined more than once')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function is defined more than once (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "planet_cell": { + "name": "planet_cell", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> \n>>> public_tests.rubric_check('planet_cell: function does not typecast values based on columns')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function does not typecast values based on columns (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('planet_cell: column indices are hardcoded instead of using column names')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'column indices are hardcoded instead of using column names (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('planet_cell: boolean values are not typecasted correctly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'boolean values are not typecasted correctly (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('planet_cell: function logic is incorrect')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function logic is incorrect (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('planet_cell: function is defined more than once')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function is defined more than once (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "planets_list": { + "name": "planets_list", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> \n>>> public_tests.rubric_check('planets_list: data structure is defined incorrectly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'data structure is defined incorrectly (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('planets_list: `get_planets` function is not used', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`get_planets` function is not used (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('planets_list: paths are hardcoded using slashes')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'paths are hardcoded using slashes (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('planets_list: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q1": { + "name": "q1", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q1', files_in_data)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q1: answer is not sorted explicitly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer is not sorted explicitly (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q1: answer does not remove all files and directories that start with `.`')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer does not remove all files and directories that start with `.` (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q10": { + "name": "q10", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q10', biggest_star)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q10: incorrect logic is used to answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect logic is used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q10: `stars_dict` data structure is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`stars_dict` data structure is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q10: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q11": { + "name": "q11", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q11', avg_age_kepler)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q11: answer does not check for only stars that start with `Kepler`')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer does not check for only stars that start with `Kepler` (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q11: incorrect logic is used to answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect logic is used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q11: `stars_dict` data structure is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`stars_dict` data structure is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q12": { + "name": "q12", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q12', fifth_planet)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q12: `planet_cell` function is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`planet_cell` function is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q12: `mapping_1_json` data structure is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`mapping_1_json` data structure is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q12: answer unnecessarily iterates over the entire dataset')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer unnecessarily iterates over the entire dataset (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q12: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q13": { + "name": "q13", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q13', last_five_planets_1)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q13: `get_planets` function is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`get_planets` function is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q13: paths are hardcoded using slashes')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'paths are hardcoded using slashes (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q13: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q14": { + "name": "q14", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q14', controversial_planets)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q14: incorrect logic is used to answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect logic is used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q14: `get_planets` function is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`get_planets` function is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q14: paths are hardcoded using slashes')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'paths are hardcoded using slashes (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q14: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q15": { + "name": "q15", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q15', last_five_planets_4)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q15: `get_planets` function is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`get_planets` function is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q15: paths are hardcoded using slashes')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'paths are hardcoded using slashes (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q15: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q16": { + "name": "q16", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q16', planets_5020_5025)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q16: `planets_list` data structure is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`planets_list` data structure is not used to answer (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q17": { + "name": "q17", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q17', planets_disc_2023)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q17: incorrect comparison operator is used')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect comparison operator is used (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q17: incorrect logic is used to answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect logic is used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q17: `planets_list` data structure is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`planets_list` data structure is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q18": { + "name": "q18", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q18', toi_2022_c_star)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q18: `planets_list` and `stars_dict` data structures are not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`planets_list` and `stars_dict` data structures are not used to answer (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q18: did not exit loop and instead iterated further after finding the answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'did not exit loop and instead iterated further after finding the answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q18: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q19": { + "name": "q19", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q19', avg_planet_radius_big_stars)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q19: incorrect comparison operator is used')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect comparison operator is used (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q19: incorrect logic is used to answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect logic is used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q19: `planets_list` and `stars_dict` data structures are not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`planets_list` and `stars_dict` data structures are not used to answer (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q19: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q2": { + "name": "q2", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q2', file_paths)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q2: recomputed variable defined in Question 1, or the answer is not sorted explicitly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'recomputed variable defined in Question 1, or the answer is not sorted explicitly (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q2: answer does not remove all files and directories that start with `.`')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer does not remove all files and directories that start with `.` (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q2: paths are hardcoded using slashes')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'paths are hardcoded using slashes (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q2: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q20": { + "name": "q20", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q20', youngest_star_planets)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q20: answer does not include all Planets that orbit the Star')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer does not include all Planets that orbit the Star (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q20: incorrect logic is used to answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect logic is used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q20: `planets_list` and `stars_dict` data structures are not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`planets_list` and `stars_dict` data structures are not used to answer (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q20: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q3": { + "name": "q3", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q3', csv_file_paths)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q3: recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q3: answer does not remove all files and directories that start with `.`')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer does not remove all files and directories that start with `.` (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q3: answer does not check only for files that end with `.csv`')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer does not check only for files that end with `.csv` (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q3: paths are hardcoded using slashes')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'paths are hardcoded using slashes (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q3: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q4": { + "name": "q4", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q4', stars_paths)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q4: recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q4: answer does not remove all files and directories that start with `.`')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer does not remove all files and directories that start with `.` (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q4: answer does not check for only files that start with `stars`')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer does not check for only files that start with `stars` (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q4: paths are hardcoded using slashes')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'paths are hardcoded using slashes (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q4: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q5": { + "name": "q5", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q5', third_star)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q5: `star_cell` function is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`star_cell` function is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q5: answer unnecessarily iterates over the entire dataset')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'answer unnecessarily iterates over the entire dataset (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q5: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q6": { + "name": "q6", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q6', dp_leo)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q6: `stars_1_dict` data structure is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`stars_1_dict` data structure is not used to answer (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q7": { + "name": "q7", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q7', avg_lum_stars_1)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q7: incorrect logic is used to answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect logic is used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q7: `stars_1_dict` data structure is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`stars_1_dict` data structure is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q7: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q8": { + "name": "q8", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q8', avg_age_stars_2)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q8: incorrect logic is used to answer')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'incorrect logic is used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q8: `get_stars` function is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`get_stars` function is not used to answer (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> public_tests.rubric_check('q8: public tests')\nAll test cases passed!\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "q9": { + "name": "q9", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.check('q9', kepler_220_temp)\nAll test cases passed!\n", + "hidden": false, + "locked": false + }, + { + "code": ">>> \n>>> public_tests.rubric_check('q9: `stars_dict` data structure is not used to answer', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`stars_dict` data structure is not used to answer (-2)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "star_cell": { + "name": "star_cell", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> \n>>> public_tests.rubric_check('star_cell: function does not typecast values based on columns')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function does not typecast values based on columns (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('star_cell: column indices are hardcoded instead of using column names')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'column indices are hardcoded instead of using column names (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('star_cell: function logic is incorrect')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function logic is incorrect (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('star_cell: function is defined more than once')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'function is defined more than once (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "stars_dict": { + "name": "stars_dict", + "points": 0, + "suites": [ + { + "cases": [ + { + "code": ">>> \n>>> public_tests.rubric_check('stars_dict: data structure is defined incorrectly')\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - 'data structure is defined incorrectly (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('stars_dict: `get_stars` function is not used', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`get_stars` function is not used (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + }, + { + "code": ">>> \n>>> public_tests.rubric_check('stars_dict: `stars_paths` is not used to find paths of necessary files', False)\nAll test cases passed!\n", + "hidden": false, + "locked": false, + "success_message": "Note that the Gradescope autograder will deduct points if your code fails the following rubric point - '`stars_paths` is not used to find paths of necessary files (-1)'. The public tests cannot determine if your code satisfies these requirements. Verify your code manually." + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + }, + "summary": { + "name": "summary", + "points": 127, + "suites": [ + { + "cases": [ + { + "code": ">>> public_tests.get_summary()\nTotal Score: 100/100\n", + "hidden": false, + "locked": false + } + ], + "scored": true, + "setup": "", + "teardown": "", + "type": "doctest" + } + ] + } + } + }, + "vscode": { + "interpreter": { + "hash": "14da47bd65ade4b245ff2b2e979dfbf7dfc83de2bd52e30e561cd592f0ba2dfc" + } + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/p10/public_tests.py b/p10/public_tests.py new file mode 100644 index 0000000000000000000000000000000000000000..06d00521e3691d902925c009d0082880296ec542 --- /dev/null +++ b/p10/public_tests.py @@ -0,0 +1,878 @@ +#!/usr/bin/python +# + +import os, json, math, copy +from collections import namedtuple +from bs4 import BeautifulSoup + +HIDDEN_FILE = os.path.join("hidden", "hidden_tests.py") +if os.path.exists(HIDDEN_FILE): + import hidden.hidden_tests as hidn +# - + +MAX_FILE_SIZE = 750 # units - KB +REL_TOL = 6e-04 # relative tolerance for floats +ABS_TOL = 15e-03 # absolute tolerance for floats +TOTAL_SCORE = 100 # total score for the project + +DF_FILE = 'expected_dfs.html' +PLOT_FILE = 'expected_plots.json' + +PASS = "All test cases passed!" + +TEXT_FORMAT = "TEXT_FORMAT" # question type when expected answer is a type, str, int, float, or bool +TEXT_FORMAT_UNORDERED_LIST = "TEXT_FORMAT_UNORDERED_LIST" # question type when the expected answer is a list or a set where the order does *not* matter +TEXT_FORMAT_ORDERED_LIST = "TEXT_FORMAT_ORDERED_LIST" # question type when the expected answer is a list or tuple where the order does matter +TEXT_FORMAT_DICT = "TEXT_FORMAT_DICT" # question type when the expected answer is a dictionary +TEXT_FORMAT_SPECIAL_ORDERED_LIST = "TEXT_FORMAT_SPECIAL_ORDERED_LIST" # question type when the expected answer is a list where order does matter, but with possible ties. Elements are ordered according to values in special_ordered_json (with ties allowed) +TEXT_FORMAT_NAMEDTUPLE = "TEXT_FORMAT_NAMEDTUPLE" # question type when expected answer is a namedtuple +PNG_FORMAT_SCATTER = "PNG_FORMAT_SCATTER" # question type when the expected answer is a scatter plot +HTML_FORMAT = "HTML_FORMAT" # question type when the expected answer is a DataFrame +FILE_JSON_FORMAT = "FILE_JSON_FORMAT" # question type when the expected answer is a JSON file +SLASHES = "SLASHES" # question SUFFIX when expected answer contains paths with slashes + +def get_expected_format(): + """get_expected_format() returns a dict mapping each question to the format + of the expected answer.""" + expected_format = {'q1': 'TEXT_FORMAT_ORDERED_LIST', + 'q2': 'TEXT_FORMAT_ORDERED_LIST_SLASHES', + 'q3': 'TEXT_FORMAT_ORDERED_LIST_SLASHES', + 'q4': 'TEXT_FORMAT_ORDERED_LIST_SLASHES', + 'Star': 'TEXT_FORMAT_NAMEDTUPLE', + 'q5': 'TEXT_FORMAT_NAMEDTUPLE', + 'q6': 'TEXT_FORMAT_NAMEDTUPLE', + 'q7': 'TEXT_FORMAT', + 'q8': 'TEXT_FORMAT', + 'q9': 'TEXT_FORMAT', + 'q10': 'TEXT_FORMAT', + 'q11': 'TEXT_FORMAT', + 'Planet': 'TEXT_FORMAT_NAMEDTUPLE', + 'q12': 'TEXT_FORMAT_NAMEDTUPLE', + 'q13': 'TEXT_FORMAT_ORDERED_LIST', + 'q14': 'TEXT_FORMAT_ORDERED_LIST', + 'q15': 'TEXT_FORMAT_ORDERED_LIST', + 'q16': 'TEXT_FORMAT_ORDERED_LIST', + 'q17': 'TEXT_FORMAT', + 'q18': 'TEXT_FORMAT_NAMEDTUPLE', + 'q19': 'TEXT_FORMAT', + 'q20': 'TEXT_FORMAT_UNORDERED_LIST'} + return expected_format + + +def get_expected_json(): + """get_expected_json() returns a dict mapping each question to the expected + answer (if the format permits it).""" + expected_json = {'q1': ['stars_5.csv', + 'stars_4.csv', + 'stars_3.csv', + 'stars_2.csv', + 'stars_1.csv', + 'planets_5.csv', + 'planets_4.csv', + 'planets_3.csv', + 'planets_2.csv', + 'planets_1.csv', + 'mapping_5.json', + 'mapping_4.json', + 'mapping_3.json', + 'mapping_2.json', + 'mapping_1.json'], + 'q2': ['data\\stars_5.csv', + 'data\\stars_4.csv', + 'data\\stars_3.csv', + 'data\\stars_2.csv', + 'data\\stars_1.csv', + 'data\\planets_5.csv', + 'data\\planets_4.csv', + 'data\\planets_3.csv', + 'data\\planets_2.csv', + 'data\\planets_1.csv', + 'data\\mapping_5.json', + 'data\\mapping_4.json', + 'data\\mapping_3.json', + 'data\\mapping_2.json', + 'data\\mapping_1.json'], + 'q3': ['data\\stars_5.csv', + 'data\\stars_4.csv', + 'data\\stars_3.csv', + 'data\\stars_2.csv', + 'data\\stars_1.csv', + 'data\\planets_5.csv', + 'data\\planets_4.csv', + 'data\\planets_3.csv', + 'data\\planets_2.csv', + 'data\\planets_1.csv'], + 'q4': ['data\\stars_5.csv', + 'data\\stars_4.csv', + 'data\\stars_3.csv', + 'data\\stars_2.csv', + 'data\\stars_1.csv'], + 'Star': Star(spectral_type='G2 V', stellar_effective_temperature=5780.0, stellar_radius=1.0, stellar_mass=1.0, stellar_luminosity=0.0, stellar_surface_gravity=4.44, stellar_age=4.6), + 'q5': Star(spectral_type='K0III', stellar_effective_temperature=4888.0, stellar_radius=11.55, stellar_mass=1.78, stellar_luminosity=1.84, stellar_surface_gravity=2.55, stellar_age=4.5), + 'q6': Star(spectral_type=None, stellar_effective_temperature=13500.0, stellar_radius=0.01, stellar_mass=0.69, stellar_luminosity=-2.4, stellar_surface_gravity=None, stellar_age=None), + 'q7': 0.01624010554089703, + 'q8': 4.3255604996096775, + 'q9': 4632.0, + 'q10': 'HD 81817', + 'q11': 4.245366288492731, + 'Planet': Planet(planet_name='Jupiter', host_name='Sun', discovery_method='Imaging', discovery_year=1610, controversial_flag=False, orbital_period=4333.0, planet_radius=11.209, planet_mass=317.828, semi_major_radius=5.2038, eccentricity=0.0489, equilibrium_temperature=110, insolation_flux=0.0345), + 'q12': Planet(planet_name='17 Sco b', host_name='17 Sco', discovery_method='Radial Velocity', discovery_year=2020, controversial_flag=False, orbital_period=578.38, planet_radius=12.9, planet_mass=1373.01872, semi_major_radius=1.45, eccentricity=0.06, equilibrium_temperature=None, insolation_flux=None), + 'q13': [Planet(planet_name='Kepler-1494 b', host_name='Kepler-1494', discovery_method='Transit', discovery_year=2016, controversial_flag=False, orbital_period=91.080482, planet_radius=3.07, planet_mass=9.64, semi_major_radius=0.3982, eccentricity=0.0, equilibrium_temperature=415.0, insolation_flux=9.11), + Planet(planet_name='Kepler-1495 b', host_name='Kepler-1495', discovery_method='Transit', discovery_year=2016, controversial_flag=False, orbital_period=85.273256, planet_radius=2.94, planet_mass=8.96, semi_major_radius=0.3677, eccentricity=0.0, equilibrium_temperature=443.0, insolation_flux=9.1), + Planet(planet_name='Kepler-1496 b', host_name='Kepler-1496', discovery_method='Transit', discovery_year=2016, controversial_flag=False, orbital_period=64.6588017, planet_radius=2.22, planet_mass=5.56, semi_major_radius=0.3211, eccentricity=0.0, equilibrium_temperature=535.0, insolation_flux=18.38), + Planet(planet_name='Kepler-1497 b', host_name='Kepler-1497', discovery_method='Transit', discovery_year=2016, controversial_flag=False, orbital_period=8.74199772, planet_radius=1.66, planet_mass=3.39, semi_major_radius=0.0817, eccentricity=0.0, equilibrium_temperature=924.0, insolation_flux=172.38), + Planet(planet_name='TOI-784 b', host_name='HD 307842', discovery_method='Transit', discovery_year=2023, controversial_flag=False, orbital_period=2.7970365, planet_radius=1.93, planet_mass=9.67, semi_major_radius=0.038, eccentricity=0.0, equilibrium_temperature=1256.0, insolation_flux=413.89)], + 'q14': [Planet(planet_name='Kepler-452 b', host_name='Kepler-452', discovery_method='Transit', discovery_year=2015, controversial_flag=True, orbital_period=384.843, planet_radius=1.63, planet_mass=3.29, semi_major_radius=1.046, eccentricity=0.0, equilibrium_temperature=265.0, insolation_flux=1.1), + Planet(planet_name='Kepler-747 b', host_name='Kepler-747', discovery_method='Transit', discovery_year=2016, controversial_flag=True, orbital_period=35.61760587, planet_radius=5.27, planet_mass=24.1, semi_major_radius=0.1916, eccentricity=0.0, equilibrium_temperature=456.0, insolation_flux=10.19), + Planet(planet_name='V830 Tau b', host_name='V830 Tau', discovery_method='Radial Velocity', discovery_year=2016, controversial_flag=True, orbital_period=4.927, planet_radius=14.0, planet_mass=222.481, semi_major_radius=0.057, eccentricity=0.0, equilibrium_temperature=None, insolation_flux=None), + Planet(planet_name='nu Oct A b', host_name='nu Oct A', discovery_method='Radial Velocity', discovery_year=2016, controversial_flag=True, orbital_period=417.0, planet_radius=13.3, planet_mass=762.78818, semi_major_radius=1.25, eccentricity=0.11, equilibrium_temperature=None, insolation_flux=None)], + 'q15': [Planet(planet_name='Wolf 1061 b', host_name='Wolf 1061', discovery_method='Radial Velocity', discovery_year=2015, controversial_flag=False, orbital_period=4.8869, planet_radius=1.21, planet_mass=1.91, semi_major_radius=0.0375, eccentricity=0.15, equilibrium_temperature=None, insolation_flux=7.34), + Planet(planet_name='Wolf 1061 c', host_name='Wolf 1061', discovery_method='Radial Velocity', discovery_year=2015, controversial_flag=False, orbital_period=17.8719, planet_radius=1.66, planet_mass=3.41, semi_major_radius=0.089, eccentricity=0.11, equilibrium_temperature=None, insolation_flux=1.3), + Planet(planet_name='YZ Cet b', host_name='YZ Cet', discovery_method='Radial Velocity', discovery_year=2017, controversial_flag=False, orbital_period=2.02087, planet_radius=0.913, planet_mass=0.7, semi_major_radius=0.01634, eccentricity=0.06, equilibrium_temperature=471.0, insolation_flux=8.21), + Planet(planet_name='ups And b', host_name='ups And', discovery_method='Radial Velocity', discovery_year=1996, controversial_flag=False, orbital_period=4.617033, planet_radius=14.0, planet_mass=218.531, semi_major_radius=0.059222, eccentricity=0.0215, equilibrium_temperature=None, insolation_flux=None), + Planet(planet_name='ups And d', host_name='ups And', discovery_method='Radial Velocity', discovery_year=1999, controversial_flag=False, orbital_period=1276.46, planet_radius=12.5, planet_mass=3257.74117, semi_major_radius=2.51329, eccentricity=0.2987, equilibrium_temperature=None, insolation_flux=None)], + 'q16': [Planet(planet_name='TOI-712 d', host_name='TOI-712', discovery_method='Transit', discovery_year=2022, controversial_flag=False, orbital_period=84.8396, planet_radius=2.474, planet_mass=6.68, semi_major_radius=0.3405, eccentricity=0.073, equilibrium_temperature=314.0, insolation_flux=1.6), + Planet(planet_name='Wolf 1061 b', host_name='Wolf 1061', discovery_method='Radial Velocity', discovery_year=2015, controversial_flag=False, orbital_period=4.8869, planet_radius=1.21, planet_mass=1.91, semi_major_radius=0.0375, eccentricity=0.15, equilibrium_temperature=None, insolation_flux=7.34), + Planet(planet_name='Wolf 1061 c', host_name='Wolf 1061', discovery_method='Radial Velocity', discovery_year=2015, controversial_flag=False, orbital_period=17.8719, planet_radius=1.66, planet_mass=3.41, semi_major_radius=0.089, eccentricity=0.11, equilibrium_temperature=None, insolation_flux=1.3), + Planet(planet_name='YZ Cet b', host_name='YZ Cet', discovery_method='Radial Velocity', discovery_year=2017, controversial_flag=False, orbital_period=2.02087, planet_radius=0.913, planet_mass=0.7, semi_major_radius=0.01634, eccentricity=0.06, equilibrium_temperature=471.0, insolation_flux=8.21), + Planet(planet_name='ups And b', host_name='ups And', discovery_method='Radial Velocity', discovery_year=1996, controversial_flag=False, orbital_period=4.617033, planet_radius=14.0, planet_mass=218.531, semi_major_radius=0.059222, eccentricity=0.0215, equilibrium_temperature=None, insolation_flux=None)], + 'q17': 256, + 'q18': Star(spectral_type='K8V', stellar_effective_temperature=5144.0, stellar_radius=0.79, stellar_mass=0.82, stellar_luminosity=-0.401, stellar_surface_gravity=4.55, stellar_age=7.48), + 'q19': 12.888118811881188, + 'q20': [Planet(planet_name='Kepler-1663 b', host_name='Kepler-1663', discovery_method='Transit', discovery_year=2020, controversial_flag=False, orbital_period=17.6046, planet_radius=3.304, planet_mass=10.9, semi_major_radius=0.1072, eccentricity=0.0, equilibrium_temperature=362.0, insolation_flux=4.07)]} + return expected_json + + +def get_special_json(): + """get_special_json() returns a dict mapping each question to the expected + answer stored in a special format as a list of tuples. Each tuple contains + the element expected in the list, and its corresponding value. Any two + elements with the same value can appear in any order in the actual list, + but if two elements have different values, then they must appear in the + same order as in the expected list of tuples.""" + special_json = {} + return special_json + + +def compare(expected, actual, q_format=TEXT_FORMAT): + """compare(expected, actual) is used to compare when the format of + the expected answer is known for certain.""" + try: + if q_format == TEXT_FORMAT: + return simple_compare(expected, actual) + elif q_format == TEXT_FORMAT_UNORDERED_LIST: + return list_compare_unordered(expected, actual) + elif q_format == TEXT_FORMAT_ORDERED_LIST: + return list_compare_ordered(expected, actual) + elif q_format == TEXT_FORMAT_DICT: + return dict_compare(expected, actual) + elif q_format == TEXT_FORMAT_SPECIAL_ORDERED_LIST: + return list_compare_special(expected, actual) + elif q_format == TEXT_FORMAT_NAMEDTUPLE: + return namedtuple_compare(expected, actual) + elif q_format == PNG_FORMAT_SCATTER: + return compare_flip_dicts(expected, actual) + elif q_format == HTML_FORMAT: + return compare_cell_html(expected, actual) + elif q_format == FILE_JSON_FORMAT: + return compare_json(expected, actual) + else: + if expected != actual: + return "expected %s but found %s " % (repr(expected), repr(actual)) + except: + if expected != actual: + return "expected %s" % (repr(expected)) + return PASS + + +def print_message(expected, actual, complete_msg=True): + """print_message(expected, actual) displays a simple error message.""" + msg = "expected %s" % (repr(expected)) + if complete_msg: + msg = msg + " but found %s" % (repr(actual)) + return msg + + +def simple_compare(expected, actual, complete_msg=True): + """simple_compare(expected, actual) is used to compare when the expected answer + is a type/Nones/str/int/float/bool. When the expected answer is a float, + the actual answer is allowed to be within the tolerance limit. Otherwise, + the values must match exactly, or a very simple error message is displayed.""" + msg = PASS + if 'numpy' in repr(type((actual))): + actual = actual.item() + if isinstance(expected, type): + if expected != actual: + if isinstance(actual, type): + msg = "expected %s but found %s" % (expected.__name__, actual.__name__) + else: + msg = "expected %s but found %s" % (expected.__name__, repr(actual)) + elif not isinstance(actual, type(expected)): + if not (isinstance(expected, (float, int)) and isinstance(actual, (float, int))): + if not is_namedtuple(expected): + msg = "expected to find type %s but found type %s" % (type(expected).__name__, type(actual).__name__) + elif isinstance(expected, float): + if not math.isclose(actual, expected, rel_tol=REL_TOL, abs_tol=ABS_TOL): + msg = print_message(expected, actual, complete_msg) + elif isinstance(expected, (list, tuple)) or is_namedtuple(expected): + new_msg = print_message(expected, actual, complete_msg) + if len(expected) != len(actual): + return new_msg + for i in range(len(expected)): + val = simple_compare(expected[i], actual[i]) + if val != PASS: + return new_msg + elif isinstance(expected, dict): + new_msg = print_message(expected, actual, complete_msg) + if len(expected) != len(actual): + return new_msg + val = simpe_compare(sorted(list(expected.keys())), sorted(list(actual.keys()))) + if val != PASS: + return new_msg + for key in expected: + val = simple_compare(expected[key], actual[key]) + if val != PASS: + return new_msg + else: + if expected != actual: + msg = print_message(expected, actual, complete_msg) + return msg + + +def intelligent_compare(expected, actual, obj=None): + """intelligent_compare(expected, actual) is used to compare when the + data type of the expected answer is not known for certain, and default + assumptions need to be made.""" + if obj == None: + obj = type(expected).__name__ + if is_namedtuple(expected): + msg = namedtuple_compare(expected, actual) + elif isinstance(expected, (list, tuple)): + msg = list_compare_ordered(expected, actual, obj) + elif isinstance(expected, set): + msg = list_compare_unordered(expected, actual, obj) + elif isinstance(expected, (dict)): + msg = dict_compare(expected, actual) + else: + msg = simple_compare(expected, actual) + msg = msg.replace("CompDict", "dict").replace("CompSet", "set").replace("NewNone", "None") + return msg + + +def is_namedtuple(obj, init_check=True): + """is_namedtuple(obj) returns True if `obj` is a namedtuple object + defined in the test file.""" + bases = type(obj).__bases__ + if len(bases) != 1 or bases[0] != tuple: + return False + fields = getattr(type(obj), '_fields', None) + if not isinstance(fields, tuple): + return False + if init_check and not type(obj).__name__ in [nt.__name__ for nt in _expected_namedtuples]: + return False + return True + + +def list_compare_ordered(expected, actual, obj=None): + """list_compare_ordered(expected, actual) is used to compare when the + expected answer is a list/tuple, where the order of the elements matters.""" + msg = PASS + if not isinstance(actual, type(expected)): + msg = "expected to find type %s but found type %s" % (type(expected).__name__, type(actual).__name__) + return msg + if obj == None: + obj = type(expected).__name__ + for i in range(len(expected)): + if i >= len(actual): + msg = "at index %d of the %s, expected missing %s" % (i, obj, repr(expected[i])) + break + val = intelligent_compare(expected[i], actual[i], "sub" + obj) + if val != PASS: + msg = "at index %d of the %s, " % (i, obj) + val + break + if len(actual) > len(expected) and msg == PASS: + msg = "at index %d of the %s, found unexpected %s" % (len(expected), obj, repr(actual[len(expected)])) + if len(expected) != len(actual): + msg = msg + " (found %d entries in %s, but expected %d)" % (len(actual), obj, len(expected)) + + if len(expected) > 0: + try: + if msg != PASS and list_compare_unordered(expected, actual, obj) == PASS: + msg = msg + " (%s may not be ordered as required)" % (obj) + except: + pass + return msg + + +def list_compare_helper(larger, smaller): + """list_compare_helper(larger, smaller) is a helper function which takes in + two lists of possibly unequal sizes and finds the item that is not present + in the smaller list, if there is such an element.""" + msg = PASS + j = 0 + for i in range(len(larger)): + if i == len(smaller): + msg = "expected %s" % (repr(larger[i])) + break + found = False + while not found: + if j == len(smaller): + val = simple_compare(larger[i], smaller[j - 1], complete_msg=False) + break + val = simple_compare(larger[i], smaller[j], complete_msg=False) + j += 1 + if val == PASS: + found = True + break + if not found: + msg = val + break + return msg + +class NewNone(): + """alternate class in place of None, which allows for comparison with + all other data types.""" + def __str__(self): + return 'None' + def __repr__(self): + return 'None' + def __lt__(self, other): + return True + def __le__(self, other): + return True + def __gt__(self, other): + return False + def __ge__(self, other): + return other == None + def __eq__(self, other): + return other == None + def __ne__(self, other): + return other != None + +class CompDict(dict): + """subclass of dict, which allows for comparison with other dicts.""" + def __init__(self, vals): + super(self.__class__, self).__init__(vals) + if type(vals) == CompDict: + self.val = vals.val + elif isinstance(vals, dict): + self.val = self.get_equiv(vals) + else: + raise TypeError("'%s' object cannot be type casted to CompDict class" % type(vals).__name__) + + def get_equiv(self, vals): + val = [] + for key in sorted(list(vals.keys())): + val.append((key, vals[key])) + return val + + def __str__(self): + return str(dict(self.val)) + def __repr__(self): + return repr(dict(self.val)) + def __lt__(self, other): + return self.val < CompDict(other).val + def __le__(self, other): + return self.val <= CompDict(other).val + def __gt__(self, other): + return self.val > CompDict(other).val + def __ge__(self, other): + return self.val >= CompDict(other).val + def __eq__(self, other): + return self.val == CompDict(other).val + def __ne__(self, other): + return self.val != CompDict(other).val + +class CompSet(set): + """subclass of set, which allows for comparison with other sets.""" + def __init__(self, vals): + super(self.__class__, self).__init__(vals) + if type(vals) == CompSet: + self.val = vals.val + elif isinstance(vals, set): + self.val = self.get_equiv(vals) + else: + raise TypeError("'%s' object cannot be type casted to CompSet class" % type(vals).__name__) + + def get_equiv(self, vals): + return sorted(list(vals)) + + def __str__(self): + return str(set(self.val)) + def __repr__(self): + return repr(set(self.val)) + def __getitem__(self, index): + return self.val[index] + def __lt__(self, other): + return self.val < CompSet(other).val + def __le__(self, other): + return self.val <= CompSet(other).val + def __gt__(self, other): + return self.val > CompSet(other).val + def __ge__(self, other): + return self.val >= CompSet(other).val + def __eq__(self, other): + return self.val == CompSet(other).val + def __ne__(self, other): + return self.val != CompSet(other).val + +def make_sortable(item): + """make_sortable(item) replaces all Nones in `item` with an alternate + class that allows for comparison with str/int/float/bool/list/set/tuple/dict. + It also replaces all dicts (and sets) with a subclass that allows for + comparison with other dicts (and sets).""" + if item == None: + return NewNone() + elif isinstance(item, (type, str, int, float, bool)): + return item + elif isinstance(item, (list, set, tuple)): + new_item = [] + for subitem in item: + new_item.append(make_sortable(subitem)) + if is_namedtuple(item): + return type(item)(*new_item) + elif isinstance(item, set): + return CompSet(new_item) + else: + return type(item)(new_item) + elif isinstance(item, dict): + new_item = {} + for key in item: + new_item[key] = make_sortable(item[key]) + return CompDict(new_item) + return item + +def list_compare_unordered(expected, actual, obj=None): + """list_compare_unordered(expected, actual) is used to compare when the + expected answer is a list/set where the order of the elements does not matter.""" + msg = PASS + if not isinstance(actual, type(expected)): + msg = "expected to find type %s but found type %s" % (type(expected).__name__, type(actual).__name__) + return msg + if obj == None: + obj = type(expected).__name__ + + try: + sort_expected = sorted(make_sortable(expected)) + sort_actual = sorted(make_sortable(actual)) + except: + return "unexpected datatype found in %s; expected entries of type %s" % (obj, obj, type(expected[0]).__name__) + + if len(actual) == 0 and len(expected) > 0: + msg = "in the %s, missing " % (obj) + sort_expected[0] + elif len(actual) > 0 and len(expected) > 0: + val = intelligent_compare(sort_expected[0], sort_actual[0]) + if val.startswith("expected to find type"): + msg = "in the %s, " % (obj) + simple_compare(sort_expected[0], sort_actual[0]) + else: + if len(expected) > len(actual): + msg = "in the %s, missing " % (obj) + list_compare_helper(sort_expected, sort_actual) + elif len(expected) < len(actual): + msg = "in the %s, found un" % (obj) + list_compare_helper(sort_actual, sort_expected) + if len(expected) != len(actual): + msg = msg + " (found %d entries in %s, but expected %d)" % (len(actual), obj, len(expected)) + return msg + else: + val = list_compare_helper(sort_expected, sort_actual) + if val != PASS: + msg = "in the %s, missing " % (obj) + val + ", but found un" + list_compare_helper(sort_actual, + sort_expected) + return msg + + +def namedtuple_compare(expected, actual): + """namedtuple_compare(expected, actual) is used to compare when the + expected answer is a namedtuple defined in the test file.""" + msg = PASS + if not is_namedtuple(actual, False): + msg = "expected namedtuple but found %s" % (type(actual).__name__) + return msg + if type(expected).__name__ != type(actual).__name__: + return "expected namedtuple %s but found namedtuple %s" % (type(expected).__name__, type(actual).__name__) + expected_fields = expected._fields + actual_fields = actual._fields + msg = list_compare_ordered(list(expected_fields), list(actual_fields), "namedtuple attributes") + if msg != PASS: + return msg + for field in expected_fields: + val = intelligent_compare(getattr(expected, field), getattr(actual, field)) + if val != PASS: + msg = "at attribute %s of namedtuple %s, " % (field, type(expected).__name__) + val + return msg + return msg + + +def clean_slashes(item): + """clean_slashes()""" + if isinstance(item, str): + return item.replace("\\", "/").replace("/", os.path.sep) + elif item == None or isinstance(item, (type, int, float, bool)): + return item + elif isinstance(item, (list, tuple, set)) or is_namedtuple(item): + new_item = [] + for subitem in item: + new_item.append(clean_slashes(subitem)) + if is_namedtuple(item): + return type(item)(*new_item) + else: + return type(item)(new_item) + elif isinstance(item, dict): + new_item = {} + for key in item: + new_item[clean_slashes(key)] = clean_slashes(item[key]) + return item + + +def list_compare_special_initialize(special_expected): + """list_compare_special_initialize(special_expected) takes in the special + ordering stored as a sorted list of items, and returns a list of lists + where the ordering among the inner lists does not matter.""" + latest_val = None + clean_special = [] + for row in special_expected: + if latest_val == None or row[1] != latest_val: + clean_special.append([]) + latest_val = row[1] + clean_special[-1].append(row[0]) + return clean_special + + +def list_compare_special(special_expected, actual): + """list_compare_special(special_expected, actual) is used to compare when the + expected answer is a list with special ordering defined in `special_expected`.""" + msg = PASS + expected_list = [] + special_order = list_compare_special_initialize(special_expected) + for expected_item in special_order: + expected_list.extend(expected_item) + val = list_compare_unordered(expected_list, actual) + if val != PASS: + return val + i = 0 + for expected_item in special_order: + j = len(expected_item) + actual_item = actual[i: i + j] + val = list_compare_unordered(expected_item, actual_item) + if val != PASS: + if j == 1: + msg = "at index %d " % (i) + val + else: + msg = "between indices %d and %d " % (i, i + j - 1) + val + msg = msg + " (list may not be ordered as required)" + break + i += j + return msg + + +def dict_compare(expected, actual, obj=None): + """dict_compare(expected, actual) is used to compare when the expected answer + is a dict.""" + msg = PASS + if not isinstance(actual, type(expected)): + msg = "expected to find type %s but found type %s" % (type(expected).__name__, type(actual).__name__) + return msg + if obj == None: + obj = type(expected).__name__ + + expected_keys = list(expected.keys()) + actual_keys = list(actual.keys()) + val = list_compare_unordered(expected_keys, actual_keys, obj) + + if val != PASS: + msg = "bad keys in %s: " % (obj) + val + if msg == PASS: + for key in expected: + new_obj = None + if isinstance(expected[key], (list, tuple, set)): + new_obj = 'value' + elif isinstance(expected[key], dict): + new_obj = 'sub' + obj + val = intelligent_compare(expected[key], actual[key], new_obj) + if val != PASS: + msg = "incorrect value for key %s in %s: " % (repr(key), obj) + val + return msg + + +def is_flippable(item): + """is_flippable(item) determines if the given dict of lists has lists of the + same length and is therefore flippable.""" + item_lens = set(([str(len(item[key])) for key in item])) + if len(item_lens) == 1: + return PASS + else: + return "found lists of lengths %s" % (", ".join(list(item_lens))) + +def flip_dict_of_lists(item): + """flip_dict_of_lists(item) flips a dict of lists into a list of dicts if the + lists are of same length.""" + new_item = [] + length = len(list(item.values())[0]) + for i in range(length): + new_dict = {} + for key in item: + new_dict[key] = item[key][i] + new_item.append(new_dict) + return new_item + +def compare_flip_dicts(expected, actual, obj="lists"): + """compare_flip_dicts(expected, actual) flips a dict of lists (or dicts) into + a list of dicts (or dict of dicts) and then compares the list ignoring order.""" + msg = PASS + example_item = list(expected.values())[0] + if isinstance(example_item, (list, tuple)): + val = is_flippable(actual) + if val != PASS: + msg = "expected to find lists of length %d, but " % (len(example_item)) + val + return msg + msg = list_compare_unordered(flip_dict_of_lists(expected), flip_dict_of_lists(actual), "lists") + elif isinstance(example_item, dict): + expected_keys = list(example_item.keys()) + for key in actual: + val = list_compare_unordered(expected_keys, list(actual[key].keys()), "dictionary %s" % key) + if val != PASS: + return val + for cat_key in expected_keys: + expected_category = {} + actual_category = {} + for key in expected: + expected_category[key] = expected[key][cat_key] + actual_category[key] = actual[key][cat_key] + val = list_compare_unordered(flip_dict_of_lists(expected), flip_dict_of_lists(actual), "category " + repr(cat_key)) + if val != PASS: + return val + return msg + + +def get_expected_tables(): + """get_expected_tables() reads the html file with the expected DataFrames + and returns a dict mapping each question to a html table.""" + if not os.path.exists(DF_FILE): + return None + + expected_tables = {} + f = open(DF_FILE, encoding='utf-8') + soup = BeautifulSoup(f.read(), 'html.parser') + f.close() + + tables = soup.find_all('table') + for table in tables: + expected_tables[table.get("data-question")] = table + + return expected_tables + +def parse_df_html_table(table): + """parse_df_html_table(table) takes in a table as a html string and returns + a dict mapping each row and column index to the value at that position.""" + rows = [] + for tr in table.find_all('tr'): + rows.append([]) + for cell in tr.find_all(['td', 'th']): + rows[-1].append(cell.get_text().strip("\n ")) + + cells = {} + for r in range(1, len(rows)): + for c in range(1, len(rows[0])): + rname = rows[r][0] + cname = rows[0][c] + cells[(rname,cname)] = rows[r][c] + return cells + + +def get_expected_namedtuples(): + """get_expected_namedtuples() defines the required namedtuple objects + globally. It also returns a tuple of the classes.""" + expected_namedtuples = [] + + global Star + star_attributes = ['spectral_type', 'stellar_effective_temperature', 'stellar_radius', 'stellar_mass', 'stellar_luminosity', 'stellar_surface_gravity', 'stellar_age'] + Star = namedtuple('Star', star_attributes) + expected_namedtuples.append(Star) + global Planet + planets_attributes = ['planet_name', 'host_name', 'discovery_method', 'discovery_year', 'controversial_flag', 'orbital_period', 'planet_radius', 'planet_mass', 'semi_major_radius', 'eccentricity', 'equilibrium_temperature', 'insolation_flux'] + Planet = namedtuple('Planet', planets_attributes) + expected_namedtuples.append(Planet) + return tuple(expected_namedtuples) + +_expected_namedtuples = get_expected_namedtuples() + + +def compare_cell_html(expected, actual): + """compare_cell_html(expected, actual) is used to compare when the + expected answer is a DataFrame stored in the `expected_dfs` html file.""" + expected_cells = parse_df_html_table(expected) + try: + actual_cells = parse_df_html_table(BeautifulSoup(actual, 'html.parser').find('table')) + except Exception as e: + return "expected to find type DataFrame but found type %s instead" % type(actual).__name__ + + expected_cols = list(set(["column %s" % (loc[1]) for loc in expected_cells])) + actual_cols = list(set(["column %s" % (loc[1]) for loc in actual_cells])) + msg = list_compare_unordered(expected_cols, actual_cols, "DataFrame") + if msg != PASS: + return msg + + expected_rows = list(set(["row index %s" % (loc[0]) for loc in expected_cells])) + actual_rows = list(set(["row index %s" % (loc[0]) for loc in actual_cells])) + msg = list_compare_unordered(expected_rows, actual_rows, "DataFrame") + if msg != PASS: + return msg + + for location, expected in expected_cells.items(): + location_name = "column {} at index {}".format(location[1], location[0]) + actual = actual_cells.get(location, None) + if actual == None: + return "in %s, expected to find %s" % (location_name, repr(expected)) + try: + actual_ans = float(actual) + expected_ans = float(expected) + if math.isnan(actual_ans) and math.isnan(expected_ans): + continue + except Exception as e: + actual_ans, expected_ans = actual, expected + msg = simple_compare(expected_ans, actual_ans) + if msg != PASS: + return "in %s, " % location_name + msg + return PASS + + +def get_expected_plots(): + """get_expected_plots() reads the json file with the expected plot data + and returns a dict mapping each question to a dictionary with the plots data.""" + if not os.path.exists(PLOT_FILE): + return None + + f = open(PLOT_FILE, encoding='utf-8') + expected_plots = json.load(f) + f.close() + return expected_plots + + +def compare_file_json(expected, actual): + """compare_file_json(expected, actual) is used to compare when the + expected answer is a JSON file.""" + msg = PASS + if not os.path.isfile(expected): + return "file %s not found; make sure it is downloaded and stored in the correct directory" % (expected) + elif not os.path.isfile(actual): + return "file %s not found; make sure that you have created the file with the correct name" % (actual) + try: + e = open(expected, encoding='utf-8') + expected_data = json.load(e) + e.close() + except json.JSONDecodeError: + return "file %s is broken and cannot be parsed; please delete and redownload the file correctly" % (expected) + try: + a = open(actual, encoding='utf-8') + actual_data = json.load(a) + a.close() + except json.JSONDecodeError: + return "file %s is broken and cannot be parsed" % (actual) + if type(expected_data) == list: + msg = list_compare_ordered(expected_data, actual_data, 'file ' + actual) + elif type(expected_data) == dict: + msg = dict_compare(expected_data, actual_data) + return msg + + +_expected_json = get_expected_json() +_special_json = get_special_json() +_expected_plots = get_expected_plots() +_expected_tables = get_expected_tables() +_expected_format = get_expected_format() + +def check(qnum, actual): + """check(qnum, actual) is used to check if the answer in the notebook is + the correct answer, and provide useful feedback if the answer is incorrect.""" + msg = PASS + error_msg = "<b style='color: red;'>ERROR:</b> " + q_format = _expected_format[qnum] + + if q_format == TEXT_FORMAT_SPECIAL_ORDERED_LIST: + expected = _special_json[qnum] + elif q_format == PNG_FORMAT_SCATTER: + if _expected_plots == None: + msg = error_msg + "file %s not parsed; make sure it is downloaded and stored in the correct directory" % (PLOT_FILE) + else: + expected = _expected_plots[qnum] + elif q_format == HTML_FORMAT: + if _expected_tables == None: + msg = error_msg + "file %s not parsed; make sure it is downloaded and stored in the correct directory" % (DF_FILE) + else: + expected = _expected_tables[qnum] + else: + expected = _expected_json[qnum] + + if SLASHES in q_format: + q_format = q_format.replace(SLASHES, "").strip("_ ") + expected = clean_slashes(expected) + actual = clean_slashes(actual) + + if msg != PASS: + print(msg) + else: + msg = compare(expected, actual, q_format) + if msg != PASS: + msg = error_msg + msg + print(msg) + + +def check_file_size(path): + """check_file_size(path) throws an error if the file is too big to display + on Gradescope.""" + size = os.path.getsize(path) + assert size < MAX_FILE_SIZE * 10**3, "Your file is too big to be displayed by Gradescope; please delete unnecessary output cells so your file size is < %s KB" % MAX_FILE_SIZE + + +def reset_hidden_tests(): + """reset_hidden_tests() resets all hidden tests on the Gradescope autograder where the hidden test file exists""" + if not os.path.exists(HIDDEN_FILE): + return + hidn.reset_hidden_tests() + +def rubric_check(rubric_point, ignore_past_errors=True): + """rubric_check(rubric_point) uses the hidden test file on the Gradescope autograder to grade the `rubric_point`""" + if not os.path.exists(HIDDEN_FILE): + print(PASS) + return + error_msg_1 = "ERROR: " + error_msg_2 = "TEST DETAILS: " + try: + msg = hidn.rubric_check(rubric_point, ignore_past_errors) + except: + msg = "hidden tests crashed before execution" + if msg != PASS: + hidn.make_deductions(rubric_point) + if msg == "public tests failed": + comment = "The public tests have failed, so you will not receive any points for this question." + comment += "\nPlease confirm that the public tests pass locally before submitting." + elif msg == "answer is hardcoded": + comment = "In the datasets for testing hardcoding, all numbers are replaced with random values." + comment += "\nIf the answer is the same as in the original dataset for all these datasets" + comment += "\ndespite this, that implies that the answer in the notebook is hardcoded." + comment += "\nYou will not receive any points for this question." + else: + comment = hidn.get_comment(rubric_point) + msg = error_msg_1 + msg + if comment != "": + msg = msg + "\n" + error_msg_2 + comment + print(msg) + +def get_summary(): + """get_summary() returns the summary of the notebook using the hidden test file on the Gradescope autograder""" + if not os.path.exists(HIDDEN_FILE): + print("Total Score: %d/%d" % (TOTAL_SCORE, TOTAL_SCORE)) + return + score = min(TOTAL_SCORE, hidn.get_score(TOTAL_SCORE)) + display_msg = "Total Score: %d/%d" % (score, TOTAL_SCORE) + if score != TOTAL_SCORE: + display_msg += "\n" + hidn.get_deduction_string() + print(display_msg) + +def get_score_digit(digit): + """get_score_digit(digit) returns the `digit` of the score using the hidden test file on the Gradescope autograder""" + if not os.path.exists(HIDDEN_FILE): + score = TOTAL_SCORE + else: + score = hidn.get_score(TOTAL_SCORE) + digits = bin(score)[2:] + digits = "0"*(7 - len(digits)) + digits + return int(digits[6 - digit]) diff --git a/p10/rubric.md b/p10/rubric.md new file mode 100644 index 0000000000000000000000000000000000000000..106b674f2e2a9d7541589b5b75f516f59857e353 --- /dev/null +++ b/p10/rubric.md @@ -0,0 +1,151 @@ +# Project 10 (P10) grading rubric + +## Code reviews + +- The Gradescope autograder will make deductions based on the rubric provided below. +- To ensure that you don't lose any points, you must review the rubric and make sure that you have followed the instructions provided in the project correctly. + +## Rubric + +### General guidelines: + +- Outputs not visible/did not save the notebook file prior to running the cell containing "export". We cannot see your output if you do not save before generating the zip file. (-3) +- Used concepts/modules such as `csv.DictReader` and `pandas` not covered in class yet. Note that built-in functions that you have been introduced to can be used. (-3) +- Used bare try/except blocks without explicitly specifying the type of exceptions that need to be caught (-3) +- Large outputs such as `stars_dict` or `planets_list` are displayed in the notebook. (-3) +- Import statements are not mentioned in the required cell at the top of the notebook. (-1) + +### Question specific guidelines: + +- q1 (3) + - answer is not sorted explicitly (-2) + - answer does not remove all files and directories that start with `.` (-1) + +- q2 (3) + - recomputed variable defined in Question 1, or the answer is not sorted explicitly (-1) + - answer does not remove all files and directories that start with `.` (-1) + - paths are hardcoded using slashes (-1) + +- q3 (4) + - recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly (-1) + - answer does not remove all files and directories that start with `.` (-1) + - answer does not check only for files that end with `.csv` (-1) + - paths are hardcoded using slashes (-1) + +- q4 (4) + - recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly (-1) + - answer does not remove all files and directories that start with `.` (-1) + - answer does not check for only files that start with `stars` (-1) + - paths are hardcoded using slashes (-1) + +- `Star` (2) + - data structure is defined more than once (-1) + - data structure is defined incorrectly (-1) + +- `star_cell` (4) + - function does not typecast values based on columns (-1) + - column indices are hardcoded instead of using column names (-1) + - function logic is incorrect (-1) + - function is defined more than once (-1) + +- q5 (4) + - `star_cell` function is not used to answer (-1) + - answer unnecessarily iterates over the entire dataset (-1) + +- `get_stars` (6) + - function logic is incorrect (-2) + - hardcoded the name of directory inside the function instead of passing it as a part of the input argument (-1) + - function is called more than twice with the same dataset (-1) + - `star_cell` function is not used (-1) + - function is defined more than once (-1) + +- q6 (2) + - `stars_1_dict` data structure is not used to answer (-2) + +- q7 (3) + - incorrect logic is used to answer (-1) + - `stars_1_dict` data structure is not used to answer (-1) + +- q8 (3) + - incorrect logic is used to answer (-1) + - `get_stars` function is not used to answer (-1) + +- `stars_dict` (3) + - data structure is defined incorrectly (-1) + - `get_stars` function is not used (-1) + - `stars_paths` is not used to find paths of necessary files (-1) + +- q9 (2) + - `stars_dict` data structure is not used to answer (-2) + +- q10 (3) + - incorrect logic is used to answer (-1) + - `stars_dict` data structure is not used to answer (-1) + +- q11 (3) + - answer does not check for only stars that start with `Kepler` (-1) + - incorrect logic is used to answer (-1) + - `stars_dict` data structure is not used to answer (-1) + +- `Planet` (2) + - data structure is defined more than once (-1) + - data structure is defined incorrectly (-1) + +- `planet_cell` (5) + - function does not typecast values based on columns (-1) + - column indices are hardcoded instead of using column names (-1) + - boolean values are not typecasted correctly (-1) + - function logic is incorrect (-1) + - function is defined more than once (-1) + +- q12 (4) + - `planet_cell` function is not used to answer (-1) + - `mapping_1_json` data structure is not used to answer (-1) + - answer unnecessarily iterates over the entire dataset (-1) + +- `get_planets` (8) + - function logic is incorrect (-2) + - hardcoded the name of directory inside the function instead of passing it as a part of the input argument (-1) + - function is called more than twice with the same dataset (-1) + - `planet_cell` function is not used (-1) + - function is defined more than once (-3) + +- q13 (3) + - `get_planets` function is not used to answer (-1) + - paths are hardcoded using slashes (-1) + +- q14 (3) + - incorrect logic is used to answer (-1) + - `get_planets` function is not used to answer (-1) + - paths are hardcoded using slashes (-1) + +- q15 (3) + - `get_planets` function is not used to answer (-1) + - paths are hardcoded using slashes (-1) + +- `planets_list` (4) + - data structure is defined incorrectly (-2) + - `get_planets` function is not used (-1) + - paths are hardcoded using slashes (-1) + +- q16 (2) + - `planets_list` data structure is not used to answer (-2) + +- q17 (3) + - incorrect comparison operator is used (-1) + - incorrect logic is used to answer (-1) + - `planets_list` data structure is not used to answer (-1) + +- q18 (4) + - `planets_list` and `stars_dict` data structures are not used to answer (-2) + - did not exit loop and instead iterated further after finding the answer (-1) + +- q19 (5) + - incorrect comparison operator is used (-1) + - incorrect logic is used to answer (-1) + - `planets_list` and `stars_dict` data structures are not used to answer (-2) + +- q20 (5) + - answer does not include all Planets that orbit the Star (-1) + - incorrect logic is used to answer (-1) + - `planets_list` and `stars_dict` data structures are not used to answer (-2)