{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n",
    "\n",
    "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\", as well as your name below.\n",
    "\n",
    "Rename this problem sheet as follows:\n",
    "\n",
    "    ps{number of lab}_{your user name}_problem{number of problem sheet in this lab}\n",
    "    \n",
    "for example\n",
    "    \n",
    "    ps2_blja_problem1\n",
    "\n",
    "Submit your homework within one week until next Monday, 9 a.m."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "NAME = \"\"\n",
    "EMAIL = \"\"\n",
    "USERNAME = \"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "5a2f5cf99bb357cbee40379d9332d423",
     "grade": false,
     "grade_id": "cell-e687ff80d998f2a2",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "# Introduction to Data Science\n",
    "## Lab 7: Logistic regression\n",
    "### Part B: Logistic regression in practice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "255fa011d42dd0025eaef1adb3e2061b",
     "grade": false,
     "grade_id": "cell-f6e31ae2c0391934",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "In this lab, we want to investigate the `Default` data set known from the lecture.\n",
    "It contains the predictors\n",
    "- `student` status, either `'Yes'` or `'No'`\n",
    "- `balance`, i.e., monthly credit card balance\n",
    "- yearly `income`\n",
    "and the response\n",
    "- `default`, which is either `'Yes'` or `'No'`\n",
    "\n",
    "We first load the necessary modules.\n",
    "\n",
    "By the way, the command\n",
    "    \n",
    "    plt.rcParams['figure.figsize'] = [13, 5]\n",
    "    \n",
    "changes the default size of figures (in inches)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "2311e42a5d76699e2bf57003896ae6b9",
     "grade": false,
     "grade_id": "cell-8b73e58f90854ece",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "plt.rcParams['figure.figsize'] = [13, 5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "7d1450918eaf52d4f70eb73958b1a15e",
     "grade": false,
     "grade_id": "cell-4242deff452c95aa",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Task**: Download the file `Default.csv` from the webpage.\n",
    "Read it using the `pandas` function `read_csv` and store the `pandas DataFrame` in the variable `D`.\n",
    "Make sure that:\n",
    "- the index column is recognized appropriately.\n",
    "- the column titles are correct"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "2544741ee27e14c1f78dec9b7438ff67",
     "grade": false,
     "grade_id": "cell-b1d3abed11dc76fb",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "b4597b968a6a4aa30cf6f360add620c8",
     "grade": true,
     "grade_id": "cell-d8859f83966b0cca",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'D' in locals()\n",
    "assert isinstance(D.index, pd.Int64Index)\n",
    "assert isinstance(D.columns, pd.Index)\n",
    "assert D.columns[1] == 'student'\n",
    "assert D.shape == (10000, 4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "0a49123b6ef5ba172e829646d4e05513",
     "grade": false,
     "grade_id": "cell-801cc9fa0e0cb812",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Task**: Inspect the data using the methods you've learned so far, e.g., `describe`, `hist`, `head`, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "b332eb77bdb44d90e1bdd5fb4c00b17e",
     "grade": true,
     "grade_id": "cell-a3ce086647901439",
     "locked": false,
     "points": 1,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# Task: Apply here at least two different methods to inspect the data set D\n",
    "\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "ada5249aaecfa59eb28e833a04df94fa",
     "grade": false,
     "grade_id": "cell-5dbbb956a2779a87",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "You should observe that the method `describe` only contains the predictors `balance` and `income`, but not  `default` and `student`.\n",
    "\n",
    "This is due to the fact that these values were read in by the `read_csv` function as **strings**.\n",
    "We know from the lecture that these predictors are categorical (in particular binary).\n",
    "\n",
    "In order to process these values we convert them to the data type `boolean`, i.e., we replace the `String` objects in the columns `default` and `student` by `Boolean`'s.\n",
    "There are a lot of ways to accomplish this task; the easiest might be\n",
    "\n",
    "    D.replace(to_replace='No', value=False, inplace=True)\n",
    "    \n",
    "**Task**: Replace every 'No' and 'Yes' in the `DataFrame` by the values `False` and `True`, resp."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "77c74175982a243cbbd3b3ce4bda6044",
     "grade": false,
     "grade_id": "cell-cc9a0df3e56d8dc7",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "67cf572e1f1706bf7255b97e1f0e29f4",
     "grade": true,
     "grade_id": "cell-e19e911aa04de5e9",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert D.student.dtype == 'bool'\n",
    "assert D.default.dtype == 'bool'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "a759d48acc9d54f65819e49eb805e197",
     "grade": false,
     "grade_id": "cell-f8af7eaffb575a5a",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "### Answer the following questions!\n",
    "\n",
    "Store your answers in the given variables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "7d56d0d80b61fe8a536843a0a600fbee",
     "grade": false,
     "grade_id": "cell-a1372f6119e62431",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Question A**: How many students belong to the data set?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "e43c9192833d5f03f61b0f71d04bd820",
     "grade": false,
     "grade_id": "cell-45a0fd6410a6a3f1",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# answer_A = \n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "61cb004a7a2ee59500eea4dae89afd72",
     "grade": true,
     "grade_id": "cell-3aa830790df4955d",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'answer_A' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "14af312131a6fa3698295da7a100820e",
     "grade": false,
     "grade_id": "cell-890875141b50d740",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Question B**: What is the mean **balance** of all samples?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "3211e1574f8fb4e9dabe61dc16d86236",
     "grade": false,
     "grade_id": "cell-adbccfcb7c0c4c64",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# answer_B = \n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "2c67e6a1d57fc824d225af937ada6f38",
     "grade": true,
     "grade_id": "cell-8f8fa0dd0ba12a6b",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'answer_B' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "228cdc9bcc22c82cca8d1674c837fe7e",
     "grade": false,
     "grade_id": "cell-c18aac1aeab0ffa4",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Question C**: What is the mean income of the **students**?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "abc532e425a0222b15b878a63a8d3271",
     "grade": false,
     "grade_id": "cell-304f5e89cf258b70",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# answer_C =\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "c439bc7a4697b3252927bf4ff5b97c7d",
     "grade": true,
     "grade_id": "cell-88cc5b64c139ad51",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'answer_C' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "4939e75a459ad0fa97e1a4cc66b12f8d",
     "grade": false,
     "grade_id": "cell-1357da426fb6d118",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Question D**: How many **students** obtain an **income** of more than 20.000."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "454c38fa0ba92a2271827a2b09978749",
     "grade": false,
     "grade_id": "cell-055cd883f43fd6a8",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# answer_D = \n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "1539a723dffe322eb90973f9186d985e",
     "grade": true,
     "grade_id": "cell-ad443f5f82e71c2d",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'answer_D' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "ee5b20461877dfa47b41a874ec043935",
     "grade": false,
     "grade_id": "cell-c18373f10d5ad01e",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Question E**: What is the 25% quantile of the predictor **balance**?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "da2b48d1384b41a1f67cd21ed4cc6d09",
     "grade": false,
     "grade_id": "cell-d71bb4c3124f7e98",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# answer_E = \n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "f308d4d65798c31701fa89ed92d2882b",
     "grade": true,
     "grade_id": "cell-65d1394f282d0881",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'answer_E' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "949450ab555248dad31e350687bd8494",
     "grade": false,
     "grade_id": "cell-8001a7451b2d18e5",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "### Plotting the data set\n",
    "\n",
    "Next, we want to plot both, the `income` and `balance` predictors as boxplots as a function of the `default` status.\n",
    "\n",
    "**Task**: Complete the plotting command in the following cell. What do you observe?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "777e2f4976fd1cdb75f19233e873e463",
     "grade": true,
     "grade_id": "cell-368b0f91e82b2f93",
     "locked": false,
     "points": 1,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(1,2)\n",
    "D.boxplot(column='balance',by='default', ax=ax[0]);\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "99c52c64b6e36bcd492de7d4b19fd6b3",
     "grade": false,
     "grade_id": "cell-e444abacd60aac95",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "You should observe that it seems that the credit card balance has a large effect on the default status, while the income seems not to predict the default status very well.\n",
    "\n",
    "Finally, the following cell let's you plot the `default`'s vs. the non-`default`'s of the data set. No task here!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "43f74bf01e0f469a951a6280b66db47e",
     "grade": false,
     "grade_id": "cell-c4452c32d77031f5",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "D.plot(y='income', x='balance', kind='scatter',c = D.default, cmap = 'coolwarm', marker='.');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "54c36aef3917d21b1b3f3868424f5fe0",
     "grade": false,
     "grade_id": "cell-d86bcf96591539f6",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "### Fitting a logistic regression model\n",
    "Next, we want to fit a logistic regression model to our data.\n",
    "Use the `LogisticRegression` function in the module `sklearn.linear_model`.\n",
    "The behaviour is similar to a `LinearRegression` fit.\n",
    "\n",
    "You can find the documentation of this function [here](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).\n",
    "There are a lot of optional arguments, the most important might be the unimpressive looking parameter `C`, which determines the strength of regularization used in the algorithm that solves the maximum likelihood problem.\n",
    "\n",
    "We will discuss regularization later in the lecture as well as in the labs. For now, it suffices if you keep the following in mind:\n",
    "\n",
    "**The larger you choose `C`, the less the problem will be regularized.**\n",
    "\n",
    "**Task**: Fit a logistic regression model that predicts the probability of `default` using `balance` as predictor. You should obtain the following values: $\\beta_0: -10.6513$, $\\beta_\\text{balance}: 0.0055$.\n",
    "\n",
    "Choose the following optional parameters:\n",
    "* set the regularization parameter `C = 1e10` (which is the scientific notation of $C = 10^{10}$, and thus very large)\n",
    "* set the error tolerance to `tol=1e-10`\n",
    "* set the solver to `solver = 'liblinear'`\n",
    "\n",
    "in this and the upcoming problems."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "13e476d4c0cc9e47a77a3529ff2ca6fc",
     "grade": true,
     "grade_id": "cell-22819ddc03810ae9",
     "locked": false,
     "points": 2,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "from sklearn.linear_model import LogisticRegression\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "360f7abc62cea2dedd024d7fe638ecdd",
     "grade": false,
     "grade_id": "cell-69b312391fc62d46",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Task**: Store the intercept of the model in a variable `intercept0` and the regression coefficient in the variable `reg_coef0`.\n",
    "These quantities represent the coeffcients for a linear regression model predicting the log-odds, i.e.\n",
    "\n",
    "$$\n",
    "\\log \\left( \\frac{p(x)}{1-p(x)} \\right) = \\beta_0 + \\beta_1 \\, x\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "2c9c6b0c10801c38421bb04fbcee878e",
     "grade": false,
     "grade_id": "cell-f5497700d04802be",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "8542c338e7188d525d2afec8bebf043d",
     "grade": true,
     "grade_id": "cell-6628981a047b598c",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'intercept0' in locals()\n",
    "assert 'reg_coef0' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "e6cb3f7cd999f952c773175295951711",
     "grade": false,
     "grade_id": "cell-bb51b2f348605b46",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Task**:\n",
    "Predict the probability of `default` for a `balance` value of $\\$ 1.000 $ and $\\$ 2.000 $ and store your answers is the variables `pod_1000` and `pod_2000`, resp.\n",
    "\n",
    "Use the method `predict_proba` of a `LogisticRegression` model.\n",
    "\n",
    "**Note**: The model assumes that your data has the same format as your original training data. Therefore, you might have to reshape the input into the correct format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "5f1d18ed10c683d426e0ede84aea60a6",
     "grade": false,
     "grade_id": "cell-f089240b8df50cd4",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "e4cfc9c1050ddc314034a90e56396516",
     "grade": true,
     "grade_id": "cell-2c1bb68da7976c58",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'pod_1000' in locals()\n",
    "assert 'pod_2000' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "534d0119a1b9ab837676eba549002175",
     "grade": false,
     "grade_id": "cell-23819e4234a18a17",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "You should observe, that the probality of default of an individual with a credit card balance of $\\$ 1.000 $ is approximately 0.57\\%.\n",
    "The probality of default of an individual with a credit card balance of $\\$  2.000 $ is approximately 58.6\\%."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "920c1545e68013cb41a805360da662bf",
     "grade": false,
     "grade_id": "cell-033cb0ae7b7573b9",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "Now, we want to incorporate the predictors `income` and `student` status as well. This can be done easily using the same methods.\n",
    "\n",
    "Execute the following code cell to train a new logistic regression model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "910a0191b641f23e69e507c82301d756",
     "grade": false,
     "grade_id": "cell-283d443517d30846",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "lr2= LogisticRegression(solver='liblinear', tol=1e-10, C=1e10)\n",
    "X = D.loc[:,['balance','income','student']]\n",
    "y = D.loc[:,'default']\n",
    "reg2 = lr2.fit(X,y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "75224e0abab10151a3cf005c85db696b",
     "grade": false,
     "grade_id": "cell-ee4444430413621d",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Task**: Store the intercept of the new model in the variable `intercept_full` as well as the coefficients in variables `beta_balance`, `beta_income`, `beta_student`, resp."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "fac7d158deec297a668ab69fe5d0d368",
     "grade": false,
     "grade_id": "cell-2e01d7749939ac81",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "cb971603be8dfdc19d5662c50745a734",
     "grade": true,
     "grade_id": "cell-97b6efdfefdf0bd8",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'intercept_full' in locals()\n",
    "assert 'beta_balance' in locals()\n",
    "assert 'beta_income' in locals()\n",
    "assert 'beta_student' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "831e9c17302cde555cac12ba0609b93d",
     "grade": false,
     "grade_id": "cell-ae38f1628ef2f1e0",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "**Task**:\n",
    "What is the default probability of a student and a non-student with a credit card balance of $\\$ 1500$, an income of $\\$40,000$?\n",
    "Store your answers in the variables `pod_student` and `pod_nonStudent`, resp."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "e7f85e10939c9d260600ee26d9347c00",
     "grade": false,
     "grade_id": "cell-1916a01129726b28",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "7bdd2938df903d76ee48de0110afe1b8",
     "grade": true,
     "grade_id": "cell-0b93cc82b179752b",
     "locked": true,
     "points": 1,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "assert 'pod_student' in locals()\n",
    "assert 'pod_nonStudent' in locals()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "6efa1bb290e035715a0572bea5c6a771",
     "grade": false,
     "grade_id": "cell-c076bea9f400249f",
     "locked": true,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "source": [
    "You should observe that a student with a credit card balance of $\\$1.500\\$$ and an income of $\\$40,000$ has an estimated probability of default of 5.8\\%, while an non-student with the same balance and income has a probability of default of 10.5\\%."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}