{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n", "\n", "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\", as well as your name below.\n", "\n", "Rename this problem sheet as follows:\n", "\n", " ps{number of lab}_{your user name}_problem{number of problem sheet in this lab}\n", " \n", "for example\n", " \n", " ps2_blja_problem1\n", "\n", "Submit your homework within one week until next Monday, 9 a.m." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "NAME = \"\"\n", "EMAIL = \"\"\n", "USERNAME = \"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Data Science\n", "## Lab 10: Bootstrap\n", "As you now from the lecture, the bootstrap is a widely applicable and powerful statistical tool for quantifying the uncertainty associated with an estimate or statistical learning method.\n", "\n", "Here, we want to recapitulate the investment problem from the lecture, slide 241ff.\n", "Remember, the goal is to invest a fixed sum of money in 2 financial assets with random returns\n", "$X$ and $Y$ in a risk minimal manner, i.e., we have to determine $\\alpha$ such that the quantity\n", "\n", "$$\n", "\\text{Var}(\\alpha X + (1 - \\alpha) Y)\n", "$$\n", "\n", "is minimal.\n", "One can show, that the optimal fraction $\\alpha^*$ is given by\n", "\n", "$$\n", "\\alpha^* = \\frac{\\sigma_Y^2 - \\sigma_{XY}}{\\sigma_X^2 + \\sigma_Y^2 - 2 \\sigma_{XY}}.\n", "$$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we define the setting for our problem:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "4d6e806094f8ce33dd6ef0715e2dc760", "grade": false, "grade_id": "cell-3daceefe408899b5", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "import numpy as np\n", "\n", "sx = 1.\n", "sy = np.sqrt(1.25)\n", "sxy = 0.5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task (1 point)**: Implement the function alphaopt to determine the optimal investment fraction $\\alpha^*$." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "9e8df50b1ecfd983bcccee81970e770c", "grade": false, "grade_id": "cell-f77a41b0d2ea1ec5", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "def alphaopt(sx, sy, sxy):\n", " \"\"\"Function to determine the optimal\n", " investment fraction in the problem from\n", " slide 241.\"\"\"\n", " # YOUR CODE HERE\n", " raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "60bdd9070924b31c5b97a558c069ae0c", "grade": true, "grade_id": "cell-b2142d1eba56b5ea", "locked": true, "points": 1, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert abs(alphaopt(0,1,.2) - 1.3333333333333335) < 1e-8\n", "assert abs(alphaopt(0.1,0.2,0.12)+ alphaopt(0.2,0.1,0.12) - 1) < 1e-8" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task (1 point)**: Determine the optimal investment fraction in for our data and store it in the variable `alpha1`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "9c5f5412ac3dd7de45139dceb65ce37d", "grade": false, "grade_id": "cell-94325ace46d13327", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "412bf100729d67c67d9aefa987faf2a4", "grade": true, "grade_id": "cell-48863699f34c6938", "locked": true, "points": 1, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert abs(alpha1 - 0.6) < 1e-8" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function serves as our black box the draw samples of the *unknown* distribution." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def drawXY(sx, sy, sxy, n = 100):\n", " \"\"\"Function to draw multivariate normal\n", " distributed random variables with zero mean\n", " and variance-covariance matrix\n", " [[ sx**2 sxy ]\n", " [ sxy sy**2]]\n", " \"\"\"\n", " # Set up variance-covariance matrix\n", " Sigma = np.array([[sx**2,sxy],[sxy,sy**2]])\n", " XY = np.random.multivariate_normal(np.zeros(2,),Sigma,size=(n))\n", " return XY" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task (2 points)**: Implement the repeated process of simulating 100 $(X,Y)$-observations and estimating $\\alpha$ 1000 times.\n", "Store the estimated values of $\\alpha$ in the prepared array `Alpha`.\n", "Finally, determine the mean of `Alpha` and store it in the variable `alphaest`.\n", "\n", "As an [estimate for the covariance matrix](https://en.wikipedia.org/wiki/Estimation_of_covariance_matrices), you can use\n", "\n", "$$\n", "Q = \\frac{1}{n-1} \\sum_{i=1}^{n} (x_i - \\bar x) (x_i - \\bar x)^T.\n", "$$\n", "\n", "*Note*: Depending on the way you've defined and stored the samples of $(X,Y)$, you might have shift the transposition to the other term.\n", "\n", "You should be able to approach the true value of $\\alpha^*$ within an accuracy of $0.01$, you can set a random seed prior to your loop to ensure the next test is passed." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "8e73eb3f6e4a15ce2562b07c7a052287", "grade": false, "grade_id": "cell-e3a000acdb1c69da", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "num_exp = 1000\n", "n = 100\n", "Alpha = np.zeros(num_exp)\n", "np.random.seed(0)\n", "\n", "for i in range(num_exp):\n", " # Implement the i'th experiment and set Alpha[i]\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", "\n", "# Determine the mean of Alpha\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "5de9cd3994a56508307be7521257601e", "grade": true, "grade_id": "cell-93597bbfd5aa40e3", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "np.abs(alphaest - 0.6) < 1e-2" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }