{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": false }, "source": [ "# [Introduction to Data Science](http://datascience-intro.github.io/1MS041-2023/) \n", "## 1MS041, 2023 \n", "©2023 Raazesh Sainudiin, Benny Avelin. [Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Other measurements of performance\n", "\n", "Recall that in the logistic regression case our function $G(x) \\in [0,1]$ and represents the probability of the label being $1$, we then used the following rule to construct a decision function from this $G$, i.e. \n", "$$\n", " g(x) = \n", " \\begin{cases}\n", " 1, & \\text{if } G(x) > 1/2 \\\\\n", " 0, & \\text{otherwise.}\n", " \\end{cases}\n", "$$\n", "The parameter $1/2$ can be changed in order to create a trade-off between precision and recall.\n", "\n", "Lets consider the function\n", "$$\n", " g_\\alpha(x) = \n", " \\begin{cases}\n", " 1, & \\text{if } G(x) > \\alpha \\\\\n", " 0, & \\text{otherwise.}\n", " \\end{cases}\n", "$$\n", "where $\\alpha \\in [0,1]$, then for each such $\\alpha$ we get a precision and recall, i.e.\n", "$$\n", " \\begin{aligned}\n", " \\text{Precision:} \\quad \\text{Pr}(\\alpha) = P(Y = 1 \\mid g_\\alpha(X) = 1) \\\\\n", " \\text{Recall:} \\quad \\text{Re} (\\alpha) = P(g_\\alpha(X) = 1 \\mid Y = 1).\n", " \\end{aligned}\n", "$$\n", "\n", "These functions can be plotted as functions of $\\alpha$, we can see that below" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
SVC(kernel='linear', probability=True)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
SVC(kernel='linear', probability=True)
SVC(probability=True)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
SVC(probability=True)