427 lines
61 KiB
Plaintext
427 lines
61 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# On importe les librairies\n",
|
||
"import numpy as np\n",
|
||
"import matplotlib.pyplot as plt"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We want to estimate parameters of a simple linear regression (y i = wx i + b) by closed-form and GD. \n",
|
||
"\n",
|
||
"The loss function is defined on a data point by l(ŷ i , y i ) = (ŷ i − y i )² where the prediction is given by ŷ i = ŵx + b̂. The\n",
|
||
"P N\n",
|
||
"optimization problem we want to solve is min i=1 l(ŷ i , y i ). \n",
|
||
"\n",
|
||
"We assume the following data generation process:\n",
|
||
"w,b\n",
|
||
"Y ∼ wX + b + \\eta where X ∼ U[20; 40] and \\eta ∼ N (0, 1).\n",
|
||
"\n",
|
||
"# 1\n",
|
||
"\n",
|
||
"Generate N data points (x i , y i ) using the data generation process given above and store them into x vs\n",
|
||
"N\n",
|
||
"y. Note: N = 100 and start by first generating {x i } N\n",
|
||
"1 and then {y i } 1 . Do not forget to add noise!\n",
|
||
"\n",
|
||
"D'abord on va générer les x"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# On génère les x\n",
|
||
"list_x = np.random.uniform(20, 40, 100).tolist()\n",
|
||
"\n",
|
||
"# Pis les y\n",
|
||
"\n",
|
||
"def y(x):\n",
|
||
" w = 1.5\n",
|
||
" b = 5\n",
|
||
" return w * x + b + np.random.normal(0,1)\n",
|
||
"\n",
|
||
"list_y = [y(x) for x in list_x ]\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 2\n",
|
||
"Plot the data points x vs y."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"<matplotlib.collections.PathCollection at 0x7ff1544421d0>"
|
||
]
|
||
},
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
},
|
||
{
|
||
"data": {
|
||
"image/png": "\n",
|
||
"text/plain": [
|
||
"<Figure size 432x288 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"plt.scatter(list_x, list_y, s=6)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 3\n",
|
||
"Estimate parameters of the simple linear regression by closed-form. Note: store these values into w_cf\n",
|
||
"and b_cf."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# D'abord, on doit transposer nos donner pour pouvoir les utiliser dans le calcul matriciel\n",
|
||
"# Donc on génère une liste de 1 qu'on va nsuit coller à notre liste de x\n",
|
||
"\n",
|
||
"list_1 = np.ones(len(list_x))\n",
|
||
"\n",
|
||
"\n",
|
||
"# Pis on colle les deux\n",
|
||
"\n",
|
||
"X = np.stack((list_1, list_x), axis=1)\n",
|
||
"\n",
|
||
"# Enfin, on fait notre calcul de ouf\n",
|
||
"\n",
|
||
"def B(y): \n",
|
||
" X_T = np.transpose(X)\n",
|
||
" prod = np.dot(X_T, X)\n",
|
||
" inv = np.linalg.inv(prod)\n",
|
||
" return np.dot(inv, np.dot(X_T,y))\n",
|
||
"\n",
|
||
"Y = np.asarray(list_y)\n",
|
||
"b_cf, w_cf = B(Y)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 8\n",
|
||
"Create a function loss(x_i, y_i, w, b) which is computing the regression loss on the full dataset.\n",
|
||
"What is the value of loss([1], [3], 1, 2) ?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0"
|
||
]
|
||
},
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# On détermine la fonction de coût\n",
|
||
"\n",
|
||
"def loss(x_i, y_i, w, b):\n",
|
||
" return ((w * x_i) + b - y_i)**2\n",
|
||
"\n",
|
||
"loss(1, 3, 1, 2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"8"
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"def dl_dw(xi, yi, w, b):\n",
|
||
" return -2 * xi * ( yi - (w * xi + b))\n",
|
||
"\n",
|
||
"dl_dw(4, -1, 0, 0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"def dl_db(xi, yi, w, b):\n",
|
||
" return -2 * ( yi - (w * xi + b))\n",
|
||
"\n",
|
||
"dl_db(4, -1, 0, 0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 11\n",
|
||
"Implement a function update_w_and_b(x_i, y_i, w, b, lr) which is updating w and b according to the gradient compute on the full data points. What is the output of the following command of update_w_and_b([0], [3], 5, 3, 0.1) and why?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"(5.0, 3.0)"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"def update_w_and_b(x, y, w, b, lr):\n",
|
||
" grad_w = 0\n",
|
||
" grad_b = 0\n",
|
||
" \n",
|
||
" N = len(y)\n",
|
||
" \n",
|
||
" for t in range(N):\n",
|
||
" grad_w += dl_dw(x[t], y[t], w, b)\n",
|
||
" grad_b += dl_db(x[t], y[t], w, b)\n",
|
||
" \n",
|
||
" # update\n",
|
||
" w -= (1 / float(N)) * grad_w * lr\n",
|
||
" b -= (1 / float(N)) * grad_b * lr\n",
|
||
" \n",
|
||
" return w, b\n",
|
||
" \n",
|
||
"update_w_and_b([0], [3], 5, 3, 0.1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 12 \n",
|
||
"Estimate parameters of the simple linear regression by gradient descent from a random initialization of\n",
|
||
"w and b and with a leanring rate equals to 0.001. Note: store these values into w_gd and b_gd."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 50,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"w_gd, b_gd = 2., 4.\n",
|
||
"\n",
|
||
"for i in range(5000):\n",
|
||
" # update w and b\n",
|
||
" w_gd, b_gd = update_w_and_b(list_x, list_y, w_gd, b_gd, 0.001)\n",
|
||
" "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 61,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "\n",
|
||
"text/plain": [
|
||
"<Figure size 432x288 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"<Figure size 432x288 with 0 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Affichage des résultats des deux techniques\n",
|
||
"\n",
|
||
"min_x, max_x = 20, 40\n",
|
||
"list_x = np.random.uniform(min_x, max_x, 100).tolist()\n",
|
||
"list_x.sort()\n",
|
||
"f = lambda x_i: 1.5 * x_i + 5\n",
|
||
"SCALE = 5\n",
|
||
"list_y = [f(x) + np.random.normal(0, SCALE, 1)[0] for x in list_x]\n",
|
||
"\n",
|
||
"plt.scatter(list_x, list_y, s=6, label='original data')\n",
|
||
"plt.plot([min_x, max_x], [b_cf + w_cf * min_x, b_cf + w_cf * max_x],\n",
|
||
" '-', markersize=3, color='blue', label='Closed-form')\n",
|
||
"plt.plot([min_x, max_x], [b_gd + w_gd * min_x, b_gd + w_gd * max_x],\n",
|
||
"'-', markersize=3, color='orange', label='Gradient Descent')\n",
|
||
"plt.xlabel('x')\n",
|
||
"plt.ylabel('y')\n",
|
||
"plt.legend(loc=0)\n",
|
||
"plt.show()\n",
|
||
"plt.clf()\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 62,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Maintenant on voit comment régler ça en 3 lignes\n",
|
||
"\n",
|
||
"from sklearn.linear_model import LinearRegression\n",
|
||
"\n",
|
||
"\n",
|
||
"model = LinearRegression().fit(np.asarray(list_x).reshape(-1, 1), list_y)\n",
|
||
"w_sklearn, b_sklearn = model.coef_[0], model.intercept_\n",
|
||
"\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 63,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "\n",
|
||
"text/plain": [
|
||
"<Figure size 432x288 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"<Figure size 432x288 with 0 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Affichage des résultats des trois techniques\n",
|
||
"\n",
|
||
"min_x, max_x = 20, 40\n",
|
||
"list_x = np.random.uniform(min_x, max_x, 100).tolist()\n",
|
||
"list_x.sort()\n",
|
||
"f = lambda x_i: 1.5 * x_i + 5\n",
|
||
"SCALE = 5\n",
|
||
"list_y = [f(x) + np.random.normal(0, SCALE, 1)[0] for x in list_x]\n",
|
||
"\n",
|
||
"plt.scatter(list_x, list_y, s=6, label='original data')\n",
|
||
"plt.plot([min_x, max_x], [b_cf + w_cf * min_x, b_cf + w_cf * max_x],\n",
|
||
" '-', markersize=3, color='blue', label='Closed-form')\n",
|
||
"plt.plot([min_x, max_x], [b_gd + w_gd * min_x, b_gd + w_gd * max_x],\n",
|
||
"'-', markersize=3, color='orange', label='Gradient Descent')\n",
|
||
"plt.plot([min_x, max_x], [b_sklearn + w_sklearn * min_x, b_sklearn + w_sklearn * max_x],\n",
|
||
"'-', markersize=3, color='yellow', label='Sklearn')\n",
|
||
"plt.xlabel('x')\n",
|
||
"plt.ylabel('y')\n",
|
||
"plt.legend(loc=0)\n",
|
||
"plt.show()\n",
|
||
"plt.clf()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.7.0"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|