diff --git a/Archis Applications/Exercices/Exercice 1.ipynb b/Archis Applications/Exercices/TP1/.ipynb_checkpoints/Exercice 1-checkpoint.ipynb similarity index 100% rename from Archis Applications/Exercices/Exercice 1.ipynb rename to Archis Applications/Exercices/TP1/.ipynb_checkpoints/Exercice 1-checkpoint.ipynb diff --git a/Archis Applications/Exercices/Exercice 2.ipynb b/Archis Applications/Exercices/TP1/.ipynb_checkpoints/Exercice 2-checkpoint.ipynb similarity index 100% rename from Archis Applications/Exercices/Exercice 2.ipynb rename to Archis Applications/Exercices/TP1/.ipynb_checkpoints/Exercice 2-checkpoint.ipynb diff --git a/Archis Applications/Exercices/TP1/Exercice 1.ipynb b/Archis Applications/Exercices/TP1/Exercice 1.ipynb new file mode 100644 index 0000000..2b98d49 --- /dev/null +++ b/Archis Applications/Exercices/TP1/Exercice 1.ipynb @@ -0,0 +1,359 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "# On importe les librairies\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 1 - Créer une fonction J(theta qui implémente J(theta). Quelle est la valeur de de J(4) ?" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "25" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "def J(theta_i):\n", + " return (theta_i + 1)**2\n", + "\n", + "J(4)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 2" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [], + "source": [ + "# On génère les points dans theta\n", + "theta = np.arange(-10.00, 10.00, 0.01)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 3\n", + "Plot theta vs J(theta). When do we reach the minimum?" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "# plot de J(theta)\n", + "plt.plot(theta,J(theta), label='Fitted line - closed form')\n", + "plt.xlabel('x')\n", + "plt.ylabel('J(x)')\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "-1.0" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "J_theta = [J(x) for x in theta]\n", + "# Valeur de theta à laquelle on obtien le plus petit J(theta)\n", + "round(theta[J_theta.index(min(J_theta))], 3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 4\n", + "Create a function dJ_dtheta(x_i) which is computing the gradient\n", + "?" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# On calcule la fonction dérivée. D'après J(theta). Merci dcode.fr\n", + "def dJ_theta(x_i):\n", + " return 2 * (x_i + 1)\n", + "# Affichage de la valeur à 0\n", + "dJ_theta(-1)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "plt.plot(theta, dJ_theta(theta))\n", + "# dJ_t = [dJ_theta(x) for x in theta]\n", + "# plt.plot(theta, dJ_t)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 5\n", + "Create a function gradient_descent(theta_0, lr, nb_iters, df_dx, f) which returns the solution\n", + "of argmin f (θ). \n", + "x_0, lr, nb_iters, df_dx, f correspond respectively to the initial value of θ, the\n", + "θ\n", + "learning rate, the number of iterations allowed for solving the problem, the gradient of θ and the\n", + "function J(.).\n", + "\n", + "What is the solution θ̂ found by gradient descent to our problem? Note: assume that\n", + "x_0, lr, nb_iters = -7, 0.1, 100 while debugging" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "def gradient_descent(theta_0, lr, nb_iters, dJ_t, J_theta):\n", + " # puis on reprend les étapes vues en cours\n", + " theta = theta_0\n", + " \n", + " for t in range(nb_iters): \n", + " # calcul de theta\n", + " theta = theta - lr * dJ_t(theta)\n", + "\n", + " return theta\n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "-1.0000000100978041" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# On appelle la fonction \n", + "gradient_descent(-7, 0.01, 1000, dJ_theta, J)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 6\n", + "Update your function gradient_descent for printing the optimization path (i.e. print the line between\n", + "θ t and θ t+1 and saving the figure at the end of the optimization process)." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "-1.0000000100978041" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "def gradient_descent(theta_0, lr, nb_iters, dJ_t, J_theta):\n", + " # puis on reprend les étapes vues en cours\n", + " theta = theta_0\n", + " \n", + " # on initialise les séries pour le plot\n", + " x = [theta]\n", + " y = [J_theta(theta)]\n", + " \n", + " for t in range(nb_iters): \n", + " # calcul de theta\n", + " theta = theta - lr * dJ_t(theta)\n", + " # ajout de la valeur au plot\n", + " x.append(theta)\n", + " y.append(J_theta(theta))\n", + " \n", + " # affichage du plot\n", + " plt.plot(x,y)\n", + " \n", + " return theta\n", + "\n", + "# On appelle la fonction \n", + "gradient_descent(-7, 0.01, 1000, dJ_theta, J)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 7\n", + "Assuming that you are setting nb_iters equals to 100 what is the solution θ̂ when varying nb_iters in\n", + "[-8, -1, 7] and lr in [0.1, 0.01, 0.001, -0.01, 0.8, 1.01]. Does the estimated solution always\n", + "the same? What are pros and cons about these hyperparameters and GD in general?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 8\n", + "Now assume the function J(θ) = sin(2θ + 1), what is the solution θ̂ given by GD?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/Archis Applications/Exercices/TP1/Exercice 2.ipynb b/Archis Applications/Exercices/TP1/Exercice 2.ipynb new file mode 100644 index 0000000..4344cce --- /dev/null +++ b/Archis Applications/Exercices/TP1/Exercice 2.ipynb @@ -0,0 +1,426 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "# On importe les librairies\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We want to estimate parameters of a simple linear regression (y i = wx i + b) by closed-form and GD. \n", + "\n", + "The loss function is defined on a data point by l(ŷ i , y i ) = (ŷ i − y i )² where the prediction is given by ŷ i = ŵx + b̂. The\n", + "P N\n", + "optimization problem we want to solve is min i=1 l(ŷ i , y i ). \n", + "\n", + "We assume the following data generation process:\n", + "w,b\n", + "Y ∼ wX + b + \\eta where X ∼ U[20; 40] and \\eta ∼ N (0, 1).\n", + "\n", + "# 1\n", + "\n", + "Generate N data points (x i , y i ) using the data generation process given above and store them into x vs\n", + "N\n", + "y. Note: N = 100 and start by first generating {x i } N\n", + "1 and then {y i } 1 . Do not forget to add noise!\n", + "\n", + "D'abord on va générer les x" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [], + "source": [ + "# On génère les x\n", + "list_x = np.random.uniform(20, 40, 100).tolist()\n", + "\n", + "# Pis les y\n", + "\n", + "def y(x):\n", + " w = 1.5\n", + " b = 5\n", + " return w * x + b + np.random.normal(0,1)\n", + "\n", + "list_y = [y(x) for x in list_x ]\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 2\n", + "Plot the data points x vs y." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "plt.scatter(list_x, list_y, s=6)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 3\n", + "Estimate parameters of the simple linear regression by closed-form. Note: store these values into w_cf\n", + "and b_cf." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "# D'abord, on doit transposer nos donner pour pouvoir les utiliser dans le calcul matriciel\n", + "# Donc on génère une liste de 1 qu'on va nsuit coller à notre liste de x\n", + "\n", + "list_1 = np.ones(len(list_x))\n", + "\n", + "\n", + "# Pis on colle les deux\n", + "\n", + "X = np.stack((list_1, list_x), axis=1)\n", + "\n", + "# Enfin, on fait notre calcul de ouf\n", + "\n", + "def B(y): \n", + " X_T = np.transpose(X)\n", + " prod = np.dot(X_T, X)\n", + " inv = np.linalg.inv(prod)\n", + " return np.dot(inv, np.dot(X_T,y))\n", + "\n", + "Y = np.asarray(list_y)\n", + "b_cf, w_cf = B(Y)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 8\n", + "Create a function loss(x_i, y_i, w, b) which is computing the regression loss on the full dataset.\n", + "What is the value of loss([1], [3], 1, 2) ?" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# On détermine la fonction de coût\n", + "\n", + "def loss(x_i, y_i, w, b):\n", + " return ((w * x_i) + b - y_i)**2\n", + "\n", + "loss(1, 3, 1, 2)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "8" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "def dl_dw(xi, yi, w, b):\n", + " return -2 * xi * ( yi - (w * xi + b))\n", + "\n", + "dl_dw(4, -1, 0, 0)" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "2" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "def dl_db(xi, yi, w, b):\n", + " return -2 * ( yi - (w * xi + b))\n", + "\n", + "dl_db(4, -1, 0, 0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 11\n", + "Implement a function update_w_and_b(x_i, y_i, w, b, lr) which is updating w and b according to the gradient compute on the full data points. What is the output of the following command of update_w_and_b([0], [3], 5, 3, 0.1) and why?" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(5.0, 3.0)" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "def update_w_and_b(x, y, w, b, lr):\n", + " grad_w = 0\n", + " grad_b = 0\n", + " \n", + " N = len(y)\n", + " \n", + " for t in range(N):\n", + " grad_w += dl_dw(x[t], y[t], w, b)\n", + " grad_b += dl_db(x[t], y[t], w, b)\n", + " \n", + " # update\n", + " w -= (1 / float(N)) * grad_w * lr\n", + " b -= (1 / float(N)) * grad_b * lr\n", + " \n", + " return w, b\n", + " \n", + "update_w_and_b([0], [3], 5, 3, 0.1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 12 \n", + "Estimate parameters of the simple linear regression by gradient descent from a random initialization of\n", + "w and b and with a leanring rate equals to 0.001. Note: store these values into w_gd and b_gd." + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": {}, + "outputs": [], + "source": [ + "w_gd, b_gd = 2., 4.\n", + "\n", + "for i in range(5000):\n", + " # update w and b\n", + " w_gd, b_gd = update_w_and_b(list_x, list_y, w_gd, b_gd, 0.001)\n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# Affichage des résultats des deux techniques\n", + "\n", + "min_x, max_x = 20, 40\n", + "list_x = np.random.uniform(min_x, max_x, 100).tolist()\n", + "list_x.sort()\n", + "f = lambda x_i: 1.5 * x_i + 5\n", + "SCALE = 5\n", + "list_y = [f(x) + np.random.normal(0, SCALE, 1)[0] for x in list_x]\n", + "\n", + "plt.scatter(list_x, list_y, s=6, label='original data')\n", + "plt.plot([min_x, max_x], [b_cf + w_cf * min_x, b_cf + w_cf * max_x],\n", + " '-', markersize=3, color='blue', label='Closed-form')\n", + "plt.plot([min_x, max_x], [b_gd + w_gd * min_x, b_gd + w_gd * max_x],\n", + "'-', markersize=3, color='orange', label='Gradient Descent')\n", + "plt.xlabel('x')\n", + "plt.ylabel('y')\n", + "plt.legend(loc=0)\n", + "plt.show()\n", + "plt.clf()\n" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": {}, + "outputs": [], + "source": [ + "# Maintenant on voit comment régler ça en 3 lignes\n", + "\n", + "from sklearn.linear_model import LinearRegression\n", + "\n", + "\n", + "model = LinearRegression().fit(np.asarray(list_x).reshape(-1, 1), list_y)\n", + "w_sklearn, b_sklearn = model.coef_[0], model.intercept_\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# Affichage des résultats des trois techniques\n", + "\n", + "min_x, max_x = 20, 40\n", + "list_x = np.random.uniform(min_x, max_x, 100).tolist()\n", + "list_x.sort()\n", + "f = lambda x_i: 1.5 * x_i + 5\n", + "SCALE = 5\n", + "list_y = [f(x) + np.random.normal(0, SCALE, 1)[0] for x in list_x]\n", + "\n", + "plt.scatter(list_x, list_y, s=6, label='original data')\n", + "plt.plot([min_x, max_x], [b_cf + w_cf * min_x, b_cf + w_cf * max_x],\n", + " '-', markersize=3, color='blue', label='Closed-form')\n", + "plt.plot([min_x, max_x], [b_gd + w_gd * min_x, b_gd + w_gd * max_x],\n", + "'-', markersize=3, color='orange', label='Gradient Descent')\n", + "plt.plot([min_x, max_x], [b_sklearn + w_sklearn * min_x, b_sklearn + w_sklearn * max_x],\n", + "'-', markersize=3, color='yellow', label='Sklearn')\n", + "plt.xlabel('x')\n", + "plt.ylabel('y')\n", + "plt.legend(loc=0)\n", + "plt.show()\n", + "plt.clf()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/Archis Applications/Exercices/TP2/.ipynb_checkpoints/Exercice 1-checkpoint.ipynb b/Archis Applications/Exercices/TP2/.ipynb_checkpoints/Exercice 1-checkpoint.ipynb new file mode 100644 index 0000000..57a4822 --- /dev/null +++ b/Archis Applications/Exercices/TP2/.ipynb_checkpoints/Exercice 1-checkpoint.ipynb @@ -0,0 +1,326 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'data': array([[6.3200e-03, 1.8000e+01, 2.3100e+00, ..., 1.5300e+01, 3.9690e+02,\n", + " 4.9800e+00],\n", + " [2.7310e-02, 0.0000e+00, 7.0700e+00, ..., 1.7800e+01, 3.9690e+02,\n", + " 9.1400e+00],\n", + " [2.7290e-02, 0.0000e+00, 7.0700e+00, ..., 1.7800e+01, 3.9283e+02,\n", + " 4.0300e+00],\n", + " ...,\n", + " [6.0760e-02, 0.0000e+00, 1.1930e+01, ..., 2.1000e+01, 3.9690e+02,\n", + " 5.6400e+00],\n", + " [1.0959e-01, 0.0000e+00, 1.1930e+01, ..., 2.1000e+01, 3.9345e+02,\n", + " 6.4800e+00],\n", + " [4.7410e-02, 0.0000e+00, 1.1930e+01, ..., 2.1000e+01, 3.9690e+02,\n", + " 7.8800e+00]]),\n", + " 'target': array([24. , 21.6, 34.7, 33.4, 36.2, 28.7, 22.9, 27.1, 16.5, 18.9, 15. ,\n", + " 18.9, 21.7, 20.4, 18.2, 19.9, 23.1, 17.5, 20.2, 18.2, 13.6, 19.6,\n", + " 15.2, 14.5, 15.6, 13.9, 16.6, 14.8, 18.4, 21. , 12.7, 14.5, 13.2,\n", + " 13.1, 13.5, 18.9, 20. , 21. , 24.7, 30.8, 34.9, 26.6, 25.3, 24.7,\n", + " 21.2, 19.3, 20. , 16.6, 14.4, 19.4, 19.7, 20.5, 25. , 23.4, 18.9,\n", + " 35.4, 24.7, 31.6, 23.3, 19.6, 18.7, 16. , 22.2, 25. , 33. , 23.5,\n", + " 19.4, 22. , 17.4, 20.9, 24.2, 21.7, 22.8, 23.4, 24.1, 21.4, 20. ,\n", + " 20.8, 21.2, 20.3, 28. , 23.9, 24.8, 22.9, 23.9, 26.6, 22.5, 22.2,\n", + " 23.6, 28.7, 22.6, 22. , 22.9, 25. , 20.6, 28.4, 21.4, 38.7, 43.8,\n", + " 33.2, 27.5, 26.5, 18.6, 19.3, 20.1, 19.5, 19.5, 20.4, 19.8, 19.4,\n", + " 21.7, 22.8, 18.8, 18.7, 18.5, 18.3, 21.2, 19.2, 20.4, 19.3, 22. ,\n", + " 20.3, 20.5, 17.3, 18.8, 21.4, 15.7, 16.2, 18. , 14.3, 19.2, 19.6,\n", + " 23. , 18.4, 15.6, 18.1, 17.4, 17.1, 13.3, 17.8, 14. , 14.4, 13.4,\n", + " 15.6, 11.8, 13.8, 15.6, 14.6, 17.8, 15.4, 21.5, 19.6, 15.3, 19.4,\n", + " 17. , 15.6, 13.1, 41.3, 24.3, 23.3, 27. , 50. , 50. , 50. , 22.7,\n", + " 25. , 50. , 23.8, 23.8, 22.3, 17.4, 19.1, 23.1, 23.6, 22.6, 29.4,\n", + " 23.2, 24.6, 29.9, 37.2, 39.8, 36.2, 37.9, 32.5, 26.4, 29.6, 50. ,\n", + " 32. , 29.8, 34.9, 37. , 30.5, 36.4, 31.1, 29.1, 50. , 33.3, 30.3,\n", + " 34.6, 34.9, 32.9, 24.1, 42.3, 48.5, 50. , 22.6, 24.4, 22.5, 24.4,\n", + " 20. , 21.7, 19.3, 22.4, 28.1, 23.7, 25. , 23.3, 28.7, 21.5, 23. ,\n", + " 26.7, 21.7, 27.5, 30.1, 44.8, 50. , 37.6, 31.6, 46.7, 31.5, 24.3,\n", + " 31.7, 41.7, 48.3, 29. , 24. , 25.1, 31.5, 23.7, 23.3, 22. , 20.1,\n", + " 22.2, 23.7, 17.6, 18.5, 24.3, 20.5, 24.5, 26.2, 24.4, 24.8, 29.6,\n", + " 42.8, 21.9, 20.9, 44. , 50. , 36. , 30.1, 33.8, 43.1, 48.8, 31. ,\n", + " 36.5, 22.8, 30.7, 50. , 43.5, 20.7, 21.1, 25.2, 24.4, 35.2, 32.4,\n", + " 32. , 33.2, 33.1, 29.1, 35.1, 45.4, 35.4, 46. , 50. , 32.2, 22. ,\n", + " 20.1, 23.2, 22.3, 24.8, 28.5, 37.3, 27.9, 23.9, 21.7, 28.6, 27.1,\n", + " 20.3, 22.5, 29. , 24.8, 22. , 26.4, 33.1, 36.1, 28.4, 33.4, 28.2,\n", + " 22.8, 20.3, 16.1, 22.1, 19.4, 21.6, 23.8, 16.2, 17.8, 19.8, 23.1,\n", + " 21. , 23.8, 23.1, 20.4, 18.5, 25. , 24.6, 23. , 22.2, 19.3, 22.6,\n", + " 19.8, 17.1, 19.4, 22.2, 20.7, 21.1, 19.5, 18.5, 20.6, 19. , 18.7,\n", + " 32.7, 16.5, 23.9, 31.2, 17.5, 17.2, 23.1, 24.5, 26.6, 22.9, 24.1,\n", + " 18.6, 30.1, 18.2, 20.6, 17.8, 21.7, 22.7, 22.6, 25. , 19.9, 20.8,\n", + " 16.8, 21.9, 27.5, 21.9, 23.1, 50. , 50. , 50. , 50. , 50. , 13.8,\n", + " 13.8, 15. , 13.9, 13.3, 13.1, 10.2, 10.4, 10.9, 11.3, 12.3, 8.8,\n", + " 7.2, 10.5, 7.4, 10.2, 11.5, 15.1, 23.2, 9.7, 13.8, 12.7, 13.1,\n", + " 12.5, 8.5, 5. , 6.3, 5.6, 7.2, 12.1, 8.3, 8.5, 5. , 11.9,\n", + " 27.9, 17.2, 27.5, 15. , 17.2, 17.9, 16.3, 7. , 7.2, 7.5, 10.4,\n", + " 8.8, 8.4, 16.7, 14.2, 20.8, 13.4, 11.7, 8.3, 10.2, 10.9, 11. ,\n", + " 9.5, 14.5, 14.1, 16.1, 14.3, 11.7, 13.4, 9.6, 8.7, 8.4, 12.8,\n", + " 10.5, 17.1, 18.4, 15.4, 10.8, 11.8, 14.9, 12.6, 14.1, 13. , 13.4,\n", + " 15.2, 16.1, 17.8, 14.9, 14.1, 12.7, 13.5, 14.9, 20. , 16.4, 17.7,\n", + " 19.5, 20.2, 21.4, 19.9, 19. , 19.1, 19.1, 20.1, 19.9, 19.6, 23.2,\n", + " 29.8, 13.8, 13.3, 16.7, 12. , 14.6, 21.4, 23. , 23.7, 25. , 21.8,\n", + " 20.6, 21.2, 19.1, 20.6, 15.2, 7. , 8.1, 13.6, 20.1, 21.8, 24.5,\n", + " 23.1, 19.7, 18.3, 21.2, 17.5, 16.8, 22.4, 20.6, 23.9, 22. , 11.9]),\n", + " 'feature_names': array(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD',\n", + " 'TAX', 'PTRATIO', 'B', 'LSTAT'], dtype='\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# plot target VS CRIM\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mscatter\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mY\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m~/applications/anaconda3/lib/python3.7/site-packages/matplotlib/pyplot.py\u001b[0m in \u001b[0;36mshow\u001b[0;34m(*args, **kw)\u001b[0m\n\u001b[1;32m 251\u001b[0m \"\"\"\n\u001b[1;32m 252\u001b[0m \u001b[0;32mglobal\u001b[0m \u001b[0m_show\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 253\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0m_show\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkw\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 254\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 255\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/applications/anaconda3/lib/python3.7/site-packages/matplotlib/backend_bases.py\u001b[0m in \u001b[0;36mshow\u001b[0;34m(cls, block)\u001b[0m\n\u001b[1;32m 206\u001b[0m \u001b[0mblock\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 207\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mblock\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 208\u001b[0;31m \u001b[0mcls\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 209\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 210\u001b[0m \u001b[0;31m# This method is the one actually exporting the required methods.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/applications/anaconda3/lib/python3.7/site-packages/matplotlib/backends/_backend_tk.py\u001b[0m in \u001b[0;36mmainloop\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1073\u001b[0m \u001b[0;34m@\u001b[0m\u001b[0mstaticmethod\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1074\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1075\u001b[0;31m \u001b[0mTk\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m~/applications/anaconda3/lib/python3.7/tkinter/__init__.py\u001b[0m in \u001b[0;36mmainloop\u001b[0;34m(n)\u001b[0m\n\u001b[1;32m 555\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 556\u001b[0m \u001b[0;34m\"\"\"Run the main loop of Tcl.\"\"\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 557\u001b[0;31m \u001b[0m_default_root\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtk\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 558\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 559\u001b[0m \u001b[0mgetint\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mKeyboardInterrupt\u001b[0m: " + ] + } + ], + "source": [ + "# plot target VS CRIM\n", + "plt.scatter(X[:,0], Y)\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": {}, + "outputs": [ + { + "ename": "KeyboardInterrupt", + "evalue": "", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# c'est inexploitable, on log\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mscatter\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlog\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mY\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m~/applications/anaconda3/lib/python3.7/site-packages/matplotlib/pyplot.py\u001b[0m in \u001b[0;36mshow\u001b[0;34m(*args, **kw)\u001b[0m\n\u001b[1;32m 251\u001b[0m \"\"\"\n\u001b[1;32m 252\u001b[0m \u001b[0;32mglobal\u001b[0m \u001b[0m_show\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 253\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0m_show\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkw\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 254\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 255\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/applications/anaconda3/lib/python3.7/site-packages/matplotlib/backend_bases.py\u001b[0m in \u001b[0;36mshow\u001b[0;34m(cls, block)\u001b[0m\n\u001b[1;32m 206\u001b[0m \u001b[0mblock\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 207\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mblock\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 208\u001b[0;31m \u001b[0mcls\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 209\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 210\u001b[0m \u001b[0;31m# This method is the one actually exporting the required methods.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m~/applications/anaconda3/lib/python3.7/site-packages/matplotlib/backends/_backend_tk.py\u001b[0m in \u001b[0;36mmainloop\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1073\u001b[0m \u001b[0;34m@\u001b[0m\u001b[0mstaticmethod\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1074\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1075\u001b[0;31m \u001b[0mTk\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m~/applications/anaconda3/lib/python3.7/tkinter/__init__.py\u001b[0m in \u001b[0;36mmainloop\u001b[0;34m(n)\u001b[0m\n\u001b[1;32m 555\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 556\u001b[0m \u001b[0;34m\"\"\"Run the main loop of Tcl.\"\"\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 557\u001b[0;31m \u001b[0m_default_root\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtk\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmainloop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 558\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 559\u001b[0m \u001b[0mgetint\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mKeyboardInterrupt\u001b[0m: " + ] + } + ], + "source": [ + "# c'est inexploitable, on log \n", + "plt.scatter(np.log(X[:,0]), Y)\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/Archis Applications/Exercices/TP2/Exercice 1.py b/Archis Applications/Exercices/TP2/Exercice 1.py new file mode 100644 index 0000000..a2359f8 --- /dev/null +++ b/Archis Applications/Exercices/TP2/Exercice 1.py @@ -0,0 +1,138 @@ +# -*- coding: utf-8 -*- +""" +Exercice 1 du cours 2 de machine learning avec F.Baradel +""" +import matplotlib + +matplotlib.use("TkAgg") + +import matplotlib.pyplot as plt +import numpy as np + +from sklearn.datasets import load_boston +from sklearn.linear_model import LinearRegression + +boston = load_boston() + +boston + +X = boston['data'] +Y = boston['target'] +X.shape + +n_train = 300 + +crim = X[:, 0].copy() + +""" 4. Right now we will be using only the first features called CRIM for + modelling the target variable. Plot CRIM vs target with and without normalizing your data. + What do you observe? +""" +# plot target VS CRIM +plt.scatter(crim, Y) +plt.show() + +# c'est inexploitable, on log +plt.scatter(np.log(crim), Y) +plt.show() + +# C'est mieux +#on écrit la fonction de normalisation + +def normalize (y): + return (y - np.min(y)) / (np.max(y) - np.min(y)) + +"""5. Use LinearRegression() for modelling target with CRIM on the training set and compute the predicted +values for the validation set.""" + + + +model = LinearRegression().fit(crim[:n_train].reshape(-1, 1), Y[:n_train]) + + +w, b = model.coef_, model.intercept_ + +print(w, b) + + +""" 6. Plot the predictions and the actual ground-truth for the training and the validation set.""" +print("entrainement") +preds = model.predict(crim[:n_train].reshape(-1, 1)) +plt.scatter(crim[:n_train], preds, color="green") +plt.scatter(crim[:n_train], Y[:n_train], color="red") +plt.show() + +n_valid = 400 + +print("validation") +valids = model.predict(crim[n_train:n_valid].reshape(-1,1)) +plt.scatter(crim[n_train:n_valid], valids, color="green") +plt.scatter(crim[n_train:n_valid], Y[n_train:n_valid], color="orange") +plt.show() + +# On utilise maintenant le log . Je m'a planté et ait cru que c'était la normalisation d'ou les noms fallacieux. +# Ne pas oublier de caler l'exponentielle (réciproque de log) pour remettre valeur dans le bon champ + +print(" avec log") +normY = np.log(Y) + +model = LinearRegression().fit(crim[:n_train].reshape(-1, 1), normY[:n_train]) +print("entrainement") +normPreds = np.exp(model.predict(crim[:n_train].reshape(-1, 1))) +plt.scatter(crim[:n_train], normPreds, color="green") +plt.scatter(crim[:n_train], np.exp(normY[:n_train]), color="red") +plt.show() + +print("validation") +normValids = np.exp(model.predict(crim[n_train:n_valid].reshape(-1,1))) +plt.scatter(crim[n_train:n_valid], normValids, color="green") +plt.scatter(crim[n_train:n_valid], np.exp(normY[n_train:n_valid]), color="orange") +plt.show + + +""" +7. Implement a function which is computing the Root-Mean-Square-Error (RMSE). What is the RMSE on +the training set and on the validation set? + + On veut calculer un score, pour savoir si le modèle est bon ou non + on définit une fonction qui prend mes prédiciton,s mes valeurs, et retourne le mean square error +(ŷi-yi)² + à calculer sur le trainig set et le validation set + + Ensuite modéliser log(target) = w +b.CRIM et MSE train/val +""" + +def mse(preds, vals): + return np.mean((preds - vals)**2) + +mse(preds, Y[:n_train]) +mse(valids, Y[n_train:n_valid]) + +# meme chose avec log +mse(normPreds, Y[:n_train]) +mse(normValids, Y[n_train:n_valid]) + + +# Ensuite, faire la meme chose avec une deuxièm variabl du tableau X : ZN +crimZn = X[:, :2] + +model = LinearRegression().fit(crimZn[:n_train].reshape(-1, 2), Y[:n_train]) + + +w, b = model.coef_, model.intercept_ + +print(w, b) + +print("entrainement") +preds = model.predict(crim[:n_train].reshape(-1, 2)) +plt.scatter(crimZn[:n_train], preds, color="green") +plt.scatter(crimZn[:n_train], Y[:n_train], color="red") +plt.show() + +n_valid = 400 + +print("validation") +valids = model.predict(crim[n_train:n_valid].reshape(-1,1)) +plt.scatter(crim[n_train:n_valid], valids, color="green") +plt.scatter(crim[n_train:n_valid], Y[n_train:n_valid], color="orange") +plt.show() diff --git a/Archis Applications/Exercices/TP2/tp_2.pdf b/Archis Applications/Exercices/TP2/tp_2.pdf new file mode 100644 index 0000000..84edeb0 Binary files /dev/null and b/Archis Applications/Exercices/TP2/tp_2.pdf differ diff --git a/Archis Applications/Introduction au Machine Learning.md b/Archis Applications/Introduction au Machine Learning.md index 3a039cd..bebc8ca 100644 --- a/Archis Applications/Introduction au Machine Learning.md +++ b/Archis Applications/Introduction au Machine Learning.md @@ -145,7 +145,7 @@ Comme dans la réalité on a beaucoup trop de données pour s'appuyer sur toutes ### Exemple -Nous sommes de charmants vendeurs de gales ambulants. On se pose la question : "Quand température est de $x$, combien je vais vendre de glace ?". On s'appuie sur plusieurs expériences de cas réels où on a vendu $y$ glaces alors que la température était de $x$. +Nous sommes de charmants vendeurs de glaces ambulants. On se pose la question : "Quand la température est de $x$, combien je vais vendre de glace ?". On s'appuie sur plusieurs expériences de cas réels où on a vendu $y$ glaces alors que la température était de $x$. On va résoudre le problème en partant du principe qu'on va vendre $wx + b$ glaces, où x est la température. Si $x = 0$ on va vendre $b$ glaces. $b$ est appelé l'intercepte, et $w$ la pente. @@ -160,4 +160,49 @@ Le chapeau indique la prédiction. Pour trouver le coût moyen (de tous les poin À la fin des exercies on mate un graph avec trois méthodes : closed-form, gradient descend et stochastic gradient descent. On remarque que les 3 sont très proches. -On voit ensuite comment faire une descente de gradient en 3 lignes de code : scikit-learn \ No newline at end of file +On voit ensuite comment faire une descente de gradient en 3 lignes de code : scikit-learn + +### Régression linéaire multivariée + +Même principe que linéaire, mais avec un nombre *p* de variables et un concept de vecteurs. + +## Overfitting + +La problématique en ML c'est de généraliser. Nous, humains, on peut généraliser et extrapoler vite (on voit deux chats on peut vite dire si ce qu'on voit ensuite est un chat ou non). En ML, y a besoin de beaucoup plus de données. + +Pour résumer la généralisation: + +On divise le dataset en 3 : donnés d’apprentissage, de validation et de test (genre 70%/15%/15%). On entraîne la machine uniquement avec les données d'apprentissage. Une fois qu'elle a suffisamment appris, on essaye ses déductions sur la base de validation (on donne x on demande de trouver ŷ (on connaît y nous, on peut donc comparer avec)). On retravaille ensuite l'algo sur les données d'apprentissage et puis on recommence la validation. Au bout d'un certain temps, on passe au test : on fait calculer le total des ŷ et on voit le pourcentage de réussite global, sans avoir accès aux ŷ trouvés. + +L'objectif, c'est d'éviter l'underfitting et l'overfitting. L'over c'est de trop coller aux données qu'on a, et l'under c'est d'en être trop éloigné. Donc la solution va consister à alterner entre l'un et l'autre pour se rapprocher d'un modèle optimal : "good fit". + +On va donc pour ça partir de fonctions complexes, et tâcher de simplifier ces fonctions. + +### Tips and tricks + +La normalisation : dans un wx + b, on tente de minimiser le w. Il y a normalisation L2 et normalisation L1. Pour la L2, on obtient toujours quelque chose de convexe. La L1, non. Donc la L1 est plutôt utile pour faire de la sélection de variable. + +Si on ajoute L1 et L2, c'est l'"Elastic Net". Qu'on ajoute L1, L2 ou L1+L2 à la fonction de coût, on obtient des résultats différents. Il faut aussi pondérer l'hyperparamètre C pour pas qu'il soit trop petit ou trop grand. + +#### Numerical and Categorical variable + +Il s'agit de regrouper des variables ou de changer leur type en leur assignant des nombres. Sur une variable constante par exemple (de 0 à n), on peut décréter que de x à x' on est dans la variable [1,0,0], de x'' à x''' [O,1,0] et tout le reste [0,0,1]. + +#### Normalisation, standardisation + +L'idée est de ne pas se retrouver à donner plus d'importance à un paramètre qui a un champ plus large que d'autres (genre l'age de 1 à 100 va valoir plus que la taille de 0 à 2) on ramène tout à "de 0 à 1". + +Formule de normalisation de j pour sa donnée x : $\bar x^j = \frac{x^j - min^j}{max^j - min^j}$ + +Le Z-score, normalise en fonction de la moyenne, non plus en fonction min et max : c'est plus robuste : $\hat x = \frac{x^j - \mu^j}{\sigma^j}$ + +#### Transformation de cible + +L'idée est, sur des paramètres difficiles à expliquer, d'arriver sur des données plus "jolies à voir", plus facilement exploitables pour nous. Dans l'exemple sur le PDF, on a juste appliqué la fonction log aux données. + +#### Interaction entre les variables + +Parfois il ne suffit pas de prendre les paramètres mais de regarder les relations entre eux : on appelle ça une interaction. Les interactions consistent à rajouter un terme (paramètre) qui est le produit des deux paramètres. + +$y = b +w_1.age + w_2 .taille +w_3.age.taille$ +