From 0b5ed62a5c968fa13b9f4028955f39bc2a7b30b9 Mon Sep 17 00:00:00 2001 From: Produpro <136916627+Produpro@users.noreply.github.com> Date: Sun, 15 Oct 2023 16:15:56 +0000 Subject: [PATCH] resuelto --- notebook/problems.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebook/problems.ipynb b/notebook/problems.ipynb index 94a8dd29..e8a5ce4f 100644 --- a/notebook/problems.ipynb +++ b/notebook/problems.ipynb @@ -1 +1 @@ -{"cells":[{"cell_type":"markdown","id":"5dbe7b9e","metadata":{},"source":["# Calculus and Algebra problems"]},{"cell_type":"markdown","id":"519c4b12","metadata":{},"source":["## Calculus\n","\n","Calculus is not obscure. It is the language for modeling behaviors. Calculus enables us to find the rate of changes in order to optimise a function. Without calculus, we would not be able to fully understand techniques such as:\n","\n","Backpropagation in neural networks\n","\n","Regression using optimal least square\n","\n","Expectation maximization in fitting probability models"]},{"cell_type":"markdown","id":"b7e2e87a","metadata":{},"source":["### Exercise 1\n","\n","Let’s say in my office, it takes me 10 seconds (time) to travel 25 meters (distance) to the coffee machine.\n","If we want to express the above situation as a function, then it would be:\n","\n","distance = speed * time\n","\n","So for this case, speed is the first derivative of the distance function above. As speed describes the rate of change of distance over time, when people say taking the first derivative of a certain function, they mean finding out the rate of change of a function.\n","\n","**Find the speed and build the linear function on distance $(d)$ over time $(t)$, when $(t ∈ [0,10])$.**"]},{"cell_type":"code","execution_count":null,"id":"bb3e954e","metadata":{},"outputs":[],"source":["#import libraries\n","\n","\n","#define the distance function"]},{"cell_type":"code","execution_count":null,"id":"dbc4c780","metadata":{},"outputs":[],"source":["#plot the distance function on domain(t)"]},{"cell_type":"code","execution_count":null,"id":"4c4d4f20","metadata":{},"outputs":[],"source":["# create a dataframe"]},{"cell_type":"markdown","id":"1144168d","metadata":{},"source":["### Exercise 2\n","\n","It turned out that I wasn't walking a constant speed towards getting my coffee but I was accelerating (my speed increased over time). If initial speed = 0, it still took me 10 seconds to travel from my seat to my coffee but I was walking faster and faster.\n","\n","$V_o$ = initial speed = $0$\n","\n","t = time\n","\n","a = acceleration\n","\n","**distance** = $V_o * t + 0.5 * a * (t^2)$\n","\n","**speed** = $V_o + a * t$\n","\n","The first derivative of the speed function is acceleration. I realise that the speed function is closely related to the distance function.\n","\n","**Find the acceleration value and build the quadratic function $(t ∈ [0,10])$. Also create a graph and a table.**"]},{"cell_type":"code","execution_count":null,"id":"ec1f8bd7","metadata":{},"outputs":[],"source":["#define and plot the quadratic funtion"]},{"cell_type":"code","execution_count":null,"id":"ba5c497b","metadata":{},"outputs":[],"source":["#create a dataframe"]},{"cell_type":"markdown","id":"66d4cc18","metadata":{},"source":["Before exercise 3, we'll make a brief introduction to Gradient Descent algorithm, which will have a larger explanation in future modules of the bootcamp.\n","\n","Gradient Descent algorithm is the hero behind the family of deep learning algorithms. When an algorithm in this family runs, it tries to minimize the error between the training input and predicted output. This minimization is done by optimization algorithms and gradient descent is the most popular one.\n","\n","Let’s say you have these input & output pairs:\n","\n","```py\n","input:\n","[\n"," [1,2],\n"," [3,4]\n","]\n","\n","output:\n","\n","[\n"," [50],\n"," [110]\n","]\n","```\n","We can estimate that if we multiply the input values by [10, 20], we can have the output as shown above.\n","```py\n","1(10) + 2(20) = 50\n","\n","3(10) + 4(20) = 110\n","```\n","When a machine learning algorithm starts running, it assigns random values and makes a prediction. \n","Let’s say it assigned [1,2] values:\n","```py\n","1(1) + 2(2) = 5\n","\n","3(1) + 4(2) = 11\n","```\n","Once it has the predictions, it calculates the error: the difference between the real data and the predicted data. There are many ways to calculate the error and they are called loss functions. \n","Once we have this value, the optimization algorithm starts showing itself and it sets new values which replace the initial random values. \n","\n","And, the loop continues until a condition is met. That condition can be to loop n times, or to loop until error is smaller than a value."]},{"cell_type":"markdown","id":"85ef2f0b","metadata":{},"source":["It can be hard to understand **gradient descent** without understanding **gradient**. So, let’s focus on what gradient is. The gradient shows the direction of the greatest change of a scalar function. The gradient calculation is done with derivatives, so let’s start with a simple example. To calculate the gradient, we just need to remember some linear algebra calculations from high school because we need to calculate derivatives.\n","\n","Let’s say we want to find the minimum point of $f(x) = x^2$. The derivative of that function is $df(x)=2^x$. \n","\n","The gradient of $f(x)$ at point $x=-10$\n","\n","is \n","\n","$df(-10)=-20$.\n","\n","The gradient of $f(x)$ at point $x=1$\n","\n","is \n","\n","$df(1)=2$.\n","\n","Now let’s visualize $f(x)$ and those $x=-10$ and $x=1$ points."]},{"cell_type":"code","execution_count":22,"id":"4ff7e11a","metadata":{},"outputs":[],"source":["import numpy as np\n","import seaborn as sns\n","\n","def f(x):\n"," return x**2\n","\n","def df(x):\n"," return 2*x\n","\n","def visualize(f, x=None):\n"," \n"," xArray = np.linspace(-10, 10, 100) \n"," yArray = f(xArray)\n"," sns.lineplot(x=xArray, y=yArray)\n"," \n"," if x is not None:\n"," assert type(x) in [np.ndarray, list] # x should be numpy array or list\n"," if type(x) is list: # if it is a list, convert to numpy array\n"," x = np.array(x)\n","\n"," \n"," y = f(x)\n"," sns.scatterplot(x=x, y=y, color='red')"]},{"cell_type":"code","execution_count":23,"id":"633a54fd","metadata":{},"outputs":[{"data":{"image/png":"","text/plain":["
"]},"metadata":{"needs_background":"light"},"output_type":"display_data"}],"source":["visualize(f, x=[-10, 1])"]},{"cell_type":"markdown","id":"9c187ad7","metadata":{},"source":["The red dot at x=-10 does not know the surface it stands on and it only knows the coordinates of where it stands and that the gradient of itself which is -20. And the other red dot at x=1 does not know the surface it stands on and it only knows the coordinates of where it stands and that the gradient of itself which is 2.\n","\n","By having only this information: we can say that the red dot at x=-10 should make a bigger jump than x=1 because it has a bigger absolute gradient value. The sign shows the direction. - shows that the red dot at x=-10 should move to the right and the other one should move to the left.\n","\n","In summary; the red dot at x=-10 (gradient: -20) should make a bigger jump to the right and the red dot at x=1 (gradient: 2) should make a smaller jump to the left. \n","\n","We know that the jump length should be proportional to the gradient, but what is that value exactly? We don’t know. So, let’s just say that red points should move with the length of alpha*gradient where alpha is just a parameter.\n","\n","We can say that the new location of the red dot should be calculated with the following formula:\n","\n","x = x - gradient * alpha"]},{"cell_type":"markdown","id":"0a7f5c3f","metadata":{},"source":["Now let's implement this with **Numpy**. Let’s start with visualizing $f(x)=x^2$ function and $x=-10$ point."]},{"cell_type":"code","execution_count":24,"id":"e26dbdf0","metadata":{},"outputs":[{"data":{"image/png":"","text/plain":["
"]},"metadata":{"needs_background":"light"},"output_type":"display_data"}],"source":["visualize(f, x=[-10])"]},{"cell_type":"markdown","id":"6e752e19","metadata":{},"source":["The following code implements the whole logic explained before:"]},{"cell_type":"code","execution_count":25,"id":"2bdd54f1","metadata":{},"outputs":[],"source":["def gradient_descent(x, nsteps=1):\n"," \n"," \n"," #collectXs is an array to store how x changed in each iteration, \n"," #so we can visualize it later\n"," \n"," collectXs = [x]\n"," \n"," #learning_rate is the value that we mentioned as alpha in previous section\n"," \n"," learning_rate = 1e-01\n"," \n"," for _ in range(nsteps):\n"," \n"," #the following one line does the real magic\n"," #the next value of x is calculated by subtracting the gradient*learning_rate by itself\n"," #the intuation behind this line is in the previous section\n"," \n"," x -= df(x) * learning_rate \n"," collectXs.append(x)\n"," \n"," #we return a tuple that contains\n"," #x -> recent x after nsteps \n"," #collectXs -> all the x values that was calculated so far\n"," \n"," return x, collectXs"]},{"cell_type":"markdown","id":"aea74a65","metadata":{},"source":["Before running gradient descent with 1000 steps, let’s just run it twice, one step at a time to see how x evolves. \n","We start with x=-10 and it evolves to x=-8. We know that when x=0 that is the **minimum point**, so, yes it is evolving in the correct direction."]},{"cell_type":"code","execution_count":26,"id":"0350981e","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["-8.0\n"]}],"source":["x=-10\n","x, collectedXs = gradient_descent(x, nsteps=1)\n","print(x)"]},{"cell_type":"code","execution_count":27,"id":"f8e01e2d","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["-6.4\n"]}],"source":["#The next step will start at x=-8. Let's run gradient for 1 step\n","\n","x, collectedXs = gradient_descent(x, nsteps=1)\n","print(x)"]},{"cell_type":"markdown","id":"93f13b32","metadata":{},"source":["It goes to x=-6.4. Excelent. Now let's run it 1000 times"]},{"cell_type":"code","execution_count":28,"id":"b699d1fb","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["-7.873484301831169e-97\n"]}],"source":["x, collectedXs = gradient_descent(x, nsteps=1000)\n","print(x)"]},{"cell_type":"code","execution_count":29,"id":"0b76ee22","metadata":{},"outputs":[{"data":{"image/png":"","text/plain":["
"]},"metadata":{"needs_background":"light"},"output_type":"display_data"}],"source":["visualize(f, x=collectedXs)"]},{"cell_type":"markdown","id":"d00d2fbb","metadata":{},"source":["### Exercise 3\n","\n","When I arrive to the coffee machine, I hear my colleague talking about the per-unit costs of producing 'product B' for the company. As the company produces more units, the per-unit costs continue to decrease until a point where it starts to increase.\n","\n","To optimise the per-unit production cost at its minimal to optimise efficiency, the company would need to find the number of units to be produced where the per-unit production costs begin to change from decreasing to increasing.\n","\n","**Build a quadratic function $f(x)=0.1(x)^2−9x +4500$ on $x∈[0,100]$ to create the per-unit cost function, and make a conclusion.**"]},{"cell_type":"code","execution_count":null,"id":"7c67d8b7","metadata":{},"outputs":[],"source":["#define and plot the funtion"]},{"cell_type":"markdown","id":"fbe54895","metadata":{},"source":["We saw with Gradient Descent how the red dot navigates in an environment it does not know about. It only knows the coordinates of where it is and its gradient. The red dot could find the minimum point by using only this knowledge and gradient descent algorithm.\n","\n","**Optional:**\n","\n","Implement all the previous steps to create a gradient descent algorithm to see how the per-unit cost evolves, with a starting point of 0 units of production."]},{"cell_type":"markdown","id":"aabad82c","metadata":{},"source":["## Linear Algebra"]},{"cell_type":"markdown","id":"6753636d","metadata":{},"source":["### Exercise 1 : Sum of two matrices\n","\n","Suppose we have two matrices A and B.\n","```py\n","A = [[1,2],[3,4]]\n","B = [[4,5],[6,7]]\n","\n","then we get\n","A+B = [[5,7],[9,11]]\n","A-B = [[-3,-3],[-3,-3]]\n","```\n","\n","Make the sum of two matrices using Python with Numpy"]},{"cell_type":"code","execution_count":null,"id":"9e200c32","metadata":{},"outputs":[],"source":["# importing numpy as np\n","\n"," \n"," \n","# creating first matrix\n","\n"," \n","# creating second matrix\n","\n"," \n","#print elements\n","\n","\n"," \n","# adding two matrix\n"]},{"cell_type":"markdown","id":"93bfb6cc","metadata":{},"source":["### Exercise 2: Sum of two lists\n","\n","There will be many situations in which we'll have to find index wise summation of two different lists. This can have possible applications in day-to-day programming. In this exercise we will solve the same problem with various ways in which this task can be performed.\n","\n","We have the following two lists:\n","```py\n","list1 = [2, 5, 4, 7, 3]\n","list2 = [1, 4, 6, 9, 10]\n","```\n","\n","Now let's use Python code to demonstrate addition of two lists.\n"]},{"cell_type":"code","execution_count":null,"id":"867b70fc","metadata":{},"outputs":[],"source":["# Naive method\n","\n","# initializing lists\n","list1 = [2, 5, 4, 7, 3]\n","list2 = [1, 4, 6, 9, 10]\n"," \n","# printing original lists\n","print (\"Original list 1 : \" + str(list1))\n","print (\"Original list 2 : \" + str(list2))\n"," \n","# using naive method to \n","# add two list \n","res_list = []\n","for i in range(0, len(list1)):\n"," res_list.append(list1[i] + list2[i])\n"," \n","# printing resultant list \n","print (\"Resultant list is : \" + str(res_list))"]},{"cell_type":"markdown","id":"7a063d7f","metadata":{},"source":["Now use the following three different methods to make the same calculation: sum of two lists"]},{"cell_type":"code","execution_count":null,"id":"681930a3","metadata":{},"outputs":[],"source":["# Use list comprehensions to perform addition of the two lists:\n","\n","# initializing lists\n","\n"," \n","# printing original lists\n","\n"," \n","# using list comprehension to add two list \n","\n"," \n","# printing resultant list \n"]},{"cell_type":"code","execution_count":null,"id":"a3a8a425","metadata":{},"outputs":[],"source":["# Use map() + add():\n","\n","# initializing lists\n","\n"," \n","# printing original lists\n","\n"," \n","# using map() + add() to add two list \n","\n"," \n","# printing resultant list "]},{"cell_type":"code","execution_count":null,"id":"1708d7ee","metadata":{},"outputs":[],"source":["# Use zip() + sum():\n","\n","# initializing lists\n","\n"," \n","# printing original lists\n","\n"," \n","# Using zip() + sum() to add two list \n","\n"," \n","# printing resultant list "]},{"cell_type":"markdown","id":"1aef1bd2","metadata":{},"source":["### Exercise 3 : Dot multiplication\n","\n","We have two matrices:\n","```py\n","matrix1 = [[1,7,3],\n"," [ 4,5,2],\n"," [ 3,6,1]]\n","matrix2 = [[5,4,1],\n"," [ 1,2,3],\n"," [ 4,5,2]]\n","```\n","\n","A simple technique but expensive method for larger input datasets is using for loops. In this exercise we will first use nested for loops to iterate through each row and column of the matrices, and then we will perform the same multiplication using Numpy."]},{"cell_type":"code","execution_count":null,"id":"840e7d0e","metadata":{},"outputs":[],"source":["#Using a for loop input two matrices of size n x m\n","matrix1 = [[1,7,3],\n"," [ 4,5,2],\n"," [ 3,6,1]]\n","matrix2 = [[5,4,1],\n"," [ 1,2,3],\n"," [ 4,5,2]]\n"," \n","res = [[0 for x in range(3)] for y in range(3)]\n"," \n","# explicit for loops\n","for i in range(len(matrix1)):\n"," for j in range(len(matrix2[0])):\n"," for k in range(len(matrix2)):\n"," \n"," # resulted matrix\n"," res[i][j] += matrix1[i][k] * matrix2[k][j]\n"," \n","print (res)"]},{"cell_type":"code","execution_count":null,"id":"db6c3355","metadata":{},"outputs":[],"source":["# Import libraries\n","\n"," \n","# input two matrices\n","\n"," \n","# This will return dot product\n","\n"," \n","# print resulted matrix\n"]},{"cell_type":"markdown","id":"785f6c30","metadata":{},"source":["\n","https://www.youtube.com/channel/UCXq-PLvYAX-EufF5RAPihVg\n","\n","https://www.geeksforgeeks.org/\n","\n","https://medium.com/@seehleung/basic-calculus-explained-for-machine-learning-c7f642e7ced3\n","\n","https://blog.demir.io/understanding-gradient-descent-266fc3dcf02f"]}],"metadata":{"interpreter":{"hash":"d3463682613d55fcbb64853e38cc3520a7f67bdf8d6940e781ddcdc423122719"},"kernelspec":{"display_name":"Python 3.9.12 ('calculus-project')","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.9.12"}},"nbformat":4,"nbformat_minor":5} +{"cells":[{"cell_type":"markdown","id":"5dbe7b9e","metadata":{},"source":["# Calculus and Algebra problems"]},{"cell_type":"markdown","id":"519c4b12","metadata":{},"source":["## Calculus\n","\n","Calculus is not obscure. It is the language for modeling behaviors. Calculus enables us to find the rate of changes in order to optimise a function. Without calculus, we would not be able to fully understand techniques such as:\n","\n","Backpropagation in neural networks\n","\n","Regression using optimal least square\n","\n","Expectation maximization in fitting probability models"]},{"cell_type":"markdown","id":"b7e2e87a","metadata":{},"source":["### Exercise 1\n","\n","Let’s say in my office, it takes me 10 seconds (time) to travel 25 meters (distance) to the coffee machine.\n","If we want to express the above situation as a function, then it would be:\n","\n","distance = speed * time\n","\n","So for this case, speed is the first derivative of the distance function above. As speed describes the rate of change of distance over time, when people say taking the first derivative of a certain function, they mean finding out the rate of change of a function.\n","\n","**Find the speed and build the linear function on distance $(d)$ over time $(t)$, when $(t ∈ [0,10])$.**"]},{"cell_type":"code","execution_count":1,"id":"bb3e954e","metadata":{},"outputs":[],"source":["#import libraries\n","import numpy as np\n","import matplotlib.pyplot as plt\n","import pandas as pd\n","#define the distance function\n","\n","x = np.linspace(0,10)\n","def f(X): return 2.5*x "]},{"cell_type":"code","execution_count":2,"id":"dbc4c780","metadata":{},"outputs":[{"data":{"text/plain":["[]"]},"execution_count":2,"metadata":{},"output_type":"execute_result"},{"data":{"image/png":"","text/plain":["
"]},"metadata":{},"output_type":"display_data"}],"source":["#plot the distance function on domain(t)\n","plt.plot(x, f(x), label='f')\n"]},{"cell_type":"code","execution_count":3,"id":"4c4d4f20","metadata":{},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
xf(x)
00.0000000.000000
10.2040820.510204
20.4081631.020408
30.6122451.530612
40.8163272.040816
\n","
"],"text/plain":[" x f(x)\n","0 0.000000 0.000000\n","1 0.204082 0.510204\n","2 0.408163 1.020408\n","3 0.612245 1.530612\n","4 0.816327 2.040816"]},"execution_count":3,"metadata":{},"output_type":"execute_result"}],"source":["# create a dataframe\n","df = pd.DataFrame({'x': x,'f(x)': f(x)})\n","df.head()\n"]},{"cell_type":"markdown","id":"1144168d","metadata":{},"source":["### Exercise 2\n","\n","It turned out that I wasn't walking a constant speed towards getting my coffee but I was accelerating (my speed increased over time). If initial speed = 0, it still took me 10 seconds to travel from my seat to my coffee but I was walking faster and faster.\n","\n","$V_o$ = initial speed = $0$\n","\n","t = time\n","\n","a = acceleration\n","\n","**distance** = $V_o * t + 0.5 * a * (t^2)$\n","\n","**speed** = $V_o + a * t$\n","\n","The first derivative of the speed function is acceleration. I realise that the speed function is closely related to the distance function.\n","\n","**Find the acceleration value and build the quadratic function $(t ∈ [0,10])$. Also create a graph and a table.**"]},{"cell_type":"code","execution_count":4,"id":"ec1f8bd7","metadata":{},"outputs":[{"data":{"text/plain":["[]"]},"execution_count":4,"metadata":{},"output_type":"execute_result"},{"data":{"image/png":"","text/plain":["
"]},"metadata":{},"output_type":"display_data"}],"source":["#define and plot the quadratic funtion\n","import numpy as np\n","import matplotlib.pyplot as plt\n","import pandas as pd\n","\n","\n","def f(x): return 0.5*0.5*(x**2)\n","x = np.linspace(0,10)\n","plt.plot(x, f(x))\n","\n"]},{"cell_type":"code","execution_count":5,"id":"ba5c497b","metadata":{},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
xf(x)
00.0000000.000000
10.2040820.010412
20.4081630.041649
30.6122450.093711
40.8163270.166597
\n","
"],"text/plain":[" x f(x)\n","0 0.000000 0.000000\n","1 0.204082 0.010412\n","2 0.408163 0.041649\n","3 0.612245 0.093711\n","4 0.816327 0.166597"]},"execution_count":5,"metadata":{},"output_type":"execute_result"}],"source":["#create a dataframe\n","df2 = pd.DataFrame({'x':x, 'f(x)': f(x)})\n","df2.head()"]},{"cell_type":"markdown","id":"66d4cc18","metadata":{},"source":["Before exercise 3, we'll make a brief introduction to Gradient Descent algorithm, which will have a larger explanation in future modules of the bootcamp.\n","\n","Gradient Descent algorithm is the hero behind the family of deep learning algorithms. When an algorithm in this family runs, it tries to minimize the error between the training input and predicted output. This minimization is done by optimization algorithms and gradient descent is the most popular one.\n","\n","Let’s say you have these input & output pairs:\n","\n","```py\n","input:\n","[\n"," [1,2],\n"," [3,4]\n","]\n","\n","output:\n","\n","[\n"," [50],\n"," [110]\n","]\n","```\n","We can estimate that if we multiply the input values by [10, 20], we can have the output as shown above.\n","```py\n","1(10) + 2(20) = 50\n","\n","3(10) + 4(20) = 110\n","```\n","When a machine learning algorithm starts running, it assigns random values and makes a prediction. \n","Let’s say it assigned [1,2] values:\n","```py\n","1(1) + 2(2) = 5\n","\n","3(1) + 4(2) = 11\n","```\n","Once it has the predictions, it calculates the error: the difference between the real data and the predicted data. There are many ways to calculate the error and they are called loss functions. \n","Once we have this value, the optimization algorithm starts showing itself and it sets new values which replace the initial random values. \n","\n","And, the loop continues until a condition is met. That condition can be to loop n times, or to loop until error is smaller than a value."]},{"cell_type":"markdown","id":"85ef2f0b","metadata":{},"source":["It can be hard to understand **gradient descent** without understanding **gradient**. So, let’s focus on what gradient is. The gradient shows the direction of the greatest change of a scalar function. The gradient calculation is done with derivatives, so let’s start with a simple example. To calculate the gradient, we just need to remember some linear algebra calculations from high school because we need to calculate derivatives.\n","\n","Let’s say we want to find the minimum point of $f(x) = x^2$. The derivative of that function is $df(x)=2^x$. \n","\n","The gradient of $f(x)$ at point $x=-10$\n","\n","is \n","\n","$df(-10)=-20$.\n","\n","The gradient of $f(x)$ at point $x=1$\n","\n","is \n","\n","$df(1)=2$.\n","\n","Now let’s visualize $f(x)$ and those $x=-10$ and $x=1$ points."]},{"cell_type":"code","execution_count":4,"id":"4ff7e11a","metadata":{},"outputs":[],"source":["import numpy as np\n","import seaborn as sns\n","\n","def f(x):\n"," return x**2\n","\n","def df(x):\n"," return 2*x\n","\n","def visualize(f, x=None):\n"," \n"," xArray = np.linspace(-10, 10, 100) \n"," yArray = f(xArray)\n"," sns.lineplot(x=xArray, y=yArray)\n"," \n"," if x is not None:\n"," assert type(x) in [np.ndarray, list] # x should be numpy array or list\n"," if type(x) is list: # if it is a list, convert to numpy array\n"," x = np.array(x)\n","\n"," \n"," y = f(x)\n"," sns.scatterplot(x=x, y=y, color='red')"]},{"cell_type":"code","execution_count":5,"id":"633a54fd","metadata":{},"outputs":[{"name":"stderr","output_type":"stream","text":["/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.\n"," with pd.option_context('mode.use_inf_as_na', True):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.\n"," with pd.option_context('mode.use_inf_as_na', True):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n"]},{"data":{"image/png":"","text/plain":["
"]},"metadata":{},"output_type":"display_data"}],"source":["visualize(f, x=[-10, 1])"]},{"cell_type":"markdown","id":"9c187ad7","metadata":{},"source":["The red dot at x=-10 does not know the surface it stands on and it only knows the coordinates of where it stands and that the gradient of itself which is -20. And the other red dot at x=1 does not know the surface it stands on and it only knows the coordinates of where it stands and that the gradient of itself which is 2.\n","\n","By having only this information: we can say that the red dot at x=-10 should make a bigger jump than x=1 because it has a bigger absolute gradient value. The sign shows the direction. - shows that the red dot at x=-10 should move to the right and the other one should move to the left.\n","\n","In summary; the red dot at x=-10 (gradient: -20) should make a bigger jump to the right and the red dot at x=1 (gradient: 2) should make a smaller jump to the left. \n","\n","We know that the jump length should be proportional to the gradient, but what is that value exactly? We don’t know. So, let’s just say that red points should move with the length of alpha*gradient where alpha is just a parameter.\n","\n","We can say that the new location of the red dot should be calculated with the following formula:\n","\n","x = x - gradient * alpha"]},{"cell_type":"markdown","id":"0a7f5c3f","metadata":{},"source":["Now let's implement this with **Numpy**. Let’s start with visualizing $f(x)=x^2$ function and $x=-10$ point."]},{"cell_type":"code","execution_count":6,"id":"e26dbdf0","metadata":{},"outputs":[{"name":"stderr","output_type":"stream","text":["/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.\n"," with pd.option_context('mode.use_inf_as_na', True):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.\n"," with pd.option_context('mode.use_inf_as_na', True):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n"]},{"data":{"image/png":"","text/plain":["
"]},"metadata":{},"output_type":"display_data"}],"source":["visualize(f, x=[-10])"]},{"cell_type":"markdown","id":"6e752e19","metadata":{},"source":["The following code implements the whole logic explained before:"]},{"cell_type":"code","execution_count":7,"id":"2bdd54f1","metadata":{},"outputs":[],"source":["def gradient_descent(x, nsteps=1):\n"," \n"," \n"," #collectXs is an array to store how x changed in each iteration, \n"," #so we can visualize it later\n"," \n"," collectXs = [x]\n"," \n"," #learning_rate is the value that we mentioned as alpha in previous section\n"," \n"," learning_rate = 1e-01\n"," \n"," for _ in range(nsteps):\n"," \n"," #the following one line does the real magic\n"," #the next value of x is calculated by subtracting the gradient*learning_rate by itself\n"," #the intuation behind this line is in the previous section\n"," \n"," x -= df(x) * learning_rate \n"," collectXs.append(x)\n"," \n"," #we return a tuple that contains\n"," #x -> recent x after nsteps \n"," #collectXs -> all the x values that was calculated so far\n"," \n"," return x, collectXs"]},{"cell_type":"markdown","id":"aea74a65","metadata":{},"source":["Before running gradient descent with 1000 steps, let’s just run it twice, one step at a time to see how x evolves. \n","We start with x=-10 and it evolves to x=-8. We know that when x=0 that is the **minimum point**, so, yes it is evolving in the correct direction."]},{"cell_type":"code","execution_count":8,"id":"0350981e","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["-8.0\n"]}],"source":["x=-10\n","x, collectedXs = gradient_descent(x, nsteps=1)\n","print(x)"]},{"cell_type":"code","execution_count":9,"id":"f8e01e2d","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["-6.4\n"]}],"source":["#The next step will start at x=-8. Let's run gradient for 1 step\n","\n","x, collectedXs = gradient_descent(x, nsteps=1)\n","print(x)"]},{"cell_type":"markdown","id":"93f13b32","metadata":{},"source":["It goes to x=-6.4. Excelent. Now let's run it 1000 times"]},{"cell_type":"code","execution_count":10,"id":"b699d1fb","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["-7.873484301831169e-97\n"]}],"source":["x, collectedXs = gradient_descent(x, nsteps=1000)\n","print(x)"]},{"cell_type":"code","execution_count":11,"id":"0b76ee22","metadata":{},"outputs":[{"name":"stderr","output_type":"stream","text":["/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.\n"," with pd.option_context('mode.use_inf_as_na', True):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.\n"," with pd.option_context('mode.use_inf_as_na', True):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n","/home/vscode/.local/lib/python3.11/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead\n"," if pd.api.types.is_categorical_dtype(vector):\n"]},{"data":{"image/png":"","text/plain":["
"]},"metadata":{},"output_type":"display_data"}],"source":["visualize(f, x=collectedXs)"]},{"cell_type":"markdown","id":"d00d2fbb","metadata":{},"source":["### Exercise 3\n","\n","When I arrive to the coffee machine, I hear my colleague talking about the per-unit costs of producing 'product B' for the company. As the company produces more units, the per-unit costs continue to decrease until a point where it starts to increase.\n","\n","To optimise the per-unit production cost at its minimal to optimise efficiency, the company would need to find the number of units to be produced where the per-unit production costs begin to change from decreasing to increasing.\n","\n","**Build a quadratic function $f(x)=0.1(x)^2−9x +4500$ on $x∈[0,100]$ to create the per-unit cost function, and make a conclusion.**"]},{"cell_type":"code","execution_count":12,"id":"7c67d8b7","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["Punto mas bajo de la curva 4297.501041232819\n"]},{"data":{"image/png":"","text/plain":["
"]},"metadata":{},"output_type":"display_data"}],"source":["#define and plot the funtion\n","import numpy as np\n","import matplotlib.pyplot as plt\n","import pandas as pd\n","\n","x = np.linspace(0, 100)\n","def f(x): return 0.1*x**2 -9*x + 4500\n","plt.plot(x,f(x))\n","print(f'Punto mas bajo de la curva {f(x).min()}')\n","\n","\n","\n"]},{"cell_type":"markdown","id":"fbe54895","metadata":{},"source":["We saw with Gradient Descent how the red dot navigates in an environment it does not know about. It only knows the coordinates of where it is and its gradient. The red dot could find the minimum point by using only this knowledge and gradient descent algorithm.\n","\n","**Optional:**\n","\n","Implement all the previous steps to create a gradient descent algorithm to see how the per-unit cost evolves, with a starting point of 0 units of production."]},{"cell_type":"markdown","id":"aabad82c","metadata":{},"source":["## Linear Algebra"]},{"cell_type":"markdown","id":"6753636d","metadata":{},"source":["### Exercise 1 : Sum of two matrices\n","\n","Suppose we have two matrices A and B.\n","```py\n","A = [[1,2],[3,4]]\n","B = [[4,5],[6,7]]\n","\n","then we get\n","A+B = [[5,7],[9,11]]\n","A-B = [[-3,-3],[-3,-3]]\n","```\n","\n","Make the sum of two matrices using Python with Numpy"]},{"cell_type":"code","execution_count":2,"id":"9e200c32","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["[[1 2]\n"," [3 4]]\n","[[4 5]\n"," [6 7]]\n","Union de las 2 matrices\n","[[ 5 7]\n"," [ 9 11]]\n"]}],"source":["# importing numpy as np\n","import numpy as np\n"," \n"," \n","# creating first matrix\n","A = np.array([[1,2],[3,4]])\n","\n","\n"," \n","# creating second matrix\n","\n","B = np.array([[4,5],[6,7]])\n","\n"," \n","#print elements\n","print(A)\n","print(B)\n"," \n","# adding two matrix\n","print('Union de las 2 matrices')\n","print(np.add(A,B))\n"]},{"cell_type":"markdown","id":"93bfb6cc","metadata":{},"source":["### Exercise 2: Sum of two lists\n","\n","There will be many situations in which we'll have to find index wise summation of two different lists. This can have possible applications in day-to-day programming. In this exercise we will solve the same problem with various ways in which this task can be performed.\n","\n","We have the following two lists:\n","```py\n","list1 = [2, 5, 4, 7, 3]\n","list2 = [1, 4, 6, 9, 10]\n","```\n","\n","Now let's use Python code to demonstrate addition of two lists.\n"]},{"cell_type":"code","execution_count":1,"id":"867b70fc","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["Original list 1 : [2, 5, 4, 7, 3]\n","Original list 2 : [1, 4, 6, 9, 10]\n","Resultant list is : [3, 9, 10, 16, 13]\n"]}],"source":["# Naive method\n","\n","# initializing lists\n","list1 = [2, 5, 4, 7, 3]\n","list2 = [1, 4, 6, 9, 10]\n"," \n","# printing original lists\n","print (\"Original list 1 : \" + str(list1))\n","print (\"Original list 2 : \" + str(list2))\n"," \n","# using naive method to \n","# add two list \n","res_list = []\n","for i in range(0, len(list1)):\n"," res_list.append(list1[i] + list2[i])\n"," \n","# printing resultant list \n","print (\"Resultant list is : \" + str(res_list))"]},{"cell_type":"markdown","id":"7a063d7f","metadata":{},"source":["Now use the following three different methods to make the same calculation: sum of two lists"]},{"cell_type":"code","execution_count":9,"id":"681930a3","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["[3, 9, 10, 16, 13]\n"]}],"source":["# Use list comprehensions to perform addition of the two lists:\n","\n","res_list = [list1[i] + list2[i] for i in range(len(list1))]\n","print(res_list)\n","\n"," \n"]},{"cell_type":"code","execution_count":8,"id":"a3a8a425","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["[3, 9, 10, 16, 13]\n"]}],"source":["# Use map() + add():\n","from operator import add\n","res_list = list(map(add, list1, list2))\n","print(res_list)\n"]},{"cell_type":"code","execution_count":10,"id":"1708d7ee","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["[3, 9, 10, 16, 13]\n"]}],"source":["# Use zip() + sum():\n","\n","res_list = [sum(i) for i in zip(list1, list2)]\n","print(res_list)\n"]},{"cell_type":"markdown","id":"1aef1bd2","metadata":{},"source":["### Exercise 3 : Dot multiplication\n","\n","We have two matrices:\n","```py\n","matrix1 = [[1,7,3],\n"," [ 4,5,2],\n"," [ 3,6,1]]\n","matrix2 = [[5,4,1],\n"," [ 1,2,3],\n"," [ 4,5,2]]\n","```\n","\n","A simple technique but expensive method for larger input datasets is using for loops. In this exercise we will first use nested for loops to iterate through each row and column of the matrices, and then we will perform the same multiplication using Numpy."]},{"cell_type":"code","execution_count":15,"id":"840e7d0e","metadata":{},"outputs":[{"ename":"IndentationError","evalue":"unindent does not match any outer indentation level (, line 17)","output_type":"error","traceback":["\u001b[0;36m File \u001b[0;32m:17\u001b[0;36m\u001b[0m\n\u001b[0;31m res[i][j] += matrix1[i][k] * matrix2[k][j]\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mIndentationError\u001b[0m\u001b[0;31m:\u001b[0m unindent does not match any outer indentation level\n"]}],"source":["#Using a for loop input two matrices of size n x m\n","matrix1 = [[1,7,3],\n"," [ 4,5,2],\n"," [ 3,6,1]]\n","matrix2 = [[5,4,1],\n"," [ 1,2,3],\n"," [ 4,5,2]]\n"," \n","res = [[0 for x in range(3)] for y in range(3)]\n"," \n","# explicit for loops\n","for i in range(len(matrix1)):\n"," for j in range(len(matrix2[0])):\n"," for k in range(len(matrix2)):\n"," \n"," # resulted matrix\n"," res[i][j] += matrix1[i][k] * matrix2[k][j]\n"," \n","print (res)"]},{"cell_type":"code","execution_count":17,"id":"db6c3355","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["[[24 33 28]\n"," [33 36 23]\n"," [25 29 23]]\n"]}],"source":["matrix1 = [[1,7,3],\n"," [ 4,5,2],\n"," [ 3,6,1]]\n","matrix2 = [[5,4,1],\n"," [ 1,2,3],\n"," [ 4,5,2]]\n","\n","res = np.dot(matrix1,matrix2)\n","print(res)"]},{"cell_type":"markdown","id":"785f6c30","metadata":{},"source":["\n","https://www.youtube.com/channel/UCXq-PLvYAX-EufF5RAPihVg\n","\n","https://www.geeksforgeeks.org/\n","\n","https://medium.com/@seehleung/basic-calculus-explained-for-machine-learning-c7f642e7ced3\n","\n","https://blog.demir.io/understanding-gradient-descent-266fc3dcf02f"]}],"metadata":{"interpreter":{"hash":"d3463682613d55fcbb64853e38cc3520a7f67bdf8d6940e781ddcdc423122719"},"kernelspec":{"display_name":"Python 3.9.12 ('calculus-project')","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.11.4"}},"nbformat":4,"nbformat_minor":5}