MIT Courses #2 – Calculus

Welcome back to the second entry on my path to figuring out what I missed by not taking the traditional computer science route. Looking at MIT’s requirements for their Computer Science Degree program, I’ve compiled a list of 12 courses from the departmental program and their prerequisites which I plan to go through via MIT OpenCourseWare to see what I can learn from them, and what fundamentals I’m lacking after 4 or 5 years of my own learning process. For the first course, I started with Intro to Computer Science with Python which was all content I’d come across before, though a good refresher and a good introduction to programming overall. As I said last time, I suspect that’s the only course that my knowledge will overlap with. This time, I’ll be tackling something I’ve been dreading but also curious about for a while, Calculus.

I was relieved to avoid this in college as a theatre major didn’t require it and thought of myself as a “creative” type who wouldn’t want any “technical” work. I was hoping that Mathematics for Computer Science, which I’m planning on doing next, would teach me any math that might be relevant to my field in a condensed form so that I could focus more on computers. Unfortunately, that course lists Single Variable Calculus as a prerequisite, so here I am having spent the majority of the last two weeks attempting to gain some understanding of Calculus in an attempt to move forward with this project.

Between the last course and this one, I realized that some courses on MIT OpenCourseWare have a version called a Scholar’s course which has additional material designed for independent learners to get material they might miss by just watching the courses with a set of recitation videos that go deeper into topics from the lecture along and the 35 lectures split into 101 different sessions based on topics. Problem sets and exams along with detailed solutions were also included and overall this is a great resource. When something from the lecture itself was confusing, the recitation videos would go into often more detail and explain things that might not been clear in the lecture.

Single Variable Calculus: 18.01SC – Fall 2006 by Prof. David Jerison

I made a point to go through all of the lectures and all but a few of the recitations, admittedly slacking the most with the problem sets. Initially I was tackling every problem given but after a couple of days and not much progress through the lectures along with knowing how much else I wanted to learn about, this seemed like the right time to make a trade-off. My goal is to figure out what I’m missing and to understand conceptually what the course covered and spending months going through every problem set does not seem like the most efficient use of time right now. Unlike the last course, my experience with Calculus will be a lot less than anyone actually taking the class, given that they would spend months practicing while I spent two weeks mostly watching videos and taking notes.

What I hope to gain through this experience and write up is a solid enough understanding of it to follow the upcoming courses, understand its relevance to computer science and to know where to go back to, should I misassess my knowledge. I can see that there are areas such as machine learning and data science where increased mathematical literacy could help me see the models that Tensor Flow is running in a way I don’t know. Additionally, the concept of derivatives and calculating the rate of change seems similar to the way we measure order of growth in a program, but for the most part my understanding of the connection is still foggy. This course alone won’t likely answer a lot of my questions, but hopefully it will give me the basis to understand future courses which will.

Lastly, a couple of warnings. I’ve divided the sections up by the lecture number, so some material in a section may be from the previous or following lecture, given how the OCW course was divided up. Additionally, I’m likely posting this in a form that could probably use some diagrams, a better way of representing formulas in html (which are lazily written in text) and likely includes some mistakes that I may have caught better if were actually graded and spent more time on the problem sets.

Unit 1: Differentiation

Lecture 1 – Derivatives, slope, velocity, rate of change

The first unit in Single Variable Calculus is all about differentiation and answering two questions. First, what is a derivative in its geometric and physical interpretation? Second, how can you differentiate any function? The geometric interpretation of the derivative is the problem of finding the tangent line of a function f(x) at a specific point (x₀,y₀). The formula for a straight line is y-y₀=m(x – x₀), which can be used to get the tangent line for a point on the curve if x₀ and y₀ are known, with y₀ being f(x₀). To calculate the slope of the tangent line, the equation is m=f’(x₀) or the derivative of “f” at x₀. This derivative is the slope of the tangent line to the function y=f(x) at point P (x₀, y₀).

In comparison to the tangent line which is the rate of change at a point on the curve, the secant line on a graph joins two points on a curve. The tangent line is equal to the limit of the secant line for points PQ as Q -> P, which is the fixed point and the way of getting the closest calculation for a slope at a point on a graph. If the two points are close enough, then the slope of the secant line is close to the slope of the curve at a point. The goal is to find the slope of the tangent line and curve at a point and secant lines can help accomplish this.

The slope of the secant line is the change in f over the change in x, or delta f/delta x and the slope of the tangent line is the limit of the slopes for the secant line represented by m=lim Q->P delta f/delta x = lim delta x -> 0 delta f/delta x. The most important formula in the first lecture is the one for the slope of the tangent line and the derivative of function f at x₀, which is m=f’(x₀)=lim deltax->0 (f(x₀+delta x)-f(x₀))/delta x. f’(x₀) is the derivative of f at x₀ and the end fraction is the difference quotient, which is how the result changes versus the input.

Lots of examples of calculating derivatives of functions and graphing the derivatives compared to regular functions are included in the first and second sessions and related recitations, showing the need for refactoring functions so that their denominator isn’t zero and how a slope is zero when it’s horizontal. Increasing means positive I guess I’ve been away from math enough to forget about these.

Lecture 2 – Limits, continuity, trigonometric limits

Lecture 2 introduces the definition of the derivative as the slope of a tangent line, showing the rate of change at a specific point of time, instead of strictly the geometric interpretation. Delta y/delta x can be thought of as the average change and the notation dy/dx can be thought of as the instantaneous rate. Examples of derivatives that appear in the real world are charge and current in physics, where “q” could represent charge and dq/dt would be the current. If, “t” is temperature, then dt/dx is temperature gradient which causes airflows and change. If “s” is distance travelled, then ds/dt is its rate of change, speed.

An example talked about is MIT’s “pumpkin drop” tradition where they drop pumpkins off a building around Halloween. The height of the pumpkin is 80-5t², with t being the time elapsed. When time is 0, the height is 80 meters and when time is 4 seconds, the pumpkin’s height is 0 and it has hit the ground. The average speed over the drop is found with the equation delta h/delta t = (0-80)/(4-0)=-20 m/sec. The instantaneous speed is the derivative, found with dh/dt = -10t, since the derivative of a constant (80) is 0 and the derivative of a variable to an exponent is found by removing multiplying the current power by the variable and decreasing it by 1 in the exponent. So, 80-5t² becomes -10t from 0-5*2t. “t” can be plugged in to find the rate of change at a certain point, with height decreasing by 40 meters per second when t is 4.

Another example given is communication between GPS and a satellite to compute your distance from a point. “h” is the distance between a person and the satellite, “L” is the distance from the person to the point on the earth directly under the satellite and “s” is the distance from that point to the satellite. h is measured by radios and clocks and L is deduced from h. Delta L is estimated by delta L/delta h = dL/dh and this calculation is used in the real world for tasks such as landing airplanes.

The lecture talks about the topics of limits and continuity which will be used throughout the course. The limit as variable, “x” approaches value, 0, is written as lim x -> 0. In the easiest case you can plug the value into the equation as x and simply solve, but with derivatives it’s always harder. When you take a derivative,you’re taking the limit as x goes to x0 of the derivative function (f(x₀+delta x) -f(x₀))/(x-x₀) and plugging in zero will give 0/0 which doesn’t work and requires cancellation to be able to make sense of it.

Two types of limits are left and right hand limits, denoted as x -> x₀⁺ and x->x₀^– meaning that there’s a limit as x approaches x₀ from the right and left sides respectively. If limits from both sides exist and are the same and f(x) is defined at x₀, then the function is continuous at x₀. However, many functions are discontinuous with each side approaching different values with different behaviors when they get there. Types of discontinuity include jump, removable, infinite and other discontinuities. Jump discontinuities exist when there is a limit from each side, but they aren’t zero and the function jumps when crossing the limit. Removable discontinuities exist when the limits are equal but there is a hole at the limit. (sin x)/x and (1-cos x)/x are examples of this and have limits of 0 or 1 and 0 respectively. Infinite discontinuities exist where limits go to infinity and negative infinity respectively. An example of this is y=1/x where right limit is infinity and left is negative. Other discontinuities can also exist, such as the function Y=sin(1/x) as x goes to 0, which is a rapidly oscillating wave with no limits. A theorem states that if a function is differentiable at x₀, then it is continuous at x₀ as well.

Lecture 3 – Derivatives of products, quotients, sine, cosine

This lecture talks about different ways of calculating derivatives. Some examples are given of computing f’(x) and finding the derivative of a specific function while other general formulas are shown to find derivatives for different relationships. Examples of these are the addition formula, (u+v)’=u’+v’, which shows that the derivative of two functions added together is equal to the result of adding the derivatives of the two functions separately. The constant multiple rule, (cu)’ = cu’, shows that taking the derivative of a constant multiplied by a function is the same as multiplying the constant by the derivative of the function. Functions like these are important for knowing how to take derivatives when multiple functions are involved or to be able to substitute in separate functions to make solving a function easier. When working with polynomials, both skills are needed.

Some rules around sine and cosine and their derivatives are that the derivative of the sine is the cosine and the derivative of the cosine of a value is negative the sine. The angle sum formula, cos(a+b) = cos a cos b – sin sin b is a rule for taking the cosine of two values. On the other hand, sin(a+b)=sin a cos b + cos a sin b. The only way to describe sine and cosine is geometrically with a geometric proof. An example here shows a circle with a mostly triangular slice with one rounded edge on the edge of the circle. Arc-length of theta is the length of this curved edge (measured in radians), theta represents the angle of the triangle at the corner in the center, and the sine of theta is the vertical height if of the outer edge if the curved part was removed. As the angle gets smaller, the curved part gets closer to a straight line and sine of theta. The general idea here is that short pieces of curves are nearly straight, similar to processes with curved graphs. The lecture also includes a proof that the derivative of the sine of theta is theta’s cosine and a brief intro to the product and quotient rules covered in more detail in the next lecture.

Lecture 4 – Chain rule, higher derivatives

This section covers general rules for finding the derivatives of functions multiplied or divided by each one another. First, the product rule is (uv)’ = u’v + uv’, stating that the derivative of functions “u” and “v” multiplied together is equal to the derivative of function “u” multiplied by function “v” plus function “u” multiplied by the derivative of function “v”. You can expand to three or more functions by multiplying each function’s derivative with the other function’s original form and adding the multiples together. The quotient rule states that (u/v)’ = (u’v – uv’)/v² as long as v isn’t zero. The derivative of constant “c” times function “u” is c * u’.

Examples of these functions being used are shown in the lecture including one in recitation showing that the tangent of x is sine of x divided by cosine of x. The next rule shown is the chain rule, an important composition rule for finding derivatives with nested functions by separating them out and calculating separately. This is the most powerful technique for extending functions that can be differentiated. An example is the function y = (sin t)¹⁰ or y=sin¹⁰(t). This can be split up into two functions, x=sin(t) and y=x¹⁰. The chain rule states that dy/dt = dy/dx * dx/dt, so the derivative becomes 10(sin(t)) * cos(t).

This lecture also touches on taking higher derivatives beyond the first. For instance, u’’ is the derivative of u’ w, and u’’’ is the derivative of u’’. For the forth and beyond, derivatives can be written as u⁽⁴⁾. One interesting example given is that the fourth derivative of sin x comes all the way back around to sin x. It’s second derivative is -sin x and this pattern is shared with cosine as well. Another example shows that repeatedly differentiation Dxⁿ=nx^n-1 gives n factorial (n!), and the derivative of any constant is zero. Another way of writing the derivative of a function shown here is a Du and a second derivative could be D³u.

Lecture 5 – Implicit differentiation, inverses

This lecture talks about implicit differentiation, which is finding a derivative without necessarily completing the equation. For instance, y = x^m/n can be reformatted as yⁿ=x^m putting one variable in terms of another, without a full calculation. Another example is y’ = x/y, which shows what y’ is but needs more information filled in. In explicit differentiation, y’ is calculated in terms of only x while implicit can leave both x and y on one side to show y’. The explicit solution is needed to complete a problem, but implicit differentiation can hide a degree of complexity to show a relationship, such as y’s derivative without solving for y itself. One important use of implicit differentiation is finding derivatives of inverse functions, which basically reverse the x and y axis. Implicit differentiation can show the derivative of an inverse if the derivative of the original is known, for instance, inverse tangent of arctangent shown as an example.

Lecture 6 – Exponential and log, logarithmic differentiation; hyperbolic functions

This lecture talks about exponentials and logarithms, the last standard functions that need to be connected to calculus. For exponents, a base of “a” to the the power of 0 is 1, to power 1 is a, power 2 is a², power 3 is a³, etc. a^m/n is the nth root of a^m and a^x is defined for all values of x. For logarithms, one rule is the logarithm of (x₁x₂) = ln x₁ + ln x₂. The natural log of 1 is 0 and the natural log of e is 1. If e^x has the point (0, 1) then the log is the reverse and has the point (1, 0). Logs and exponents are inverse relationships and logs can be found using implicit differentiations. To differentiate an exponential (d/dx a^x), convert to base e, a^x= e^{x ln(a)}. D a^x=De^{x ln(a)} = (ln a)e^{x ln(a)} or Da^x=(ln a)a^x. For taking the derivative of another exponential function with natural logs, you can take the natural log of the function to the base and multiply by the original. D2^x = (ln 2)2^x and D10^x = (ln 10)10^x. Sometimes it can be easier to differentiate in terms of a natural log than the original function and examples are shown of converting to natural logs to help differentiate via chain rule.

Lecture 7 – Exam 1 review

This lecture has a lot of review of the material covered so far with some additional rules and guidelines given. Some more rules of logs given are that ln(MN) = ln M + ln N, as long as the base is the same for both and is positive. Similarly, ln(M/N) = lin M – ln N with the same base requirements. ln(M^k) = k ln M, not (ln M)^k and log_bM = (ln M)/(ln b). A couple of hyperbolic trig functions given here are that cosh(x) = (e^x+e^-x)/2, the derivative is sinh(x), or (e^x-e^-x)/2 which has a derivative of cosh(x) respectively. The review section gives the general formulas given so far including talking about implicit differentiation, the definitions of the derivative and examples of trig functions, and touches on all material so far in preparation for the exam that would take place at this point. The exam appears to be assigned a lecture number in the course, so the next lecture in the series is lecture 9.

Unit 2: Applications of Differentiation

Lecture 9 – Linear and quadratic approximations

While the first unit talked about the rules of differentiating, the second focuses more on techniques for using differentiation. The first topic talked about in this section is linear approximation which is described with the formula f(x) approximately equals f(x₀)+f’(x₀)(x-x₀). The idea is taking the derivative as the change in x goes to zero to get a linear approximation of the function making it easier to estimate a derivative. Linear approximations make functions easier by approximating them at zero. Examples of linear approximations are that sin(x)’s approximation is x, cos(x)’s is 1 and e^x has 1+x.

Linear approximations are useful for breaking down functions into simpler depictions. For instance, if ln(1+x) is approximately x, then ln(1.1) is approximately 0.1. The left half of an equation or f(1.1) looks difficult, but the answer is much simpler. If an approximation is good, then progress can be made toward finding a relationship or getting an answer. This process also removes quadratic and polynomial terms to capture just the linear features of an equation. An example of a linear approximation in the world is GPS time dilation, where the difference between a satellite and human’s watch time can affect a GPS device’s apparent position. Engineers have to account for this and offset using a calculation, t’=t(1-(v²/c²)^-1/2 approximately equals t(1+.5*v²/c²), with delta t being t’-t. The error fraction (number of significant digits) is proportional to v²/c² with a factor of one half.

Lecture 10 – Curve sketching

Similar to linear approximation, quadratic approximation adds another term to approximate equations in a quadratic form. This time, the function is f(x)=f(x₀)+f’(x₀)(x-x₀)+((f’’(x₀))/2)(x-x₀), with “+((f’’(x₀))/2)(x-x₀)” added to the linear approximation. These can be used when a linear approximation isn’t enough to describe a function. In economics, log linear and log quadratic functions are used with log quadratic used for most modeling. Quadratic approximation doesn’t add to sine functions, and slight changes to cosine and ex equations. Geometrically, quadratic can show a curve better fit to the original and can potentially add needed detail for a better approximation. Some examples of quadratic approximations are sin(x) is approximately x, cosine(x) is approximately 1-½ x², e^x is 1+x+½x², lin(1+x)=x-½x², and (1+x)^r=1+rx+r(r-1)/2*x². Using the quadratic approximation formula can give all of these equations.

General advice given for approximation is to try linear first and use quadratic if forced to. High order terms are dropped in this case are fractions which get smaller and smaller, and throwing them away doesn’t take away too much from the understanding of the functions. An example shown in the recitation is taking a quadratic approximation of a product of two functions by taking the approximation of each individual function and multiplying them together. One of the problem sets goes into an example of finding 3rd degree and higher approximations as well, so that does exist even if less common.

Lecture 11 – Max-min problems

This lecture talks more about sketching curves which is encouraged to make sure that graphing skills and common sense are in line with the solutions you’re trying to find. Generally, if a derivative of a function is greater than zero, then the original function is increasing. If the derivative is less than zero, then the original is decreasing. If a second derivative is greater than zero, then the derivative is increasing and the original function’s shape is concave up with the slope getting larger. On the other hand, if the second derivative is less than zero, the derivative is decreasing and the original function is concave down with its slope getting smaller. Turning points on a graph are where the derivatives change their sign and direction, and the value of the derivative will be zero at the turning point. For instance, if f’(x₀) equals zero, then x₀ is known as a “critical point” and f(x₀) is considered a “critical value”, whether or not the line actually changes direction at the point. It’s mentioned that on a sharp corner it won’t be called a critical point and that will be discussed later. If a derivative function can’t equal zero then it has no critical points and the original function can’t change direction.

Some general strategies for sketching graphs are to plot discontinuities, especially when infinite, endpoints or where x goes to an infinity if no end to a function, and points such as axis crossings. Solve the equation at and plot critical points and values. Then, decide if the derivative is positive or negative between critical points and discontinuities as directions can change at either. This helps to double check the previous steps. If you have to, check if the second derivative is positive or negative, though you’re better off if you can get away without it. If f’’(x₀) equals 0, then x₀ is known as the inflection point. Lastly, this lecture touches on finding the minimum and maximum points on the graph of a function which is easy to see on a graph, but having to sketch a graph can be a lot of work.

Lecture 12 – Related rates

This lecture continues the problem of finding minimum and maximum points by looking at the critical points, endpoints and points of discontinuity points only. As the name suggests, you’re looking for the points with the lowest or highest value. Examples of these problems are finding the minimum distance from the origin a point is on a curve or finding dimensions of a triangle to minimize area with certain constraints. Finding the critical points and endpoints will give the minimum and maximum points of the functions and both should be found together after taking all these points. Other examples are finding the dimensions of a box with the least surface area for a fixed volume and minimizing the surface area of a cylinder. More complicated problems can be solved more easily with implicit differentiation, though would still need a thorough solution later on.

This lecture also introduces related rates with a problem showing a car and police officer on the road forming a right triangle and figuring out if the car is speeding based on the change in the car’s position on the road with respect to time using the chain rule. With dx/dt being the change in distance over time and another use of a derivative. Another problem calculates how fast radius and surface area are growing as air is blown into a balloon.

Lecture 13 – Newton’s method and other applications

The first half of this lecture talks about the “ring on a string” problem, which is a min/max problem where depending on how the string is held the ring settles differently. The length of the string is the constant and the goal is to find the minimum height of the ring, which is also at the point where the function’s derivative is zero.

Afterwards, the lecture introduces Newton’s Method which is described as one of the greatest applications of calculus and the related function is “x_n+1=x_n– (f(x_n))/(f’(x_n))”. The method here is to start with a guess of what the x intercept is and use the formula to get closer answers to the actual x value. It works well if the derivative isn’t small, the second derivative isn’t too big and x₀ starts nearby. The method can fail if the guess is too far away in a quadratic function as you could end up going to the wrong root. For instance, if you guess bottom or top where the tangent slope is zero, then it won’t work. The point where the derivative is zero can mess up the method as x₁ can be undefined. A parabola can also lead to a cycle where you end up jumping between two nearby zeroes. Ultimately, it can be good for finding zeroes, but be careful when using it as if you have no idea where the zero is, you could end up jumping around forever. An example problem shows Newton’s method never finding the exact x value where x³=0, but gets progressively closer when any non-zero value is guessed.

Lecture 14 – Mean value theorem, Inequalities

Lecture 14 introduces the mean value theorem (MVT) which follows the function (f(b) – f(a))/(b-a) = f’(c) where c is between a and b. This is provided that the function “f” is differentiable for values of x between a and b and continuous where x is greater than or equal to a and less than or equal to b. An example of the theorem states that if you’re travelling 3000 miles in six hours, then at some point, you’ll be going the average speed of 500.

Compared to linear approximation, the former says that (delta f)/(delta x) is approximately f’(a) as b approaches a, while MVT says that (delta f)/(delta x) equals f’(c), for a < c < b. MVT tells you for sure that it’s equal to a value less than a maximum value for a derivative and bigger than a minimum on the same interval. For example, the average speed has to be between the minimum and maximum speeds. A couple examples are shown of MVT being used including one showing that if two functions have the same derivative, they differ by a constant, which makes sense but I hadn’t thought of before. An anti-derivative is also mentioned here for the first time.

Lecture 15 – Differentials, antiderivatives

This is the first introduction to integration and talks more about antiderivatives or integrals. Differentials refer to the derivative of a function and examples are shown of writing dy/dx as dy for an easier notation. The opposite of a differential is an integral and the integral of a function has that function as its derivative. Integrals are noted by a capital function variable, for instance G(x) is the indefinite integral and antiderivative of g(x)dx. The integral sign looks like a vertically stretched out S. Since constants aren’t factored into the derivative, a constant value in an antiderivative is ambiguous. For example, the integral of sin(x)dx = -cos(x) + c. Many functions have the same derivative, so an exact function can’t be solved for. Similar to derivatives, the sum of antiderivatives and the antiderivative of a sum are the same. When taking the antiderivative, the power of an exponent goes up by 1 instead of down and it’s easy to check your work by taking the derivative of your solution and making sure it’s the original function. Rules such as the sum and product rules apply for antiderivatives as well.

Generally, integration is much harder than differentiation and can be helped with more substitution and approximation techniques. The method of substitution is tailor made for differential notation. For this, you define a new function and take its differential, then substitute the new variable and add the actual value back at the end. Advanced guessing is another technique using existing knowledge to guess at an answer and test it out by differentiating. Guessing and approximation and certainly common themes throughout.

Lecture 16 – Differential equations, separation of variables

This is the introduction to differential equations, which MIT also has a whole course focused around in 18.03. The simplest kind of differential equation is the standard, dy/dx = f(x) and y is the integral of f(x)dx. For this lecture, this is viewed as solved and substitution and advanced guessing are mentioned again as good techniques to find antiderivatives. Differential equations can come in other forms, a trickier example being (d/dx + x)y=0. d/dx + x is called an “annihilation operator” in quantum mechanics and is studied because it has simple solutions that can be written out. This function can be rearranged as dy/dx = -x*y and dy/y=-x*dx to organize variables. Separation of variables is important and the goal is often to get x and y to their own side.

Unit 3: Intro to Integration

Lecture 18 – Definite integrals

The third unit of the course focuses mostly on integrals. The geometric point of view on integrals is that they are the area under a curve. Another concept touched on that will be described later is a cumulative sum. So far, the course has shown indefinite intervals, which are the integral of the entire function for all points. On the other hand, definite intervals specify start and end values. For instance the integral a to b of f(x)dx can be calculated by divided up the area into rectangles to estimate the area under the curve and add up the area of these rectangles. The smaller the rectangles, the less space lost and the closer the estimated solution.

Rectangles are easy to find the area of as you just need to multiply the base by the height making them a helpful tool for estimation. By using equal base intervals, the length of the base (b) is divided by the number of intervals (n). The height of each rectangle is calculated with f(interval * b/n). The equation for the sum of the areas of the rectangles is written (b/n)*f(b/n) + (b/n)f(2b/n)+…+(b/n)f(nb/n). This is basically the bases times the heights. This can be stated as summation i=1 to n of b/n f(ib/n). The summation notation is written with a sigma with the starting value underneath and end on top. Summations can also be subtracted and added when the functions are the same with different starts and ends.

Riemann Sums are introduced in this lecture and are named after a mathematician with the same name from the 1800’s. They are a general procedure for calculating definite integrals. Using equal intervals of delta x, with n intervals total, delta x = (b-a)/n for integral a to b of a function. To start, pick any height of f in each interval and use the function, summation i=1 to n of f(ci) delta x = integral a to b of f(x) dx to calculate a Riemann sum. Ci is the x value in this interval and this can be used to estimate the value under a curve by choosing a single point to calculate the height at. This is another approximation method and can be used on 3d paraboloids as well as shown in a recitation video.

Lecture 19 – First fundamental theorem of calculus

This lecture introduces the Fundamental Theorem of Calculus which is described as the most important thing in the course. There are two versions with FTC1 being used mostly throughout the class. The theorem states that if F’(x)=f(x), then integral a to b of f(x)dx = F(b)-F(a). F(b) – F(a) can also be stated as F(x) evaluated at a to b and is shown with a pipe with the starting underneath and end on top. This process will also be reversed and integrals will be used to solve problems such as finding volumes, but it’s important to keep a connection between old and new processes. Instead of computing Riemann sums, you can use antiderivatives to calculate a definite interval by taking two values and approximate that way.

An example of the fundamental theorem in use is distance travelled. If x(t) is a position at time “t”, then x’(t) is speed v(t). The integral of a to b of v(t)dt=x(b) – x(a), the ending position minus the starting position. The right hand side of the function is the distance travelled while the left hand is speed. In the “ith” second, you’d travel v(t_i)(delta t) for instance. The integral would give you the total distance travelled. You can also integrate positive and negatives of functions separately if the function requires that or split on limits/discontinuities when functions change.

Some properties of integrals are that the integral of a to a of a function is zero since there would be no horizontal movement. Integral a to b of cf(x)dx = c * integral a to b of f(x)dx, if c is a constant. You can add integrals with the sum rule, for instance integral a to b of (f(x) + g(x))dx = integral a to b of f(x)dx + integral a to b of g(x)dx. You can also add different definite integrals of the same function together. For instance, integral a to b of f(x)dx + integral b to c of f(x)dx = integral a to c of f(x)dx. Another property is that F(b) – F(a) = -(F(a)-F(b)). Also if f(x) is less than g(x), then the integral of f(x) will be less than or equal to the integral of g(x) from a to b, assuming a is less than b.

The topic of substitution is brought up again. In this example, the integral of g(u)du can represent g(u(x))*u’(x)dx by substituting u for u(x) and du for u’(x)dx respectively. For definite integrals, u₁ and u₂ can represent u(x₁) and u(x₂) respectively. If the sign of the function changes, the function can be broken into pieces where u’ represents one piece or another. When substituting with definite integrals, bounds of integration need to be changed so you won’t have to switch back from u to x. When computing the antiderivative, you’ll have to do back substitution and put everything in terms of x to compute the definite interval.

Lecture 20 – Second fundamental theorem

Last lecture, the fundamental theorem was used to evaluate integrals. This time, that point of view is reversed to use the derivative of the integral to get information about the integral itself. Version one of the fundamental theorem says that if F’ is f, then integral a to b of f(x)dx = F(b) – F(a). Compared to the mean value theorem, the fundamental theory says that delta f is equal to the average(F’) * delta x compared to minimum value theorem stating that delta F = F’(c)*delta F, with the former giving a much stronger and less vague statement. The average will always be smaller than the maximum and bigger than the minimum values in most cases.

The fundamental theorem of calculus’s second version, FTC2 states that if f is continuous and G(x) = integral a to x of f(t)dt, where a <= t <= x, then G’(x) = f(x). G(x) solves the differential equation, y’ = f with the condition that y(a) = 0. This basically states that the derivative of an integral of a function is the original function. Both fundamental theorems are proved in this lecture and examples of the theorems are shown in the lectures and recitation.

Getting back to the theme of using a function to understand its integral, an example function is y’ = e^{-x^2}, where y(0)=0 and it’s solution is F(x)= integral 0 to x of e^{-t^2}dt. The graph of e^{x^2} is a shape known as a bell curve. F(x) describes the area under this curve which in this case can’t be described by previously seen functions. Pi is considered a construction of a “new” number in the same way the previous function is considered a “new” function and more “new” functions will be talked about.

Lecture 21 – Applications to logarithms and geometry

The second fundamental theorem is talked more about here and is used to solve problems. First off y’ = 1/x is solved to get L(x)=integral 1 to x of dt/t. The formula can be a good starting place to derive all properties of the log function. Since L’(x) is 1/x, evaluating at 1 would give zero since integral 1 to 1 of dt/t is zero. The second derivative is -1/x² and is easier to get using this formula showing that the graph is concave down. When x is less than 1, L(x) is less than zero and L(x) is increasing at 1 and integral 1 to x dt/t = integral -x to 1of dt/t greater than zero. Another manipulation of integrals shown shows the addition rule with definite intervals. L(ab) = L(a) + L(b) shows that the integral 1 to ab of dt/t is the integral 1 to a of dt/t + integral a to ab of dt/t.

This lecture also talks about more “new” functions which aren’t describable via typical methods. F(x) = integral 0 to x e^{-t^2} dt is one such method, and its derivatives are F’(x)=e^{-x^2}, F(0) = 0, F”(x) = -2x*e^{-x^2}. The second derivative is negative when x is positive and positive when x is negative. F(0x) = -(X) is considered an odd function which is the derivative of an even function. F(x) is an even function with a graph in a bell shaped curve and symmetrical before and after zero. Another function states the limit as x->inf of F(x) = sqrt(pi/2), negative inf is negative sqrt(pi/2). The standard error function, enf(x) = 2/sqrt(pi) integral 0 to x of e^{0t^2} dt = 2/sqrt(pi)F(x) is helpful as is famous and standard normal distribution is related.C(x) = integral 0 to x cos(t²)dt and S(x) = integral 0 to x of sin(t²) dt are Fresnel integrals. H(x) = integral 0 to x of sint/t dt. Li(x) = integral 2 to x of dt/lnt. Li(x) = number of primes greater than x, which is known as the Riemann’s hypothesis. After going through these new functions a student in the video asks if this is stuff we’re supposed to understand to which the teacher responds that students would have passing understandings and will be expected to calculate derivatives. The homework for this section shows a Poisson process in a probability problem, which is a situation in which a phenomenon occurs at a constant average rate.

Integrals are important for finding cumulative sums and are the answer to these problems. Finding areas between curves can be done with a method similar to Riemann’s sums, by dividing into rectangles and adding them together. Every function in the following lectures will focus on identifying two things, an integrable function known as an integrand and the limits for the function. It’s good to start calculating by drawing functions on the graph and the professor states that you have no hope of solving a problem without graphing. Depending on the function, it may need to be broken into pieces and added together. A second slicing method divides into horizontal slices with the width as dy, integral of the lowest to highest y value. If the right is larger, subtract the left from it for area calculation. x = y² is shown here as a problem where y can have two x values. The problem sets and recitations also show examples of calculating areas between two bounded curves between two functions, such as two curves that form a circle on the graph.

Lecture 22 – Volumes by disks and shells

This lecture talks about how to calculate volumes by slicing similar to finding areas. For example, how do you calculate the volume of a loaf of bread with slices? If a slice of delta v is approximately the area time delta x, then the volume of a loaf is integral A(x)dx. A technique for creating 3d shapes on a graph is by swinging a section of a graph around the x axis to show a 3d picture. This can form a disk shape and can help figure out the volume of pieces of the volume of separate chunks of the shapes with dV = (pi y²)dx. An example problem shown in the lecture finds the volume of a ball with radius a by dividing it into chunks accordingly. Similar to the first method, known as the method of disks, the second method is the method of shells and rotates around the y axis. An example is finding how much liquid is in a cauldron. Parabolas like x2 can be rotated to create a cauldron shape and a shell of this has thickness of dx and height of a – y, which is the total height minus the function (x²). If you unwrapped the shape, it would look like a slab with a length of the circumference and the overall function for the shape is dV=(2pi*x)(a-y)dx. More examples of finding volumes are shown in recitation with one finding the volume of a paraboloid with shells and another with disks.

Limits and integrands are equally important for defining these functions and while problems can usually be set up either vertically or horizontally, one will generally make it easier to solve. Occasionally it’s impossible to solve the other way. It’s important to consider the units used as well, as using meters where centimeters are expected will create a problem. For instance y=x² with 1 meter would give a y value of 1 meter while a smaller value of 10 centimeters would give the same amount and a 100 centimeters squared would give 100 meters.

Lecture 23 – Work, average value, probability

This lecture talks about integrals and averages. Average values are one of the most important applications for integrals. 1/(b-a) integral a to b of f(x)dx is known as the continuous average of f. The average of a constant value is the constant itself, so c * n/n’s average value is c. In real life problems, weighted averages tend to be more common. For instance, integral a to b of f(x)w(x)dx divided by the integral a to b of w(x)dx. These are the integrals of functions multiplied by a weight and division allows for averaging a constant properly. An example of this is the problem of heating up a problem and finding the average temperature as the heat won’t be the same throughout. This is solved using horizontal slices for finding consistent temperatures using the method of disks.

Lecture 24 – Numerical integration

Numerical integration is important as many functions don’t have easy to describe antiderivatives and they must be calculated via a calculator. Different techniques for computing estimates include Riemann sums, the Trapezoidal rule and Simpson’s rule. Riemann sums shown before are inefficient. They use the average by adding y values as different points to get integrals with the calculation (Y₀ + y₁ +…+y_n-1)(delta x). The Trapezoidal rule is more reasonable but still inefficient. The area is divided into sections with diagonal lines on top for a closer shape and added up. The function is “area = x( y₀/2) +y₁ + y₂+ …+y_n-1 + ½ y_n”. This is equivalent to adding to adding up and averaging left and right hand Riemann sums. The last method, Simpson’s rule is the best so far and pretty accurate. Provided that n is an even number, this method works with pairs of boxes. The area under the parabola is calculated with area = (2 delta x/6)((y₀+4y₁+y₂) + (y₂+4y₃+y₄)+…+(y_n-2+4y_n-1+y_n)). The smaller the distance between y values, the more accurate the method gets.

Lecture 25 – Exam 3 review

Continuing the topic of numerical integration, examples of the Trapezoidal and Simpson’s rules are shown here. For Simpson’s rule, the difference between the estimate and exact answer is approximately (delta x)⁴. Simpson’s rule matches a parabola which naturally follows a curve better than straight lines and the rule is derived using the exact answer for all 2nd degree polynomials (parabolas and lower). This works for cubic functions almost as well. Also, it’s important to watch out for 1/x when x is near zero.

Next the lecture covers the bell curve more and finds the volume of the curve e^{-r^2} via rotation around the axis. Volume = integral 0 to inf of 2pi*r*e^{-r^2}dr = -pi e^{-r^2} when evaluated at zero and infinity is pi. The quantity is the area under the bell curve of a slice is the integral minus inf to inf of e^{-t^2}dt. Volume can be computed by slices, ex V=Q². Adding a third axis to a graph can show a three dimensional space and can help find volume via parallel slices with volume being the integral negative to positive infinity of A(y)dy. If y is represented as b, you can calculate the area under a curve with A(b) = e^{-b^2} Q and can plug in A(y) to solve V=Q². X so far is treated as constant, though in the Calculus 2 class, both variables will change complicating things a bit. Lastly, it’s mentioned that with volumes of revolution, you’ll always work back to a 2D diagram.

Unit 4 – Techniques of Integration

Lecture 26 – Trigonometric integrals and substitution

The topic of this section is diving into techniques for integrating. Starting off, some basic trigonometry principles and substitutions are introduced.Sin²theta + cos²theta = 1cos(2theta) = cos²theta – sin²thetasin(2theta) = 2sin theta cos thetaCos²theta = (1+cos(2 theta)/2Sin²theta = 1-cos(2 theta)/2cos(a+b) = cosa cosb – sina sinbd sin x = (cos(x))dx: integral of cos(x) dx = sin(x) + cd cos x = -sin(x) dx: integral sin(x) dx = -cos(x) + c

If a new variable is introduced, it’s important to put it back into terms of its original value to solve a problem. Additionally, it’s helpful to break out odd powers to a single occurrence. A harder case for solving problems are when there are only even exponents and half angle formulas (iv and v) can help to get rid of even powers. This lecture also introduces polar coordinates which come out from the center of a graph in a circular pattern unlike the standard x and y axis.

Lecture 28 – Integration by inverse substitution; completing the square

Some more rules and integrals about trigonometry functions are introduced here, this time talking about secant, cosecant and cotangent. Secant is 1/cos, cosecant is 1/sin, tangent is sin/cos and cotangent is cos/sin. When you put “co” in front of one of these identities, sine and cosine’s roles in the equation are exchanged. Another important identity is that 1 can be written as cos²x + sin²x which can help substitute 1 into trig identities. Some more equations for trig identities are.Sec²x = (cos²x+sin²x)/cos²x = 1 + tan²xd/dx tan x = d/dx sinx/cosx = (cos²x – (sinx)(-sinx))/cos²xd/dx * sec x = d/dx 1/cosx = -(-sinx)/cos²x = sec(x)tan(x)Integral of tan x = integral of sinx / cosx * dx = integral of -du/u = ln(u)+c = -ln(cos(x))+cu = cos x, du = -sin x dxintegral of sec(x)dx = integral of ln (sec x + tan x) + cd/dx(sec x + tan x) = (secx + tanx)U = sec x + tan x. u’ = u*sec(x)sec = u’/u = d/dx ln(u) gives an answer known as a logarithmic derivative.

The lecture shows examples of making these substitutions and undoing them. Trig substitutions can be useful in a lot of situations. For instance, if an integrand has the square root of a²-x², you can substitute x for “a cos(theta)” or x = “a sin(theta)” and you’ll get a sin(theta) or “a cos(theta)”respectively. If an integrand contains the square root of a²+x², substitute x for “a tan(theta) to get “a sec(theta)”. If the integrand contains the square root of x²-a², substitute x for “a sec theta” to get “a tan theta.” These three tools can help to get an integral though will need to remove theta afterwards. Hyperbolic trig substitutions are briefly touched on here.

The last technique mentioned here is completing a square. In a square root, there may be a middle term. For instance, take the problem of finding the integral of dx/sqrt(x²+4x). This can be rewritten in quadratic form of (x+a)²+c, with x²+3x = (x+2)² – 4, directly substituting u for x+2, and du=dx, you can rewrite as integral of du/sqrt(u²-4). Trig substitution can set u = 2sec(theta) and du = 2 sec(theta) tan(theta), which can reduce the equation to ln(sec(theta)+tan(theta)+c. Undoing trig substitutions and putting u back as x+2 gives Integral of dx/sqrt(x²+4x) = ln((x+2)/2 + (sqrt(x²+4x)/2) + c.

Lecture 29 – Partial fractions

This lecture introduces partial fractions. A rational fraction would be something like P(x)/Q(x) which describes a ratio of two polynomials P(x) and Q(x). Partial fractions can help split up P and Q into easier pieces. For instance, find the integral of (1/(x-1) + 3/(x+2))dx = ln |x-1| + 3 ln |x+2| + c. This function can be combined into a common fraction, (4x-1)/(x²+x-2). Taking the integral was easier but after combining, it’s disguised. An algebra method can help unwind the disguise to detect “easy” pieces” inside using the cover up method.

Step 1 of the cover up method is write down the function you want to integrate and undo the damage. For instance, factor the denominator as (4x-1)/((x-1)(x+2)). Step 2, set up what you expect to factor, A/(x-1) + B/(x+2). Step 3, solve for A and B. A can be solved by multiplying by (x-1), getting the equation, (4x-1)/(x+2) = A + B/(x+2) * (x-1) and then plugging in x as 1 to get the value A = 1. B can be solved by multiplying by (x+2) and plugging in x as -1 to get B=3. To summarize, the cover up method steps are to factor denominators, set up what unknowns should be and targets, and step 3 is to cover up and solve for unknown coefficients.

This method works if Q(x) has distinct linear factors and degree P is less than degree Q. On the other hand, if P is bigger or equal to Q, you’ll get an improper fraction, but can use long division to convert to a proper fraction. Polynomial long division can separate into a value and a fraction, making this easier. Partial fractions always work, but sometimes require a lot of help. For instance, long division could be a required step zero to ensure that P is less than Q. Step 1 could require machine calculation if difficult to factor and set up for higher degrees of denominator functions can require an unknown value for each polynomial degree requiring reintegrating many pieces.

Lecture 30 – Integration by parts, reduction formulae

Integration by parts is a combination of the Fundamental Theorem of Calculus and the product rule, (uv)’ = u’v + uv’. The formula for integral by parts is integral of uv’dx = uv – integral of u’v dx. With definite integral, the formula is integral a to b of uv’ dx = uv evaluated at a and b minus integral a to b of u’v dx. Another useful formula for simplifying problems is the reduction formula which suggests applying a rule and finding an integral in terms of a simpler integral. This could require n steps. An example of a reduction formula is F_n(x) = x(ln x)ⁿ F_n-1(x) which can help break down the integral into smaller parts for the example problem of taking integral of (ln x)ⁿ dx. Some important considerations are determining which function should be u and which should be v. Advanced guessing can also help to break down a problem. Examples are shown of using reduction formulas and integration by parts to solve the volume of a wine glass with horizontal and vertical slices.

Lecture 31 – Parametric equations, arclength, surface area

This lecture talks about computing the length of a curve and more about the surface area of graphs. For computing the length of a curve or arc length, ds² = dx² + dy² is a given formula and ds is sqrt(dx² + dy²) respectively, with dx and dy making up two sides of a triangle with ds being a curved side. Arc length itself is the total length from S_n – S₀ and which is the integral of a to b of sqrt(1+f’(x)²dx), or ds. As the change gets smaller, the curve becomes closer to a straight line and dx, dy and ds become closer to making a triangle. “The idea of calculus is being able to apply techniques for linear functions for any curve by breaking them into smaller bits.”

Surface area is talked about a lot more in Multi Variable Calculus, if you can handle for linear functions then you can figure out the rest. An example of finding the surface area of a rotation is y=x² rotated around the x axis. dA = (2pi y)ds. Lower case “s” represents arc length and dS or dA is surface area. The surface area of hemisphere, 0 to a is 2pi a² and the area of the full sphere, -a to a, is 4pi a². Lastly, this lecture includes an example of a multi variable problem where 1 variable has multiple x and y values which are separate variables based on it. For example, x = a cos(t), y = a sin(t) and x²+y²=a²cos²(t)+a²sin²(t) which forms the shape of a circle.

Lecture 32 – Polar coordinates; area in polar coordinates

This lecture talks about parametric curves and polar coordinates which were briefly mentioned earlier. An ellipsoid is formed by rotating the circular ellipse around the y-axis and a problem of calculating its surface area is shown in this lecture. Half of finding an integral is setting up the integrand and the other half is finding the limits. The rest of the lecture talks about polar coordinates which involve a circular geometry, creating a graph that resembles a dartboard as opposed to the rectangular coordinate system we’re used to. This way of describing points in a plane starts with a ray with a distance to the origin and an angle, theta from the origin with a horizontal axis. Points come off the origin at an angle and are placed a distance away. Variables are plotted with x = r cos(theta) and y=r sin(theta), with r being the positive or negative sqrt(x²+y²) and theta is tan^-1y/x or tan^-1(-y/-x). Unlike the rectangular coordinate system which has incremental x and y values plotted on a place in plane, points are plotted with the above equations for polar coordinates and are used a lot in physics.

Lecture 33 – Exam 4 review

This lecture talks more about polar coordinates and using them for graphing. First, area can be calculated with polar coordinates, initially of a circular shape with the formula for a circle’s area pi r², and an example of a non circular curved shape being solved by breaking it into pieces similar to the Riemann sums process. For polar coordinates, the origin is the center and the rays come out at angles based around pi. You can always use x and y variables based on theta, but it’s good to understand geometrically.

Some other given facts and equations are stated here as well. Kepler’s law dates that dA/dt is a constant. The Conservation of angular momentum states that r² d theta/dt is constant, this is apparently an important breakthrough and commonly used in physics. The rest of the lecture is a review of the previous lectures.

Unit 5 – Exploring the Infinite

Lecture 35 – L’Hôspital’s rule

This final unit helps to clear up loose ends from the course and talks about techniques to help deal with problems involving infinity. One rule that can help with this is L’Hospital’s rule which helps calculate limits including new ones. x ln(x), xe^-x and ln(x)/x’s limits can be calculated and problems can be solved by looking at linear approximations of f and g functions for a value. An easy version of this rule takes care of a problem where the denominator can’t be zero. Lim x -> a of f(x)/g(x) = lim x-> a of f’(x)/g’(x). The limit of this ratio is the same as the limit of the ratio of its derivative, so for calculating limits, you can use the derivative in place of the original function or vice versa.

Differentiating numerator and denominator separately can help solve problems and this rule can be applied multiple times. If one limit exists, then the preceding limit exists by the rule that if the right side of an equation exists, the left does as well. The approximation rule works well for simple problems and gets the same answers often, but L’Hospital can work better for handling more exotic limits. Products can also be converted to ratios so this rule can be applied to then to solve. When one side of the ratio goes to zero, the other goes to infinity, and it’s important to compare the speeds for how they change and can help understand which goes to infinity faster. A rule applies in 0/0 or inf/inf cases which are indeterminate forms. Other indeterminate forms are 0⁰, inf⁰, 1^inf, inf – inf. With a quotient, L’Hospital’s rule can be applied and products can be turned into quotients. In the case of x^x as x goes to zero from the right, this function can be solved by rewriting as x^x = e^{x ln(x)}. x ln(x) = ln(x)/1/x) via L’Hospital’s rule as x goes to zero is approximately (1/x)/(-1/x²) = -x and the limit is 1. While it can be a helpful tool, it’s important to think before using and not good to use as a crutch. Some notations are given for comparing rates of growth. f(x) << g(x) means that f(x)/g(x) goes to 0 as x goes to infinity, with << being a lot less. Similarly, x << x^c << e^x << e^{x^2} -> infinity.

Lecture 36 – Improper integrals

Improper integrals and techniques for comparing integrals are talked about in this lecture. Improper integrals are when the area under a curve stretches to infinity. For instance, integral a to infinity of f(x)dx is the limit as N -> inf of integral a to N of f(x)dx. The integral converges if the limit exists and the area is finite. Otherwise, it diverges and the area is infinite. A useful tool given for solving problems is that 1/x or x^-1 in the denominator making it easy to convert between powers and quotients.

Limit comparison is useful when you don’t know the number, but can compare to something whose convergence properties are understood. If f(x)/g(x) goes to 1 as x -> infinity, then integral a to infinity f(x)dx and integral a to infinity of g(x)dx either both converge or diverge. f(x)/g(x) goes to 1 can be written as f(x) ~ g(x).

Examples of a second type of improper integral is integral 0 to 1 of dx/sqrt(x), integral 0 to 1 of dx/x or integral 0 to 1 of dx/x². You can plug in zero to solve and get the right answer potentially, but it’s possible to fool yourself for problems that can only be positive and don’t work when divergent. If a limit exists, then the functions converge, otherwise they diverge. For f(x) = 1/x, if f(x) has singularity at zero, integral 0 to 1 of f(x)dx = lim of a to 0⁺ integral a to 1 of f(x)dx. In general, integral 0 to 1 of dx/xP = 1/(1-P) if p<1 and diverges if not. A function can be split into converging and diverging parts, but if one part diverges, then the whole function does.

Lecture 37 – Infinite series and convergence tests

This lecture introduces infinite series which is the second to last topic. The most important series is the geometric series. One example is 1+½ + ¼ +⅛ +…= 2. Adding them all together gets a bigger and bigger number but never quite reaches two, despite being written that way. Geometric series come in the form 1 + a + a² + a³ + … = 1/(1-a) and works while a is less than 1 and greater than negative 1. General notation of these problems include S_N which means the summation of a=0 to N of a_n and is a partial sum. S is summation of n=0 to infinity of a_n = limit N to infinity of S_N. If the limit exists, the series converges, otherwise it diverges.

Series can be compared using methods like Riemann sums and integral comparisons show if f(x) is decreasing and positive, then sum n=1 to infinity of f(n) minus integral 1 to infinity of f(x)dx is less f(1) and the sum of n=1 to infinity of f(n) and integral 0 to infinity of f(x)dx, then they either diverge or converge together. For comparing limits, if f(n) ~ g(n) and g(n)>0 then sum(f(n)), sum(g(n)) either both converge or both diverge.

Lecture 38 – Taylor’s Series

This lecture talks more about series and starts off with a problem of seeing how high you can stack blocks so each is partially off of the one underneath before they fall. Is there a limit or not. The trick is to build from the top down and take each block as far left as you can. This is a greedy algorithm which is bad in computer science. It grows slowly but is possible to keep growing.

The last subject in the course is power series. The geometric series from the previous lecture is one such example. It’s easy to diagnose if there’s a convergence or not. The general setup of a power series is a₀+a₁x + a₂x² + a₃x³+ … or sum n=0 to inf of a_nxⁿ. The absolute value of x should be less than R, the radius of convergence. If |x| > R sum a_nxⁿ, then the function diverges and if they’re equal it’s borderline and won’t be used. |a_nxⁿ| -> 0 exponentially fast while |x|<R and doesn’t go to zero at all when |x|>R.

Power series can be read in two ways, the left side as a formula for right or the right side as formula for left. Rules for convergent power series are mostly the same as rules for polynomials. You can add, multiply, divide, substitute, and more for both. Differentiating and integrating power series are the most interesting calculations with them. The derivative of a power series is calculated with d/dx(a₀+a₁x +a₂x² + a₃x³+ …) = a₁ + 2a₂x + 3a₃x²+ … and integral (a₀+a₁x +a₂x² + a₃x³+ …)dx = c + a₀x + (a₁ x²)/2 + (a₂x³)/3 + …. Another formula related to power series is Taylor’s Formula. f(x) = summation n=0 to infinity of (f⁽ⁿ⁾(0))/n! Xⁿ, which is “f(x) = a₀+a₁x + a₂x² + a₃x³+ …”. The first derivative is “f’(x) = a₁ + 2a₂x + 3a₃x²+ …”, the second derivative is f’’(x) = 2a₂+3*2a₃x…, the third is f’’’(x) = 3*2a₃, evaluated at 0, the third derivative is a₃. For these, 0! = 1 to make calculations easier.

Lecture 39 – Final review

The main difference between power series and polynomials is that there’s a number R known as the radius of convergence. This is between zero and infinity so when |x|R, it diverges. Taylor’s formula, for |x|n = f⁽ⁿ⁾(0)/n, “f(x) = f(0) + f’(0)x + f”(0)/2! * x² + …”. With e^x, all derivatives are e^x, evaluated at 0, all are 1. “e^x = 1+x+x²/2! + x³/3!…” when plugged into Taylor’s Series. R is infinity in this case. Any function with a reasonable expression such as cosine, sine, can be written as a power series. Power series are good but can hide certain properties such as pi in a sin(x) series.

To multiply a power series together, such as x sin(x), you would multiply x by all values in sin(x)’s power series to get “x sin(x) = x²-x⁴/3! + x⁶/5! – x⁸/7! +…”. Since these are both odd functions, the product will be even. The derivative of cos(x)’s power series is “sin’(x) = 1 – 3x²/3! + 5x⁴/5! – …”. You can derive with Taylor’s formula as well and the same radius of convergence will be kept. An example of integrating a power series is integral 0 to x of (1-t+t²-t³…) is “t-t²/2+t³/3-t⁴/4+…”. An example of a power series substitution is e^{-t^2} substitution – take e^x power series and make x = -t² to get the power series “e^{-t^2} = 1 +(-t²) +(-t²)²/2! + (-t²)³/3! …”. The last equation given is the error function, erf(x) = 2/sqrt(pi) integral of – to x of e^{-t^2} dt. As the limit of this function goes to infinity, erf(x) goes to 1. A power series expansion of the error function is 2/sqrt(pi) (x – x³/3 + x⁵/(5*2!) – x⁷/(7*3!) +…)

Summary

Overall, this didn’t leave me feeling as confident as the previous lecture. While I definitely am familiar with a lot more concepts of calculus, I don’t yet see the connection to programming outside of some similarity with evaluating order of growth for functions. There’s still a good amount of work that could be useful in studying my notes and getting more practice solving these problems and familiarizing myself further with the concepts and formulas introduced. Hopefully, I didn’t mess too much up in translating from course to notes to paper, though given the uncertainty I still have, it’s likely that some mistakes were made. At some point, I’d like to clean up this post and potentially add some diagrams and figure out how to represent the formulas in a more appropriate way than. For now though, I want to move on to the next course in the series and come back to this as needed to hopefully get a better understanding of how math can help me in programming. The next course is called Mathematics for Computer Science, so if I get an answer anywhere, that should be it.