Introduction to this series

Getting into the worlds of software development while a theatre major in Chicago, I’ve had anything but a traditional path. My journey started with online tutorials on websites like Codecadamy and Team Treehouse, to get the feeling for writing code which was something I’d always been mildly curious about but never bothered to pursue until taking a gap year from school and feeling a surprising void with the lack of classes. This was interesting enough to dig further and start going through a couple of Coursera courses including Python for Everybody by University of Michigan and attending my first PyCon. I ended up going into my senior year with the goal of being a programmer of some sort after graduation.

I started going to local coding meetups back in Chicago and did a mentorship program with ChiPy, the Chicago Python User Group learning the basics of Python building out a heavily simplified version of Magic: the Gathering. After failing to get work at first, I signed up for Hack Reactor which was a (well beyond) full time coding bootcamp focused on teaching full stack JavaScript applications with a good mix of classroom learning and working in groups to develop full stack mobile and web apps. After this I was able to start working initially in app development while later focused more in the worlds of cloud and AWS.

At this point, more than 5 years since I started learning programming, I’m fairly confident I’ll be spending the rest of my life learning about programs even if the specifics change. At the same time, programming never really caught my interest prior to college. Likely the rise of AI in the news and a year spent travelling abroad made me start seriously looking for a way I could work remotely, luckily around the same time that a lot of new e-learning platforms started popping up giving me a way to learn like I couldn’t before.

Something I’ve wondered the whole time is what I missed by not going the traditional route. The idea that I could go back to school and get a computer science degree has crossed my mind a lot, but as a theater major at a college which would accept anyone, it’s hard to believe that I could get into a top tier tech school or that it would impact my career much. Having gone through college myself, I also know that the quality of a class is generally based on the quality of the teacher with high levels of variances and costs that are impossible to justify especially after making the switch to jobs where I get paid to learn. While these have led me to gain a negative view of traditional education, I also know that some colleges provide a lot to the field of technology and host meetups that I’ve attended to help my learning. University of Michigan’s Python for Everybody was an amazing introduction to Python and programming. Stanford has one of the most famous Machine Learning courses and Harvard’s CS50 feels more like a party than a course.

Most importantly for this article, MIT’s OpenCourseWare provides many of MIT’s lectures and course materials for free to the public, which is amazing and is what I hope to use to answer my question. After a couple years of having this idea floating around in my head, it’s time to start it. The goal of this series is to find out what knowledge I’m lacking in that MIT would teach but that I haven’t come across in my coding journey yet. For this series, I plan to go through each course in MIT’s Computer Science degree program and their prerequisites (not including gen-eds) and summarize them and talk about what I’ve learned. This is going to be a long process, but if I’d started it when I first had the idea, I might know the answer to the question that’s been bothering me for awhile now.

While I’m glad that MIT is providing these resources for free, this obviously won’t be a full match to an actual MIT experience as I won’t have the rigor of having to study for tests and won’t have access to any office hours with teachers. Part of the variance in class quality is likely that I’d miss the opportunity to interact with amazing teachers. I’d also imagine there would be a lot of opportunities for hackathons and learning opportunities outside of class that would be more valuable than I can imagine. Nevertheless, I feel that I have a lot to learn from one of the US’s top technology schools and am excited to find out what that is.

MIT Courses #1 – Introduction to Computer Science and Programming in Python

6.0001 – Fall 2016 by Dr. Ana Bell, Prof. Eric Grimon and Prof. John Guttag

I had to start somewhere and an introduction to Computer Science and programming seemed like the right place. I should probably be relieved that there wasn’t anything new here but this served as a good refresher to some basics as focusing so much on AWS had taken my mind away from basic programming. Overall, this course covers a basic overview of the Python programming language and how to write programs along with touching a bit on algorithms and how to measure them. I’ll break down the course and what it teaches similarly to the Builder’s Library posts in a way that summarizes their key information in a way I can stay consistent with across courses, even though this course is a lot of repetition from when I was starting.

Lecture 1 – What is Computation

Breaking the course down by lectures, the first class touches on how computers work, what programming languages are as well as introducing basic data types and functions. Honestly, the inner workings of computers and their hardware is something I hope to learn more about in the future, but this is the only point it’s touched upon in this course and not in much detail. The general idea here is just that computers perform calculations and can remember results. Computers only do what you tell them to do and programming languages can be used for that.

Computers are split into two types, fixed program computers, which can only do already set tasks (ex. calculators), and stored program computers which can store and execute instructions given (ex. PCs). You can use languages to tell them facts as well as sequences of steps to run. While in math you typically solve for an unknown, in programming you list all steps to get to a result. The basic infrastructure for a computer is memory, a control unit, arithmetic logic unit and input and output. When you enter an input, the control unit gets the program steps from memory to figure out what it needs to do, then contacts the arithmetic logic unit which can get data from memory and the control unit manipulates the program counter to the appropriate next step when done.

The stored program counter stores a sequence of instructions built from primitives such as arithmetic and logic, simple tests and storing data. The interpreter program executes each instruction in order and can change flow of control while moving through sequence. Current programming languages have more convenient sets of primitives than Turing’s six (move left, right, read, write, scan and do nothing) and anything computable with one language is computable with another. Expressions in programming are legal combinations of these primitives.

Programming languages are compared to English in that nouns and verbs are comparable to data types and operators and that correct ordering is required for syntactically valid structures. Static semantics errors occur when a string is valid syntactically but won’t work due to specifics or do something different than you intended. Python is the focus of this course and a program is made up of definitions and commands and executable via shell or another program.

Everything in Python is an object with varying types based on thow they can be used. Some objects are scalar such as int and others such as lists and dictionaries have an internal structure that can be accessed. The type function can be used to see the type of an object and various object types are described here. Expressions are formed by combining operators and objects and typically follow the syntax, “object operator object.” Various operators exist to work with integers and floats such as basic mathematical symbols and comparison operators. The equal sign is used to assign a value to a variable name, with the value being stored in computer memory and being bound to the name for reusability throughout a program. Variables can be assigned to a new value with a new assignment and the previous value will be stored in memory until cleaned up by garbage collection.

Lecture 2 – Branching, Iteration

This lecture goes more into detail on different aspects of Python such as working with strings and how they can be added together or multiplied to repeat values. Python’s input command can get input from a user’s input given a certain prompt and stores it as a string which can be converted if needed. Comparing strings based on greater than, less than or equality or inequality can be done with strings and numeric values. With strings, it compares alphabetical order. “Not” and “or” exist for booleans as well.

If/Else statements are touched on here, evaluating conditions as booleans and going down the appropriate path, with “elif” used to set a second backup condition. Python forces indentation and uses spacing to determine which code is part of a block. For loops and while loops are used to run code a certain number of times or while a condition is true respectively. The range function can go once for each item passed to it or be customized with start, step and stop inputs. While for loops will normally stop automatically, the condition a while loop is set on will need to change it’s condition manually for it to stop. The break statement can also be added to exit the innermost loop skipping remaining steps. While requiring more added control to avoid an infinite loops, certain situations aren’t appropriate for “for loops” such as the example of Legend of Zelda’s lost woods which will display the same screen as long as the user goes right and could require infinite for loops while a while loop could end when the user goes left.

Lecture 3 – String Manipulation

The third lecture talks about manipulating strings and talks about some algorithms for solving problems. Strings are a series of case sensitive characters that can be sliced and indexed at specific positions are reversed, with the condition that the string itself can’t be modified. You can set characters in a string to a new variable with creative indexing which takes a start, stop and step similar to range. Calling “example[n]” can give the nth element of the example string, “example[::-1]” will reverse the string since it’s passing a default start and stop value along with a custom step of negative one, which will cause it to go backwards. Strings can also be looped over with “for char in example.”

The rest of the lecture details a couple types of algorithms. A guess and check method can be used if the answer’s correctness can be checked and there’s a realistic boundary for guessing. You update the guess at an interval and guess until close enough to correct. An example includes guessing cube roots until guessing a number that’s greater than or equal to the cube, with the limitation that it can only tell for a perfect cube or gives an approximate solution. Approximate solutions can sometimes be good enough depending on the need, trading off speed with accuracy. A second example shows a problem where it attempts to find a solution in a range, with an interval that skips the range entirely and warns to be careful when choosing acceptable ranges and intervals to avoid infinite loops.

Lastly, this lecture introduces the bisection search. This search algorithm looks for a number in a sorted list and repeatedly guesses the middle number, splitting the list in half each time. The search space splits in half every time and has decreased complexity increases as the input gets larger. This is touched again later once time complexity is talked about.

Lecture 4 – Decomposition, Abstraction and Functions

The fourth lecture talks about ways to break down functions to make them more readable. This talks about how more code isn’t necessarily better and functionality is more important. Adding functions and modules are introduced here as well as some basic testing concepts. The example of a black box is given where you shouldn’t need to know how a function works to use and test it. It should be documented with an input, description of what it does and an output. Functions and modules should be designed this way for better reusability and testing. Decomposition refers to splitting up the code and abstraction refers to the practice of separating the usage from the code itself with function specifications and docstrings. Functions are a reusable piece of code that are defined and not called until they are invoked with optional params. A return statement inside can return a value after the function is called with a default of None. Since everything in Python is an object, functions can be passed and called in other functions. Docstrings can be defined with triple quotes and is a multiline comment.

This lecture also introduces the concept of scope referring to where in the code a variable is recognized. Inside a function call, a new scope is used with input and output being used to send values between scopes. When a program starts, a global scope is used, with a function definition and other variables defined there being set in that scope. The function call itself starts a new scope setting input parameters to their named values. Inside a scope a variable from outside can be called if a local variable doesn’t exist but a value from outside can’t be modified. A variable with the same name as one from the parent scope can be defined in the new scope and not affect the parent. It’s possible to get around these scope limitations with global variables, but not recommended.

Lecture 5 – Tuples, Lists, aliases, mutability and cloning

Lecture 5 talks about compound data types and how their interaction with a system’s memory requires different ways of referencing them. Tuples and lists are detailed here for the first time, with tuples being an immutable (unalterable) ordered sequence of items represented by being enclosed by parenthesis, such as ( 1, 2, 3 ). Lists are represented with brackets and act similar but are changeable and denoted with square brackets instead, such as [1, 2, 3]. Both can be referenced by index and contain items of different data types. You can set a list’s specific index to another value but not a tuple’s, and it won’t change the list object in memory. Tuples are shown to be a convenient way of swapping variables around via (x, y) = (y, x) since a temporary value would need to be added otherwise. Adding two tuples or list will combine each’s elements into a new tuple or list. Both tuples and lists can be iterated over. As a note, I’m not sure I knew that “//” meant division ignoring anything returned after the decimal.

Lists have built in methods such as append which can add a new item to the list. They are a property of the list object itself and are referenced via list.method(parameter). The extend method can add a second list’s elements to the first without creating a new list. You can also remove items from lists with functions such as calling “del” on a lists index, using the list’s pop method to remove and return the last element or its remove method to remove a specific element. Lists can be converted to strings and back. The list function or a string object’s built in list method can give a list of characters and a string’s split method can split the string on a specified character. A string’s join method can join items in a list passed to it on the calling string. For instance, “_”.join(list) would join the list items on an underscore. The last list methods talked about are sort, sorted and reverse. Sort sorts the list in place, while sorted leaves the calling list alone and returns a new sorted list. Reverse reverses the existing list.

Lists have an interesting property with how they’re stored. The value is tied to an object in memory and isn’t stored the same way as int or scalar value. Because of this setting another variable to the list’s variable will point to the same object in memory and changing one will change the other. The same behavior with an integer wouldn’t have the same effect and each variable could be changed separately. In order to avoid this, you can create a copy of a list via indexing, such as “list2 = list1[:]”. Lastly, it’s best to avoid mutating a list while iterating over it as the counter a loop is iterated over is calculated before it is run, so changing the number of elements in a list during a loop will cause it to be ignored. A suggested way around this is to make a copy of the list that is altered and used at the end.

Lecture 6 – Recursion and Dictionaries

This lecture talks about recursive algorithms and Python’s dictionary object for the first time. Recursion is a method of divide and conquer which reduces a problem to a smaller version of the same problem. Recursive functions will include a call to themselves inside of their function definition. To avoid infinitely calling themselves, there should be a base case that the function handles differently. Solving for a number’s factorial is a common example of this where the goal is to multiply a number by all positive integers smaller than it. In the function’s base case, where the input, “n” is 1, it simply returns 1. Otherwise, it returns “n” multiplied the result of the function called again with “n” minus 1 as the input to the next call. Eventually, this causes “n” to be multiplied by all integers lower than it.

Going back to scope, each call creates its own scope with the other scopes knowing the values passed and and returned. The same variable name is used in each with different values that don’t interfere with each other. The previous scope will be entered again once one function call returns its value. A problem will work recursively if it has a base case it knows it can reach. To prove the function works for all values of “n”, prove it’s true for the smallest value, then an arbitrary value and then that plus 1.

The next recursive algorithm introduced here is the Towers of Hanoi. The board is set up with three spikes and a stack of disks stacked with from largest to smallest on one spike. The goal is to recreate the same order on another spike moving oneat a time and never covering a smaller disk with a larger one. The given algorithm has a move function which takes a “from” and “to” value. The main function takes “n”, the number of disks, and the “from,” “to,” and “spare” towers. The base case is where there’s a single disk and it can simply move from one disk to another. Otherwise, the algorithm is solved recursively three times by calling itself on “n” minus 1 on each arrangement of towers. Eventually one recursive path will reach the base case and make the move to solve the puzzle.

Fibonacci numbers were another given example solvable with recursion. Starting with a newborn pair of rabbits, every month each pair matures if not mature or makes a new pair of rabbits. Assuming they keep breeding forever, how many pairs are there after a given number of months. In the base cases of 0 and 1 months, 1 is returned. Otherwise two iterations of the function are called and added together, one on the number minus one and the other with it minus two. Recursion is shown in non-numeric cases such as checking if a string is a palindrome by comparing its first and last letters and calling the function on the string without them. The base case occurs when the length of the string it was called on is 1 or 0.

After a good amount of recursion examples, dictionaries are shown for the first time in the course, defined by { “key”: “value” }, with pairs separable by commas. This data structure makes it easy to set a named key to a value allowing lookups by key. They’re mutable and a key can be set to a new value and referenced using a list’s bracket notation or with dict.key. They can also be deleted with the “del” function. Dict.keys and dict.values will return an iterable list of keys and values acting as a tuple, without a specific order as dictionaries don’t maintain order. Dictionary values can be any type and can be duplicates, while keys must be immutable types and unique. Dictionaries can be used to store previously done calculations on an input value to save time on calculations. This process is known as memoization.

Lecture 7 – Testing, debugging, exceptions and assertions

This lecture talks about defensive programming and writing tests and handling issues yourself before the program does. Testing and validation should be done to compare input and output to expected values to see if it’s working. If not, you can debug by looking at the events that led up to the problem. It’s advised to design code to help with this from the start, modularize code and test separately and document input and output constraints along with assumptions. Tests that can be run on code that’s expected to work include unit tests, validating separate pieces and functions, regression tests, added as bugs are found and catching previously fixed errors, and integration testing, which makes sure the whole program works. Test the program using intuition about its natural boundaries. Random testing can help find unforeseen errors, but black box and glass box testing are better.

Black box tests are ones designed without looking at code and can be done by someone other than the implementer to avoid bias. This testing should be reusable even if the implementation of the function changes. These should be built based on natural partitions and test expected boundary or extreme values and all types of inputs. Glass box tests on the other hand are test cases based on code, making sure all code is tested and loops are entered varying amounts of times. Access all code paths and still still test boundary cases.

Debugging is an important skill to learn as writing bug free code is important. Multiple tools can help with this such as editors and print statements to test expected values of variables and inputs and outputs of code. Placing prints halfway through code help split the potential space the error could occur using a bisection method. It’s important to study the program code and ask how an unexpected result came about and a more scientific method of forming a hypothesis and testing with the simplest input can be used as well. Error methods can help make debugging easier, with different messages appearing based on the problem. Index errors appear when accessing an index not in a list, type error means an inappropriate type was used, name error occurs when referencing a non existent variable and syntax error means the code can’t be run due to a typing issue such as forgetting to close parenthesis. If not obvious, it’s can be good to figure out what’s happening and potentially talk to a “rubber ducky.” Avoiding errors is one reason to write, test and debug function by function making sure code works along the way and making debugging easier. It’s also important to backup code before changing so you can revert to an older version.

When a process hits an unexpected error, it will give a named exception based on the type of problem. One way of handling different error types proactively is to use try/except blocks which will run code in a try block unless there’s an error, where the except block is run. Multiple except blocks can be specified with different error types (ex. “except ValueError:”) allowing for more finely tuned feedback or handling of different error types. Additionally, the else block can be run after try completes without exceptions and a finally block will always run regardless of exceptions.

When receiving an error, it’s important to choose how to handle it. Some tips are not to return special values and check for them and to raise an exception when unable to produce a sufficient result. Python’s ‘raise ValueError(“custom message”)’ can be used to raise a custom error of your choosing. In some cases, using a default value in case of an error may be acceptable, but it all depends on your use case and the specific issue.

Lastly, assert statements can be used to make sure your expectations of functions and inputs are correct. You can put an assert function such as ‘assert flag, “custom message”’ to raise an AssertionError if the specified assert function evaluates to false (in this case if flag is false). Use this to ensure that inputs and outputs are as expected and it can help spot bugs as a supplement to testing.

Lecture 8 – Object Oriented Programming

Everything in python is an object and an instance of a type and an internal data representation (primary or composite) and a procedure set for interacting with it. For example, “hello” would be an instance of type string with properties and methods given to it by its type. The object itself is a data abstraction from the internal representation giving you special ways of interacting with it and hiding how it’s internally built. For instance, a list object is represented internally as a linked list of cells. One value in the list has a pointer to the next item which has a pointer to the one after that. Still, you can manipulate a list or get it’s length without knowing how python makes it work and the correct behavior could be compromised if you tried messing with its internal representation directly.

Some advantages of object oriented programming are that data can be bundled into packages with procedures to work on them with well defined interfaces. They help divide and conquer development by implementing and testing class behavior separately and modularizing helps reduce complexity. Classes are a way of creating your own types and make it easy to reuse code. Each class has its own variable environment meaning no collision on function names. Subclasses can gain properties of the class they’re based on via inheritance along with gaining functionality of their own for further customization.

To use a class, you first create it by defining its class names and attributes and then instantiate an instance of the class to use its methods. The class keyword can be used to create a new class object with the parent object passed in. For instance, “class Coordinate(object):” create a Coordinate class based on a basic Python object. The “(object)” part of a base class seems to be outdated from the example, with “class Coordinate:” being more common. In defining this, you can add custom data attributes, x and y values for a coordinate and methods which are functions that define how you interact with the object. A special method called __init__ is defined for adding instance variables to the object.

    class Coordinate:
      def __init__(self, x, y):
        self.x = x
        self.y = y
  

Self refers to an instance of the class and is always the first parameter. You can create an instance of the class (instantiating an object) with “c = Coordinate(3,4) which will give it an x and y value of 3 and 4 respectively. You can define other methods on a class similar to init, saving double underscores for methods you only want to call internally. For instance, if “c” had a calculate method, you’d type “c.calculate(param)” to call it. Data attributes are accessed in the same format, ex “c.x” and “c.y.” You can use double underscores similar to init to define methods to be used internally and not meant to be called directly. __str__ can tell your class what to do when printed.

Overall, object oriented programming is a useful way of bundling objects together with common attributes and procedures to operate. You can use abstraction to separate knowledge of an object’s implementation from how it’s used via internal methods and properties. The next lecture also touches on object inheritance which can reuse classes and further customize them.

Lecture 9 – Python classes and inheritance

Two additional methods of abstraction that are commonly used in classes and getters and setters which return and set data attributes respectively, allowing external data access to remain the same even if the class’s internal representation changes. This is recommended over dot notation for data attributes. Unfortunately python makes information hiding more difficult as it allows you to access, update and create new data attributes from outside class methods, all of which is bad style. This section also shows that default arguments can be set in a function with ‘def func(x=”default”):,’ which would set “x” to default if no argument is passed to func.

Classes can have parent or superclasses and child or subclasses. A child class inherits all data and behaviors from its parent and can add additional information and add or override behavior. To create a subclass, pass its superclass as an argument when creating a class, “class subclass(superclass):,” and the new class will have all properties and methods of the superclass without an init needed, though providing one can be used to add new data points. If you do this and override init, be sure to call the superclass’s init function in the init function to set properties from the parent, ex “superclass.__init__(self, param)” and add new variables in the following lines. When accessing data attributes or methods, the program will check for a local definition on the current class and then check the parent or grandparent, using the first definition found.

Class variables can be set on a class and shared among all instances, by defining them in the class, but outside of init. A given example is of a rabbit class where each rabbit has a tag number and a running total of rabbits is kept on the rabbit class. In addition to __str__, special methods such as __add__ can be redefined to tell Python how to add two class object together with your own implementation. Adding a new __eq__ method can tell Python how to compare objects. Lastly, some classes such as random can be imported as Python libraries as well.

Lecture 10 – Understanding Program Efficiency – part 1

This lecture focuses on reasoning about an algorithm’s efficiency in comparison to the growth of its input. Computers may be fast, but today’s datasets can be so large that simple solutions won’t be able to scale. There’s a trade off between time and space efficiency of a program as sometimes, pre-computed results can be stored and looked up via memoization. This lecture will focus solely on time efficiency. Some challenges in understanding a solution are that a task can be implemented in different ways, you can usually solve a problem with few algorithms, and would like to separate algorithm types and implementation choices.

A few ways to evaluate algorithms are measuring time with a timer, counting the number of operations done and an abstract notation of order of growth referred to as Big O that will the focus of this section. The timer method can be used by importing Python’s time module and starting and stopping it before and after a program run respectively. It’s major drawbacks are both that it varies by implementation of an algorithm and that it varies based on computer speed. It’s going to be hard to express a relationship between input and time with a timer method. Counting the number of operations is better but has some drawbacks still. First, it assumes that each step takes a constant amount of time. It is able to compare input size to time in a sense, based on the algorithm and regardless of computer speed, but it still varies with each implementation and there isn’t a clear way to determine how to value different operations.

An example of a problem that could be analyzed is a program that loops through a list until it finds a specific item. In the best case, the first element has the item. In the worst case the item isn’t in the list, and on average, it’s halfway through the list. Generally the worst case scenario is the most useful to focus on improving and the goal is to put a tight upper bound on the growth of the function’s run time as the input size increases. Using Big O notation to give values to different algorithms based on how they grow isn’t exact but it’s a good way of measuring the upper bound on asymptotic growth, or order of growth. Here, we’re describing the worst case which occurs often and is an algorithm’s bottleneck.

This evaluates the overall algorithm caring less about the implementation and ignores the machine entirely. This course also avoids additive and multiplicative constants, for instance, an evaluation of 5n + 1 would become O(n) meaning the algorithm grows linearly in comparison to the input size’s growth and focuses on the most rapidly growing part of the algorithm. What’s important is which value will be bottleneck later which will be the section that grows the fastest. A constant, O(1) doesn’t change, logarithmic, O(log n) grows slower as input size gets bigger, linear, O(n) has a steady rate, log linear, O(n log n) a combination of the previous two is slightly worse than linear, quadratic, O(n^c), grows quicker, and exponential, O(c^n) grows rapidly and is the worst case scenario. If a program has a part that matches two of these cases, the one that grows quicker will represent the function. A few examples are given of adding and multiplying orders from different parts of the function, though they will end up boiling down to one of the above cases. In general, loops are typically linear while nested loops are typically quadratic. Some list access functions can be varying degrees of constant time.

Lecture 11 – Understanding program efficiency – part 2

Continuing on the previous lecture, this lecture talks about analyzing the order of growth of algorithms and representing them with Big O. While constant time, which doesn’t care about input size, is technically the best, it’s rare to have a useful algorithm with that complexity. Typically the best case algorithm is a logarithmic algorithm, such as bisection search which is able to split its input in half multiple times to narrow it down. One type of this is binary search which divides a sorted list in half and asks if the target is larger and smaller than that value, searching the left or right half based on that. A recursive implementation of this is used though it’s important to note that keeping track of pointers is better as slicing copies a list and leads to a linear result. While the goal is not to care about implementation, this is a case where an implementation can change a complexity. Another logarithmic function is a function that converts an integer to a string by dividing the integer by 10 and adding the remainder to the front of a string.

Iterative and recursive factorial functions are both examples of linear algorithms as they require the same process to be run again for each value the input grows by. Polynomial and quadratic are common when nested loops exist and exponential occurs when multiple recursive calls occur in a function such as the Towers of Hanoi having three recursive calls each iteration. Another exponential example is that of a power set algorithm which gets all subsets of elements in a list. The problem can be solved by solving the smallest case and adding with a duplicate list with the next element in all its answers.

Finally, the lecture talks about some complexities of common list and dictionary functions. Lists’ indexing, storing, checking length, and appending functions are all constant while removing, checking equality, copying, reversing, checking for an item in a list and iterating through a list are all linear. Dictionaries have a worst case for indexing, storing, getting length, deleting and iterating of linear with an average case of for all but checking length that is constant. This is due to the inner workings of dictionaries and the magic of hashing.

Lecture 12 – Searching and sorting

The last lecture in this course focuses on searching and sorting algorithms and their efficiencies. Search algorithms are a method of finding an item or group of items with specific properties in a collection of items. The collection can be implicit, such as finding a square root, or explicit, such as a student’s record in a collection of records. Linear search is a brute force algorithm that doesn’t require a sorted list but will have to check every item in a list in the worst case and has a linear order of growth. Bisection/binary search require the list to be sorted but can provide a logarithmic order of growth which is much better. Sorting requires work of its own but can make the search algorithm more efficient, and can be useful when searching multiple times on sorted data, making the sort inconsequential.

Getting into sorting algorithms, monkey sort or bogo sort is a first example and a terrible one. Also known as stupid and slow sort, this requires randomly shuffling the items until they are properly sorted, which becomes a near zero percent chance with more items though can be linear extremely rare best case can occur if its sorted first try. Bubble sort is a better one which goes through a list comparing consecutive pairs, swapping them so that the smaller is first. It goes through the list again until no swaps need to be made. The largest item will always be at the end of the array after each pass, so at most “n” passes are required. In the worst case, nested loops make this quadratic. Selection sort works by finding the minimum element, putting it at the first position and then repeating the function with the remainder of the list, keeping the left side sorter. This is quadratic as well due to multiple loops.

Finally, we get to merge sort, the best “worst case” algorithm. Merge sort works by recursively splitting a list into smaller lists until separate items and finally merging them back together in the correct order. It compares the smallest elements in each list, building out a result list until one of the two lists is empty. The number of comparisons is at most the number of elements in each list. The time complexity of merge sort is O(n log(n)) where n is the length of the list. Dividing the list in half with each recursive call is logarithmic, but each recursion level is linear.

Summary

Overall, this course covers quite a few topics relating to programming, data structures writing and testing functions, and analyzing different algorithms briefly. While nothing is new or particularly in depth, it’s definitely a good introductory programming course. Looking back at some of my earlier posts, I had one on time complexity, talking about big O, and one on solving the N-Queens problem with recursion from my time at Hack Reactor. I remember learning about memoization and trying out assigning a list variable to a new variable and trying to change it for the first time. More recently, I wrote a blog post comparing sorting algorithms after starting MIT’s Introduction to Algorithms course, which I’m looking forward to going through in full.

While it may be all review, this is still the first course I’ve completed in full and marks the shift from this project being an idea I’ve had in my head to a project I’m committing to and invested in. I’m looking forward to revisiting Calculus and Introduction to Algorithms, which I’ve started in the past but never finished and making a point to finish what I start this time. I’m also looking forward to all of the MIT courses I haven’t started as I’m suspecting this will be the only one to be fully review.

Leave a Reply

Trending

Discover more from NikCreate

Subscribe now to keep reading and get access to the full archive.

Continue reading