CSC148 Adventure Blog

Friday, April 4, 2014

Sorting and Efficiency

Recently, we’ve been discussing sorting and efficiency in lectures. Different sorting algorithms that have appeared in CSC108 and CSC165 have made a comeback in CSC148. In CSC108, we were told that there are more than one way to sort a list and that the different methods of sorting said list have different efficiencies.

Truthfully speaking, the only thing I remember from CSC108, with regards to this topic, is: “Bubble Sort is a pretty bad sorting algorithm compared to the alternatives.” The alternatives here are other sorting algorithms: merge, insert, quick,

With regards to CSC165, big-oh notation makes a comeback. From what I remember from CSC165, big-oh allows us to model how the runtime of a function/algorithm grows as the input size increases. Big-oh allowed us to obtain an upper-bound for the worst-case scenario (i.e. for some sorting algorithms, like bubble sort, it would be if the elements of the list were in reverse order). If the worst case has an upper bound, then logically, the other cases would also be bounded by the same upper-bound.

We use n to denote the number of items in a list. Thus, as n grows, the runtime of the sorting algorithm should obviously grow too (unless n is perhaps 0 or 1). The problem is that we need to determine just exactly HOW the runtime grows. That is, the runtime may not grow linearly. We can actually trace a sorting algorithm to determine if the worst-case input for the algorithm grows similarly to another function.

For example, with bubble sort, we compare each element of the list with its immediate neighbor. We then swap the two elements if necessary then move on to the next two immediate elements. When the “index” moves to the end of the list, the index resets to the beginning and does the entire process again. Only when no swaps are made during an entire pass will the algorithm be finished. With the first pass, we see that n – 1 comparisons are made. With the second path, n – 2 comparisons are made. This continues until we reach the (n - 1)th pass, in which only 1 comparison is made. Thus, the total number of comparisons is:

1+2+3+…+n-2 + n-1 = 0.5n^2 – 0.5n.

The above expression is in big-oh of n^2. However, as mentioned earlier, there is the possibility of having a BETTER or perhaps even a BEST case. With bubble sort, this is when all the items are already properly ordered. Thus, only n-1 comparisons are made, which is associated with big-oh of n. Although this is true, we need to keep in mind that very rarely will the best case happen. It’s likely that the elements of a list aren’t properly ordered. That’s why I think that it’s better to associate bubble sort with big-oh of n^2. Thus, as n grows, the bubble sort runtime will grow quadratically.

Using the same idea, we can trace the code for other sorting algorithms to obtain a better understanding of how the runtime grows as the input size increases. With more efficient algorithms, log (base 2) n is used as the algorithm splits the list into two separate parts.

On a different note, I recall that during one lecture, Professor Heap talked about a sorting algorithm that could sort a deck of cards like this:

1) Toss the deck of cards in the air and let the cards fall.

2) Pick up the cards and stack them together.

3) Check to see if they’re properly sorted.

4) If not sorted, repeat steps 1 -3.

At first, I thought that Professor Heap was joking. This sorting algorithm is hilariously atrocious (runtime involving factorials!). This sparked by interest and I found that this sorting algorithm does indeed exist and is called “robsort”.

http://www.robsort.org/

From the website says, “it is claimed to be the world’s least efficient sorting algorithm.” The questions I have now are: “Why would someone create this?” and “How does this work practically in real life situations?”

Monday, February 24, 2014

Week 7: Recursion

Now in CSC148, recursion has been introduced to us. Essentially, recursion, at least in the context of CSC148, involves functions that call upon itself to solve a given problem. Personally, this seems useful, as there are some problems that seem recursive in nature; using recursion would be the obvious solution here. Prior to CSC148, I’ve experienced a type of “recursion” in math. In MAT137, we were taught a method of proving something called Proof by Induction. With proof by induction, we show that the base case is true. Then we assume the inductive step is true for any natural number (n) and then prove that this implies that the given statement is true for the next natural number (n+1).

At first, I was slightly confused about how exactly a function can call upon itself. It didn’t make sense to be because it seemed like we would have an infinite loop. Upon looking at more recursive functions, I realized that the Base Case is the reason for not getting an infinite loop. At one point in our function, we need to tell our function to return something. This is actually the simplest case. For example, consider the function, sum_list(x, list) whose implementation is:

return sum([sum_list(x) if isinstance(x, list) else x for x in L])

The purpose of this function is to sum up all values in a potentially nested list. The base case here is the part after the “else”. This says that if the element in list is not a list (and is therefore not a nested list), then it can only be an int or float. If that’s the case, then we simply place the value itself in the list and move on. In the case that the element is a list, we call upon sum_list again as it could be nested in many layers. However, we know that eventually we’ll reach the “base case” as there HAS to be integers within a nested list. The sum() method allows the function to return a single number from the list. Therefore, our most outer layer list is actually a regular non-nested list whose elements are just ints (or floats).

On a less academic note, I am grateful for my lab partner this week as he essentially went through recursive function with me step-by-step. I didn’t really have a solid understanding, and it was clear during the lab in which we were required to write recursive functions. The biggest problem for me was picturing all the different layers when a function called upon itself; it was difficult to keep track. However, my lab partner drew pictures involving boxes and arrows, and that allowed me to get a better grasp of recursion. To those who are struggling with recursion, I recommend two things:

1) Continue to practice the approach taught in class. That is, go over the simplest or Base Case. This allows you to get a sense of what happens at the deepest of all layers. After working out what happens to the Base Case, work on a slightly harder situation. You’ll see that the base case comes into play here. Afterwards, work on an even harder case. You’ll see both previous situations come into play.

2) Draw pictures. Draw boxes to indicate different layers and arrows that point to what’s happening, depending on what the recursive function is doing.

Thursday, January 23, 2014

Week 3: Object-Oriented Programming

I can't believe that it's already been 3 weeks into CSC148. Those weeks have absolutely flown by. As a first impression of the course, it definitely feels like there will be many tears to be shed down the road. Because of the fact that I took CSC108 years ago and that I had no programming background prior to CSC108, it was difficult to quickly learn the new material taught in class.

In terms of the actual structure of lectures, I am thoroughly please at how Danny introduces new concepts. Although I felt that the lectures, in which Classes were discussed, were a bit hard to understand, I'm glad that students have access to materials online. Another positive feature of CSC148 itself pertains to the Sense of Community that is manifested due the nature of the structure of the course. Peers are encouraged to lend a helping hand, when possible, by posting and answering questions on Piazza. I find this especially useful since I took CSC108 quite a while ago.

Pertinent to Object-Oriented Programming, I completely forgot what a Class was before e1 was released. Upon reading part 2 of e1, I let out a huge sigh as I didn't understand most of the key words. "Instance", "constructor", "__init__" were such words. However, I was thankful that Danny provided us with a link that explained Classes and the various components involved. In the end, thanks to said link, I was able to trudge through the exercise without TOO much difficulty.

It was helpful in understanding the concept of Classes to think of it with concepts that I was already familiar with. For example, I knew from CSC108 that str was "type" in Python and contained many useful methods that can be used to alter strings. One can think of String as being a Class and that the functions defined under this String Class are the methods that we've grown to use and love, such as 'startswith()', 'splitlines()', etc. The most difficult aspect about classes for me personally is the notation. It takes some getting used to to type out lines of code in the format self.name.