Saturday 9 November 2013

Week 9

Wow, term test 2 is coming up in 5 days already! This past week we talked some more about sorting algorithms (accompanied by some demonstrations by Danny) as well as their performance. We compared the speed of a number of sorts as the input size increased, and surprisingly mergesort was only 3rd, losing out to O(n) sorts like Count Sort and Timsort. Something new I learned was that big O notation was an upper bound for number of calculations as the input increases (rather than just a rough estimate of how it grows), which meant that it had more of a hierarchical relationship: anything that is O(1) is also O(logn), which is also O(n), and so on... This makes more sense after learning about big theta and big omega in CSC165.

Another thing worth mentioning is the lab this week. I must say, they are getting more challenging now! This week the first task consisted of writing a function which took a root node of a BST, and returned the first and last nodes of a linked list consisting of all of the nodes of the BST according to an inorder traversal. My first approach was straightforward: by recycling the code Danny posted on inorder traversal to work on binary trees, you could get a list of all of the values in the BST according to inorder traversal. Then, iterate through the list and to create a linked list with all of the values. Finally, return the first and last nodes of the linked list - simple enough. Alas, after writing this out in code I realized that I had missed a crucial detail - the function must create and link the linked list nodes as it visits each node inorder in the tree, making things a bit more complicated.

Using a hint from the TA ("think about joining two linked lists that are already in the correct order"), I came up with a new approach: take inorder linked list of the left subtree, join it to the linked list node instance of the root node, and join that to the inorder linked list of the right subtree. However, the more difficult part was actually writing the code. Although I eventually managed, all of the conditionals along with the helper function ended up being around 20 lines, much more than the dozen that Danny suggested would be optimal. The second part of the lab consisted of writing a function which would return the head of a linked list containing the nodes from the longest path in the binary tree. My current approach is to have the function work recursively by having the linked list node instance of the root point to the max of the longest linked list for the left subtree and the longest linked list of the right subtree. Easier said than done! It was difficult to think about recursion with linked lists since one can only modify a linked list by changing the pointer (in this case an attribute of the node) of the tail of the linked list, and then recursively calling the head of the linked list. I found it helped to write out the function in pseudocode first, using pen and paper and diagrams if needed, and then gradually working the necessary bits of the function part by part.

I didn't have the opportunity to finish the second part of the lab, but I'll keep working on it this weekend, and ask at the CS help centre if I get stuck, and hopefully come up with a nice solution for the lab.



Sunday 3 November 2013

Week 8

This week, we continued on and talked about binary search trees and touched on big o notation/some running time analysis (which we are learning concurrently in CSC165) as well. I'll be discussing mainly BSTs and the assignment which is due next week.

In essence a binary search tree is a binary tree where the left subtree of a node only contains values smaller than the value of the node, and the right subtree of an doe only contains values greater than the value of the node. This allows for a quicker search (roughly O(logn)) for most cases. Earlier in the course we encountered a data structure very similar to BSTs (I forget if it was a lab or an exercise) which was essentially implemented as a nested list instead of a tree. In both cases they essentially provided a different kind of structure to perform binary search (as opposed to the standard way on a sorted list). Instead of finding the midpoint in a list, and changing the upper and lower bound every iteration, in a BST the root is in a way the midpoint, and you traverse to the left subtree or right subtree instead of rerunning the algorithm on the left half or right half of the list. It isn't clear to me the benefit of implementing a binary search tree instead of sorting a bunch of values in a list, especially since binary search on a list will always be O(logn) whereas the BST could have a worst case runtime of O(n) if every node only has a right child. Perhaps the different variations of BSTs that Danny mentioned in class (AVL, red-black, splay, tango trees) have a better performance in more specific situations.

While going over BSTs, we also covered some of the common methods of a BST, like inserting, deleting, and the __contains__ method. They are both simpler than I expected, and involve finding the location of the node in question (or where it should be), and updating the children of the nodes around it to either delete or insert a node (I'm oversimplifying it a little). The contains method was the easiest of all, but is an extremely useful method to have, especially in a class like a BST. In the last few classes I learned a lot about useful built in methods like contains, len, repr, and str. One thing that I didn't realize was that with the insert method, the initialization of a BST with a list of values was made much simpler by just calling the insert method on each value.

A last thing that we did this week was finish up on assignment 2. I had never thought of organizing regular expressions into a binary tree, but it did seem an extremely natural thing to do. The assignment was a great way to practice recursion and manipulating/using binary trees while learning about what a regular expression is. It also provided some review on classes and inheritance. After this assignment I realized that so far the assignments have been very helpful in allowing the opportunity for the application of the topics learned in class on a larger scale and in a more challenging way than what is done in exercises and labs.

Speaking of which, if anyone was able to finish the iterative function of count_less from the lab let me know how you did it!