Programming , Algorithms and Random Stuffs: April 2010

Tuesday, April 20, 2010

Young Tableau

Young tableau consists of rows and columns of numbers in the form of a matrix. The special property of young tableau is that its rows and columns are sorted. Young tableau can also function as a min heap. There are different interesting algorithms relating to young tableau.

Example of young tableau.

1 2 3
4 6 7
5 7 12

It's clear from the example that each row and column is sorted.

Here is one interesting problem on Young Tableau: How to find an element in an Young Tableau?

Naive search on the table would require O(mn) time. Where m is the number of rows and n is the number of columns. Can we do bit better? Turns out we can. The trick is to start from the top right corner> if the element we are searching for is greater we move down in the column. If the element we are searching for is smaller we move left in the row. We repeat the procedure until we find the element or we end up in a row or column beyond the table. This would translate to a simple code below:

The example assumes 3 X 3 matrix for simplicity. row is the number of rows and col is the number of columns in the matrix and n is the element being looked for.


int find_n(int x[][3],int row, int col, int n) {

 int r =0 , c = col -1;
 while( r < row && col >=0) {
  if(x[r][c] == n)
   return n;
  else if ( x[r][c] > n)
   c--;
  else
   r++;
 }
 return -1;
}

Another problem in Young Tableau is extracting minimum element. The minimum element is the element in (0,0) position. The main problem is replacing the spot (0,0) after minimum element has been removed. The code is given below (sorry that i have it in Java because I did these two things separately but still i believe it is simple enough so i thought it was not necessary to put both of these in same language)



/*
a is young tableau, i and j are current position in the tableau that we are filling
m and n are number of rows and column
*/
void youngify(int[][] a, int i, int j, int m, int n) {
  if (i > m || j > n)
   return;

  int x = -1, y = -1;

  if (i + 1 < m && a[i + 1][j] < a[i][j]) {
   a[i][j] = a[i + 1][j];
   x = i + 1;
   y = j;

  }
  if (j + 1 < n && a[i][j + 1] < a[i][j]) {
   a[i][j] = a[i][j + 1];
   x = i;
   y = j + 1;
  }

  if (x != -1) {
   //the empty space is filled with marked
   //by maximum value a integer can hold
   a[x][y] = Integer.MAX_VALUE;
   youngify(a, x, y, m, n);
  }

 }

Inserting an element into non full young tableau follows the similar process as extracting minimum element. The idea is to put the element into last row and column and move it upto the position where is should be.

Sorting can also be implemented for n X n young tableau in O(n ^3) time by using the operation by which we extract minimum element. The idea is to extract minimum element which takes O(n+n) = O(n) time, IF we repeat this for n^2 element it will take O (n ^3)time.

Finding overflows during arithmetic operations

We know that data in computers have limited size. For example, integer may be limited to 4 bytes. So, while writing programs sometimes we might need to check for overflows to make sure that the resulting value fits the size. But finding overflows are easy.

Finding overflow during addition

Lets say you are adding number a and b. Lets say maximum value that the types of a and b can hold be MAX. Then we can check for overflow using following condition:

if ( (MAX - a) < b then the result overflows else not.

Finding overflow during multiplication

One example of its use is while computing factorial. Lets say you want to compute n!. But you also want to make sure that appropriate error is thrown when it overflows. This is also quite simple. Lets say again that the maximum value a data type can hold be MAX and you are multiplying a and b. The result overflows if MAX/a < b.

Sunday, April 18, 2010

Finding median or Kth element in Binary Search Tree (BST)

Finding kth element in BST is quite simple. We know that all the elements in left subtree is smaller than all the elements in right subtree. If we count the number of elements in left and right subtree then we know in which subtree the required element exists and then we move to this subtree and repeat the same steps again. For example starting from root we are looking for 11th element. Lets say left subtree has 7 elements and right subtree has 6 elements.Then as the number 11 is greater than 7 we know that the 11th median exists in the right subtree. Now in the right subtree we look for (11 - 7 -1 = 3)rd element (Why?? because we have discarded left subtree). We proceed similarly in the new subtree until we get the kth median as the root.

The example described above can very well be described in recursive form. The code below is a complete program in recursive form.

  
 //find kth element in a BST recursively
Node * find_kth_element(Node *tree, int &k) {

 if(tree == 0)
  return 0;

 int count = 0;

 //count the nodes in left subtree
 count_nodes(tree->left_, count);

 //check if the root is median
 if( k == count + 1)
  return tree;

 //check if median falls on left subtree
 else if (count >= k) {

  return find_kth_element(tree->left_,k);
 }
 //the median falls on right subtree
 else {
  k = k - (count + 1);
  return find_kth_element(tree->right_, k);
 }
}

Iterative Solution

We know that recursive solutions are not that efficient and it's always good to avoid it if it's possible. The problem however is equally simple in iterative form as well. The code below translates recursive form into iterative form.

  
 

//find kth element iteratively
Node * find_kth_element_iterative(Node *tree, int &k) {
 if(tree == 0)
  return 0;

 Node *node = tree;
 int count = 0;
 int pos = k;


 while(node != 0) {
  count = 0;

  //count nodes on the left subtree
  count_nodes(node->left_, count);

  //check if root is the median
  if( pos == count + 1)
   return node;

  //check if median falls on left subtree
  else if (count >= pos)
   node = node->left_;

  //median falls on right subtree
  else {
   pos -= (count + 1);
   node = node->right_;
  }

 }
 return 0;
}

Complexity

What's the complexity of above algorithms? While finding median You are moving along a path in the tree, so in the worst case it can be height of the tree (h) and also you are counting the number of nodes in left subtree at each step. Which requires O(n) time in the worst case. Thus the overall algorithm is O(nh).

Is there a way to improve it? Yes, of course ( at least one technique I know of). If you need to keep on computing the kth median on a single tree many times, then you can construct a tree with a count of elements in its subtree. If you do this the counting of number of nodes in left subtree takes O(1) time. Effectively the algorithm will reduce to O(h) algorithm. In case tree is balanced this is just O(log n). Not bad :).

Counting the number of ways BST can be formed

The problem is posed as follows: Given numbers from 1 to n, In how many ways can you form a BST. For example if number given is 1 to 3. How many ways can we proceed? We can break this down to problem where we count number of possibilities when root is 1,2 and 3 respectively. And then we sum all those values to give total number of possibilities. Now when 1 is root, we can proceed to count possibilities when 2 is the root for the right subtree of 1 and when 3 is the root of the right subtree of 1. This gives us the fair idea that we can proceed recursively and count the possibilities. For left and right subtrees we multiply each other to give total number of possibilities. This translates into simple code given below:

  
//compute how many BST can be formed for given number of inorder, no duplicates
int comb_count(int begin, int end) {

 if(begin >= end)
  return 1;
 int count = 0;
 for(int i = begin ; i <= end; i++) {
  //take this as root and find how the 
//nodes can be arranged on the left or right
  count+= comb_count(begin, i - 1) *  comb_count(i + 1, end);
 }
 return count;
}

Saturday, April 17, 2010

Inserting a value in sorted circular linked list

Inserting an item in sorted circular linked list is tricky. If the elements in the list are unique then there are 3 cases to consider:

Case I: The new element to be inserted falls between 2 elements in the list.

Case II: The new element is larger than all the elements in the list.

Case III: The new element is smaller than all the elements in the list.

For cases where list is empty or list has only 1 element, it doesn't matter where you insert the new node.

In the solution below I assume ascendingly sorted list and no duplicates are considered. It would be easy to extend the program for that but it would grow a bit complex. So I have only presented code for the simple case. I might post another version with all cases handled later.

//linked list node
struct LNode {
LNode(int val_):val(val_),next(0){}
int val;
LNode *next;
};
LNode* insert(LNode *list, int val) {

//construct new node for this value
LNode *newNode = new LNode(val);

//check if list has no item
if(list == 0){
newNode->next = newNode;
return newNode;
}
//check if list contains only one item
else if (list->next == list) {
list->next = newNode;
newNode->next = list;
return newNode;
}
LNode  *temp, *node = list;
node = node->next;
while(list != node) {

//Case I: val falls between 2 values in the list
if(val > node->val && val <>next->val){
break;

}
//case II: val is greater than all the values in the list
else if(val > node->val && node->next->val <>val) {
break;

}
//case III: val is smaller than all the values in the list
else if( val <>next->val && node->next->val <>val) {
break;
}
node = node->next;
}

temp = node->next;
node->next = newNode;
newNode->next = temp;

return newNode;
}

Handling duplicate values

Capability to handle duplicate values can easily be added to the above code by replacing greater than and less than in value comparisons by greater than equal to and less than equal to. The new conditions will be:

  //Case I: val falls between 2 values in the list
if(val >= node->val && val <>next->val){
break;

}
//case II: val is greater than all the values in the list
else if(val >= node->val && node->next->val <>val) {
break;

}
//case III: val is smaller than all the values in the list
else if( val <= node->next->val && node->next->val <>val) {
break;
}

When how the list is sorted is not known

When how the list is sorted is not known you can add similar conditions for descending sorted list. But first you need to determine how the list is sorted. The sorting order can be determined by traversing elements until we find 2 increasing(ascending) or decreasing (descending) numbers in sequence. Why two? Because with one you never know if it is at the end of the list where the element order reverses. But when list has 2 elements, you never know how the list is sorted. Consider for example list 1, 2 as circular (1->2->1), then you have no way of telling if it is sorted in ascending order or descending order. Same is the case when list has more elements but each of them have same value except one (e.g 1,1,1,1,1,1,1,3).

Friday, April 16, 2010

Serializing and De-serializing Binary Tree

Serializing a binary tree is easy. Only thing we need to do is to serialize null pointers as well. So a preorder traversal of the tree with a special symbol indicating null left or right child of the tree will be sufficient to reconstruct tree. In the example code below we indicate null by ~0 (which is the number with all 1 binary digit. For example if you have a tree like A->B->C (where B is a right child of A and C is a right child of B), the serialization will produce A NULL B NULL C NULL NULL. During reconstruction you start from the first element and proceed creating left child until you hit NULL. After we hit NULL, we drop it and look for nother element. If another element is not NULL we create right child. The process is straight forward and can easily be understood from the code.

  

std::vector<int> queue;

void serialize_bst(Node * node) {

 if(node == 0 )
 {
  queue.push_back(~0);
  return;
 }

 queue.push_back(node->value_);
 serialize_bst(node->left_);
 serialize_bst(node->right_);
}

  
Node* deserialize_bst(std::vector<int> &queue) {
 if(queue.empty())
  return 0;
 if(queue[0] == ~0) {
  queue.erase(queue.begin());
  return 0;
 }

 //creates node with null left and right child
 //and given value
 Node *node = new Node(0,0,queue[0]);
 queue.erase(queue.begin());
 node->left_ = deserialize_bst(queue);
 node->right_ = deserialize_bst(queue);
 return node;
}

Conversion from Roman Number to Decimal

Conversion from roman number to decimal is quite straightforward if you look at the pattern of the number. You can start from the rightmost letter of roman number and keep on adding the decimal number that it corresponds to. But when the letter encountered while moving left is smaller than the maximum value of the letter encountered previously we need to subtract it. For example take XLV, rightmost letter is V which is 5 , the second one is L which is 50 and the third one is X which is 10 (this is less than the max value encountered, 50, so we need to subtract it. Thus the total is 5+50-10 = 45.

  
/*I - 1
V - 5
X -10
L - 50
C - 100
D -500*/

int get_val(char roman) {

 switch (roman) {
  case 'I':
   return 1;
  case 'V':
   return 5;
  case 'X':
   return 10;
  case 'L':
   return 50;
  case 'C':
   return 100;
  case 'D':
   return 500;
  case 'M':
   return 1000;
  default:
   return 0;
 }
 return 0;
}

int roman_to_int (const std::string &roman) {
 int max = 0;
 int num = 0;
 int len = roman.length() - 1;
 while(len >= 0) {
  int val = get_val(roman[len]);

  if(val < max)
   num -= val;
  else {
   num += val;
   max = val;
  }
  len--;
 }
 return num;
}

Greatest Common Divisor(GCD) and Least Common Multiple (LCM)

GCD a number that divides the given numbers without any remainder. Naive idea to compute GCD is to try every number from 1 to smallest of the numbers as divisor and taking the maximum one which divides the numbers without any remainders. This technique is too slow. One clever method for this is to use technique given below. If you have numbers say 42 and 18 then first divide 42 by 18 and take the remainder which is 6. Next divide 18 by 6, no remainder, so GCD is 6. This technique is mentioned in programming pearls book by John Bentley.

  
int GCD(int a, int b){

 while( b != 0){
  a = a%b;
  std::swap(a,b);
 }
 return a;

}

After being able to compute GCD, computing LCM is pretty easy. You juts multiply 2 numbers and divide it by the GCD. The code for calculating LCM is given below.

  
//calculate lcm
int lcm(int a, int b) {

 return a * b/ GCD(a,b);
}

Thursday, April 15, 2010

Iterative Inorder Traversal

Iterative inorder traversal is a little complicated than Preorder traversal. As for Preorder we need to use stack for Inorder traversal as well but the nodes can't be removed from stack until it's children are pushed to the stack and the node has been visited. For all nodes we need to push their children on the stack first before visiting it. We mark whether the children have been put into stack or not for a node my using another stack call marks. When adding children to the stack we set mark to true on the marks stack. Now when we pop the node again we check corresponding mark and if it is set to true we know that it's children has already been processed. Thus we visit the node (print the value in the code below). In Inorder traversal left child, parent and right child are processed respectively. So while pushing on the stack we do it on the reverse order so that Inorder requirements are maintained.

  
void inorder_traversal(Node *root) {
 if(root == 0)
  return;

 std::vector<bool> marks;
 std::vector<Node*>nodes;
 nodes.push_back(root);
 marks.push_back(false);

 while(!nodes.empty()){

//read the top mark
  bool mark = marks.back();
  marks.pop_back();

 //read the top node
  Node *node = nodes.back();
  nodes.pop_back();
 
 if(mark) {
 //if mark is set then the node is visited
   std::cout << node->value_ << std::endl;
  }
  else {
 //put the righ child on stack
   if(node->right_ != NULL) {
    nodes.push_back(node->right_);
    marks.push_back(false);
   }

 //set mark to true
 //put the node in the stack
   marks.push_back(true);
   nodes.push_back(node);

 //put left child on the stack
   if(node->left_ != NULL) {
    nodes.push_back(node->left_);
    marks.push_back(false);
   }
  }
 }
}

Iterative Preorder Traversal

Preorder traversal is easy when you consider recursive approach but it's quite tricky without recursion. The idea is to simulate recursion using stack. Start from the root and put the root's right and left child on the stack respectively. Then pop the topmost element from the stack, visit it and then again put it's right and left child on the stack if they exists. Continue this till the stack is empty.

  
void preorder_traversal(Node *root) {
 if(root == 0)
  return;

 std::vector stack;
 stack.push_back(root);
 Node *current;
 
 while ( !stack.empty()) {
  current = stack.back();
  stack.pop_back();
  
  std::cout << current->value_ << std::endl;
  
  if(current->right_ != NULL)
   stack.push_back(current->right_);
  if(current->left_ != NULL)
   stack.push_back(current->left_);
 }
}

Random Shuffle

Random shuffling has application in areas like card games. It's highly important that the shuffling is uniform. i.e to say that every possible permutation of the cards should be equally likely. This can be achieved by using uniform random number between 0 an 1. The algorithm (thanks to CLRS) works as follows: at first each 52 cards is equally likely so you select one card and put it in the last position, again of the 51 cards remaining you select another card and so on. At each step each card remaining should be equally likely to be selected. The code below implements the concept.


//random shuffle x elements
void random_shuffle(int x[], int size) {
 float r =0;
 int new_pos;
 for (int j = size -1; j >= 0; j--) {
  r = rand()/ (RAND_MAX + 1.0);
  new_pos = j * r;
  std::swap(x[new_pos], x[j]);
 }

//displaying the shuffled elements
 std::ostream_iterator iterator(std::cout,"\n");
 std::copy(x, x+size, iterator);
}

Dutch National Flag Problem

Dutch National Flag problem is posed as follows: Given an array with elements of 3 colors Red, Green and Blue, how would you sort them in Red, Green and Blue Order. One can definitelycount each elements and rewrite the array but another technique is to go through the array keeping 3 indexes, one representing Red (0), the other representing Green(1) and the third representing Blue(2). The variable are lo, mid and hi. lo represents first position where the value is not 0, hi represents first position from the end where value is not 2. mid represents past lo where the element is not 1. At anytime this invariant is maintained by swapping the elements in a loop. So when mid overtakes hi the algorithm terminates.


//solution 2 dutch flag problem
//as only 3 values exists it can also be solved using counting sort
void threecolors(int x[], int size){

 int lo = 0, mid = 0, hi = size-1;

//move lo to a position where value is not 0
 while ( lo < size && x[lo] ==0)
  lo++;

//move hi to a position where value is not 2
 while(hi >=0 && x[hi] == 2)
  hi--;

//set mid to lo
 mid = lo;
 while (mid <= hi)
 {

 //swap with lo if mid points to 0
 //swap with hi if mid points to 1
 //increment mid if it points to 1
  if ( x[mid] == 0) {
   std::swap(x[lo],x[mid]);
   lo++;mid++;
  }
  else if (x[mid] == 1)
   mid++;
  else
  {

   std::swap(x[mid],x[hi]);
   hi--;
  }
 }
}

Binary tree reconstruction from IN and PRE or POST order traversals

Reconstructing binary tree from Inorder and Preorder traversal is pretty straighforward. The idea is to number the Inorder sequence in increasing order and use that number as a key to insert into Binary Search Tree. The insertion into BST is carried out in the order given in Preorder traversal.

Example:
Inorder: DBEAFCG
Preorder: ABDECFG

Now number the inorder traversal:DBEAFCG (1234567). Thus the preorder traversal will be ABDECFG (4213657)

Insert the items picking them in preorder into BST. So first you will insert 4(A), then 2(B), 1(D) and so on. This will give you the original tree from which traversals are generated.

Code for it is given below



//Tree node
class Node {

public Node left;
public Node right;

//key
public int key;

//label of the node
public char label;

public Node(Node left, Node right, int key, char label){
this.left = left;
this.right = right;
this.value = val;
this.key= key;
}


public Node(int key, char label) {
this(null, null, key,label);
}


}

//Binary Search Tree
class Tree {
public Node root;

public Tree() {
this(null);
}
public Tree(Node root) {
this.root = root;
}


public void insert(int key, char label) {

Node node = new Node(key, label);
if(root == null) {
root = node;
return;
}

Node temp = root;
Node parent = root;

while(temp != null) {
parent = temp;
if ( key < temp =" temp.left;" temp ="=" left =" node;" temp =" temp.right;" temp ="=" right =" node;" table =" new" count =" 0;" bst =" new">

Reconstruction from Inorder and Postorder

Steps are pretty much the same for Postorder too. But while inserting into BST we start from last element of the traversal and continue on to first.

Computing derivatives

One simple technique that I know of for computing derivative is assuming a small change in x axis (like 0.0001, the smaller the better) and computing corresponding change in y. The derivative then is dy / dx, which is just a slope at that point. With this idea, lets say were given a function y = f(x) and we want to compute derivative at x = a. Consider a small value h (the smaller the value h, the more accurate is the derivative) then we will consider interval a and a + h. For the given interval we can calculate change in y as f(a + h) - f(a). Now we already know that derivative at a point is the slope at that point, which is given by:

m = [f(a+h) - f(a)]/ h

This formula is sufficient to compute derivative. The pseudo code for this would be :


double compute_derivative(double a) {
 double h = 0.0000001;
 double faph = compute the function for x = a+h;
 double fa = compute the function for x =a;
  return (faph - fa) /h;
}

Derivatives are useful for many numerical algorithms. So knowing this can be handy for programmers.

Computing square / cube roots

We are very much used to using the square root functions provided in the library. But many of us don't have a clue how it can be done ourselves ( at least some of my friends don't have clue because we were discussing about it sometime back). So how can it be implemented if you need one? I am obviously at advantage because I had previlege to take Numerical Methods course during my undergrad. But even those of you who didn't take Numerical methods course know a searching technique which can be used to compute square roots. What is the technique? Of Course, binary search. How can you use binary search to compute square root of say 10. The idea is simple: consider the interval 0 - 10. Find the midpoint which is 5. Check if 5*5 is less than the given number. If it is less then the root greater than 5 else it is less than 5. Keep on proceeding like this until u find the actual number. This would translate to code as follows:


double find_root(double number) {
 double lo = 0;
 double hi =number;
 double mid;
 double epsilon = 0.000001; //precision
 while ( lo <>
  mid = (lo + hi)/2;
  if (abs((mid * mid) - number) <= epsilon)
   break;
  else if (mid * mid < number)
   lo = mid;
  else
   hi = mid;
  
 }
 return mid;
}

This works but takes time in converging. For those of you familiar with numerical algorithms you know that there is a better way of doing this. One most widely used method for root finding is Newton Raphson method which exploits concept of derivative. In this method a root is guessed and then the guess is converged to the actual root. For example your initial guess is X0 then the next value closer to the root would be:

X1 = X0 - f(X0) / f`(X0)

Similarly X2 = X1 - f(X1) / f`(X1) and so on until you get the precision that you want. Generally, it can be written as:

X = X - f(X) / f`(X)

The idea is that at each iteration it takes you closer to the root. We know that root is a poin in XY plane where the curve meets X- axis. The above formula is taking you close to that X-axis whenever you apply it. Now lets express finding square root as a root finding problem. Say you want to find square root of c. Let the square root be x then we can write the equation as:

x^2 = c or x^2 - c = 0

We require computing derivative for this at each new value of x that we obtain iteratively. This cane be done by referring to the post: http://codinggeeks.blogspot.com/2010/04/computing-derivatives.html


double newton_raphson(double number) {
 double fx , fdashx;
 double x = 3.00; //make an smart guess here
 double epsilon = 0.00001; //choose your precision
 do {
   fx = computer fx at x;
   fdashx = compute_derivative(x);
   x -= fx / fdashx;

 } while ( abs (x * x - c) > epsilon);
 
 return x;
}

You can choose your own precision. Note that this method can be uesd to compute any root though I illustrated code for square root computation. For computing any root replace the while condition with your function.

Tuesday, April 13, 2010

Efficient method to find if a number is power of 2

Finding if a number is power of 2 is pretty straight forward when you think in terms of bits. One way is to count the number of ones in binary representation. If number of ones is one then the number is power of 2. To illustrate this consider powers of 2 like 1,2,4,8,16 and so on. how are they represented in binary? 1, 10, 100, 1000, 10000 and so on. What do they have in common? All of them contain only single bit that has value 1 and rest are 0. This method is effective but still it requires O(n) time in the worst case (where n is the maximum number of bits in the number).

Lets think of it in a different way. What sort of numbers are power of 2? If we look at the bit pattern we can say that the numbers which have a single 1 bit and the left and right bits from the 1 bit are all zeros. So what will be the number that represents the right bits? If you deduct 1 from given number N (i.e. N-1), then it should give number with all right bits set if the number is power of 2. What this means is that if we AND the given number N with N-1 and the result is 0 then the number is power of 2 otherwise not. If a number is not power of 2, it will have atleast single 1 on the right of it and the AND operation would result a number greater then 0. The check for power of 2 can be written in terms of code as:



bool is_power(unsigned int n) {

 return n !=0 && (n & n-1) == 0;
}

Converting BST to sorted Doubly Linked List

Today, one of my friend asked how to convert a Binary Search Tree to sorted Doubly Linked List. He came across this problem while preparing for interview. I then began to dig what I could remember about BST. I had learnt from my Data Structures course in college that most tree problems boil down to tree traversal problems. So I started thinking along that line and suddenly came out with a solution. The idea is to traverse the tree In Order and adjust the pointers accordingly. The code for this is pretty straight forward.


Node *last_left = 0;

void traverse(Node * node)
{
 if(node == 0)
   return;
 //traverse left tree
 traverse(node->left_);

 //adjust the left and right pointers (left points to previous
//and right points to next)
 //in terms of doubly linked list
 if(last_left != 0){
   last_left->right_ = node;
   node ->left_ = last_left;
   last_left = node;
 }
 else
   last_left = node;

 //traverse right subtree
 traverse(node->right_);
}

Programming , Algorithms and Random Stuffs