Sunday, October 24, 2010

Java Language Concepts

What's the difference between abstract classes and interface?

interface can only have method signatures on it. It can't have implementations(body of methods) but abstract classes can have implementations. A class can be labeled as abstract even if no methods in the class are abstract. Abstract classes cannot be instantiated. If all the methods in a class are abstract it behaves like interface. Is it? Then why do we need interface? Only limitation of abstract classes is that they are classes which means we cannot extend from more than one abstract class.

What is the difference between static and not static methods?

Static methods can be accessed using class name while non static method requires instance of the class to access it. Both static and non static methods have single copy in the application. So under the hood what is the difference? One important distinction is that static methods cannot be virtual. Other is that static methods don't have access to this pointer. Non static methods take the object (on which it is called as argument). Static and non static methods look like this:



class X {

public static void staticMethod() {

//do sth
}

public void nonStaticMethod() {
//do sth
}
}

Under the hood the compiler generates method like this:

public void staticMethod() {
//do sth
}

public void nonStaticMethod(X this) {
//do sth
}



What is the difference between Comparable and Comparator ?

Comparable can be implemented by any class that requires ordering. compareTo method of the Comparable interface takes another object to which the current object is compared. For example you might have Person object and you have implemented Comparable to compare those objects using the first name and last name. But later your need changes and you have to order person objects based on their address. In this case, with Comparable, you have to change the implementation. You might not want to do that because you might need sorting using first name and last name in other places.So in that case you can always use Comparator interface. The sort method of Collections class can also take Comparator as argument. This helps you sort objects based on your needs. With comparable you are stuck to one implementation.

When do you have to override hashcode method?

When you will be using a hashed container to put your objects in, you need to override hashcode. If you override equals it's always good to override hashcode too. The hashcode provided by Object class in Java Language takes memory address of the object as hashcode. This is not what we want with our objects. For example consider following example:


class Person {

private String name;
private String dob;

public Person(String name, String dob) {
this.name = name;
this.dob = dob;
}
public boolean equals(Object obj) {
//implementation
}

//..


public static void main(String[] args) {

Set<Person> persons = new HashSet<Person>();
Person p1 = new Person("Person1", "12/12/2008");


persons.add(p1);

Person p2 = new Person("Person1", "12/12/2008");
if(persons.contains(p2)) {
//do sth
}

}
}



I am assuming here that person with same name and Date of Birth are equal. Do you think the condition will be true? Not necessarily. It's because the two person objects might get mapped into two different buckets in the hashset. Unless we return same hashcode for two equal objects there is no guarantee that the objects can be located in the hashed containers. So keep in mind that whenever you override equals you should override hashcode and two equal objects should always return same hashcode.

Thursday, October 21, 2010

finally in C/C++

If you have programmed in C/ C++ you know there is no such thing as finally in C/C++. If you have used Java you know how convenient it is with finally to handle resources in code that is exception prone. There is a pretty nice and clean way to simulate finally in C++ (using destructors) but C doesn't have nice and clean way to do it. I am not sure if it is the recommended approach as it utilizes goto keyword. Since I started programming, I have been told not to use goto. Even while I read books i find authors ranting about use of goto. But here is a technique in C that simulates finally using goto.




void do_sth() {

int *arr1 = 0,*arr2 = 0, * arr3 = 0;

//calling function on will return error code, 0 being success
// and other values a failure
arr1 = (int *)malloc(100 * sizeof(int));
if(!arr1)
goto finally; // u wud have free(arr1) without finally

arr2 =(int*) malloc(100 * sizeof(int));

if(!arr2)
goto finally; //u wud have free(arr1) and free(arr2) without finally

arr3 =(int*) malloc(100 * sizeof(int));

if(!arr3)
goto finally; //u wud have free(arr1) and free(arr2) and free(arr3) without finally

//do sth here

finally:
free(arr1);
free(arr2);
free(arr3);
}



This is a simple example showing how you can handle exceptions in memory allocation. This makes sure that everything is freed when function exists and makes code lot cleaner. The resource allocation used here was simple. For example if you were opening 3 files instead. How would the code ? You would probably check if the file handle is valid in finally and if it is valid you would close them. The main advantage of this approach is that your resource handling code is localized. it's not spread over places and also you would write less code with this approach.

Now how would you do same thing in C++? Well C++ has facility for destructor which can do the job for you. When the object goes out of scope, the resource is cleaned up. There are lots of examples in C++ standard library. For example, instead of using dynamic arrays as shown above in C code, you could use std::vector in C++ which takes care of freeing the resources itself. Also there is std::auto_ptr which can be used for exception prone code. Of course auto_ptr doesn't support arrays.For supporting arrays you can use equivalent of boost's scoped_array.

Producer Consumer Problem

Well this is one of the frequently asked questions. It's a good way to see how much you understand multithreading. The solution is pretty simple but it's always good to understand the subtleties. The code would look something like this:

 
public void produce() {

synchronized(queue) {

while(queue.isFull()) {
queue.wait();
}

//produce
queue.enqueue(someObj);

queue.notifyAll();
}
}




public void consume() {

synchronized(queue) {

while(queue.isEmpty()) {
queue.wait();
}

//consume
someObj = queue.dequeue();
//do something with the object

queue.notifyAll();
}

}



The code is pretty simple but even there are things where people get confused easily. First of all both the producer and consumer should synchronize on same object. Beginners tend to forget that. Also notice the use of while loop, when required conditions haven't been met threads are made to wait. A while loop is needed because we are using notifyAll here.notifyAll wakes up all the threads waiting on an object but only one thread will be able to obtain the lock. So in this case only one thread is allowed to proceed and others go back to wait. Another question arising in the context is the placement of notifyAll. Does it have to be the last statement inside synchronize block? The answer is no. It doesn't matter where you keep the notifyAll till it is inside synchronize block. The lock will only be released when synchronize block is escaped.

Recommendation

Never try to do produce consumer problem this way in production code. Always try to use the concurrent collections or facilities provided by the Java Language. For example, you could have used BlockingQueue rather than hand coding the program above.

Monday, June 7, 2010

Conversion from infix to postfix

Converting a given infix expression to postfix expression is used when numerical expressions are evaluated. Given an expression like A+B*C, the problem is to convert it into expression ABC*+. This is quite a simple problem and involves use of a stack. The idea is to add every digit encountered into postfix string. A stack is used to store operators. Whenever an operator is encountered, if it is of lower precedence than the operator on top of stack then the stack is popped and the operator is added to the postfix string. The popping is continued until the stack is empty. This is a pretty simple problem and the code is given below:



//returns true if stk precedes infix
bool precedes(char stk, char infix) {

//declare an array of operators in precedence order
//considers only 4 operators now
static char prec[] ={'/','*','+','-'};

//check for precedence
for(int i = 0; i <4; i++) {
if(prec[i] == stk)
return true;
if(prec[i] == infix)
return false;

}
}

//convert to postfix
std::string postfix(const std::string &infix) {

int i = 0;

//operator stack
std::stack<char> op_stack;
std::string post_fix;
while(i < infix.length()) {

//if it is a digit add it to postfix
if(isdigit(infix[i]))
post_fix.append(1,infix[i]);
else {

//if operator at the top of stack
//precededs current operator
//pop the operator and add it to
//the post fix string
//repeat this until stack is empty
//or operator at top of stack
//doesn't have higher precedence
while(!op_stack.empty() &&
precedes(op_stack.top(),infix[i])) {
post_fix.append(1, op_stack.top());
op_stack.pop();
}

//push the current operator
op_stack.push(infix[i]);
}
i++;
}

//add everything to the post fix string
while(!op_stack.empty())
{
post_fix.append(1, op_stack.top());
op_stack.pop();
}
return post_fix;
}


Wednesday, May 26, 2010

Number base conversion and arithmetic

Perhaps most of us came across base conversions when we were in high school. It's pretty easy to convert numbers from and to decimal if we remember 2 simple rules. If we desire to convert from/to base b to/from decimal we can use following rules:

Conversion from decimal to base b

Divide the given decimal number by b and store the remainder in a string. Keep on dividing until we the number can be divided no more.

num be the given decimal number

digit[i] = (num /b ^ (i+1)) % b.

resulting number in base b is digit[0] digit[1] ...digit[n-1].

Conversion from base b to decimal

num be the given number in base b and n be the number of digits

decimal = digit[n - 1] * b ^ (n -1)+...+digit[i] * b ^ (i) + digit[1] * b^1 + digit[0] * b^0


These 2 techniques are sufficient to convert between numbers in any bases.
Lets say you want to convert a number from base x into base y. To solve this you can convert base x into decimal and then convert the decimal into base y. These 2 techniques are easily transformed into programs. But it might not be a good idea to use these techniques when bases are power of 2. If bases are power of 2, bit manipulation techniques can be applied to convert. For example for converting from decimal to hex, it's always lot faster to take 4 bits at a time and convert them to hexadecimal. For example if given number is 35 (00100011, take 4 rightmost bits (0011) which is 3 and again take next 4 bits(0010) which is 2. Thus the Hex number for this is 23. This is likely to be faster than division.

Arithmetic

How would you add/subtract or do some other arithmetic operations on 2 numbers in base b? The simple idea is to convert the numbers into decimal and do the operation and then convert the resulting number back to b. The operation can be implemented for the given base itself but requires significant effort than the conversion technique.

Tuesday, May 25, 2010

Web Application Performance Improvement

When a web application is accessed by few users the performance may not be a huge problem but as the numbers of users grow it might bring your application to a halt if not designed properly with scalability in mind.It's always good to start out by profiling the application first. The profiling can give you insight on the piece of code that is expensive. A piece of code that is expensive might not necessarily be the main culprit. Always look for the piece of code that is extensively called. For example you might have one method that takes 2 seconds to run and then the other method that takes 200 ms. First intuition might be to make the first method efficient. But it might be the case that your application calls the method 1 or 2 times whereas it calls the second method thousands of time. In this case it's always good to optimize the second method.

Connection Pooling

If your application accesses database a lot make sure you are using connection pooling. And also consider using Stored Procedure if you need to filter results or make multiple requests to the database in the same code path.

Caching

If results can be reused across multiple requests, it's always a good idea to employ cache.

Compression

Make sure server supports compression and is using it to send results. This decrease communication time.


Clustering

Use server cluster for your application.

Resource contention

Minimize resource contention because if lot of requests are competing for same resource they degrade performance and hinder scalability. For example servlet is a shared resource and having synchronized method in servlets may cause scalability problems.

Memory Usage

Watch memory usage of your applications and try to reduce unnecessary memory usage.Use of large amount of memory might trigger garbage collector frequently which uses precious CPU time which could have been used for serving users.

Monday, May 24, 2010

Designing Cache

Cache is one very important concept in computer science. It is one design technique that is likely to help you when your system is thwarted by performance problems. Give this it is highly likely that at some point in your life as a Software Guy you have to design a cache. Cache generally employs multiple data structures for efficiency. Cache has limited space to store data so depending on your application you might have to chose a replacement policy. Ideally you would want to throw the element that would be unused to make room for new elements. But our world is not perfect and we have no way of knowing which element is likely to be unused in the future. The best thing we can do is to make a guess based on our application or may be log the usage statistics and fine tune the cache. Based on the replacement policy we can design caches in different ways.

LRU Cache

LRU cache is the simple cache where the least recently used item is thrown off the cache to make room for new one. As cache requires fast lookup we can always rely on hashtable for fast lookup. But how can we find least recently used item in Hashtable? We can't. So we need to find some other data structure that does it effectively. Why not use a queue? Whenever element is accessed put it on the tail. So front of the queue will be the least recently used one. Cool but we cannot access elements randomly from the queue. Queue is a data structure where you can remove element from the front and put the element from the back. To be able to access element anywhere we can use LinkedList. This solves the problem pretty effectively. SO here is the simple design: Create Node class that holds cache key and data. This node will be node of the linked list and also based on the key, it is inserted into hastable. We can use pointers so that both hashtable and Linked List point to same node. Lookup will require looking up in the hastable, if the data is not there and cache is full then remove the first element from the linked list and fetch new element and put it in back of the list. If cache is not full and data is there simply put the Node to the back of the list. The operations are quite simple.

MRU Cache
The neeed of this kind of cache may not arise that often but it can be implemented similarly as before but you can remove the tail of the list to remove most recently used item.

LFU cache
Least frequently used cache. This is quite difficult to design because we need to keep count of the accesses of each entry in cache. Here is my idea: Use Binary Search Tree with frequency of access as key and Hashtable for fast lookup. For lookup, if entry is in the cache, take the entry Node , delete it from the BST, increase its frequency and reinsert it into BST. While replacing the entry chose the leftmost element.

There are different other schemes. I would also like to hear from readers about these schemes and their solution.

Monday, May 10, 2010

Reversing a Linked List

I heard this is one of the most frequently asked questions in an interview. It seems to be asked quite a lot in phone screens. May be because it's simple, gives a good idea whether you have grasp of pointers and whether you have an idea of basic data structure (Of course linked list). There are 2 ways to approach this problem. The first one is recursive. But you might not want to come up with recursive solution during interview unless interviewer explicitly directs you to do so. Why? Because recursion is slow and takes up lot of memory if linked list is long. The second solution is also obvious. You go through each node in the list and adjust the pointers. Before proceeding it might be wise to ask the interviewer whether it's an singly or doubly linked list. Here we assume it's a singly linked list. The code is given below.



Node* reverse_list(Node *head) {
if(head == NULL)
return;

Node *current = head, *temp, *new_list = 0;

while(current != 0) {
temp = current->next;
current->next = new_list;
new_list = current;
current = temp;
}

//return new head
return new_list;

}

Tuesday, April 20, 2010

Young Tableau

Young tableau consists of rows and columns of numbers in the form of a matrix. The special property of young tableau is that its rows and columns are sorted. Young tableau can also function as a min heap. There are different interesting algorithms relating to young tableau.

Example of young tableau.

1 2 3
4 6 7
5 7 12

It's clear from the example that each row and column is sorted.

Here is one interesting problem on Young Tableau: How to find an element in an Young Tableau?

Naive search on the table would require O(mn) time. Where m is the number of rows and n is the number of columns. Can we do bit better? Turns out we can. The trick is to start from the top right corner> if the element we are searching for is greater we move down in the column. If the element we are searching for is smaller we move left in the row. We repeat the procedure until we find the element or we end up in a row or column beyond the table. This would translate to a simple code below:

The example assumes 3 X 3 matrix for simplicity. row is the number of rows and col is the number of columns in the matrix and n is the element being looked for.


int find_n(int x[][3],int row, int col, int n) {

int r =0 , c = col -1;
while( r < row && col >=0) {
if(x[r][c] == n)
return n;
else if ( x[r][c] > n)
c--;
else
r++;
}
return -1;
}



Another problem in Young Tableau is extracting minimum element. The minimum element is the element in (0,0) position. The main problem is replacing the spot (0,0) after minimum element has been removed. The code is given below (sorry that i have it in Java because I did these two things separately but still i believe it is simple enough so i thought it was not necessary to put both of these in same language)



/*
a is young tableau, i and j are current position in the tableau that we are filling
m and n are number of rows and column
*/
void youngify(int[][] a, int i, int j, int m, int n) {
if (i > m || j > n)
return;

int x = -1, y = -1;

if (i + 1 < m && a[i + 1][j] < a[i][j]) {
a[i][j] = a[i + 1][j];
x = i + 1;
y = j;

}
if (j + 1 < n && a[i][j + 1] < a[i][j]) {
a[i][j] = a[i][j + 1];
x = i;
y = j + 1;
}

if (x != -1) {
//the empty space is filled with marked
//by maximum value a integer can hold
a[x][y] = Integer.MAX_VALUE;
youngify(a, x, y, m, n);
}

}


Inserting an element into non full young tableau follows the similar process as extracting minimum element. The idea is to put the element into last row and column and move it upto the position where is should be.

Sorting can also be implemented for n X n young tableau in O(n ^3) time by using the operation by which we extract minimum element. The idea is to extract minimum element which takes O(n+n) = O(n) time, IF we repeat this for n^2 element it will take O (n ^3)time.

Finding overflows during arithmetic operations

We know that data in computers have limited size. For example, integer may be limited to 4 bytes. So, while writing programs sometimes we might need to check for overflows to make sure that the resulting value fits the size. But finding overflows are easy.

Finding overflow during addition

Lets say you are adding number a and b. Lets say maximum value that the types of a and b can hold be MAX. Then we can check for overflow using following condition:

if ( (MAX - a) < b then the result overflows else not.

Finding overflow during multiplication

One example of its use is while computing factorial. Lets say you want to compute n!. But you also want to make sure that appropriate error is thrown when it overflows. This is also quite simple. Lets say again that the maximum value a data type can hold be MAX and you are multiplying a and b. The result overflows if MAX/a < b.

Sunday, April 18, 2010

Finding median or Kth element in Binary Search Tree (BST)

Finding kth element in BST is quite simple. We know that all the elements in left subtree is smaller than all the elements in right subtree. If we count the number of elements in left and right subtree then we know in which subtree the required element exists and then we move to this subtree and repeat the same steps again. For example starting from root we are looking for 11th element. Lets say left subtree has 7 elements and right subtree has 6 elements.Then as the number 11 is greater than 7 we know that the 11th median exists in the right subtree. Now in the right subtree we look for (11 - 7 -1 = 3)rd element (Why?? because we have discarded left subtree). We proceed similarly in the new subtree until we get the kth median as the root.

The example described above can very well be described in recursive form. The code below is a complete program in recursive form.

  
//find kth element in a BST recursively
Node * find_kth_element(Node *tree, int &k) {

if(tree == 0)
return 0;

int count = 0;

//count the nodes in left subtree
count_nodes(tree->left_, count);

//check if the root is median
if( k == count + 1)
return tree;

//check if median falls on left subtree
else if (count >= k) {

return find_kth_element(tree->left_,k);
}
//the median falls on right subtree
else {
k = k - (count + 1);
return find_kth_element(tree->right_, k);
}
}




Iterative Solution

We know that recursive solutions are not that efficient and it's always good to avoid it if it's possible. The problem however is equally simple in iterative form as well. The code below translates recursive form into iterative form.


  


//find kth element iteratively
Node * find_kth_element_iterative(Node *tree, int &k) {
if(tree == 0)
return 0;

Node *node = tree;
int count = 0;
int pos = k;


while(node != 0) {
count = 0;

//count nodes on the left subtree
count_nodes(node->left_, count);

//check if root is the median
if( pos == count + 1)
return node;

//check if median falls on left subtree
else if (count >= pos)
node = node->left_;

//median falls on right subtree
else {
pos -= (count + 1);
node = node->right_;
}

}
return 0;
}



Complexity

What's the complexity of above algorithms? While finding median You are moving along a path in the tree, so in the worst case it can be height of the tree (h) and also you are counting the number of nodes in left subtree at each step. Which requires O(n) time in the worst case. Thus the overall algorithm is O(nh).

Is there a way to improve it? Yes, of course ( at least one technique I know of). If you need to keep on computing the kth median on a single tree many times, then you can construct a tree with a count of elements in its subtree. If you do this the counting of number of nodes in left subtree takes O(1) time. Effectively the algorithm will reduce to O(h) algorithm. In case tree is balanced this is just O(log n). Not bad :).

Counting the number of ways BST can be formed

The problem is posed as follows: Given numbers from 1 to n, In how many ways can you form a BST. For example if number given is 1 to 3. How many ways can we proceed? We can break this down to problem where we count number of possibilities when root is 1,2 and 3 respectively. And then we sum all those values to give total number of possibilities. Now when 1 is root, we can proceed to count possibilities when 2 is the root for the right subtree of 1 and when 3 is the root of the right subtree of 1. This gives us the fair idea that we can proceed recursively and count the possibilities. For left and right subtrees we multiply each other to give total number of possibilities. This translates into simple code given below:

  
//compute how many BST can be formed for given number of inorder, no duplicates
int comb_count(int begin, int end) {

if(begin >= end)
return 1;
int count = 0;
for(int i = begin ; i <= end; i++) {
//take this as root and find how the
//nodes can be arranged on the left or right
count+= comb_count(begin, i - 1) * comb_count(i + 1, end);
}
return count;
}

Saturday, April 17, 2010

Inserting a value in sorted circular linked list

Inserting an item in sorted circular linked list is tricky. If the elements in the list are unique then there are 3 cases to consider:

Case I: The new element to be inserted falls between 2 elements in the list.

Case II: The new element is larger than all the elements in the list.

Case III: The new element is smaller than all the elements in the list.

For cases where list is empty or list has only 1 element, it doesn't matter where you insert the new node.

In the solution below I assume ascendingly sorted list and no duplicates are considered. It would be easy to extend the program for that but it would grow a bit complex. So I have only presented code for the simple case. I might post another version with all cases handled later.

//linked list node
struct LNode {
LNode(int val_):val(val_),next(0){}
int val;
LNode *next;
};
LNode* insert(LNode *list, int val) {

//construct new node for this value
LNode *newNode = new LNode(val);

//check if list has no item
if(list == 0){
newNode->next = newNode;
return newNode;
}
//check if list contains only one item
else if (list->next == list) {
list->next = newNode;
newNode->next = list;
return newNode;
}
LNode  *temp, *node = list;
node = node->next;
while(list != node) {

//Case I: val falls between 2 values in the list
if(val > node->val && val <>next->val){
break;

}
//case II: val is greater than all the values in the list
else if(val > node->val && node->next->val <>val) {
break;

}
//case III: val is smaller than all the values in the list
else if( val <>next->val && node->next->val <>val) {
break;
}
node = node->next;
}

temp = node->next;
node->next = newNode;
newNode->next = temp;

return newNode;
}
Handling duplicate values

Capability to handle duplicate values can easily be added to the above code by replacing greater than and less than in value comparisons by greater than equal to and less than equal to. The new conditions will be:

  //Case I: val falls between 2 values in the list
if(val >= node->val && val <>next->val){
break;

}
//case II: val is greater than all the values in the list
else if(val >= node->val && node->next->val <>val) {
break;

}
//case III: val is smaller than all the values in the list
else if( val <= node->next->val && node->next->val <>val) {
break;
}

When how the list is sorted is not known

When how the list is sorted is not known you can add similar conditions for descending sorted list. But first you need to determine how the list is sorted. The sorting order can be determined by traversing elements until we find 2 increasing(ascending) or decreasing (descending) numbers in sequence. Why two? Because with one you never know if it is at the end of the list where the element order reverses. But when list has 2 elements, you never know how the list is sorted. Consider for example list 1, 2 as circular (1->2->1), then you have no way of telling if it is sorted in ascending order or descending order. Same is the case when list has more elements but each of them have same value except one (e.g 1,1,1,1,1,1,1,3).

Friday, April 16, 2010

Serializing and De-serializing Binary Tree

Serializing a binary tree is easy. Only thing we need to do is to serialize null pointers as well. So a preorder traversal of the tree with a special symbol indicating null left or right child of the tree will be sufficient to reconstruct tree. In the example code below we indicate null by ~0 (which is the number with all 1 binary digit. For example if you have a tree like A->B->C (where B is a right child of A and C is a right child of B), the serialization will produce A NULL B NULL C NULL NULL. During reconstruction you start from the first element and proceed creating left child until you hit NULL. After we hit NULL, we drop it and look for nother element. If another element is not NULL we create right child. The process is straight forward and can easily be understood from the code.

  

std::vector<int> queue;

void serialize_bst(Node * node) {

 if(node == 0 )
 {
  queue.push_back(~0);
  return;
 }

 queue.push_back(node->value_);
 serialize_bst(node->left_);
 serialize_bst(node->right_);
}
 


  
Node* deserialize_bst(std::vector<int> &queue) {
 if(queue.empty())
  return 0;
 if(queue[0] == ~0) {
  queue.erase(queue.begin());
  return 0;
 }

 //creates node with null left and right child
 //and given value
 Node *node = new Node(0,0,queue[0]);
 queue.erase(queue.begin());
 node->left_ = deserialize_bst(queue);
 node->right_ = deserialize_bst(queue);
 return node;
}
 

Conversion from Roman Number to Decimal

Conversion from roman number to decimal is quite straightforward if you look at the pattern of the number. You can start from the rightmost letter of roman number and keep on adding the decimal number that it corresponds to. But when the letter encountered while moving left is smaller than the maximum value of the letter encountered previously we need to subtract it. For example take XLV, rightmost letter is V which is 5 , the second one is L which is 50 and the third one is X which is 10 (this is less than the max value encountered, 50, so we need to subtract it. Thus the total is 5+50-10 = 45.

  
/*I - 1
V - 5
X -10
L - 50
C - 100
D -500*/

int get_val(char roman) {

switch (roman) {
case 'I':
return 1;
case 'V':
return 5;
case 'X':
return 10;
case 'L':
return 50;
case 'C':
return 100;
case 'D':
return 500;
case 'M':
return 1000;
default:
return 0;
}
return 0;
}

int roman_to_int (const std::string &roman) {
int max = 0;
int num = 0;
int len = roman.length() - 1;
while(len >= 0) {
int val = get_val(roman[len]);

if(val < max)
num -= val;
else {
num += val;
max = val;
}
len--;
}
return num;
}

Greatest Common Divisor(GCD) and Least Common Multiple (LCM)

GCD a number that divides the given numbers without any remainder. Naive idea to compute GCD is to try every number from 1 to smallest of the numbers as divisor and taking the maximum one which divides the numbers without any remainders. This technique is too slow. One clever method for this is to use technique given below. If you have numbers say 42 and 18 then first divide 42 by 18 and take the remainder which is 6. Next divide 18 by 6, no remainder, so GCD is 6. This technique is mentioned in programming pearls book by John Bentley.

  
int GCD(int a, int b){

while( b != 0){
a = a%b;
std::swap(a,b);
}
return a;

}



After being able to compute GCD, computing LCM is pretty easy. You juts multiply 2 numbers and divide it by the GCD. The code for calculating LCM is given below.

  
//calculate lcm
int lcm(int a, int b) {

return a * b/ GCD(a,b);
}

Thursday, April 15, 2010

Iterative Inorder Traversal

Iterative inorder traversal is a little complicated than Preorder traversal. As for Preorder we need to use stack for Inorder traversal as well but the nodes can't be removed from stack until it's children are pushed to the stack and the node has been visited. For all nodes we need to push their children on the stack first before visiting it. We mark whether the children have been put into stack or not for a node my using another stack call marks. When adding children to the stack we set mark to true on the marks stack. Now when we pop the node again we check corresponding mark and if it is set to true we know that it's children has already been processed. Thus we visit the node (print the value in the code below). In Inorder traversal left child, parent and right child are processed respectively. So while pushing on the stack we do it on the reverse order so that Inorder requirements are maintained.

  
void inorder_traversal(Node *root) {
if(root == 0)
return;

std::vector<bool> marks;
std::vector<Node*>nodes;
nodes.push_back(root);
marks.push_back(false);

while(!nodes.empty()){

//read the top mark
bool mark = marks.back();
marks.pop_back();

//read the top node
Node *node = nodes.back();
nodes.pop_back();

if(mark) {
//if mark is set then the node is visited
std::cout << node->value_ << std::endl;
}
else {
//put the righ child on stack
if(node->right_ != NULL) {
nodes.push_back(node->right_);
marks.push_back(false);
}

//set mark to true
//put the node in the stack
marks.push_back(true);
nodes.push_back(node);

//put left child on the stack
if(node->left_ != NULL) {
nodes.push_back(node->left_);
marks.push_back(false);
}
}
}
}

Iterative Preorder Traversal

Preorder traversal is easy when you consider recursive approach but it's quite tricky without recursion. The idea is to simulate recursion using stack. Start from the root and put the root's right and left child on the stack respectively. Then pop the topmost element from the stack, visit it and then again put it's right and left child on the stack if they exists. Continue this till the stack is empty.


  
void preorder_traversal(Node *root) {
if(root == 0)
return;

std::vector stack;
stack.push_back(root);
Node *current;

while ( !stack.empty()) {
current = stack.back();
stack.pop_back();

std::cout << current->value_ << std::endl;

if(current->right_ != NULL)
stack.push_back(current->right_);
if(current->left_ != NULL)
stack.push_back(current->left_);
}
}

Random Shuffle

Random shuffling has application in areas like card games. It's highly important that the shuffling is uniform. i.e to say that every possible permutation of the cards should be equally likely. This can be achieved by using uniform random number between 0 an 1. The algorithm (thanks to CLRS) works as follows: at first each 52 cards is equally likely so you select one card and put it in the last position, again of the 51 cards remaining you select another card and so on. At each step each card remaining should be equally likely to be selected. The code below implements the concept.


//random shuffle x elements
void random_shuffle(int x[], int size) {
float r =0;
int new_pos;
for (int j = size -1; j >= 0; j--) {
r = rand()/ (RAND_MAX + 1.0);
new_pos = j * r;
std::swap(x[new_pos], x[j]);
}

//displaying the shuffled elements
std::ostream_iterator iterator(std::cout,"\n");
std::copy(x, x+size, iterator);
}

Dutch National Flag Problem

Dutch National Flag problem is posed as follows: Given an array with elements of 3 colors Red, Green and Blue, how would you sort them in Red, Green and Blue Order. One can definitelycount each elements and rewrite the array but another technique is to go through the array keeping 3 indexes, one representing Red (0), the other representing Green(1) and the third representing Blue(2). The variable are lo, mid and hi. lo represents first position where the value is not 0, hi represents first position from the end where value is not 2. mid represents past lo where the element is not 1. At anytime this invariant is maintained by swapping the elements in a loop. So when mid overtakes hi the algorithm terminates.



//solution 2 dutch flag problem
//as only 3 values exists it can also be solved using counting sort
void threecolors(int x[], int size){

int lo = 0, mid = 0, hi = size-1;

//move lo to a position where value is not 0
while ( lo < size && x[lo] ==0)
lo++;

//move hi to a position where value is not 2
while(hi >=0 && x[hi] == 2)
hi--;

//set mid to lo
mid = lo;
while (mid <= hi)
{

//swap with lo if mid points to 0
//swap with hi if mid points to 1
//increment mid if it points to 1
if ( x[mid] == 0) {
std::swap(x[lo],x[mid]);
lo++;mid++;
}
else if (x[mid] == 1)
mid++;
else
{

std::swap(x[mid],x[hi]);
hi--;
}
}
}

Binary tree reconstruction from IN and PRE or POST order traversals

Reconstructing binary tree from Inorder and Preorder traversal is pretty straighforward. The idea is to number the Inorder sequence in increasing order and use that number as a key to insert into Binary Search Tree. The insertion into BST is carried out in the order given in Preorder traversal.

Example:
Inorder: DBEAFCG
Preorder: ABDECFG

Now number the inorder traversal:DBEAFCG (1234567). Thus the preorder traversal will be ABDECFG (4213657)

Insert the items picking them in preorder into BST. So first you will insert 4(A), then 2(B), 1(D) and so on. This will give you the original tree from which traversals are generated.

Code for it is given below



//Tree node
class Node {

public Node left;
public Node right;

//key
public int key;

//label of the node
public char label;

public Node(Node left, Node right, int key, char label){
this.left = left;
this.right = right;
this.value = val;
this.key= key;
}


public Node(int key, char label) {
this(null, null, key,label);
}


}

//Binary Search Tree
class Tree {
public Node root;

public Tree() {
this(null);
}
public Tree(Node root) {
this.root = root;
}


public void insert(int key, char label) {

Node node = new Node(key, label);
if(root == null) {
root = node;
return;
}

Node temp = root;
Node parent = root;

while(temp != null) {
parent = temp;
if ( key < temp =" temp.left;" temp ="=" left =" node;" temp =" temp.right;" temp ="=" right =" node;" table =" new" count =" 0;" bst =" new">

Reconstruction from Inorder and Postorder
Steps are pretty much the same for Postorder too. But while inserting into BST we start from last element of the traversal and continue on to first.

Computing derivatives

One simple technique that I know of for computing derivative is assuming a small change in x axis (like 0.0001, the smaller the better) and computing corresponding change in y. The derivative then is dy / dx, which is just a slope at that point. With this idea, lets say were given a function y = f(x) and we want to compute derivative at x = a. Consider a small value h (the smaller the value h, the more accurate is the derivative) then we will consider interval a and a + h. For the given interval we can calculate change in y as f(a + h) - f(a). Now we already know that derivative at a point is the slope at that point, which is given by:

m = [f(a+h) - f(a)]/ h

This formula is sufficient to compute derivative. The pseudo code for this would be :


double compute_derivative(double a) {
double h = 0.0000001;
double faph = compute the function for x = a+h;
double fa = compute the function for x =a;
return (faph - fa) /h;
}

Derivatives are useful for many numerical algorithms. So knowing this can be handy for programmers.

Computing square / cube roots

We are very much used to using the square root functions provided in the library. But many of us don't have a clue how it can be done ourselves ( at least some of my friends don't have clue because we were discussing about it sometime back). So how can it be implemented if you need one? I am obviously at advantage because I had previlege to take Numerical Methods course during my undergrad. But even those of you who didn't take Numerical methods course know a searching technique which can be used to compute square roots. What is the technique? Of Course, binary search. How can you use binary search to compute square root of say 10. The idea is simple: consider the interval 0 - 10. Find the midpoint which is 5. Check if 5*5 is less than the given number. If it is less then the root greater than 5 else it is less than 5. Keep on proceeding like this until u find the actual number. This would translate to code as follows:



double find_root(double number) {
double lo = 0;
double hi =number;
double mid;
double epsilon = 0.000001; //precision
while ( lo <>
mid = (lo + hi)/2;
if (abs((mid * mid) - number) <= epsilon)
break;
else if (mid * mid < number)
lo = mid;
else
hi = mid;

}
return mid;
}


This works but takes time in converging. For those of you familiar with numerical algorithms you know that there is a better way of doing this. One most widely used method for root finding is Newton Raphson method which exploits concept of derivative. In this method a root is guessed and then the guess is converged to the actual root. For example your initial guess is X0 then the next value closer to the root would be:
X1 = X0 - f(X0) / f`(X0)
Similarly X2 = X1 - f(X1) / f`(X1) and so on until you get the precision that you want. Generally, it can be written as:

X = X - f(X) / f`(X)

The idea is that at each iteration it takes you closer to the root. We know that root is a poin in XY plane where the curve meets X- axis. The above formula is taking you close to that X-axis whenever you apply it. Now lets express finding square root as a root finding problem. Say you want to find square root of c. Let the square root be x then we can write the equation as:

x^2 = c or x^2 - c = 0

We require computing derivative for this at each new value of x that we obtain iteratively. This cane be done by referring to the post: http://codinggeeks.blogspot.com/2010/04/computing-derivatives.html


double newton_raphson(double number) {
double fx , fdashx;
double x = 3.00; //make an smart guess here
double epsilon = 0.00001; //choose your precision
do {
fx = computer fx at x;
fdashx = compute_derivative(x);
x -= fx / fdashx;

} while ( abs (x * x - c) > epsilon);

return x;
}


You can choose your own precision. Note that this method can be uesd to compute any root though I illustrated code for square root computation. For computing any root replace the while condition with your function.

Tuesday, April 13, 2010

Efficient method to find if a number is power of 2

Finding if a number is power of 2 is pretty straight forward when you think in terms of bits. One way is to count the number of ones in binary representation. If number of ones is one then the number is power of 2. To illustrate this consider powers of 2 like 1,2,4,8,16 and so on. how are they represented in binary? 1, 10, 100, 1000, 10000 and so on. What do they have in common? All of them contain only single bit that has value 1 and rest are 0. This method is effective but still it requires O(n) time in the worst case (where n is the maximum number of bits in the number).

Lets think of it in a different way. What sort of numbers are power of 2? If we look at the bit pattern we can say that the numbers which have a single 1 bit and the left and right bits from the 1 bit are all zeros. So what will be the number that represents the right bits? If you deduct 1 from given number N (i.e. N-1), then it should give number with all right bits set if the number is power of 2. What this means is that if we AND the given number N with N-1 and the result is 0 then the number is power of 2 otherwise not. If a number is not power of 2, it will have atleast single 1 on the right of it and the AND operation would result a number greater then 0. The check for power of 2 can be written in terms of code as:




bool is_power(unsigned int n) {

return n !=0 && (n & n-1) == 0;
}

Converting BST to sorted Doubly Linked List

Today, one of my friend asked how to convert a Binary Search Tree to sorted Doubly Linked List. He came across this problem while preparing for interview. I then began to dig what I could remember about BST. I had learnt from my Data Structures course in college that most tree problems boil down to tree traversal problems. So I started thinking along that line and suddenly came out with a solution. The idea is to traverse the tree In Order and adjust the pointers accordingly. The code for this is pretty straight forward.


Node *last_left = 0;

void traverse(Node * node)
{
if(node == 0)
return;
//traverse left tree
traverse(node->left_);

//adjust the left and right pointers (left points to previous
//and right points to next)
//in terms of doubly linked list
if(last_left != 0){
last_left->right_ = node;
node ->left_ = last_left;
last_left = node;
}
else
last_left = node;

//traverse right subtree
traverse(node->right_);
}