Astrologers in our processors

Ya we have an astrologer in our modern micro processors he is the “Branch Predictor”. I have been surfing in the web then I have gone crazy to find out what is the question i.e most viewed and most voted question on stack overflow then I came across a question which is viewed around 500k times, The Question has 8886 up votes,  8946 stars and the answer had 13457 up votes. Let’s come to the point

Branch Predictor predicts which way the branch ( say if-else) goes and continues with the instructions in that branch before actually evaluating the result of the boolean expression if the prediction comes out to be true it continues to execute that part of the code else it discards the instructions that got evaluated and goes to the other branch. This technique is used by modern pipelined micro processor with x86 architectures for better performance.

Branch predictors mostly predicts the value of the boolean expression(i.e which way to go) from the previously calculated outputs of the boolean outputs i.e it tries to recognize a pattern.

Now let’s see the effect of branch prediction on our codes:

Say you have an N sized array named “A” (say a) and you would like to find all the elements less than an element ‘a’ (say) in the array now using general procedure checking every element in the array

CASE 1: Array Is Not Sorted

Now it checks with each element if it is less than a then it executes that part of the code that increments code else don’t.  If the array is not sorted branch predictor as usually tries to predict whether the number is less than ‘a'(in this case) or not but as the array is not sorted most of the times there is a possibility that the predictor  is wrong and it has to discard the instructions it had already fetched along the other branch.

CASE 2:Array Is Sorted

When the array is sorted we have all the numbers less than ‘a’ in the beginning of the N-sized array and all the numbers greater than ‘a’ will be towards the end and branch predicts the correct value most of the times and have a better performance as each times it saves the time for execution of the boolean execution which decides the branch;

Amazing right?????

Related links:

Stack Overflow link is here

wiki link is here

Advertisements

This “bug” is really “buggy”

Did you ever ran a loop with counter as unsigned int then you must have faced this problem consider the following for loop which is expected to print numbers from 16 to 0

unsigned int i ;

for(i = 16;i>=0;i–){

printf(“%d “,i);

}

This loop doesn’t give output as 16 15 ……… 0 but runs on for ever.   Can you guess the reason(once give it a try!!!). Here it goes  once the value of i becomes 0 when i — is done the value changes from 0 to max value of unsigned int(depends upon the architecture of your system) and hence the value of 1 can never go below zero. Hence it’s an infinite loop 🙂

“EXTERN” keyword in “C”

First I would like to write What exactly does a declaration or definition of a variable mean:

Declartion and Definition:

In c declaration tells the compiler that you are going to use a variable name var(say) and it is of type int(say). This is what exactly a declaration does. Remember that just declaration doesn’t allocate any memory to the variable. What definition does is It allocates the memory block to the variable and assigns some value if specified. Most of the times you write int var you are actually declaring as well as defining it i.e you are telling the compiler that you are going use a variable name var which is used to refer to an integer and it allocate the sufficient memory if you give int var =0 it also assigns it some value. You are allowed to declare a variable as many times as you wish but you can define it only once. Once you define a variable you cannot redefine again.

So how do we declare a variable without defining it. Here comes our “extern” key word. extern int var just declares the variable without defining it you can later declare and define or even assign using int var =0;

Just to add, in c only uninitialized global variables are all made to zero but not local variables. The reason can be killing of performance if it is to initialise all the local variables each time corresponding function is called. But global variables are initialised only once and it’s done by CRT( C Run Time) 🙂

The “C” Preprocessor

Some of the important changes will be done to the source code before it is handed over to the “C” compiler by the “C” Preprocessor.   We use Pre processor directives to get work done using Pre processor.   #include is a preprocessor directive which is used to include the elements of the corresponding file mentioned; Say you have used #include<stdio.h> This asks the pre processor to copy all the elements(definitions here) in the stdio.h file into the present file in place of the Pre processor directive(if angular brackets(<>) for a file then It is meant to search for the file in the standard compiler include paths and if double quotes (“”) are used it searches in the path mentioned in between the quotes. Say you use #define VARIABLE VALUE and say you used that variable you defined with the #define directive at different parts of your code the Pre processor replaces all the places where there is presence of VARIABLE with VALUE.

Conditional Compilation is an important feature of Pre processor Directives;  For this you use

#if boolean
//execute this code
#elif boolean
//execute this code
#else
//execute this code
//endif

//There boolean may be some statement which finally gets evaluated to a boolean

You can also have conditional Compilation as below

#ifdef MYDEF
//execute the code if MYDEF is defined i.e #define MYDEF value is done somewhere
#endif

#ifndef MYLIB_H
//usually this is used when you include your own .h file and making sure that type one is not included before i.e you do as below
#include “mylib.h”

#endif

Adding Two Numbers Without Using Arithmetic Operator “+”

Ever Wondered how to do math on a computer without having arithmetic operators; Oh God!!! Luckily we have them;  Here’s a post which is an alternative for adding two numbers without using Arithmetic Operators;  The strategy that we follow is to go back to the basic level of bits and bytes and what happens to binary state of numbers when we add them and then establish a paradigm to obtain the relation between the final pattern of the bits and the initial bit patterns of the two numbers;  Here’s the algorithm

Say you have to add two numbers a,b; The strategy is when you & two bit patterns you get one’s only the places where you get a carry 1 and a xor b will just give the correct sum of the individual bits without considering any carry’s i.e i.e you add 1 1 then you get zero in xor and the carry is taken care from a & b bit pattern;

int a,b,carry;

while(b!=0){

carry=a&b; // This bit pattern is for the carry i.e you are checking for the bit pattern of the carry

a=a^b;        // Now to get the sum of the bit pattern without considering any carry’s

b=carry<<1;//making a left shift of the carry as the carry bit goes left on each computation

}

Now printing a will give the sum of the initial b with a

Arrays and Pointers

Arrays and pointers are really intimidating at the first go. Here are some of the intricacies with the Arrays and Pointers
Arrays and Pointers are equivalent but not equal It’s the array indexing and pointer arithmetic that is similar. Arrays are the constant pointers i.e always an array reference(or variable) points to the first element of the contiguous memory allocated like if arr is the array then a reference variable always points to first element of the array i.e &a[0] or &*(arr+0).  One cannot assign to some other location like arr=&b or arr=arr+1 i.e one cannot change the location to which the reference variable or the name of the array is pointing..,

Coming to the 2D arrays:
In 2D arrays say you have an array say:

char ttt[3][3] = {{'x', 'x', 'o'},
                  {'o', 'o', 'x'},
                  {'x', 'o', ' '}
                 };

Now the memory allocation would be like

2D array memory representation
and the whole 2D array would like the above figure with contiguous memory allocation

Lets say you have int arr[10];

int *a = arr;

Here when you write arr[3] it actually means at the location represented by arr move by 3 places and read the value, when you write a[3] it means get the location value stored in a and move by 3 places and read the value. The value returned by both of them would be the same. This similarity in accessing the elements by the pointers and arrays is exploited in using pointers as function parameters instead of arrays.

&arr will give a pointer to an array of 10 ints whereas &a will give a pointer to an integer.  Though the value returned by both are same i.e address of the first element of the array but the type of the pointers are different.

An L-value  of type array-of-T which appears in an expression decays into a pointer to the first element of the array with the following 3 exceptions:

  1. As an argument to sizeof() function, :sizeof(arr) will give size of array wheras sizeof(a) will give size of a pointer on your machine i.e 4 bytes or 8 bytes.
  2. & operator: &arr will return the address of the first element in the array i.e array’s address but &a will return address of the pointer
  3. Literal string initialiser for a character array: char a[ ] = “hello”; char *a=”hello”; In the first case a doesn’t decay into a pointer but retains it’s behaviour as an array of int’s for the initialisation to take place.

Here a is a pointer to an int or integer. Now let’s go to 2D arrays say we have int arr[rows][Ncolumns] Now arr is pointer to array of Ncolumns ints and hence arr+1 would take you to the next column in Ncolumns. (Got It???)

So How do you usually declare a pointer to an array:

int *ip;       //This declares pointer to an integer

int *ip[3];   //This declares an array of pointers(here pointers of integers) where the array size is 3

int (*ip)[3]    //This declares pointer to an integer array of length 3

Hence It’s clear now that the reference variable or the name of the two dimensional array(of ints here) used is an array to No. of columns sized integer array(here);

Hence If you would like to have an iterator(or would like to pass it as an argument in a function) over an one dimensional array use int *p and If it’s two dimensional array then use int (*ip)[Ncolumns];

src:http://www.fredosaurus.com/notes-cpp/arrayptr/23two-dim-array-memory-layout.html

Performance Optimisation in using two dimensional arrays

Here’s the thing we always write a double for loop for iteration over two two dimensional arrays Yes! we did it many a times as we have to do it.  Here’s a way to optimize the way you write that nested for loop

Usually we can write the nested for loops in the following two ways both does exactly same but when it comes to optimization in execution time First Method wins over the Second Method

First Method:

for (int i = 0 ; i < width; i++) 
{
    for (int j = 0; j < height ; j++) 
    {
        testArray[i][j] = i * j;
    }
}

Second Method:

for (int j = 0; j < height ; ++j)
{
    for (int i = 0 ; i < width; ++i) 
    {
        testArray[i][j] = i * j;
    }
}

The reason is:

In the First Method  you iterate over element by element and once you finish a row you go to the next row hence in this case mostly you use a contigous memory location and you jump over for the next element only when you have to change a row after iterating over elements in that row and in this cache is in you support and when you fetch data from a memory location cache does a house keeping of storing data in immediately next memory locations and this help for a faster iteration.

In the Second Method you jump over the entire row for each element and as the element in the next row is usually not immediately the next memory location you aren’t just using your cache and this reduces the performance of your code.   I was really impressed when I read this article Though it looks obvious for many people realizing even the simplest things is sometimes non-trivial:)

src: http://blog.drorhelper.com/2009/04/performance-optimization-tip-understand.html

Let’s talk about “this”

Ya I mean “this” a keyword in object oriented programming..,
Look at the Example 1 in the below gist.

In the temp class you have two reference variables named x,y and two local variables x,y inside special method.

Now you have another class say temp1 as shown above when run produces the following output:

Now the output will be
1
2
3
4
when you are write “this” in a method you are actually referring to the object of the current class created during the run time. Hence when you write this.x it refers to the reference variable but not the local variable but when you just mention x in special method you are referring to local variables.
When you are using this(args) you are actually referring the constructor of this class. Look at the example 3 for the usage of “this” to call a constructor.

Bit wise operators

Most of us rarely use bit wise operators but if we know where and when to use them they are really going to be awesome.., Let’s check out an application of bit wise operators right now:)

Bit Vectors(series of bits) can be used to represent sets(gotta remember from school days mathematics).

Lets say we have to perform union operation of two sets It becomes increasingly easy if you use bit wise operators

Ex.:

lets say the set we have is {0,3,5,6}

then represent it as follows i.e 7654321 and corresponding status 01101001 where 1 represents the presence of the number in the set and viceversa

Let’s take another set {0,2,4,6}

which is represented as 7654321 and corresponding status 01010101

Now union of two sets is bit wise union( | ) of two statuses and intersection is bit wise intersection( & ) of two statuses and Now Symmetric difference is similarly with bit wise operator ^(exclusive or) and for compliment operator we can use ~ operator on statuses


A smarter Application of bitwise operators

Suppose If you would like to represent if-else conditional in a smarter way Check
out here..,
if(x)
a=y;
else
a=z;
Way one is this a=x?y:z
The other interesting way is:
a= ((x<<31)>>31)&y+(((!x)<<31)>>31)&z
Here we assumed that the boolean x is either 1 or zero

Another Application is here:

Suppose you are asked to swap two numbers without using a third variable here bit wise operators comet to our rescue i.e An important concept here is if you xor a n bit binary number with any arbitary number twice you get that back the original number; Say you have to swap x and y:
int x=53,y=43;
x=x^y;
y=x^y;
x=x^y; // “^” operator is associative