# March 2010

Warning! Some information on this page is older than 5 years now. I keep it for reference, but it probably doesn't reflect my current knowledge and beliefs.

# Introduction to Endianess

20:06
Wed
31
Mar 2010

For those of my readers who don't know this subject yet, here is my brief introduction to what the endianess is. It's simply the way of storing numbers that span across many bytes in the computer memory, file or network. There are two possibilites: Big-Endian and Little-Endian. It is (or at least should be) always clearly stated which endianess is in use in particular hardware platform, file format or network protocol.

But let's start with something simple. As we all know, a byte has 8 bits and we usually imagine it as a sequence of zeros and ones ordered from the most signififant bit to the least significant bit. So for example, if we have an 8-bit unsigned integer number, we could write it this way:

201 = 0b11001001

BTW, new the Java version 7 will have the ability to understand such binary numbers starting with 0b. I wish the creators of C++ allowed this instead of totally unuseful and misleading octal numbers starting with just 0 :(

Back to the point... As byte is the smallest piece of data that can be addressed and processed, we don't care in what order does the memory or file store the bits inside of it. But as a number spans across multiple bytes, things get a little more complicated. Let's say we have a 32-bit unsigned integer number:

20100331 = 0x0132B4EB

In all notations we as humans and programmers use to show numbers, whether it has decimal, hexadecimal, octal or binary radix, we start from the most significant digit. So it would be natural and intuitive to think that compuler stores such 32-bit number as such consecutive four bytes (shown as hex numbers) :

01 32 B4 EB (Big-Endian)

But it's often not the case. The order in which the most significant byte of a multibyte value comes first is called Big-Endian and it's used in some of file formats, network protocols or as the native way of keeping numbers in memory on some hardware platforms (like Xbox 360). But our PC-s natively use Little-Endian, which means they store the bytes of multibyte values in the opposite order, where least significant byte comes first:

EB B4 32 01 (Little-Endian)

So if you write a piece of C++ code that assigns unsigned x = 0x0132B4EB, saves this variable to a file: fwrite(&x, sizeof(unsigned), 1, myFile); and then see this file in some hex editor, you will see bytes stored in the order opposite to the natural one - EB B4 32 01 - because PC is a Little-Endian platform. It looks same way in the RAM memory when you look under the address where variable x is being kept.

The good news is that you don't always have to remember about all this endianess stuff. When you just use your multibyte numbers (short, int, __int64, float, double) in memory, store them in files and transfer across the network and your application is compiled only under platform with same endianess (like the PC, no matter if it's Windows or Linux), you don't even have to think what's the byte order of these values.

But there are some cases when you have to think about the endianess, like when you address the memory or file byte-after-byte while trying to interpret its data. You sometimes even have to convert the endianess. For example, you may prepare some data to be read under Xbox 360 or you code support for a file format or a network protocol that use Big-Endian for its numbers.

How to do such conversion? The multiplatform sockets API provides some functions that convert 16-bit "short" and 32-bit "long" numbers between so called "host" byte order (the order used under current platform) and "network" byte order (which is Big-Endian, used by IP, TCP, UDP etc.). They are: htons, ntohs, htonl, ntohl.

I've also included some functions for endianess conversion into my CommonLib. library. First, here is how I swap bytes in 16-bit, 32-bit and 64-bit numbers. It's quite simple and just based on bit shifts.

inline void SwapEndian16(void *p)
{
uint2 &u = *(uint2*)p;
u = (u << 8) | (u >> 8);
}
inline void SwapEndian32(void *p)
{
uint4 &u = *(uint4*)p;
u = (u << 24)
| ((u & 0x0000ff00u) << 8)
| ((u & 0x00ff0000u) >> 8) | (u >> 24);
}
inline void SwapEndian64(void *p)
{
uint8 &u = *(uint8*)p;
u = (u << 56) | (u >> 56)
| ((u & 0x000000000000ff00ull) << 40)
| ((u & 0x00ff000000000000ull) >> 40)
| ((u & 0x0000000000ff0000ull) << 24)
| ((u & 0x0000ff0000000000ull) >> 24)
| ((u & 0x00000000ff000000ull) <<  8)
| ((u & 0x000000ff00000000ull) >>  8);
}

I've also wanted to have functions to swap endianess of whole array of numbers in a single call:

void SwapEndian16_Array(void *p, uint count);
void SwapEndian32_Array(void *p, uint count);
void SwapEndian64_Array(void *p, uint count);

void SwapEndian16_Data(void *p, uint count, int stepBytes);
void SwapEndian32_Data(void *p, uint count, int stepBytes);
void SwapEndian64_Data(void *p, uint count, int stepBytes);

And finally I've defined more type-aware, overloaded functions that swap byte endianess regardless of type of the parameter passed. This way I can easily define new versions of this function for some of my custom types and semantics stays the same so I can use it in templates. Functions for single-byte types like bool and char are here for completeness and, as you can see, do nothing :)

inline void SwapEndian(bool             &v) { }
inline void SwapEndian(unsigned char    &v) { }
inline void SwapEndian(signed char      &v) { }
inline void SwapEndian(unsigned short   &v) { SwapEndian16(&v); }
inline void SwapEndian(short            &v) { SwapEndian16(&v); }
inline void SwapEndian(unsigned         &v) { SwapEndian32(&v); }
inline void SwapEndian(int              &v) { SwapEndian32(&v); }
inline void SwapEndian(unsigned long    &v) { SwapEndian32(&v); }
inline void SwapEndian(long             &v) { SwapEndian32(&v); }
inline void SwapEndian(unsigned __int64 &v) { SwapEndian64(&v); }
inline void SwapEndian(__int64          &v) { SwapEndian64(&v); }
inline void SwapEndian(float            &v) { SwapEndian32(&v); }
inline void SwapEndian(double           &v) { SwapEndian64(&v); }

# My Algorithms for Sorted Arrays

16:47
Sun
21
Mar 2010

Yesterday I wrote about The Beauty of Sorted Arrays. I believe the C++ STL is not perfect and so I've written some additional code that I needed for the purpose of handling such data structure. It can be found in my CommonLib library.

I once wanted to have a sorting algorithm, just like std::sort, but using insertion sort. The std::sort is of course great, has average O(n log n) complexity and AFAIK internally uses introsort algorithm (an improved version of quicksort). But if we know that the collection is already almost sorted, insertion sort will work faster. So here is my code:

// Basic version that uses operator <
template<class Iterator>
void InsertionSort(Iterator b, Iterator e)
{
if (b == e) return;
for (Iterator j = b + 1; j < e; j++)
{
typename std::iterator_traits<Iterator>::value_type key;
key = *j;
Iterator i = j;
while (i > b && key < *(i-1))
{
*i = *(i-1);
i--;
}
*i = key;
}
}

// Overloaded version with explicit comparator
template<class Iterator, typename Compare>
void InsertionSort(Iterator b, Iterator e, Compare Comp)
{
if (b == e) return;
for (Iterator j = b + 1; j < e; j++)
{
typename std::iterator_traits<Iterator>::value_type key;
key = *j;
Iterator i = j;
while (i > b && Comp(key, *(i-1)))
{
*i = *(i-1);
i--;
}
*i = key;
}
}

Another problem is finding an item in the ordered array by some key. I can see two solutions in C++, both unsatisfactory. First one is to write your own comparator or overload operator < for your item structure to compare items by key, like this:

struct Item
{
int Id;
DWORD Color;
std::string Name;

bool operator < (const Item &rhs) const { return Id < rhs.Id; }
};

The problem is that C++ comparator always compares two full items, so to use this comparing feature for finding item in the array, we must construct and pass full Item object, despite we use only the Id field.

Item item;
item.Id = 1;
bool contains = std::binary_search(vec.begin(), vec.end(), item);

Second solution is to use std::map container. It has separate key and value type so you can search by key:

std::map<int, Item>::iterator foundIt = map.find(1);
if (foundIt != map.end())
std::cout << "Found ID=" << foundIt->first
<< ", Name=" << foundIt->second.Name << std::endl;

But the problem is that the key field is separate from value type so the Id field should not be included in the Item structure because it's redundant. We should instead use std::pair<int, Item> as the type of items contained in the map, which is alised to std::map<int, Item>::value_type.

I wanted to have more convenient way to quickly search an ordered array by key, so I've coded these two function templates:

template <typename IterT, typename KeyT, typename CmpT>
IterT FirstNotLessIndex(IterT beg, IterT end, const KeyT &key, CmpT cmp)
{
unsigned down = 0, up = (end - beg);
while(down < up)
{
unsigned mid = (down + up) / 2;
int res = cmp(key, *(beg + mid));
if(res > 0)
down = mid + 1;
else
up = mid;
}
return beg + down;
}

template <typename IterT, typename KeyT, typename CmpT>
IterT BinarySearch(IterT beg, IterT end, const KeyT &key, CmpT cmp)
{
IterT it = FirstNotLessIndex<IterT, KeyT, CmpT>(beg, end, key, cmp);
if (it == end || cmp(key, *it) != 0)
return end;
return it;
}

First one performs binary search to determine the iterator for a first element not less (greater or equal) than the specified key. It's perfect for determining the place for inserting new item into the array. Second function uses first one to check whether an item with exact matching key exists in the array and returns iterator pointing to it, or end iterator if not found.

As you can see, both functions take the type of the key as a template parameter, which may be different than full item type pointed by iterator. They also take a comparator functor which is always called with parameters (key, item) and should return int. The comparator should return a number less, equal or greater than zero according to the result of the comparison - just like strcmp, memcmp and other functions from old C.

Now my example can be written like this:

struct Item
{
int Id;
DWORD Color;
std::string Name;

static int Cmp(int id, const Item &rhs) { return id - rhs.Id; }
};

bool contains =
common::BinarySearch(vec.begin(), vec.end(), 1, &Item::Cmp) != vec.end();

Having such universal comparison function, it's easy to convert it to all comparison operators:

bool operator <  (const Item &rhs) const { return Cmp(Id, rhs) <  0; }
bool operator >  (const Item &rhs) const { return Cmp(Id, rhs) >  0; }
bool operator <= (const Item &rhs) const { return Cmp(Id, rhs) <= 0; }
bool operator >= (const Item &rhs) const { return Cmp(Id, rhs) <= 0; }
bool operator == (const Item &rhs) const { return Cmp(Id, rhs) == 0; }
bool operator != (const Item &rhs) const { return Cmp(Id, rhs) != 0; }

Opposite conversion is also possible. If you have a type with operator < accessible (like float or your own type with overloaded operator), you can convert it to a comparator returning int with this function:

template <typename T>
inline int UniversalCmp(const T &a, const T &b)
{
if (a < b) return -1;
if (b < a) return  1;
return 0;
}

Comments | #commonlib #stl #c++ #algorithms

# The Beauty of Sorted Arrays

16:48
Sat
20
Mar 2010

A one-dimmensional array (also called vector) is very simple data structure. We were taught that inserting or deleting elements in the middle of the array is slow because we have to move (by rewriting) all following elements. That's why books about C++ teach that for sorted collections - those who provide fast search operation - it's better to use containers like std::set or std::map.

But those who code games in C++ and care about performance know that today's processors (CPU) are very fast in terms of doing calculations and processing lots of data, while access to the RAM memory becomes more and more kind of a bottleneck. According to great presentation Pitfalls of Object Oriented Programming by Tony Albrecht from Sony, a cachce miss (access to some data from the memory which are not currently in CPU cache) costs about 400 cycles! Dynamic memory allocation (like operator new in C++) is also a slow operation. That's why data structures that use separate nodes linked with pointers (like std::list, std::set, std::map) are slower than std::vector and it's often better to keep a sorted collection of elements in a vector, even if you have hundreds or thousands of elements and must sometimes insert or delete them from the middle.

Unfortunately, C++ STL doesn't provide a "std::ordered_vector" container that whould keep items sorted while allocating single chunk of continuous memory. We have to do it by ourselves, like this:

First, we have to choose a way to compare elements. There are multiple solutions (read about comparators in C++), but the simplest one is to just overload operator < in your element structure.

#include <vector>
#include <string>
#include <algorithm>

struct Item
{
int Id;
DWORD Color;
std::string Name;

Item() { }
Item(int id, DWORD color, const std::string name)
: Id(id), Color(color), Name(name) { }

bool operator < (const Item &rhs) const { return Id < rhs.Id; }
};

typedef std::vector<Item> ItemVector;

ItemVector vec;
Item item;

Here is how we insert an item into sorted vector:

item = Item(1, 0xFF0000, "Red");
ItemVector::iterator itToInsert = std::lower_bound(
vec.begin(), vec.end(), item);
vec.insert(itToInsert, item);

Unfortunately we have to deal with iterators instead of simple indices.

To check if the vector contains element with particular Id:

item = Item(1, 0, std::string());
bool contains = std::binary_search(vec.begin(), vec.end(), item);

Here come another nuisances. We have to create full Item object just to use its Id field for comparison purpose. I'll show tomorrow how to overcome this problem.

It's also annoying that binary_search algorithm returns bool instead of iterator that would show us where the found item is. To find an item and determine its iterator (and from this its index) we have to use lower_bound algorithm. This algorithm is very universal but also hard to understand. It quickly searches a sorted collection and returns the position of the first element that has a value greater than or equivalent to a specified value. It was perfect for determining a place to insert new item into, but to find existing element, we have to use it with this if:

item = Item(1, 0, std::string());
ItemVector::iterator findIt = std::lower_bound(vec.begin(), vec.end(), item);
if (findIt != vec.end() && findIt->Id == item.Id)
{
size_t foundIndex = std::distance(vec.begin(), findIt);
}
else
{
}

And finally, to remove an item from the vector while having index, we do this trick:

size_t indexToDelete = 0;
vec.erase(vec.begin() + indexToDelete);

Well, C++ STL is very general, universal, powerful and... very complicated, while standard libraries of other programming languages also have some curiosities in their array classes.

For example, ActionScript 3 have single method to insert and remove items:

function splice(startIndex:int, deleteCount:uint, ...  values):Array

To delete some items, you pass non-zero deleteCount. To insert some items, you pass them as values parameter.

In .NET, the System.Collections.Generic.List<T> class (which is actually an array, the name is misnomer) has very smart BinarySearch method. It returns the "zero-based index of item (...), if item is found; otherwise, a negative number that is the bitwise complement of the index of the next element that is larger than item or, if there is no larger element, the bitwise complement of Count.". It can be used to both find existing item in the array or determine an index for a new item, like:

int index = myList->BinarySearch("abc");
if (index < 0)
myList->Insert(~index, "abc");

Comments | #algorithms #c++ #stl #.net #flash

# How to Disable Redrawing of a Control

18:43
Thu
18
Mar 2010

When coding Windows application with GUI, there is an issue about how long does it take to add lots of items (like hundreds or thousands) into a list view or tree view control. It is caused by redrawing the control after each operation. I've seen this annoying effect in may programs, including some serious, commercial ones. Apparently many programmers don't know there is a simple solution. But first a bit of background...

When coding games, we constantly spin inside a big loop and redraw whole screen every frame. Calculations are separated from rendering so we can, for example, build a list with thousands of items in the computation function and the visible part of the list will start being rendered since the first rendering function call after that. When coding windowed applications, nothing new is drawn onto the screen unless needed. We have to manually do it and we can call redrawing function any time. So how should a list control be refreshed when we add an item into it? It is done automatically after each insert, which is a good solution... unless we want to add hundreds of them.

So GUI library developers provide a functionality to temporarily disable redrawing of a control.

• In pure WinAPI it can be done by sending WM_SETREDRAW message to the control window. I'm sure it works with List View and Tree View control, but this message is described in MSDN as something general in the GDI section, so it sould work with other control types too.
• The MFC wrapper over this message is the SetRedraw method.
• In .NET many controls, like ListBox, ComboBox, ListView and TreeView, have BeginUpdate and EndUpdate methods.
• In wxWidgets library, wxWindow class provide methods Freeze and Thaw.

Comments | #.net #mfc #wxwidgets #winapi #gui

# The Code Bubbles Idea

18:03
Wed
17
Mar 2010

Code Bubbles Project is an interesting scientific project that tries "Rethinking the User Interface Paradigm of Integrated Development Environments". The core idea may seem controversal, but the video is definitely worth watching. We also have a discussion about this on our forum: Code bubbles - IDE przyszłości? [pl].

When it comes to my opinion, I'm enthusiastic about this - at least about half of this project. For very long time I dream about opening and editing single classes or function definitions instead of navigating over smaller or larger source files. I think that grouping such definitions into source code files as an intermediate step between whole project and single class or function is just unnecesary. Of course those who like to extensively use preprocessor macros, Find&Replace or grep would not agree, but I believe the future is in authomatic code refactoring like "Rename symbol" and intelligent code navigation like tracking call graph.

But on the other hand, I don't like the idea of having bubbles spread over 2D workspace that can panned and zoomed. It remains me of so called mind maps, which I truly hate. I always prefer to write down my ideas, knowledge, design and everyting as pure text in a tree-like nested lists, so for me, it would be more convenient in Code Bubbles to just have a tree of classes and methods and open them in separate tabs, floating windows or something like this.