More pages: 1 2 3 4
Framework4 update
Thursday, November 5, 2009 | Permalink
I haven't mentioned the new framework I'm working on in a while, mostly because progress has been slow and I've been occupied with other things. Just wanted to give a brief mention of a neat trick I've implemented recently. In my framework I try to strike some kind of balance between ease of use and readability versus performance. Most of the time my demos are not very CPU heavy at all, so the CPU-side performance part isn't all that relevant, but it's always nice to have low overhead, just because it feels good. In Framework4 I've taken some steps to keep the overhead down. Something I usually do for readability is to set textures and other resources by name using strings, like so:
context->SetTexture("Tex", m_Texture);
This makes the code very readable and it's easy to see the mapping from the code to the shader. Unfortunately, lookups by a bunch of strcmp() is not the fastest way to find the corresponding texture unit index, even though it doesn't have to be terribly slow. A faster way is to use a precomputed hash value and do a lookup with that instead. I still want to use the same syntax though. What I ended up doing was changing
const char * to
StringHash, which is a class containing only a uint32 value, so for all practical purposes, it's a uint32 when compiled with optimizations. StringHash has loads of constructors. One for each length of the string I need, like so:
StringHash(const char (&str)[2]);
StringHash(const char (&str)[3]);
StringHash(const char (&str)[4]);
StringHash(const char (&str)[5]);
...
StringHash(const char (&str)[21]);
This covers all strings up to 20 characters, which should be enough for resource names. Each constructor contains the unrolled code for generating the hash, for instance:
StringHash(const char (&str)[4])
{
m_Hash = 0;
m_Hash = m_Hash * 65599 + str[0];
m_Hash = m_Hash * 65599 + str[1];
m_Hash = m_Hash * 65599 + str[2];
}
The hash function itself is not very important, I chose this one because it came up early on a google search. So when I make a SetTexture() call, I can call it with a constant string directly, which in C++ means it'll look for a way to convert the const char array to a StringHash, and finding a matching constructor it will use that to construct the StringHash object for me. Given the compile-time constant string and the simple unrolled code, any decent compiler will boil all this down to the resulting hash code when compiled with optimizations, and the runtime cost is essentially the same as if I would have written it like this:
context->SetTexture(0x29C22FA7, m_Texture);
A sneak peek at the assembly code we find the resulting hash inlined as an intermediate constant:
mov dword ptr [eax], 29C22FA7h
call Context::SetTexture
It will of course still be faster to just call SetTexture() with an index directly instead of the name, but this should narrow the gap somewhat.
cDc
Saturday, April 9, 2011
Hi all,
Thanks Aslan for that smart implementation. I just played a bit with it on MSVC, here are some conclusions:
- VC2010 inline strings of max 254 chars (the link takes 18s on AMD X4 945 3GHz using only one core); bigger strings are split in an inline hash for the last 254 chars and a function call for the rest.
- VC2005 SP2 and VC2008 SP1 behave exactly the same (note: /O2 has to be specifically selected, even if it is activated by default in release; /O1 and /Ox are also good). Expanding manually one step the _Hash() function, decreases the linking time down to only 6s.
- g++ on Ubuntu 10.04 x86 doesn't seem able to inline the final value even with -O3 (though it expands the iteration).
There seems to be only one problem: the template can not be used two or more times (the hash value is not inlined anymore). Any idea why and how to force the linker to do its job?
The code I used:
class StringHash
{
private:
uint32_t m_val;
template<size_t N> inline uint32_t _Hash(const char (&str)[N]) const
{
typedef const char (&truncated_str)[N-1];
return str[N-1] + 65599 * _Hash((truncated_str)str);
}
inline uint32_t _Hash(const char (&str)[2]) const { return str[1] + 65599 * str[0]; }
public:
template <size_t N> StringHash(const char (&str)[N]) { m_val = _Hash(str); }
inline operator uint32_t() const { return m_val; }
};
Makenshi
Saturday, April 16, 2011
Hey,
I really like that template implementation, but it doesn't work if the parameter passed is not "like_this" but rather StringHash(str.c_str());
I've been trying to get it working for a long time but with no success.
Aslan Dzodzikov
Friday, May 20, 2011
Hi all.
cDc, __forceinline makes the thing. As for build times, you can unroll the "inlining loop" further, just like here:
class StringHash
{
unsigned m_val;
template<size_t N> __forceinline unsigned _Hash(const char (&str)[N])
{
typedef const char (&truncated_str)[N-4];
return str[N-1] + 65599*(str[N-2] + 65599*(str[N-3] + 65599 * (str[N-4] + 65599*_Hash((truncated_str)str))));
}
__forceinline unsigned _Hash(const char (&str)[4])
{
typedef const char (&truncated_str)[3];
return str[3] + 65599 * _Hash((truncated_str)str);
}
__forceinline unsigned _Hash(const char (&str)[3])
{
typedef const char (&truncated_str)[2];
return str[2] + 65599 * _Hash((truncated_str)str);
}
__forceinline unsigned _Hash(const char (&str)[2]) { return str[1] + 65599 * str[0]; }
__forceinline unsigned _Hash(const char (&str)[1]) { return str[0]; }
public:
template <size_t N> __forceinline StringHash(const char (&str)[N]) { m_val = _Hash(str); }
operator unsigned() { return m_val; }
};
int _tmain(int argc, _TCHAR* argv[])
{
#define STRING_262 "The a value 1, The a value 2,...etc
printf("a hash = %x\n", StringHash(STRING_262) );
printf("b hash = %x\n", StringHash("The b value"
);
printf("a hash 2 = %x\n", StringHash(STRING_262) );
return 0;
}
Here STRING_262 is a 262 char string. This test program builds in 2 sec., and all hash values are computed at build time.
Makenshi,
the StringHash is a compile time tool, an exact value of its argument must be known by compiler.
More pages: 1 2 3 4