Sep 19, 2012

Posted by | 0 Comments

Fastest memcpy

Fastest memcpy

Optimized memory copy version. Approx 30-70% faster than memcpy in Microsoft Visual Studio 2005.



Code:

void memcpy_sse2(void* dest, const void* src, const unsigned long size_t)
{
  __asm
{
mov esi, src;    //src pointer
mov edi, dest;   //dest pointer

mov ebx, size_t; //ebx is our counter
shr ebx, 7;      //divide by 128 (8 * 128bit registers)

loop_copy:
prefetchnta 128[ESI]; //SSE2 prefetch
prefetchnta 160[ESI];
prefetchnta 192[ESI];
prefetchnta 224[ESI];

movdqa xmm0, 0[ESI]; //move data from src to registers
movdqa xmm1, 16[ESI];
movdqa xmm2, 32[ESI];
movdqa xmm3, 48[ESI];
movdqa xmm4, 64[ESI];
movdqa xmm5, 80[ESI];
movdqa xmm6, 96[ESI];
movdqa xmm7, 112[ESI];

movntdq 0[EDI], xmm0; //move data from registers to dest
movntdq 16[EDI], xmm1;
movntdq 32[EDI], xmm2;
movntdq 48[EDI], xmm3;
movntdq 64[EDI], xmm4;
movntdq 80[EDI], xmm5;
movntdq 96[EDI], xmm6;
movntdq 112[EDI], xmm7;

add esi, 128;
add edi, 128;
dec ebx;

jnz loop_copy; //loop please
loop_copy_end:
}
}

Courtesy of William Chan

Read More
Aug 29, 2012

Posted by | 0 Comments

Calling Convention – Part IV (__thiscall)

Calling Convention – Part IV (__thiscall)

Make sure you have read “Calling Convention – Part I”, “Calling Convention – Part II” & “Calling Convention – Part III” of this article.

This calling convention ( __thiscall )

__thiscall is the default calling convention for calling member functions of C++ classes (except for those with a variable number of arguments).

The main characteristics of __thiscall calling convention are:

  1. Arguments are passed from right to left, and placed on the stack. this is placed in ECX.
  2. Stack cleanup is performed by the called function.

C++ Name Decoration/Mangling For thiscall

Please click Here to get detail overview of C++ Name Decoration.

The example for this calling convention had to be a little different. First, the code is compiled as C++, and not C. Second, we have a class/struct with a member function, instead of a global function.

class CSum
{
public:
      int Add ( int nValue1, int nValue2)
      {
           return nValue1+nValue2;
      }
};

The assembly code for the function call looks like this:

push 3
push 2
lea ecx,[sumObj]                 ; Object of CSum (this pointer)
call ?Add@CSum@@QAEHHH@Z         ; CSum::Add
mov DWORD PTR [nResult],eax

The function itself is given below:

; // function prolog
push ebp
mov ebp, esp
push ebx
push esi
push edi
; // return nValue1 + nValue2;
mov eax, DWORD PTR [nValue1]
add eax, DWORD PTR [nValue2]
; // function epilog
pop ebx
pop esi
pop edi
mov esp, ebp
pop ebp
;//Stack cleanup and return
ret 8

Now, what happens if we have a member function with a variable number of arguments? In that case, __cdecl is used, and this is pushed onto the stack last.

Conclusion

__thiscall calling convention is the default calling convention used by C++ member functions that do not use variable arguments.

Read More
Aug 11, 2012

Posted by | 0 Comments

Large Ukraine-based BitTorrent site Demonoid shut down

Large Ukraine-based BitTorrent site Demonoid shut down

Ukrainian authorities have taken down Demonoid.com, one of the world’s largest torrent file-sharing sites.

Investigators from the country’s Ministry of Internal Affairs raided the data centre that was hosting the website’s servers.

Torrents allow users to download music, video and other internet content by downloading small bits of files from others’ computers at the same time.

The shutdown is the latest news in a campaign against file-sharing sites.

It follows the US’s closure of Megaupload, and several European ISPs (internet service providers) being ordered to block access to The Pirate Bay.

Demonoid was listed alongside both of these sites in The Notorious Markets List - a document drawn up by the US government at the end of last year highlighting services that “merit further investigation for possible intellectual property rights infringements”.

It noted that Demonoid “recently ranked among the top 600 websites in global traffic and the top 300 in US traffic”.

Back online?

Users first became aware of the action on 26 July, when attempts to access Demonoid’s site yielded a “server busy” message.

The Torrentfreak news site reported that Ukraine’s Division of Economic Crimes acted after receiving a request from the international police organisation Interpol.

It said the local authorities then contacted Demonoid’s ISP, Colocall, which decided to pull its service, and allowed investigators to copy data off its servers.

“Demonoid is known for its links to relatively rare content which may be harder to come by now,” Torrentfreak’s editor Ernesto Van Der Sar told the BBC.

“However, it’s not going to stop the majority of people from sharing files as the most popular items are available though hundreds of other BitTorrent sites.”

The action follows the arrest of one of Demonoid’s administrators in Mexico last October. But despite the setbacks Mr Van Der Sar suggested it was too soon to consign the site to history.

“In 2006 The Pirate Bay came back online three days after it was raided, and in the years that followed it grew out to become the largest BitTorrent site,” he said.

The BPI, which represents the UK music industry, and the MPAA (Motion Picture Association of America) – which have both campaigned against online copyright infringement – declined to comment when approached by the BBC.

Original Link: BBC News

Read More
Jul 28, 2012

Posted by | 0 Comments

Calling Convention – Part III (__fastcall)

Calling Convention – Part III (__fastcall)

Make sure you have read “Calling Convention – Part I” & “Calling Convention – Part II” of this article.

Fast calling convention ( __fastcall )

Fast calling convention indicates that the first two arguments should be placed in registers (ECX & EDX) and rest are pushed on stack. This reduces the cost of a function call, because operations with registers are faster than with the stack.

We can explicitly declare a function to use the __fastcall convention as shown:

int __fastcall Add( int nValue1, int nValue2 );

The main characteristics of __fastcall calling convention are:

  1. The first two function arguments that require 32 bits or less are placed into registers ECX and EDX. The rest of them are pushed on the stack from right to left.
  2. Arguments are popped from the stack by the called function.

Function Name Decoration For fastcall

Function name is decorated by by prefixing a ‘@’ character and postfixing a ‘@’ and number of bytes of stack space required by the arguments at end of the function name.

@Add@8 //@ added before & after function name and number of bytes space required on stack

Note: Microsoft have reserved the right to change the registers for passing the arguments in future compiler versions.

Here goes an example:

; // put the arguments in the registers EDX and ECX
mov         edx,3
mov         ecx,2
; // call the function
call @Add@8
; // copy the return value from EAX to a local variable (int nResult)
mov DWORD PTR [nResult], eax

The called function is shown below:

; // function prolog
push ebp
mov ebp, esp
push ebx
push esi
push edi
; // return nValue1 + nValue2;
mov eax, DWORD PTR [nValue1]
add eax, DWORD PTR [nValue2]
; // function epilog
pop ebx
pop esi
pop edi
mov esp, ebp
pop ebp
;//Stack cleanup and return
ret 8

Conclusion

Advantage

Advantage of __fastcall calling convention is that it attempts to put arguments in registers, rather than on the stack, thus making function calls faster.

Disadvantage

Disadvantage of __fastcall calling convention is that functions with variable number of arguments (like printf())  can’t use __fastcall. Instead they must use __cdecl, because it is the only calling convention who knows the number of arguments in each function call; therefore only the caller can perform the stack cleanup.

Final Part: Cont. Calling Convention – Part IV

Read More
Jul 27, 2012

Posted by | 2 Comments

Calling Convention – Part II (__stdcall)

Calling Convention – Part II (__stdcall)

Make sure you have read “Calling Convention – Part I” of this article.

Standard calling convention ( __stdcall )

This convention is usually used to call Win32 API functions.

Note: WINAPI is nothing but another name for__stdcall:

#define WINAPI __stdcall

We can explicitly declare a function to use the __stdcall convention:

int __stdcall Add( int nValue1, int nValue2 );

The main characteristics of __stdcall calling convention are:

  1. Arguments are passed from right to left, and placed on the stack.
  2. Stack cleanup is performed by the called function.

Function Name Decoration For stdcall

Function name is decorated by prefixing an underscore character ‘_’ and postfixing a ‘@’ character and number of bytes of stack space required by the arguments at end of the function name.

_Add@8 //underscore before function name & @ and number of bytes space required on stack

Now, take a look at an example of a __stdcall call:

; // push arguments to the stack, from right to left
push 3
push 2
; // call the function
call _Add@8
; // copy the return value from EAX to a local variable (int nResult)
mov DWORD PTR [nResult], eax

The called function is shown below:

; // function prolog
push ebp
mov ebp, esp
push ebx
push esi
push edi
; // return nValue1 + nValue2;
mov eax, DWORD PTR [nValue1]
add eax, DWORD PTR [nValue2]
; // function epilog
pop ebx
pop esi
pop edi
mov esp, ebp
pop ebp
;//Stack cleanup and return
ret 8

Conclusion

__stdcall is default calling convention for Win32 API’s.

Advantage

Advantage of __stdcall calling convention is that it creates smaller executables than __cdecl, in which the code for stack cleanup will be cleaned by called function.

Disadvantage

Disadvantage of __stdcall calling convention is that functions with variable number of arguments (like printf())  can’t use __stdcall. Instead they must use __cdecl, because it is the only calling convention who knows the number of arguments in each function call; therefore only the caller can perform the stack cleanup.

Cont. Calling Convention – Part III

Read More
Jul 26, 2012

Posted by | 0 Comments

Calling Convention – Part I (__cdecl)

Calling Convention – Part I (__cdecl)

Before reading, I assume that you must have sufficient knowledge of C/C++ & Assembly.

Introduction

During the process of learning C++ programming for Windows, you must have came across strange specifiers that sometime appear in front of function declarations, like __cdecl, __stdcall (WINAPI, CALLBACK), __fastcall, etc. These specifiers are called “Calling Conventions“.

What are the calling conventions?

When a function is called, the arguments are typically passed to it, and the return value is retrieved. A calling convention describes how the arguments are passed and values returned by functions. It also specifies how the function names are decorated.

Is it really necessary to understand the calling conventions to write good C programs?

No, not at all. However, it may be helpful with debugging. Also, it is necessary for linking C/C++ with assembly code.

How does it work?

No matter which calling convention is used, the following things will happen:

  1. All arguments are typically saved (pushed) on stack, but may also be in registers (__fastcall will be discussed in later post).
  2. Program execution jumps to the address of the called function.
  3. Inside the function, registers ESI, EDI, EBX, and EBP are saved on the stack. The part of code that performs these operations is called ‘Function Prologand usually is generated by the compiler.
  4.  The function-specific code is executed, and the return value is placed into the EAX register.
  5. Registers ESI, EDI, EBX, and EBP are restored from the stack. The piece of code that does this is called Function Epilog‘, and in most cases compiler generates it.
  6. Arguments are removed(popped) from stack. This operation is called stack cleanup and may be performed either inside the called function or by the caller, depending on the calling convention used.

As an example for the calling conventions (except for this (__thiscall)), we are going to use a simple function:

int Add( int nValue1, int nValue2 )
{
    return nValue1 + nValue2;
}

The call to this function will look like this:

int nResult = Add( 2, 3 );

Note: Remember to compile this example code as C. If you are compiling as C++ code use the example code below to avoid ‘C++ name decorations’ (C++ name decorations are beyond the scope of this post. Will be discussed in later posts). In this post I will explain ‘C  name decorations’.

#ifdef __cplusplus
extern "C" {
#endif
int Add( int nValue1, int nValue2 )
{
    return nValue1 + nValue2;
}
#ifdef __cplusplus
}
#endif

C calling convention ( __cdecl )

This convention is the default for C/C++ programs. If a project is set to use some other calling convention, we can still declare a function to use __cdecl:

int __cdecl Add( int nValue1, int nValue2 );

The main characteristics of __cdecl calling convention are:

  1. Arguments are passed from right to left, and placed on the stack.
  2. Stack cleanup is performed by the caller.

Function Name Decoration For cdecl 
Function name is decorated by prefixing it with an underscore character ‘_’.

_Add //underscore before function name

Now, take a look at an example of a __cdecl call:

; // push arguments to the stack, from right to left
push 3
push 2
; // call the function
call _Add
; // cleanup the stack by adding the size of the arguments to ESP register
add esp, 8
; // copy the return value from EAX to a local variable (int nResult)
mov DWORD PTR [nResult], eax

The called function is shown below:

; // function prolog
push ebp
mov ebp, esp
push ebx
push esi
push edi
; // return nValue1 + nValue2;
mov eax, DWORD PTR [nValue1]
add eax, DWORD PTR [nValue2]
; // function epilog
pop ebx
pop esi
pop edi
mov esp, ebp
pop ebp
ret

Conclusion

__cdecl is the default calling convention for C and C++ programs.

Advantage

The advantage of this calling convention is that it allows functions with a variable number of arguments to be used (e.g printf).

Disadvantage

The disadvantage is that it creates larger executables.

Cont. Calling Convention – Part II

Related Posts Plugin for WordPress, Blogger...

Read More