I really like co-routines, finding them really useful for programming game AI and scripting. Since visual studio has dropped support for inline assembly when compiling x64 code I thought I would be out of luck trying to implement co-routines for this target. Fortunately visual studio can still compile assembly files using masm in x64 mode. In order to implement co-routines I had to familiarities myself with x64 calling conventions, something that has changed quite a bit from x86. The first 4 arguments are passed on the stack, there are many more callee save registers, and more obviously all registers and pointers have been extended to 64bits.
Below is a small demo showing how to implement co-routines in visual studio for x64 targets.
#include <stdio.h> typedef unsigned long long u64; typedef void * user_t; typedef void *stack_t; typedef void (*cofunc_t)( stack_t *token, user_t arg ); /* coroutine functions */ extern "C" void yield_( stack_t *token ); extern "C" void enter_( stack_t *token, user_t arg ); /* artificial stack */ const int nSize = 1024 * 1024; static char stack[ nSize ] = { 0 }; /* prepare a coroutine stack */ void prepare( stack_t *token, void *stack, u64 size, cofunc_t func ) { u64 *s64 = (u64*)( (char*)stack + size); s64 -= 10; // 10 items exist on stack s64[0] = 0; // R15 s64[1] = 0; // R14 s64[2] = 0; // R13 s64[3] = 0; // R12 s64[4] = 0; // RSI s64[5] = 0; // RDI s64[6] = (u64) s64 + 64; // RBP s64[7] = 0; // RBX s64[8] = (u64) func; // return address s64[9] = (u64) yield_; // coroutine return address *token = (stack_t) s64; // save the stack for yield } /* coroutine function */ void threadFunc( stack_t *token, user_t arg ) { for ( int i=0; i<10; i++ ) { printf( " coroutine %d\n", i ); yield_( token ); } } /* program entry point */ int main( ) { stack_t token = nullptr; /* prepare the stack */ prepare( &token, stack, nSize, threadFunc ); /* enter the coroutine */ enter_( &token, (void*)0x12345678 ); /* simple test loop */ for ( int i=0; i<10; i++ ) { printf( "main thread %d\n", i ); yield_( &token ); } /* program done */ printf( "program exit\n" ); getchar( ); }
Coroutine.asm
.code ;---- ---- ---- ---- ---- ---- ---- ---- ; coroutine yield function ; ; : void yield_( void * token ); ; ; 'token' -&gt; RCX ; yield_ proc push RBX push RBP push RDI push RSI push R12 push R13 push R14 push R15 mov RAX , RSP mov RSP , [RCX] mov [RCX], RAX pop R15 pop R14 pop R13 pop R12 pop RSI pop RDI pop RBP pop RBX ret yield_ endp ;---- ---- ---- ---- ---- ---- ---- ---- ; enter a co-routine ; ; : void enter_( void * token, void * arg1, ... ); ; ; 'token' -&gt; RCX ; 'arg1, ...' -&gt; RDX, R8, and R9 ; enter_ proc jmp yield_ enter_ endp end
I found a bug in the code. prepare() should sub 10 more u64s from the stack. The stack is 9 or 10 items larger than your code says it is. enter_ pushes 9 items on the stack, so you need to adjust for that.