Bindings to programming languages
Through binding of the dyncall library into a scripting environment, the scripting language can gain system programming status to a certain degree.The dyncall library provides bindings to Erlang[1], Java[2], Lua[3], Python[4], R[5], Ruby[6], Go[7] and the shell/command line.
However, please note that some of these bindings are work-in-progress and not automatically tested, meaning it might require some additional work to make them work.
Common Architecture
The binding interfaces of the dyncall library to various scripting languages share a common set of functionality to invoke a function call.
Dynamic loading of code
The helper library dynload which accompanies the dyncall library provides an abstract interface to operating-system specific mechanisms for loading and accessing executable code out of, but not limited to, shared libraries.
Functions
All bindings are based on a common interface convention providing a common set of the following 4 functions (exact spelling depending on the binding’s scripting environment):
- load
- - load a module of compiled code
- free
- - unload a module of compiled code
- find
- - find function pointer by symbolic names
- call
- - invoke a function call
Signatures
A signature is a character string that represents a function’s arguments and return value types. It is used in
the scripting language bindings invoke functions to perform automatic type-conversion of the languages’ types
to the low-level C/C++ data types. This is an essential part of mapping the more flexible and often abstract
data types provided in scripting languages to the strict machine-level data types used by C-libraries. The
high-level C interface functions dcCallF(), dcVCallF(), dcArgF() and dcVArgF() of the dyncall library
also make use of this signature string format.
The format of a dyncall signature string is as depicted below:
dyncall signature string format
<input parameter type signature character>* ’)’ <return type signature character>
The <input parameter type signature character> sequence left to the ’)’ is in left-to-right order of the
corresponding C function parameter type list.
The special <return type signature character> ’v’ specifies that the function does not return a value and
corresponds to void functions in C.
Signature character | C/C++ data type |
’v’ | void |
’B’ | _Bool, bool |
’c’ | char |
’C’ | unsigned char |
’s’ | short |
’S’ | unsigned short |
’i’ | int |
’I’ | unsigned int |
’j’ | long |
’J’ | unsigned long |
’l’ | long long, int64_t |
’L’ | unsigned long long, uint64_t |
’f’ | float |
’d’ | double |
’p’ | void* |
’Z’ | const char* (pointing to C string) |
’A’ | aggregate (struct, union) by-value |
Please note that using a ’(’ at the beginning of a signature string is possible, although not required. The character doesn’t have any meaning and will simply be ignored. However, using it prevents annoying syntax highlighting problems with some code editors.
Calling convention modes can be switched using the signature string, as well. A ’_’ in the signature string is followed by a character specifying what calling convention to use, as this affects how arguments are passed. This makes only sense if there are multiple co-existing calling conventions on a single platform. Usually, this is done at the beginning of the string, except in special cases, like specifying where the varargs part of a variadic function begins. The following signature characters exist:
Signature character | Calling Convention |
’:’ | platform’s default calling convention |
’*’ | platform’s default C++/thiscall calling convention |
’e’ | vararg function |
’.’ | vararg function’s variadic/ellipsis part (...), to be specified before first vararg |
’c’ | only on x86: cdecl |
’s’ | only on x86: stdcall |
’F’ | only on x86: fastcall (MS) |
’f’ | only on x86: fastcall (GNU) |
’+’ | only on x86: thiscall (MS) |
’#’ | only on x86: thiscall (GNU) |
’A’ | only on ARM: ARM mode |
’a’ | only on ARM: THUMB mode |
’$’ | syscall |
C function prototype | dyncall signature | |
void | f1(); | ”)v” |
int | f2(int, int); | ”ii)i” |
long long | f3(void*); | ”p)L” |
void | f3(int**); | ”p)v” |
double | f4(int, bool, char, double, const char*); | ”iBcdZ)d” |
void | f5(short, long long, ...); | ”_esl_.di)v” (for (promoted) varargs: double, int) |
struct A | f6(int, union B); | ”iA)A” |
short | Cls::f(unsigned char, ...); | ”_*p_eC_.i)s” (C++: this-ptr as 1st arg, int as vararg) |
Erlang language bindings
The OTP library application erldc implements the Erlang language bindings.
Signature character | accepted Erlang data types |
’v’ | no return type |
’B’ | atoms ’true’ and ’false’ converted to bool |
’c’, ’C’ | integer cast to (unsigned) char |
’s’, ’S’ | integer cast to (unsigned) short |
’i’, ’I’ | integer cast to (unsigned) int |
’j’, ’J’ | integer cast to (unsigned) long |
’l’, ’L’ | integer cast to (unsigned) long long |
’f’ | decimal cast to float |
’d’ | decimal cast to double |
’p’ | binary (previously returned from call_ptr or callf) cast to void* |
’Z’ | string cast to void* |
Go language bindings
A Go binding is provided through the godc package. Since Go’s type system is basically a superset of C’s, the type mapping from Go to C is straightforward.
Note that passing a Go-string directly to a C-function expecting a pointer is not directly possible. However, the binding comes with two helper functions, AllocCString(value string) unsafe.Pointer and FreeCString(value unsafe.Pointer) to help with converting a string to an unsafe.Pointer which then can be passed to ArgPointer(value unsafe.Pointer). Once you are done with this temporary string, free it using FreeCString(value unsafe.Pointer).
Python language bindings
The python module pydc implements the Python language bindings, namely load, find, free, call, new_callback, free_callback.
Signature character | accepted Python 2 types | accepted Python 3 types |
’v’ | no return type | no return type |
’B’ | bool | bool |
’c’, ’C’ | int, string (with single char) | int, string (with single char) |
’s’, ’S’ | int | int |
’i’, ’I’ | int | int |
’j’, ’J’ | int | int |
’l’, ’L’ | int, long | int |
’f’ | float | float |
’d’ | float | float |
’p’ | bytearray, int, long, None, (PyCObject, PyCapsule) | bytearray, int, None, (PyCObject, PyCapsule) |
’Z’ | string, unicode, bytearray | string, bytes, bytearray |
This is a very brief description that omits many details. For more, refer to the README.txt file of the binding.
R language bindings
The R package rdyncall implements the R langugae bindings providing the function .dyncall() .
Signature character | accepted R data types |
’v’ | no return type |
’B’ | coerced to logical vector, first item |
’c’ | coerced to integer vector, first item truncated char |
’C’ | coerced to integer vector, first item truncated to unsigned char |
’s’ | coerced to integer vector, first item truncated to short |
’S’ | coerced to integer vector, first item truncated to unsigned short |
’i’ | coerced to integer vector, first item |
’I’ | coerced to integer vector, first item casted to unsigned int |
’j’ | coerced to integer vector, first item |
’J’ | coerced to integer vector, first item casted to unsigned long |
’l’ | coerced to numeric, first item casted to long long |
’L’ | coerced to numeric, first item casted to unsigned long long |
’f’ | coerced to numeric, first item casted to float |
’d’ | coerced to numeric, first item |
’p’ | external pointer or coerced to string vector, first item |
’Z’ | coerced to string vector, first item |
Some notes on the R Binding:
- Unsigned 32-bit integers are represented as signed integers in R.
- 64-bit integer types do not exist in R, therefore we use double floats to represent 64-bit integers (using only the 52-bit mantissa part).
Ruby language bindings
The Ruby gem rbdc implements the Ruby language bindings.
Signature character | accepted Ruby data types |
’v’ | no return type |
’B’ | TrueClass, FalseClass, NilClass, Fixnum casted to bool |
’c’, ’C’ | Fixnum cast to (unsigned) char |
’s’, ’S’ | Fixnum cast to (unsigned) short |
’i’, ’I’ | Fixnum cast to (unsigned) int |
’j’, ’J’ | Fixnum cast to (unsigned) long |
’l’, ’L’ | Fixnum cast to (unsigned) long long |
’f’ | Float cast to float |
’d’ | Float cast to double |
’p’, ’Z’ | String cast to void* |