Overview of the Pyrex Language

This document informally describes the extensions to the Python language made by Pyrex. Some day there will be a reference manual covering everything in more detail.

Python functions vs. C functions

There are two kinds of function definition in Pyrex:

Python functions are defined using the def statement, as in Python. They take Python objects as parameters and return Python objects.

C functions are defined using the new cdef statement. They take either Python objects or C values as parameters, and can return either Python objects or C values.

Within a Pyrex module, Python functions and C functions can call each other freely, but only Python functions can be called from outside the module by interpreted Python code. So, any functions that you want to "export" from your Pyrex module must be declared as Python functions.

Parameters of either type of function can be declared to have C data types, using normal C declaration syntax. For example,

def spam(int i, char *s):
    ...
cdef int eggs(unsigned long l, float f):
    ...
When a parameter of a Python function is declared to have a C data type, it is passed in as a Python object and automatically converted to a C value, if possible. Automatic conversion is currently only possible for numeric types and string types; attempting to use any other type for the parameter of a Python function will result in a compile-time error.

C functions, on the other hand, can have parameters of any type, since they're passed in directly using a normal C function call.

If no type is specified for a parameter or return value, it is assumed to be a Python object. (Note that this is different from the C convention, where it would default to int.) For example, the following defines a C function that takes two Python objects as parameters and returns a Python object:

cdef spamobjs(x, y):
    ...
Reference counting for these objects is performed automatically according to the standard Python/C API rules (i.e. borrowed references are taken as parameters and a new reference is returned).

The name object can also be used to explicitly declare something as a Python object. This can be useful if the name being declared would otherwise be taken as the name of a type, for example,

cdef ftang(object int):
    ...
declares a parameter called int which is a Python object.

C variable and type definitions

The cdef statement is also used to declare C variables, either local or module-level:
cdef int i, j, k
cdef float f, g[42], *h
and C struct, union or enum types:
cdef struct Grail:
    int age
    float volume
cdef union Food:
    char *spam
    float *eggs
cdef enum CheeseType:
    cheddar, edam, 
    camembert
cdef enum CheeseState:
    hard = 1
    soft = 2
    runny = 3
Note that the words struct, union and enum are used only when defining a type, not when referring to it. For example, to declare a variable pointing to a Grail you would write
cdef Grail *gp
and not
cdef struct Grail *gp # WRONG

External declarations

By default, C functions and variables declared at the module level are local to the module (i.e. they have the C static storage class). They can also be declared extern to specify that they are defined elsewhere, for example:
cdef extern int spam_counter
cdef extern void order_spam(int tons)
At some time in the future it is planned that Pyrex will be able to read C header files. In the meantime, to create interfaces to existing C code you will have to include extern definitions for all the outside C functions and variables that you use.

Scope rules

Pyrex determines whether a variable belongs to a local scope, the module scope, or the built-in scope completely statically. As with Python, assigning to a variable which is not otherwise declared implicitly declares it to be a Python variable residing in the scope where it is assigned. Unlike Python, however, a name which is referred to but not declared or assigned is assumed to reside in the builtin scope, not the module scope. Names added to the module dictionary at run time will not shadow such names.

Statements and expressions

Control structures and expressions follow Python syntax for the most part. When applied to Python objects, they have the same semantics as in Python (unless otherwise noted). Most of the Python operators can also be applied to C values, with the obvious semantics.

If Python objects and C values are mixed in an expression, conversions are performed automatically between Python objects and C numeric or string types.

Reference counts are maintained automatically for all Python objects, and all Python operations are automatically checked for errors, with appropriate action taken.

C operations which do not have direct Python equivalents are handled as follows:

  • There is no -> operator in Pyrex. Instead of p->x, use p.x

  •  
  • There is no * operator in Pyrex. Instead of *p, use p[0]

  •  
  • There will be an & operator [not implemented yet]

  •  
  • Type casts are written <type>value, for example:
  • cdef char *p, float *q
    p = <char*>q

    Limitations

    Pyrex is not quite a full superset of Python. The following restrictions apply:
  • Function definitions (whether using def or cdef) cannot be nested within other function definitions.

  •  
  • import * is not supported under any circumstances.

  •  
  • Generators are not supported.
  • There are also some temporary limitations which will be lifted soon:
  • Only def and cdef statements can appear at the top level of a module

  •  
  • In-place operators (+=, etc) are not yet supported

  •  
  • Default argument values, keyword arguments, * and ** arguments, and tuple unpacking in argument lists are not yet supported

  •  
  • List comprehensions are not yet supported

  •  
  • Definition of Python classes is not yet supported
  • There are probably also some other gaps which I can't think of at the moment.