‘’’An open proposal. This is still relevant. 20080904’’’
New C code generation?¶
Issues¶
- There are several issues with the current way C code is generated:
Ops cannot declare their own persistent variables.
Reliance on weave, but most of weave’s features go unused.
There could easily be conflicts between support code from different Ops/Results. * It is currently impossible to specialize support code based on the self.
Caching of the generated code for graphs is greatly suboptimal.
Structure¶
Currently, the general structure of the generated C code is approximately as follows:
<imports>
<weave type converters>
<op/result support code>
struct my_computation {
<input/output storage>
<persistent fields>
init(<input/output storage>) { <initialize persistent fields> }
cleanup { <clean up persistent fields> }
run { <run the computation> }
};
<runner for the struct>
PyObject* instantiate(PyObject* args) {
<weave stuff>
<make up a CObject out of the runner and a my_computation instance>
<weave stuff>
}
<python exports for instantiate>
The module produced via that method then has to be used as such:
obj = module.instantiate(error_storage, input_storage, output_storage, orphan_storage)
cutils.run_cthunk(obj)
We would like to get rid of weave dependencies, avoid name conflicts with the support code and have a nicer user interface for the produced module. The proposed new structure is as follows:
<imports>
struct op1 {
<persistent variables>
<support code>
init() { <initialize persistent fields> }
cleanup { <clean up persistent fields> }
run(<inputs>) { <run the computation for op1> }
};
struct op2 { <same> };
...
struct opN { <ditto> };
struct driver {
op1 o1; op2 o2; ... opN oN;
<input storage>
<output storage>
init(<storage>) { <initialize ops, storage> }
cleanup() { <free storage?> }
run() {
<extract inputs>
o1.run(input1, input2);
o2.run(o1.output1);
...
oN.run(...);
<sync outputs>
}
}
PyObject* <name>(PyObject* inputs) {
<init driver, input/output storage>
<put inputs in input storage>
driver.run()
<free input storage>
<return output storage>
}
PyObject* <name>_driver(PyObject* storage) {
<init driver with storage>
<return driver>
}
<export <name> and <name>_driver>
- Gains:
support code can be put inside a struct and become private to the Op
we can export several functions that can be used directly, eg
z = module.add(1, 2)
* this won’t do filtering likeResult.filter
so the usefulness is limited by thatthe sequence of operations might be clearer to read
we can use more descriptive names in each Op struct representing its input names (if we can find them using the inspect module) without worrying about name conflicts
- Losses:
maybe gcc can’t optimize it as well? * make functions static and inline as much as possible
Caching¶
- The current way of caching is from a hash of the generated code. That is inefficient because code has to be generated each time, which might be a costly process. Furthermore, usage of hashing in sets make it difficult to ensure a consistent ordering of Ops in graphs where several orderings are valid, so the generated C code is potentially different each time. Here is a proposal for a better way to compute the hash:
Result_hash = Result version + Result desc
Op_hash = Op version + Op desc + input/output hashes
FunctionGraph_hash = FunctionGraph version + combination of the Op hashes and their traversal order wrt a consistent traversal method
The version could be set explicitly via a __version__
field or it could simply be equal to the file’s last modification date. We could also have a __nocache__
field indicating that code produced by the Op or Result cannot be cached.
It should also be easier to bypass the cache (eg an option to CLinker to regenerate the code).