Ravi Mohan's Tech Blog. To read my non technical blog, click here

Monday, May 18, 2009

Some guidelines on porting

This is in response to some difficulties encountered by a friend in porting a library from Common Lisp to Scheme. I am very short of time so this is less polished than usual. Apologies in advance.

I generally port something from one language to another when
(1) I want to learn a new language. I learned Ruby by porting the FIT library from Java. It ended up in my never using FIT and not seeing enough of a reason to shift my choice of scripting language from Python, but it was still an interesting exerscise.
(2) My clients need to work with or on the *code* I deliver. If a scientist has to look at or modify code, for example then the final deliverable has to be in a mainstream language like Python or Java vs a more powerful language like Erlang or Haskell or even Scheme.(though I can sometimes get away with Scheme). So I prototype in whatever language seems appropriate to the domain and then port it to Java or Python or C.

After doing this a few times, I've found a few "shortcuts".

(1) There are three variables - the source language, the target language, the domain. Make sure you know at *least* two of them at the expert level. If you don't, fix that first. Struggling with two things at the same time(say the domain and the target language) is a recipe for suboptimal outcomes.

(2) Do NOT port line by line from source to target. Even when the source and target languages are roughly similair, say Python to Ruby, or Scheme to Common Lisp, doing this often misses the idioms of the target language and ends up looking stilted. Eg. "meta programming" in Ruby (vs Python) and iteration in Scheme (vs Common Lisp).

(3) DO use an intermediate form. This is, in my experience, *the* key to a succesful port. The intermediate form could be English, ("here the foo is converted to baz via a fold left accumulator combo") Logic, pseudo code, whatever is appropriate. Translate from the source code to the intermediate language (at multiple levels - function, module, design and architecture) and then from the intermediate language to the target language. In the latter step write *idiomatic* target language. Beware the idioms of the source language leaking into the target language.

(4) When the source and target languages have different paradigms (say Prolog (logic)to Java ( a kind of bastardised OO) ) make sure you are fluent in both *paradigms* before you undertake the translation. Otherwise, you often end up with ugly code. A variant of this occurs when you are moving up or down the "blub ladder", even within the same paradigm say smalltalk to Java or C to Pascal.

Moving down is tedious but often conceptually straightforward. Moving up is harder in a different way because the lesser language often hides abstractions that are directly expressible in the higher (a complicated nest of for next lops. ifs and assignments may (in the end) decompose cleanly into sequence ops in haskell but it is sometimes difficult to see and/or extract those).

No comments: