Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Is the set of all input fixed? If you have a fixed set of files to
    convert, your job is much easier because the set of language
    construct combinations is fixed. For example, building a general
    Pascal to Java translator is much harder than building a translator
    for a set of 50 existing Pascal files.
  • Forward or external references? I.e., multiple passes needed?
    Pascal has a "forward" reference to handle intra-file procedure
    references, but references to procedures in other files via the
    USES clauses etc... require special handling.
  • Is input order of sentences close to output order? Are there
    multiple files to generate from a single input file or vice versa?
  • Context sensitive lexer? You can't decide what vocabulay symbol
    to match unless you know what kind of sentence you are parsing.

...

  • Are the source statements really similar; declarations vs
    expressions in C++?
  • Column sensitive input? E.g., are newlines significant like lines
    in a log file and does the position of an item change its meaning?

...

  • Well defined language or no manual; hacked for ages like gnucc by
    non-language designers? Is your language VisualBasic-like?
  • How fast does your translator have to be? It is often the case
    that building lots of translator phases simplifies your problem,
    but it can slow down the translation.
  • Does your input have comments as you do in programming languages
    that can occur anywhere in the input and need to go into the output
    in a sane location?
  • How much semantic information do you need to do the translation?
    For example, do you need to simply know that something is a type
    name or do you need to know that it is, say, an array whose indices
    are a set like (day,week,month) and contains records? Sometimes
    syntax alone is enough to do translation.
  • Equivalent syntaxes?  In C there are many different ways to
    dereference pointers.  You can normalize the language to a
    standard representation, but you might loose the original representation.
    The choice usually hinges on whether the output will be human-edited or
    not.  Designing the right tree structure has to incorporate
    decisions like this.
  • Jurgen Pfundt points out: The considered language might be small, but mapping is targeted for the conversion of huge files and this is really a challenge. An input file with a size of several megabytes restricts the usage of tree parsers or any other kind of memory consumpting features. The transformation should be done in one single pass due to performance requirements and an extremly good and comfortable error reporting and error recovery is a must.