PWF Template system

Let's sort template systems (TS) into following classes:

For thin clients it is good to start without TS and add it only at the point when it is obvious that adding it will reduce coding efforts. Then it is good to use native language constructs until it is obvious that it needs to simplify. Then you will see what features you need and what you need not and pick the best fitting TS accordingly. Then it takes least effort to pick some transpiler until the performance needs to be improved. But frankly I haven't met a web where server-based TS needed performance boost. Funny thing: most TS brag as "lightweight", "fast" and "secure", and none of these features are relevant to TS. Those calling transpillers lightweight obviously never met C++. 20ms web load latency is the threshold when sensitive users can notice anything and all TS are at least by a magnitude faster than that. Microoptimization on this level is not needed and brings unwanted complexity for no benefit. Security can't be solved in TS realm, which I describe in detail on Latte example here.

The way from native to transpiler TS based on PHP language (most popular server-side web language) follows.

Native PHP based TS

It can be used in a way following these rules:

  1. short_open_tags On to make php opening mark short enough to be usable for templates
  2. only for, foreach, while, if and their end- closing constructs should be used in templates
  3. no echo, just <?= instead
  4. opening and closing php marks should be on the same line
  5. every line is plain PHP or plain HTML, but not both, except for <?=

The angle brackets flood is hard to read. Based on the use cases below let's try to find the optimal way to mix HTML syntax and TS syntax. Either the syntax should be easy to spot and separate (many TS opt for curly braces), or it should integrate into HTML in a way XSL does.

1. Autoremove empty elements

<?if(!empty($items)):?> <ul> <?foreach($items as $item):?> <li> <?=$item?> <?endforeach?> </ul> <?endif?>

It is usually undesirable to have empty HTML containers on the web, but there are exceptions. The autoremoval of such containers should be default, there should be an option to keep empty containers. Plain PHP syntax needs if-foreach pair, well-designed TS should have the functionality in a single expression.

To achieve that on PHP level, we can postprocess the generated output:

  1. Place a keep-empty indicator inside element (like §keepempty§ in the snippet above)
  2. Find and remove empty elements using regular expressions (those containing keep-empty indicator are not empty)
  3. Remove keep-empty indicator
<ul>§keepempty§ <?foreach($items as $item):?> <li> <?=$item?> <?endforeach?> </ul> // postprocessing $removeEmptyTags = fn($html) => preg_replace('~<([^\s!>]+)[^>]*>\s*</\1>~s','',$html); $data = $removeEmptyTags($data); $data = str_replace("§keepempty§","",$data);

2. Add/Autoremove attributes

1) using if <input name="phone" <?if($cond):?>required<?endif?>> <input name="phone" <?if($placeholder):?>placeholder="<?=$placeholder?>"<?endif?>> 2) using operators <input name="phone" <?=$required??""?>> <input name="phone" <?=$placeholder?"placeholder=\"$placeholder\"":""?>>

HTML5 boolean attributes are set if present, even in confusing syntax like empty values (required="") or string false values (required="false") which acts the same way like value-free attributes (required). Therefore attributes with empty values should be always autoremoved and only value-free variant should be used for boolean attributes. This automation is not possible in XML compatible format where value-free attributes are not valid, but we will focus on HTML5. To add a boolean attribute doesn't lead to easy-to-read syntax in either case in the PHP snippet above. It also leaves a redundant space in the tag context, which is only an aesthetical detail not affecting the code interpretation.

The syntax above can be made readable using postprocessing like in removing empty tags.

<input name="phone" <?=implode(" ",$boolAtts)?>> <input name="phone" placeholder="<?=$placeholder?>"> // postprocessing $removeEmptyAtts = fn($html) => preg_replace('/(\s*\w+="\s*")/','',$html); $data = $removeEmptyAtts($data);

The replacement should be made only in tag context. Either we can ignore it assuming attr="value" non-existent in text context, or we can replace use HTML entity = as the equation sign in text context.

3. Autoescape output

<?=htmlspecialchars($output)?>

The output should be autoescaped by default. TS should provide a way not to escape, though. Considering HTML5 output, to secure handle untrusted data:

  1. In text context, < needs to be replaced with &lt;
  2. In tag attribute context and script string context, html entity should replace the value delimiter (" or ') with a note that script invoking attributes on-, href, action and src needs to be avoided
  3. In script context and tag context, no untrusted data can be safely placed, as the exploit possibilities are so vast here that it is virtually unfeasible to secure it by mere syntax analysis.

In case 1 and 2, htmlspecialchars should be enough to protect from XSS. Keep in mind that accesskey attribute can trigger onclick event on most attributes (good one for XSS attack is V as it is oftentimes pressed by mistake trying to paste with Ctrl+V or trying to insert @ with Ctrl+Alt+V).

To avoid calling htmlspecialchars multiple times, untrusted data can be preprocessed in array:

// preprocessing array_walk($untrusted, "htmlspecialchars");

4. Build class attribute

<input name="phone" class="<?=implode(" ",$classData)?>">

The PHP syntax readable here. The removal of empty class attribute was solved above.

5. Conditional output

<?if($cond):?> <div>oh yea!</div> <?else:?> <div>hell no!</div> <?endif?>

The PHP syntax is decent here and TS could gracefully use it without adding its own. The ugly example is XSL here: two levels of nesting for a simple if-else construct is just too much (four levels for condition inside of condition).

<xsl:choose> <xsl:when test="$cond"> <div>oh yea!</div> </xsl:when> <xsl:otherwise> <div>hell no!</div> </xsl:otherwise> </xsl:choose>

6. Iteration

<?foreach($items as $item):?> <span class="flash"><?=$item?></span> <?endforeach?>

Again: the PHP syntax works well and there is no shame for TS to delegate it to PHP. It could come in hand to add some more information to the iteration: how many items are there, is the item first or last, what is the order of the item in the list.

7. Reusable parametrized blocks

<?ob_start()?> <h2><?=$title?></h2> <?$h2=ob_get_clean()?>

PHP with its output control handles this task in surprisingly readable syntax. TS can make the parameters local.

8. Includes

<?include "somefile"?>

In one way or another, the TS will use the PHP functionality here. Also here can TS send some local variables to the include.

Conclusion

For the observed use-cases, native PHP TS with readable syntax can be achieved with preprocessing and postprocessing one-liners:

// preprocessing array_walk($untrusted,"htmlspecialchars"); // postprocessing $template = preg_replace(['~<([^\s!>]+)[^>]*>\s*</\1>~s',\s*\w+="\s*")/','§keepempty§'],'',$template);

The separation of $untrusted data from trusted $data increases security: the client coder knows where he works with what. With the single call of greedy preg_replace the code is much faster than any TS can dream of, albeit the speed is a little concern here. The drawback is the need to prepare the data for suitable array structures. The framework of such template could be in a single page with following structure of 8 steps:

// input: $input array // create suitable $untrusted and $data arrays from $input // preprocessing ob_start(); // template code here $template = ob_get_clean(); // postprocessing echo $template;

XSL based TS

XSL is not well designed language for HTML TS: the output can be any text-based format, usually XML based. HTML TS does not need that complexity: XSL generally suffers too many levels of tag nesting. Lets design the best interface for the desired tasks above:

1. Autoremove empty elements

The autoremoving would be default behavior and ts-keep-empty boolean atribute would revert it.

<ul ts-keep-empty> ...some autogenerated contents... </ul>

2. Add/Autoremove attributes

The autoremoving would be also the default behavior and ts-attr-if would conditionally add a boolean attribute. The value is colon separated data attr:cond where attr is the attribute name and cond is a boolean variable which places attr iff true.

<input placeholder="$placeholder"> <- is removed if $placeholder is empty <input type="text" ts-attrif="required:$form->name->required">

3. Autoescape output

The dollar sign should be replaced with corresponding escaped PHP variable and left as is or silently removed if no such variable is defined. (Syntax highlighting can be adjusted for that.) To guarantee inserting dollar sign, HTML entity needs to be used. To access object properties, -> operator is unfortunate for the > sign which is oftentimes escaped. There needs to be a comfortable way to produce unescaped output.

To solve these two issues there are many syntactical ways, like . operator for member access (text joining is not needed) and double dollar sign for unescaped output.

4. Build class attribute

The same logic attr:cond can be used in ts-class attribute using comma as separator

<div ts-class="marked:$marked, bold:$number>1000">...</div>

5. Conditional output

To implement non-attributes features, <ts> tag is created. It disappears after parsing, but can apply arguments to its contents. In case of conditional output, it can be done with if attribute

<ts if="$cond"> ...conditional contents... </ts> <ul ts-if="$cond" ts-loop="$data:$i"> <li>$i </ul>

6. Iteration

Iterations are useful two ways: to repeat contents, and to repeat contents of some tag:

<ts loop="$data:$i"> <span>$i</span> </ts> <ul ts-loop="$data:$i"> <li>$data </ul>

7. Reusable parametrized blocks

The apply attribute is borrowed from XSL. It has name:data signature: the name uses the contents from element tagged with ts-block, the data defines local variables for the block. In case of apply attribute, <ts> element is not pair.

<span ts-block="myBlock">...</span> <ts block="myBlock">...</ts> <ts apply="myBlock:$data">Default value if myBlock is not set</ts>

Table example

This should serve as quick test of TS features. It should handle nested loop, produce no error if $data are not provided and place the $cell contents in table cells.

<table ts-if="$data->type=table" ts-loop="$data->items:$row"> <tr ts-loop="$row:$cell"> <td>$cell </tr> </table>

Curly braces based TS

This type of TS uses mapping of its curly braces contents to PHP code, thus parsing of the template contents. Such approach should be easier to parse and custom syntax highlighted at a cost of slightly more code.

1. Autoremove empty elements

<ul>{keepempty} ...some autogenerated contents... </ul>

2. Add/Autoremove attributes

<input placeholder="{=$placeholder}"> <- is removed if $placeholder is empty <input type="text" {=if:$form->name->required,"required"}"> <- filter output

4. Build class attribute

<div class="{=implode(" ",$classData}">...</div>

Implementation

PWF should be compatible with any template system. Most of them are designed in and for specific ecosystems as original ideas from scratch: good practice from other TS are less common to mention or admit.

To design a good TS, some popular TS are examined below. As a result, a pure XSL based Ins TS and pure Curly braces based Zen TS are produced with an effort to put together pros and eliminate cons from other TS emphasizing the desired features mentioned above and eliminating features that are not essential to TS to make it as lean and extensible as possible.

Latte

Cons: Pros:

Twig

Pros: