hypertag

HTML templates with Python-like concise syntax, code reuse & modularity. The Pythonic way to web templating.

Introduction

Hypertag is a modern language for front-end development that allows writing markup documents in a way similar to writing Python scripts, where indentation determines relationships between nested elements and removes the need for explicit closing tags. Hypertag provides:

If you are new to Hypertag, see the Quick Start for a brief introduction. The source code is available on GitHub and PyPI.

Authored by Marcin Wojnarski from Paperity.

Setup

Install in Python 3:

pip install hypertag-lang               # watch out the name, it is "hypertag-lang"

Run:

from hypertag import HyperHTML
html = HyperHTML().render(script)       # rendering of a Hypertag `script` to HTML

Acknowledgements

Hypertag was partially modeled on Python’s syntax, and was inspired by indentation-based templating languages: Slim, Plim, Shpaml, Haml.

Dedication

I dedicate this project to my sons, Franciszek Józef and Brunon Piotr.    – Marcin Wojnarski



Language Reference

Blocks

A Hypertag script consists of a list of blocks. Some of them may have tags, and/or nested blocks inside:

ul
    li 
        | This is the first item of a "ul" list.
        | Pipe (|) marks a plain-text block. HTML is auto-escaped: & < >
    li
        / This is the second item. 
          Slash (/) marks a <b>markup block</b> (no HTML escaping).
          Text blocks may consist of multiple lines, like here.

output:

<ul>
    <li>
        This is the first item of a "ul" list.
        Pipe (|) marks a plain-text block. HTML is auto-escaped: &amp; &lt; &gt;
    </li>
    <li>
        This is the second item.
        Slash (/) marks a <b>markup block</b> (no HTML escaping).
        Text blocks may consist of multiple lines, like here.
    </li>
</ul>

During parsing, blocks are first translated into Hypertag’s native Document Object Model (DOM), and then the DOM undergoes rendering to generate a document string in a target language. Typically, the target language is HTML, although any other language can be supported if an appropriate runtime is implemented to provide language-specific built-ins and configuration. For HTML5 generation, the HyperHTML standard runtime can be used, like in this Python 3 code:

from hypertag import HyperHTML

script = \
"""
    ul
        li 
            | This is the first item of a "ul" list.
        li
            | This is the second item. 
"""

html = HyperHTML().render(script)
print(html)

The script above will be rendered to:

<ul>
    <li>
        This is the first item of a "ul" list.
    </li>
    <li>
        This is the second item.
    </li>
</ul>

See the Runtime section for details about runtimes and script execution.

Text blocks

The most elementary type of block is a text block. It comes in three variants:

They differ in the way how embedded expressions and raw HTML are handled:

| Plain-text block may contain {'em'+'bedded'} expressions & its output is HTML-escaped.
/ Markup block may contain expressions; output is not escaped, so <b>raw tags</b> can be used.
! In a verbatim $block$ {expressions} are left unparsed, no <escaping> is done.

output:

Plain-text block may contain embedded expressions &amp; its output is HTML-escaped.
Markup block may contain expressions; output is not escaped, so <b>raw tags</b> can be used.
In a verbatim $block$ {expressions} are left unparsed, no <escaping> is done.

Plain-text and markup blocks may contain embedded expressions, like $x or {a+b}, which are evaluated and replaced with their corresponding values during translation. Additionally, the output of a plain-text block is converted to the target language (escaped) before insertion to the DOM. A runtime-specific escape function is used for this purpose. For example, hypertag.HyperHTML runtime performs HTML-escaping: it replaces <, >, & characters with HTML entities (&lt; &gt; &amp;). For a different target language, the escape function could perform any other operation that is necessary to convert plain text to a valid string in this language.

Comments

Blocks starting with double dash (--) or hash (#) are treated as comments: their content is left unparsed (like in a “verbatim” block) and is excluded from the output. They must follow general rules of block alignment: have the same indentation as sibling blocks and deeper indentation than a parent block:

div
  p | First paragraph
  #   Comment...
  p | Second paragraph

A block comment behaves similarly to text blocks and, like them, can span multiple lines, if only a proper indentation of subsequent lines is kept:

-- this is a long ... 
    multiline ...
  block comment

A comment can also be put at the end of a block headline (inline comment), but not together with inline contents of this block. Comments cannot be used inside text blocks.

Layout

All top-level blocks in a document (or sub-blocks at any given depth) must have the same indentation. Spaces ( ) and tabs (\t) can be used for indenting, although we recommend only using spaces to avoid confusion: two indentation strings are considered the same if and only if they are equal in Python sense, which means that a space in one line cannot be replaced with a tab in another equally-indented line. These are similar rules as in Python.

All the rules of text layout (inline text, multiline text etc.) that will be discussed later on hold equally for all types of text blocks (plain-text, markup, verbatim). Spaces after special characters: |/!:$% - are never obligatory, and in some cases (inside expressions) may be disallowed. A single leading space right after the text-block marker (|/!) is interpreted as a marker-content separator and gets removed from the output, if present; an additional space should be inserted by the programmer if a space character is still desired in this place in the output.

Modifiers

Hypertag defines layout modifiers: special symbols that can be put at the beginning of a block’s headline to change the block’s indentation and/or position relative to the previous block. There are two types of modifiers: append (...) and dedent (<). Modifiers cannot be mixed, and there can be at most one modifier for a given block.

The append modifier (...) marks that a block is a continuation of the previous block and should be appended to it without a newline:

i   | word1
... | word2
... b | word3

output:

<i>word1</i>word2<b>word3</b>

There should be no empty lines between the two blocks in the code, otherwise a newline will still be inserted. Indentation of the appended block is preserved and applied to any newlines that may occur within this block’s own body:

p
    i  | When appending blocks...
    ...|  the indentation in
         a multiline block
         is still preserved!

output:

<p>
    <i>when appending blocks...</i> the indentation in
    a multiline block
    is still preserved!
</p>

The “append” modifier can also be used to convert an outline sub-block into an inline one:

p
    ... | Using "..." modifier, a block with no predecessors
    ... |  can be inlined into its parent.

output:

<p>Using "..." modifier, a block with no predecessors can be inlined into its parent.</p>

The dedent modifier (<) decreases the output indentation of a block by one level (makes the indentation equal to the parent’s). It can be used with all types of blocks, including tagged and control blocks:

div
    < p
        < i | This line's output indentation is equal to the parent's and grandparent's.

output:

<div>
<p>
<i>This line's output indentation is equal to the parent's and grandparent's.</i>
</p>
</div>

There is also a built-in tag called dedent. When used without parameters, or with full=True, this tag removes all (multi-level) output indentation of the inner blocks, up to its own indentation:

div
  dedent
    div
      p
        | Everything inside "dedent" is de-indented up to the level
          of "dedent" block itself.

output:

<div>
  <div>
  <p>
  Everything inside "dedent" is de-indented up to the level
  of "dedent" block itself.
  </p>
  </div>
</div>

When used with full=False, the “dedent” tag only removes the top-most indentation of its inner blocks. Note that the “dedent” tag can be combined with dedent/append modifiers.

The pass keyword

Hypertag has a special keyword, pass, that can be used instead of a block, as an “empty block” placeholder. This quasi-block generates no output, similarly to the pass keyword in Python. The use of pass is never enforced by the syntax: empty body is always a valid alternative and can be used inside parent blocks of all types. In some cases, though, the use of explicit pass may be preferred due to aesthetic considerations.

Structural blocks

Anatomy of a block

Some types of blocks (structural blocks) may contain nested blocks inside. A list of such nested blocks is called a body. The initial part of a block that precedes the body is called a header - it always fits on a single line (the headline). For all types of structural blocks (control blocks included!), body is not mandatory and can be omitted. Moreover, the “if” and “try” control blocks may consist of multiple branches (clauses), each branch having its own body.

The most common type of structural block is a tagged block whose header consists of a name of tag, optionally followed by a space-separated list of attributes and a body.

Content of a structural block can be arranged as inline, outline (short for “out of the line”), or mixed inline+outline. Inline content starts right after the header in the headline and is usually rendered to a more compact form than outline content (without surrounding newlines):

h1 | This is inline text, no surrounding newlines are printed in the output.
p
   / These are sub-blocks of an outline content...
   | ...of the parent paragraph block.

output:

<h1>This is inline text, no surrounding newlines are printed in the output.</h1>
<p>
   These are sub-blocks of an outline content...
   ...of the parent paragraph block.
</p>

Mixed inline+outline content is allowed if a colon : is additionally put in the headline:

div: | This inline text is followed by a sub-block "p".
  p
    i | Line 1
    b | Line 2

output:

<div>This inline text is followed by a sub-block "p".
  <p>
    <i>Line 1</i>
    <b>Line 2</b>
  </p></div>

Without a colon, all content is interpreted as multiline text (fulltext body):

div |
  Line 1
  Line 2

output:

<div>
Line 1
Line 2
</div>

If no inline content is present, a colon can optionally be put at the end of the block’s headline. The two forms, with and without a trailing colon, are equivalent:

p
    i | text
p:
    i | text

output:

<p>
    <i>text</i>
</p>
<p>
    <i>text</i>
</p>

Null tag

A special null tag (.) can be used to better align tagged and untagged blocks in the code:

p
  i | This line is in italics ...
  . | ... and this one is not, but both are vertically aligned in the script.
  . | The null tag helps with code alignment when a tag is missing.

output:

<p>
  <i>This line is in italics ...</i>
  ... and this one is not, but both are vertically aligned in the script.
  The null tag helps with code alignment when a tag is missing.
</p>

Tagged blocks

The most common type of structural block is a tagged block, whose header consists of a name of tag, optionally followed by a space-separated list of attributes and a body:

div class='main-content' width='1000px'
    | sub-block 1
    | sub-block 2

Tags can be chained together in a single block using a colon :, like here:

h1 : b : a href='#' :
    | This is a bold heading with an anchor.

output:

<h1><b><a href="#">
    This is a bold heading with an anchor.
</a></b></h1>

Each tag in a chain can have its own list of attributes. Shortcuts are available for the two most common HTML attributes: .CLASS is equivalent to class=CLASS, and #ID means id=ID.

p #main-content .wide-paragraph | text...

output:

<p id="main-content" class="wide-paragraph">text...</p>

The same (keyword) attribute may occur multiple times on the list. In such case, all the values get space-concatenated. This semantics simplifies the use of .CLASS shortcuts with multiple classes, like in:

p .wide .left .green-text

which is equivalent to a repeated use of the class=... attribute, and renders:

<p class="wide left green-text"></p>

Expressions

A Hypertag script may define variables to be used subsequently in expressions inside plain-text and markup blocks, or inside attribute lists. A variable is created by an assignment block ($). Expressions are embedded in text blocks using {...} or $... syntax - the latter can only be used for expressions that consist of a variable with (optionally) some tail operators (. [] ()):

$ k = 3
$ name = "Ala"
| The name repeated $k times is: {name * k}
| The third character of the name is: "$name[2]"

output:

The name repeated 3 times is: AlaAlaAla
The third character of the name is: "a"

Expressions may contain primitives: literal strings, numbers, booleans, collections, None. Escape strings: {{, }}, $$ can be used inside text blocks and strings to produce {, }, $, respectively.

Assignment blocks support augmented assignments, where multiple variables are assigned to, all at once:

$ a, (b, c) = [1, (2, 3)]

There are also in-place assignments:

$ x += 5
$ y *= 2

Operators

Each Hypertag variable points to a Python object and can be used with all the standard operators known from Python. The list is ordered by decreasing operator priority:

. [] ()                     - tail operators (member access, indexing, function call)
** * / // %                 - arithmetic 
+ - unary minus             - arithmetic
<< >>                       - bitwise
& ^ |                       - bitwise
A:B:C                       - slice operator inside [...] indexing
== != >= <= < >             - comparison
in is "not in" "is not"     - membership & identity
not and or                  - logical
X if TEST else Y            - logical

Inside the ternary if-else operator, the else clause is optional and defaults to else None. Hypertag defines also a number of custom operators: binary, prefix unary, and postfix unary - they are described below.

Non-standard binary operators

Hypertag’s custom binary operators include:

The concatenation operator is an extension of the Python syntax for joining literal strings, like in 'Hypertag ' "is" ' cool' which is converted by Python parser to a single string: 'Hypertag is cool'. In Python, this works for literals only, while in Hypertag, all types of expressions can be joined in this way. The concatenation operator has a lower priority than binary “or” (|) and a pipeline (:); and higher than comparisons.

Note that inside dictionaries {...} and array slices [a:b:c], operators other than arithmetic and bitwise must be enclosed in parentheses to avoid ambiguity of the colon :, which in Hypertag serves as a pipeline operator, but in dictionaries and slices plays a role of a field separator.

Non-standard prefix operators

There are two unary prefix operators that identify variables and tags inside expressions:

A tag embedding constitutes an ordinary expression, and as such, it can occur as a part of a larger expression, be used as an attribute value, be inserted into a text block, etc. For example, the following markup block:

/ { %div('plain text').upper() }

is rendered to:

<DIV>PLAIN TEXT</DIV>

Non-standard postfix operators

Hypertag defines two postfix operators called qualifiers:

Qualifiers can be put at the end of atomic expressions (X?, X!, no space), or right after expression embeddings ({...}?, {...}!, $X?, $X!). Qualifiers are often combined with “try-else” blocks. Examples of use can be found in the Qualifiers section.

Filters

Hypertag defines a new operator not present in Python, the pipeline (:), for use in expressions. It is applied in a similar way as pipes | in templating languages: to pass a result of an expression (a feed) to a function (a filter) as its first argument, without putting the entire expression inside the function-call parentheses, as would normally be required. A typical example of filters in a templating language:

title | truncate(50) | upper

this takes a title string, truncates it to no more than 50 characters and then converts to upper case. In Hypertag, this expression will look the same, except the pipes are replaced with colons:

title : truncate(50) : upper

Templating languages, like Jinja or Django’s templates, require that functions are explicitly declared as filters before they can be used in template code. In Hypertag, there are no such restrictions. Rather, all callables (functions, methods, class constructors etc.) can be used in pipelines with no special preparation. A pipeline is just another syntax for a function call, so every expression of the form:

EXPR : FUN(*args, **kwargs)

gets translated internally to:

FUN(EXPR, *args, **kwarg)

Obviously, pipeline operators can be chained together, such that EXPR:FUN1:FUN2 is equivalent to FUN2(FUN1(EXPR)). A filter can be specified using a compound expression, like obj.fun or similar constructs (an atom followed by any number of “member access” or “indexing” tail operators). For example, the standard str.upper method can be used directly, instead of implementing a custom upper(). Below, a pipeline is put inside a text block:

| {'Hypertag' : str.upper : list : sorted(reverse=True)}

that renders:

['Y', 'T', 'R', 'P', 'H', 'G', 'E', 'A']

If a filter function only takes one argument (the feed), it can be used with or without parentheses in a pipeline. These two forms are equivalent:

EXPR : FUN
EXPR : FUN()

Django filters

Hypertag seamlessly integrates all of Django’s template filters. They can be imported from hypertag.django.filters and either called as regular functions or used inside pipelines. The extra filters from django.contrib.humanize (the “human touch” to data) are also available. Django must be installed on the system.

from hypertag.django.filters import $slugify, $upper
from hypertag.django.filters import $truncatechars, $floatformat
from hypertag.django.filters import $apnumber, $ordinal

| { 'Hypertag rocks' : slugify : upper }
| { 'Hypertag rocks' : truncatechars(6) }
| { '123.45' : floatformat(4) }

# from django.contrib.humanize:
| "5" spelled out is "{ 5:apnumber }"
| example ordinals {1:ordinal}, {2:ordinal}, {5:ordinal}

output:

HYPERTAG-ROCKS
Hyper…
123.4500

"5" spelled out is "five"
example ordinals 1st, 2nd, 5th

Symbols

All identifiers in Hypertag - variables, tags, attributes - are case-sensitive.

Names of tags and variables must match the following regular expression: [a-zA-Z_][a-zA-Z0-9_]* - every name that matches this pattern is called regular.

Inside names of tag attributes in a tagged block (but not hypertag definition), a broader set of characters is allowed. Basically, Hypertag supports the same format as defined for attributes in the XML (see the Name production in the XML grammar), with the restriction that a colon : must not occur as the first nor the last character. For example, the following code is valid:

div ąłę_źó:1-x = ''
div 更車-賈滑 = ''

output:

<div ąłę_źó:1-x=""></div>
<div 更車-賈滑=""></div>

A name that satisfies the broader XML-like rule of naming attributes, but not the previous one for regular identifiers, is called irregular.

In hypertag definition blocks, names of formal attributes must be regular. Otherwise, it would be impossible to refer to and make use of such attributes inside hypertag’s definition body.

However, when a tag is implemented in Python as an external tag, it can accept the extended set of attribute names, including irregular ones.

Namespaces

There are two separate namespaces in Hypertag: for tags and variables. Thanks to the separation, there is no risk of a name collission between local variables and predefined tags: “a”, “b”, “i” etc. For instance, it is possible to define $i as a loop variable, while refering to %i (an HTML tag) inside the loop at the same time:

for i in [1,2,3]:
    i | number $i

output:

<i>number 1</i>
<i>number 2</i>
<i>number 3</i>

By convention, to avoid confusion and clearly indicate what namespace a given symbol belongs to, its name can be prepended in the documentation with $ or %, to denote a variable ($i) or a tag (%i).

Name scoping

The two global namespaces are internally arranged in a hierarchy that follows the structure of the document (hierarchical name scoping). Every tagged block, as well as a hypertag definition block, creates a new branch in the namespace: new symbols are only added to this branch and are visible to sibling blocks at the same depth and to their sub-blocks, but not to other blocks in the outer scope. For example:

p
    $x = 1
    # "x" can be accessed inside the paragraph:
    | $x

# "x" cannot be accessed outside the paragraph

Obviously, symbols defined at a higher level can be temporarily overwritten in a narrower scope down the document tree:

$ x = 1
p:
    $ x = 2
    | "x" inside the paragraph equals $x
| "x" outside the paragraph equals $x

output:

<p>
    "x" inside the paragraph equals 2
</p>
"x" outside the paragraph equals 1

Note that unlike tagged and definition blocks, control blocks do not create new branches in namespaces by themselves. Therefore, it is correct to assign a variable inside an if/try/for/while block and still access its value in sibling blocks:

if True:
    $x = 1
else:
    $x = 2
| x=$x is accessible in a sibling of a control block

output:

x=1 is accessible in a sibling of a control block

Primitives

Hypertag supports the following literal expressions:

Literal strings can be created with the '...' or "..." syntax, both are equivalent. This creates formatted strings (f-strings, analogue of Python’s f-strings), which may contain embedded expressions of both the $... and {...} form. If you want to create raw strings instead, such that $, {, } are treated as regular characters, the r'...' and r"..." syntax should be used:

| { "this is a formatted string with an embedded expression: {2+3}" }
| {r"this is a raw string, so the expression is left unparsed: {2+3}" }

output:

this is a formatted string with an embedded expression: 5
this is a raw string, so the expression is left unparsed: {2+3}

Inside formatted strings (but not in raw strings), Python’s escape sequences (\n, \t, \xNN, \uNNNN, \\ etc.) are recognized and converted to corresponding characters.

Hypertag syntax allows for creation of standard Python collections: lists, tuples, sets, dictionaries. When creating sets and dicts, keep a space between the braces of a collection and the surrounding embedding, otherwise the double braces may be interpreted as escape strings.

| this is a list:   { [1,2,3] }
| this is a tuple:  { (1,2,3) }
| this is a set:    { {1,2,1,2} }
| this is a dict:   { {'a': 1, 'b': 2} }

Output:

this is a list:   [1, 2, 3]
this is a tuple:  (1, 2, 3)
this is a set:    {1, 2}
this is a dict:   {'a': 1, 'b': 2}

Imports

Variables can be imported from other Hypertag scripts and Python modules using an import block. Objects of any type can be imported in this way, including functions and classes. Symbols can optionally be renamed using the as syntax:

from python_module import $x, $y as z, $fun as my_function, $T as MyClass
from hypertag_script import $name

| fun(x) is equal $my_function(x)
$ obj = MyClass(z)

The HyperHTML standard runtime recognizes the same package.module syntax of import paths as Python. The “dotted” path syntax can be applied to Python and Hypertag files alike: the latter must have the “.hy” extension in order to be detected. The interpretation of import paths is runtime-specific, so some other (custom) runtime classes could parse these paths differently, for instance, to enable the import of scripts from a DB instead of files, or from remote locations etc. Wildcard import is supported: from PATH import *.

Importing an entire module (import PATH) is currently not supported.

Tags can be imported in a similar way as variables. Due to separation of namespaces (variables vs. tags), all symbols must be prepended with either $ (to denote a variable), or % (a tag):

from my.utils import $variable
from my.utils import %tag

When importing external tags from a Python module, the tag name is looked up in a special module-level dictionary __tags__, which must be present and contain a given tag name for the import to succeed. The value of each entry should be an instance of hypertag.Tag.

HyperHTML supports relative (.PACKAGE.MODULE) and absolute (PACKAGE.MODULE) import paths. In some cases, when relative paths are used, it may be necessary to pass the value of __file__ or __package__ of the current module as a context variable to the render() method, for the path resolution to work correctly:

html = HyperHTML().render(script, __file__ = __file__, __package__ = __package__)

It is allowed that import paths refer to folders that are not valid Python packages (no __init__.py inside).

Context

Hypertag provides a special type of import block for declaring “context” variables and tags, which can (and should) be passed by the caller to runtime.translate() or runtime.render() (see Runtime) when executing the script. These variables/tags constitute a dynamic context of script execution, for example:

context $width          # width [px] of the page
context $height         # height [px] of the page

| Page dimensions imported from context are $width x $height

This script can be rendered in the following way:

html = HyperHTML().render(script, width = 500, height = 1000)
print(html)

and the output is:

Page dimensions imported from context are 500 x 1000

The context block behaves similar to an import block, in that the declared variables/tags get automatically introduced to the script’s namespace at the point of the block’s occurrence. Also, like in import blocks, a context block may declare multiple symbols at once, comma-separated, and each symbol on the list can optionally be renamed using the as syntax:

context $width as W, $height as H

Importantly, unlike import blocks, a context block can only occur at the beginning of a script, i.e., it can only be preceeded by comments or other context blocks (there can be multiple context blocks in a script).

Context blocks constitute a public interface of the script: all the variables and tags declared in these blocks are obligatory and must be present in the call to translate() or render(), otherwise an exception is raised. Any extra variables/tags passed by the caller are ignored.

Note that the presence of context blocks in a script implies that this script can no longer be imported by other scripts, and it can only be used at the top level of a script execution hierarchy (!), otherwise there would be no way to supply a context.

Custom tags

Hypertag allows programmers to define custom tags, either directly in Hypertag code (native tags), or as Python classes (external tags). Both cases are described below.

Native tags

One of the most distinctive features of Hypertag is the support for custom tag definitions right inside a Hypertag script. This type of custom tag is called a native tag or hypertag, and is created with a hypertag definition block (%):

% tableRow name price='UNKNOWN'
    tr        
        td | $name
        td | $price

Here, tableRow is a newly defined tag that wraps up plain-text contents of table cells with appropriate tr & td tags to produce a listing of products. A hypertag may accept attributes, possibly with default values, similar to Python functions. In places of occurrence, hypertags accept positional (unnamed) and/or keyword (named) attributes:

table
    tableRow 'Porsche'  '200,000'
    tableRow 'Jaguar'   '150,000'
    tableRow 'Maserati' '300,000'
    tableRow name='Cybertruck'

output:

<table>
    <tr>
        <td>Porsche</td>
        <td>200,000</td>
    </tr>
    <tr>
        <td>Jaguar</td>
        <td>150,000</td>
    </tr>
    <tr>
        <td>Maserati</td>
        <td>300,000</td>
    </tr>
    <tr>
        <td>Cybertruck</td>
        <td>UNKNOWN</td>
    </tr>
</table>

Custom native tags constitute a powerful instrument of code abstraction and deduplication. They enhance modularity and maintainability of presentation code, and let programmers fully adhere to the DRY principle.

Imagine that, in the example above, we wanted to add a CSS class to all cells of the price column. In HTML, we would have to walk through all the cells and manually modify every single occurrence (HTML is notorious for code duplication), taking care not to modify <td> cells of another column accidentally. With hypertags, it is enough to add .style-price in a tag definition, and voilà:

% tableRow name price='UNKNOWN'
    tr        
        td | $name
        td .style-price | $price

This definition can be moved out to a separate “utility” script and loaded with Python-like imports, or stay in the same file where it is being used, for easy maintenance - the programmer can choose whatever location is best in a given case. In traditional templating languages, there are not so many choices: often the best we can do is separate out duplicated HTML code into a Python function (like a custom tag in Django), introducing code fragmentation along the way and spreading presentation code over different types of files (views vs. models) and languages (HTML vs. Python) - a very unclean and confusing approach.

Not surprisingly, hypertags can refer to other hypertags. Moreover, hypertag definitions can be nested: a hypertag can be defined inside another one, such that it can only be used locally within the scope of the outer definition, like the %row inside %products, below:

% products items=[] maxlen=20
    % row name price
        tr        
            td | $name[:maxlen]
            td | $price
    table        
        for item in items:
            row item.name item.price

Notice that the local tag %row, which is being used in a loop in the last line, can internally access attributes of the outer hypertag (here, maxlen).

“Body” attribute

A crucial element of the hypertag syntax is the body attribute. Imagine that in the example above, we wanted to add another column containing formatted (rich-text) information about a car model: pictures, funny quotes etc. Passing this as a regular attribute is inconvenient, as we would have to somehow encode the entire HTML structure of the description: paragraphs, styles, images. Instead, we can add a body attribute (@) to the hypertag definition:

% tableRow @info name price='UNKNOWN'
    tr
        td | $name
        td | $price
        td
           @ info           # inline form can be used as well:  td @ info

This special attribute will hold the actual body of a hypertag’s occurrence, represented as a hierarchy of nodes of Hypertag’s native Document Object Model (DOM), so that all rich contents and formatting are preserved:

table
    tableRow 'Porsche' '200,000'
        img src="porsche.jpg"
        / If you insist on <s>air conditioning</s>, 🤔
        / you can always hit the track and roll down the window at <u>160 mph</u>. 😎 
    tableRow 'Jaguar' '150,000'
        img src="jaguar.jpg"
        b | Money may not buy happiness, but I'd rather cry in a Jaguar than on a bus.
    tableRow 'Cybertruck'
        | If you liked Minecraft you will like this one, too.
        / (Honestly, I did it for the memes. <i>Elon Musk</i>)

Output:

<table>
    <tr>
        <td>Porsche</td>
        <td>200,000</td>
        <td>
           <img src="porsche.jpg" />
           If you insist on <s>air conditioning</s>, 🤔
           you can always hit the track and roll down the window at <u>160 mph</u>. 😎
        </td>
    </tr>
    <tr>
        <td>Jaguar</td>
        <td>150,000</td>
        <td>
           <img src="jaguar.jpg" />
           <b>Money may not buy happiness, but I'd rather cry in a Jaguar than on a bus.</b>
        </td>
    </tr>
    <tr>
        <td>Cybertruck</td>
        <td>UNKNOWN</td>
        <td>
           If you liked Minecraft you will like this one, too.
           (Honestly, I did it for the memes. <i>Elon Musk</i>)
        </td>
    </tr>
</table>

There can be at most one body attribute in a hypertag; it must be the first one on the list; it can be missing (then the tag is void and its occurrences must have empty body); and it can have arbitrary name: we suggest @body if there is no other meaningful alternative.

Together with the body attribute, Hypertag provides a new type of block: the DOM insertion block (@) that allows embedding of a DOM represented by a body attribute into a formal body of a hypertag. This type of block was used above when defining %tableRow:

td
   @ info

This code inserts external information about a car (info) represented by a DOM into the <td> tag within the hypertag’s output. Note that, similar to text blocks, DOM insertion blocks can also be used as inline body of a structural block, so the fragment above can be rewritten as:

td @ info

The inline and outline forms may differ with respect to output indentation and whitespace; otherwise they are, in most cases, equivalent (although this may depend on a specific tag implementation).

Most often, a DOM insertion block contains just a name of the body attribute. However, in general, any expression is allowed, so the DOM can be preprocessed before insertion:

@ info.select('img')[:1]

The above code searches for img nodes and inserts the first one, if present. Note that within insertion blocks, everything is treated as an expression, so the casual embedding characters, $... and {...}, are not needed here.

Although each body attribute is a regular variable, which can be used in all the same places as other variables (in all types of expressions), embedding it through another type of block (other than @) will not work as expected. For example, although the following code is valid:

/ $info

it will produce a Python representation string of the DOM instance, instead of merging the DOM into the hypertag’s body:

<hypertag.core.dom.DOM object at 0x7f5de33bcd60>

In some rare cases, you might want to render the DOM straight away and embed it as a string:

/ $info.render()

This is a valid approach, although it prevents any further manipulation of this part of the DOM upstream in the script, and for this reason it is not recommended.

External tags

Custom tags can have a form of Python objects of hypertag.Tag class or its subclass. Every new tag should have a name (tag.name) and expose the expand method:

def expand(self, body, attrs, kwattrs):
    ...

This method is called during DOM rendering in order to convert a body (an instance of DOM, see DOM structure) tagged by the tag to a string in a target language. Typically, body.render() is called inside expand() to first convert the body to a string, and only then a tag-specific surrounding text is added; this can be customized, however. In general, the tag class is free to do whatever it wants with the input DOM, including any transformation and manipulation, before an output text is produced. The tag expansion can be influenced by positional (attrs) and keyword attributes (kwattrs). Note that names of keyword attributes may follow the more general XML rules of attribute naming, so they may not be valid Python identifiers.

If you implement a tag that behaves similarly to HTML markup tags, in that it only adds surrounding plain-text tags <...>, possibly with attributes, it may be helpful to use the standard hypertag.Markup class, which can be subclassed or directly instantiated. This approach can be taken, for example, to implement custom XML tags, not present on the list of standard HTML5 tags.

Hypertag provides also a convenient wrapper, hypertag.TagFunction, for converting functions into tags. The wrapper can be applied to any function of the form:

def fun(body, *attrs, **kwattrs)
    ...

The function is called in place of an expand() method; it should accept at least one positional argument (the body) and return a string. Importantly, the body here is passed as an already-rendered string (!), rather than a DOM, so that existing text-processing functions can be used as they are with the wrapper. If you need to manipulate the DOM during expansion, you should subclass hypertag.Tag instead.

After a new tag is implemented, it should be added to the special module-level dictionary, __tags__, where it could be found by import blocks.

Control blocks

Control blocks of multiple types are available in Hypertag to help you manipulate input data directly in a document without going back and forth between Python and presentation code. The blocks are:

“if”, “for”, “while” blocks

The syntax of “if”, “for”, “while” blocks is analogous to what it is in Python. Both inline and outline body is supported, although the former comes with restrictions: the leading expression (a condition in “if/while”, a collection in “for”) may need to be enclosed in (...) or {...} to avoid ambiguity of special symbols |/!, which can be interpreted both as operators inside the expressions and markers of inline body. Trailing colons in clause headlines are optional.

An example “if” block with an outline body looks like this:

$size = 5
if size > 10      
    | large size
elif size > 3:
    | medium size
else
    | small size

output:

medium size

The same code as above, but with inline body (notice the parentheses around expressions):

$size = 5
if (size > 10)    | large size
elif (size > 3)   | medium size
else              | small size

Trailing colons (:) after clause headers are optional.

Examples of loops:

for i in [1,2,3]  | $i

for i in [1,2,3]:
    li | item no. $i

$s = 'abc'
while len(s) > 0               -- Python built-ins ("len") can be used
    | letter "$s[0]"
    $s = s[1:]                 -- assignments can occur inside loops

output:

123

<li>item no. 1</li>
<li>item no. 2</li>
<li>item no. 3</li>

letter "a"
letter "b"
letter "c"

Like in Python, loops can be followed by an optional else clause, however, the semantics is very different to what it is in Python. Hypertag’s else was inspired by Django’s empty clause of the for ... empty ... endfor construct and its role is to provide a fallback when no iteration of the loop was executed, either due to emptiness of a collection (inside for blocks), or due to falseness of a condition right from the beginning of a while loop.

This is quite opposite to the Python’s semantics of else, which is based on the loop’s termination cause: the statement executes only after the loop completes normally (no break was encountered). The Python’s semantics seems confusing to many and is rarely used, therefore Hypertag adopts a different, more intuitive and probably more useful approach. Example:

    for item in []:
        p | $item
    else:
        | No item found.

output:

No item found.

Note that Hypertag does not provide equivalents for Python’s loop control keywords: break and continue.

“try” block

The “try” block differs from the same-named Python statement. It consists of a single “try” clause plus any number (possibly none) of “else” clauses. The first clause that does not raise an exception is returned. All exceptions that inherit from Python’s Exception are caught. Empty string is rendered if all clauses fail.

Exceptions are checked only after semantic analysis, so if there are any syntactical or name resolution errors (e.g., an undefined variable in a clause), they are still being raised. Also, note that the semantics of “else” is opposite to what it is in Python, where the “else” clause of a “try-else” statement only gets executed if no exception has occured.

Example:

$cars = {'ford': 60000, 'audi': 80000}
try
    | Price of Opel is $cars['opel'].
else
    | Price of Opel is unknown.

output:

Price of Opel is unknown.

Similar code as above, but with inline body:

$cars = {'ford': 60000, 'audi': 80000}

try  | Price of Opel is $cars['opel'].
else | Price of Opel is not available, but how about Seat: $cars['seat'].
else | Neither Opel nor Seat is available.
       Let's stick with a Ford: $cars['ford'].

output:

Neither Opel nor Seat is available.
Let's stick with a Ford: 60000.

There is a shortcut version “?” of the “try” syntax which can only be used without “else” clauses, in order to suppress exceptions:

? | Price of Opel is $cars['opel'].

Importantly, the shortcut “?” can be used as a prefix (on the same line) with a tagged block, which is not possible with the basic syntax. The code below renders empty string instead of raising an exception:

? b : a href=$cars.url | the "a" tag fails because "cars" has no "url"

Qualifiers

The “try” block is particularly useful when combined with expression qualifiers: “optional” (?) and “obligatory” (!), placed at the end of (sub)expressions to mark that a given piece of calculation either:

Together, these language constructs enable fine-grained control over data post-processing, sanitization and display. They can be used to verify the availability of particular elements of input data (keys in dictionaries, attributes of objects) and to easily create alternative paths of calculation that will handle multiple edge cases at once:

| Price of Opel is {cars['opel']? or cars['audi'] * 0.8}

In the above code, the price of Opel is not present in the dictionary, but thanks to the “optional” qualifier ?, a KeyError is caught early, and a fallback is used to approximate the price from another entry. The output is:

Price of Opel is 64000.0

With the “obligatory” qualifier ! one can verify that a variable has a non-default (non-empty) value, and adapt the displayed message accordingly, with no need for more verbose if-else tests:

%display name='' price=0
    try  | Product "$name!" costs {price}!.
    else | Product "$name!" is available, but the price is unknown yet.
    else | There is a product priced at {price!}.
    else | Sorry, we're closed.

display 'Pen' 100
display 'Pencil'
display price=25

output:

Product "Pen" costs 100.
Product "Pencil" is available, but the price is unknown yet.
There is a product priced at 25.

Qualifiers can also be used in loops to test for non-emptiness of the collections to be iterated over:

try
    for p in products!
        | $p.name costs $p.price
else
    | No products currently available.

When passed $products=[], the above code outputs:

No products currently available.

Qualifiers can be placed after all atomic expressions and embeddings, no space is allowed. More details can be found in the Operators section.

DOM

The execution of a Hypertag script consists of multiple phases. Before the final document is generated, the script is first translated to a native Document Object Model (DOM), where every tagged block is mapped to a node in the DOM tree. During translation:

Importantly, the translation is performed incrementally, going from the bottom to the top of the script’s syntax tree (AST). Whenever a hypertag declares a body attribute (@), this attribute’s value (an actual body from the place of occurrence) is passed to the tag as an already-translated DOM of a particular subtree. This gives hypertags an exceptional capability to actively manipulate (introspect, truncate, rearrange) the provided subtree before it gets merged into the formal body of the hypertag. Possible applications include:

All these can be done directly in Hypertag, without falling back to Python code.

The details of the DOM structure and manipulation are discussed in next subsections. We also show how to generate a Table of Contents in just a few lines of Hypertag code.

DOM structure

The DOM is built of instances of the following classes:

The DOM class can be imported from Hypertag’s root package:

from hypertag import DOM

During expansion of custom tags, both native and external ones, the body attribute is passed as an instance of DOM. For the entire document, the result of script translation is additionally wrapped up in Root and returned as an instance of this class.

If you want to check what DOM is being produced by a given script, you may call runtime.translate() instead of runtime.render(), and then tree() of the returned DOM to get its textual representation:

dom = runtime.translate(script)
print(dom.tree())

For example, the following script:

ul .short-list
    li : i | Item

for i in [1,2,3]:
    b class="row$i" | Row no. $i

is translated to the DOM:

<Root>
  <Text>
  ul class=short-list
    li
      i
        <Text>
  <Text>
  b class=row1
    <Text>
  b class=row2
    <Text>
  b class=row3
    <Text>
  <Text>

As you can notice, all text blocks are converted to Text nodes. Vertical whitespace surrounding or separating the blocks is also encoded as Text. Control blocks (for) are replaced with the result of their execution. Expressions are replaced with their values (row1 etc.). Tags and their attributes are preserved. All positional attributes of hypertags (but not of external tags) are converted to keyword attributes. Chained tags (li : i) are mapped onto separate parent-child nodes in the DOM.

In places where Node instances occur, the tree() method prints names of tags instead of <Node>, for brevity.

DOM manipulation

There is one fundamental reason why Hypertag employs an intermediate DOM representation and performs the AST-to-DOM translation as a separate phase during script execution instead of rendering the entire script to a string at once: this reason is to allow document manipulation inside hypertags, before the final document gets rendered, so that hypertags can assume active role in document generation, and be able to communicate more efficiently with other parts of code. In a typical scenario, the incoming DOM is passively transferred to the output of a hypertag. However, with DOM manipulation routines, the DOM can be freely modified along the way, and can also be used as a means of communication between hypertags, or as a carrier of internal data that control hypertags’ expansion.

The DOM base classes, DOM and Node, provide two general-purpose methods for traversing a DOM:

The DOM class provides also higher-level methods that can be used to apply constraints to a DOM hierarchy and select subsets of nodes. Each of these methods returns the nodes wrapped up in a newly created instance of DOM, with the original DOM left unmodified:

Constraints for select and skip are defined through function arguments, which restrict what nodes get selected. The conditions below can be combined:

Other methods may be added in the future to handle more generic classes of selector expressions (XPath, CSS).

Additionally, the DOM and Node classes override the indexing operator [...] to provide shortcuts for accessing top-level nodes and attributes, and for node selection:

Example: ToC generation

The DOM manipulation routines can be utilized to automatically generate the Table of Contents (ToC) of an arbitrary Hypertag document. Below, we assume that h2 is the tag that marks occurrences of top-level headings and therefore should be detected and its contents put in the ToC. A hypertag that performs this detection and generates a ToC takes as few as four lines of code:

%toc @document
    for heading in document['h2']
        $ id = heading.get('id','')
        li : a href="#{id}" @ heading.body

Here:

With %toc hypertag in place, we can define one more tag, %with_toc, to add an introductory text and print the full document together with the ToC:

%with_toc @document
    | Table of Contents:
    ol
        toc @document

    | The document:
    @document

Now, to add a ToC to an arbitrary document, it is enough to tag it with with_toc, like here:

with_toc
    h2 #first  | First heading
    p  | text...
    h2 #second | Second heading
    p  | text...
    h2 #third  | Third heading
    p  | text...

The output is:

Table of Contents:
<ol>
    <li><a href="#first">First heading</a></li>
    <li><a href="#second">Second heading</a></li>
    <li><a href="#third">Third heading</a></li>
</ol>

The document:
<h2 id="first">First heading</h2>
<p>text...</p>
<h2 id="second">Second heading</h2>
<p>text...</p>
<h2 id="third">Third heading</h2>
<p>text...</p>

Runtime

Execution of a Hypertag script is performed by a runtime: an instance of hypertag.Runtime class. The execution constists of 3 phases:

  1. parsing of the script to an Abstract Syntax Tree (AST); the syntactic and semantic analysis of the script is performed;
  2. translation of the AST to a native Document Object Model (DOM), where tagged and textual blocks are mapped to nodes of a DOM tree; during translation, all expressions are evaluated, native hypertags get expanded, control blocks are executed;
  3. rendering of the DOM to a final document (a string) in a target language.

Typically, client code calls runtime’s render() to perform all the above steps at once. If a client wants to obtain a structured representation of the document - the DOM - rather than a flat string, the runtime’s method translate() should be called instead, followed by a call to render() on the DOM tree. Between the calls, the DOM can be manipulated and modified according to the caller’s needs.

The runtime specifies what target language the scripts will be rendered to, and defines a list of built-in symbols (tags and/or variables, see Runtime.DEFAULT) that will be automatically imported at the beginning of script execution.

Additionally, the runtime specifies an escape function (Runtime.escape) that will be applied to all outputs of plain-text blocks in order to convert them to a target language. This function can perform simple character encoding, like entity encoding in the case of HTML, but it can also do any other more complex operation that is necessary to convert plain text to a valid string in the target language.

HyperHTML

Hypertag implementation provides a standard runtime, hypertag.HyperHTML, for generation of HTML5 documents. This runtime implements HTML-specific tags and an escape function.

The escape function performs character encoding: the special characters (<, >, &) are replaced with corresponding HTML entities (&lt; &gt; &amp;).

The symbols imported by HyperHTML as built-ins upon startup include:

  1. Python built-ins.
  2. General-purpose tags & functions.
  3. HTML-specific tags.

See the Standard library section for details.

Standard library

Hypertag comes with a number of predefined tags and functions that can be used in scripts. Some of them are declared as built-ins and automatically imported by the standard runtime (HyperHTML), while others can be imported manually using import blocks.

Importantly, all predefined tags are implemented as external tags, which means they get rewritten into the DOM nodes during translation (rather than expanded right away), and can subsequently be used in selectors during DOM manipulation inside other (native) tags or outside the parser code.

Python built-ins

First and foremost, the HyperHTML’s list of built-in symbols includes all of Python’s built-ins (builtins.*), therefore all the commonly used types and functions: list, set, dict, int, min, max, enumerate, sorted etc., are available to a Hypertag script.

| $len('cat'), $list('cat')
| $int('123'), $min(4,5,6)
for i, c in enumerate(sorted('cat')):
    | $i, $c  

output:

3, ['c', 'a', 't']
123, 4
0, a
1, c
2, t

Python built-ins can also be imported explicitly from the usual path. Remember to prepend every name with the variable marker ($):

from builtins import $sorted, $list as LIST
| $sorted(LIST((3,2,1)))

output:

[1, 2, 3]

Hypertag built-ins

Hypertag defines a number of its own general-purpose tags and functions (filters) that can be used with different runtimes and target languages. Each of the names below refer both to a tag (e.g., %dedent), and a same-named function ($dedent):

In HyperHTML, all the above symbols are declared as built-ins and imported automatically:

unique
    inline |
        Hypertag
        rocks !!!
    | { upper('  hyperTAGS   rock   ') : lower : inline }
    | Hypertag rocks !!!

output:

Hypertag rocks !!!
hypertags rock

The following symbols are only available as functions. They are declared as built-ins in HyperHTML:

If needed, all the symbols listed above can be imported explictly from the hypertag.builtins path:

from hypertag.builtins import %inline, $inline, $cycle

Foreign symbols

If Django is installed, you can use all of its template filters inside Hypertag, either as standalone functions, or filters in pipeline expressions. The details are described in the Filters section.

HTML-specific symbols

For every standard HTML5 tag, HyperHTML provides two corresponding Hypertag tags: written in lower case and upper case. For example, for the HTML tag <div>, there are %div and %DIV hypertags available. Their output differs by letter case of the HTML tag name produced, otherwise the behavior is the same. It is up to the programmer to decide what variant to use:

div class='search'
    span | text

DIV class='search'
    SPAN | text

output:

<div class="search">
    <span>text</span>
</div>

<DIV class="search">
    <SPAN>text</SPAN>
</DIV>

In addition to standard HTML tags, HyperHTML provides also the comment tag that inserts an HTML comment to the output. Typically, this tag is used with a verbatim body (!):

comment ! This is an HTML comment

output:

<!--This is an HTML comment-->

Whenever HyperHTML runtime is used, all built-in HTML tags are automatically imported to a script. They can also be imported explicitly from the hypertag.html module:

from hypertag.html import %div, %DIV

An explicit import can be used, for example, when a script is being rendered with a non-HTML runtime and the target document is mostly written in a different language, but some HTML markup still needs to be inserted.

Django connector

Hypertag fully integrates with Django. As described earlier, all of Django’s template filters are available for Hypertag scripts out of the box. There is also a Django-Hypertag backend class: hypertag.django.backend.Hypertag - when put in settings.py of a Django project, this class allows Hypertag scripts to be found by the template discovery mechanism, so that Hypertag scripts can be loaded and rendered just like standard Django’s or Jinja2 templates.

The Hypertag backend configuration should be put on the TEMPLATES list inside Django project’s settings.py:

TEMPLATES = [
    {
        'BACKEND': 'hypertag.django.backend.Hypertag',
        'DIRS': [],
        'APP_DIRS': True,
        'OPTIONS': {},
    },
    # ... other engines here ...
]

By default, Hypertag scripts are looked for in the hypertag subfolder of the Django project directory. Their file names must have .hy extension. For example, to load a Hypertag script, my_script.hy, into a Django view function, my_view(), and render it through the Hypertag backend while setting a value of a context variable, title, you can use the following code:

from django.template.loader import get_template

def my_view(request):
    template = get_template('my_script.hy')
    context = {'title': 'Hypertag sample template'}
    return template.render(context, request)

Note that when a Hypertag script imports other scripts, Hypertag’s own loading mechanism is used, not Django’s. Therefore, there is no need to tweak global Django settings to import scripts from outside the hypertag folder: the related scripts (imported by the top-level one) can be located in any folder or package that is accessible through a dotted path, absolute or relative. It is also possible to import scripts from subfolders of hypertag by using a relative path. For example, the following import block, when placed in a top-level Hypertag script:

from .users.profile import %photo_editor

will load a photo_editor tag from a profile.hy script located in hypertag/users subfolder.

In some cases, when relative import paths are used, it may be necessary to pass __file__ or __package__ of the current module to the render() method as a context variable, for the import path resolution to work correctly:

    # (...)
    context = {'title': 'Hypertag sample template', '__file__': __file__, '__package__': __package__}
    return template.render(context, request)