171 lines
9.2 KiB
Markdown
171 lines
9.2 KiB
Markdown
[Source](http://www.schnada.de/grapt/eriknaggum-enamel.html "Permalink to Erik Naggum on attributes in SGML/XML, Enamel (NML), Lisp")
|
||
|
||
# Erik Naggum on attributes in SGML/XML, Enamel (NML), Lisp
|
||
|
||
►● [Impressum][1]
|
||
|
||
# Erik Naggum on attributes in SGML/XML, Enamel (NML), Lisp
|
||
|
||
|
||
Newsgroups: [comp.lang.lisp][2]
|
||
Subject: Re: XML and lisp
|
||
From: [Erik Naggum][3] <e...@naggum.net>
|
||
Message-ID: <3207626455633924@naggum.net>
|
||
Organization: Naggum Software, Oslo, Norway
|
||
User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7
|
||
Date: Fri, 24 Aug 2001 07:21:01 GMT
|
||
NNTP-Posting-Date: Fri, 24 Aug 2001 09:21:01 MET DST
|
||
|
||
* Tim Bradshaw <t...@tfeb.org>
|
||
> ((:reply :title "Lisp is not just a programming language")
|
||
> (:body
|
||
> (:p "It is also a text-markup language,
|
||
> and many other things, as you can see here"
|
||
> "For instance with a suitable (small) macro, this is quite legal
|
||
> Lisp syntax, which is compiled to *ML. I have written significantly-sized
|
||
> documents in this notation."))
|
||
> (:signature "--tim"))
|
||
|
||
As long as we think aloud in alternative syntaxes, I actually prefer to
|
||
break the _incredibly_ stupid syntactic-only separation of elements and
|
||
attribute values. SGML and its descendants have made a crucial mistake:
|
||
For every level of container (there are about 7 of them), there is a new
|
||
syntax for _two_ properties of the container: (1) the contents is wrapped
|
||
in one syntax, but (2) the "writing on the box" is in quite another.
|
||
This means that information and meta-information are massively different
|
||
concepts, and this artificial separation runs through the whole SGML
|
||
design. Each level offers a new way to write the two differently. This
|
||
is what makes it so goddamn hard to reason about SGML documents and to do
|
||
reasonably intelligent transformations on them without working your butt
|
||
off specifying all sorts of irrelevant stuff that does _nothing_ but get
|
||
in your way.
|
||
|
||
I have come to _loathe_ the half-assed hybrid that some XML-in-Lisp tools
|
||
use and produce, because it makes XML just as evil in Lisp as it was in
|
||
XML to begin with, and we have gained absolutely nothing in either power
|
||
of processing or in abstraction, which is so very un-Lisp-like.
|
||
|
||
<foo bar="zot">quux</foo>
|
||
|
||
should be read as
|
||
|
||
(foo (bar "zot") "quux")
|
||
|
||
and most definitely _NOT_ as ((:foo :bar "zot") "quux"), which turns this
|
||
fairly reasonable structure into a morass of complexity worse than it was
|
||
to begin with. And it does _NOT_ help to represent empty elements only
|
||
with a keyword. Using three different levels of nesting to represent a
|
||
single concept is Just Plain Wrong. Also, using keywords is not a good
|
||
idea because there needs to be a lot of related information associated
|
||
with elements and attributes, in different contexts, not to mention all
|
||
the things they do with their funny "namespaces" these days.
|
||
|
||
Whether something is an attribute or element is _completely_ arbitrary.
|
||
It is based on some arbitrary choices in the design process that reveal
|
||
absolutely no inherent qualities. For purely pragmatic reasons, SGML
|
||
folks will use attributes for some things and elements for others because
|
||
their tools can deal with some things in attributes and some things in
|
||
elements. The faulty idea that attributes say something "about" the
|
||
element and sub-elements somehow constitute be their contents is the same
|
||
premature structuring that premature optimization of code suffers from.
|
||
The whole language is incredibly misdesigned in making that distinction.
|
||
|
||
As for writing SGML/XML/HTML/whatever, I have a simple way to get rid of
|
||
the annoying verbosity of these stupid languages while _retaining_ that
|
||
mistake between attribute values and elements, because it is quite hard
|
||
to make simple regular expression-based conversions retain enough data
|
||
about an element to decide what should be attribute and element. An
|
||
element has the form <name [attributes] | [contents]>. Attribute have
|
||
the form <name | value>. Internal whitespace is only for readability.
|
||
|
||
XML Enamel (NML) CL
|
||
<foo/> <foo> (foo)
|
||
<foo bar="zot"/> <foo <bar|zot>> (foo (bar "zot"))
|
||
<foo>zot</foo> <foo|zot> (foo "zot")
|
||
<foo bar="zot">quux</foo> <foo <bar|zot> |quux> (foo (bar "zot") "quux")
|
||
<foo>Hey, &quux;!</foo> <foo|Hey, [quux]!> (foo "Hey, " quux "!")
|
||
<foo>AT&T you will</foo> <foo|AT&T you will> (foo "AT&T you will")
|
||
<foo><bar>zot</bar></foo> <foo|<bar|zot>> (foo (bar "zot"))
|
||
|
||
So I have almost none of the annoying and arbitrary quote/escape mania in
|
||
attribute values or contents alike, either. Entities I write as [name],
|
||
and they end up in the Lisp version as symbols if not the character they
|
||
represent purely for syntactic reasons. Writing "code" in this language
|
||
is actually amazingly painless compared to the produced noise. Besides,
|
||
with a few simple modify-syntax-entry calls in Emacs, I get < and > to
|
||
match and blink and I can move up and down the structure very easily.
|
||
|
||
For processing this stuff in Common Lisp, it is _sometimes_ neat to
|
||
convert the single | attribute/content marker into the zero-length
|
||
symbol, ||, so pathological cases like
|
||
|
||
<foo bar="zot"><bar>"zot"</bar></foo>
|
||
|
||
which could have been written like this to show how arbitrary the
|
||
syntactic disctinction in SGML/XML is
|
||
|
||
<foo <bar|zot>|<bar|zot>>
|
||
|
||
come out as
|
||
|
||
(foo (bar "zot") || (bar "zot"))
|
||
|
||
The really interesting thing is that writing in Enamel and producing XML
|
||
is so easy that a simple Perl or Lisp function that takes an Enamel
|
||
string as argument and produces XML is quite simple and straight-
|
||
forward. This makes for some interesting-looking "scripting" that blows
|
||
the mind of the miserable little wrecks that think they have to type the
|
||
endtag, the quotes and all the other user-inimical features of SGML/XML.
|
||
|
||
In my personal view, Lisp "markup" has the disadvantage of needing lots
|
||
of quotes, while Enamel has the strong advantage that in <xxx|yyy>, xxx
|
||
is always symbolic and yyy is always a string of characters subject to
|
||
interpretation by whatever the symbolic part instructs in context.
|
||
|
||
Since the key feature of markup languages is the separation of text from
|
||
markup, the simple idea in Enamel should carry enough force to make this
|
||
a fully realizable goal without making an artificial syntactic separation
|
||
between information and meta-information at any level. If the syntax is
|
||
good enough for the information, it should be good enough for the meta-
|
||
information, and I think Enamel is. Fortunately, I do not have to create
|
||
a whole new international following and engage in godawful politics to
|
||
use a better syntax for XML and the like, since XML and the like are only
|
||
used as interchange syntaxes these days. Nobody in their right mind
|
||
actuslly writes anything by hand in such stupid languages that require so
|
||
much attention to incredibly insignificant details and incomprehensibly
|
||
irrelevant redundancy, anyway, do they? :)
|
||
|
||
Finally, note that in Enamel, a complete element is enclosed in <...> and
|
||
that means it can be subject to a nice little Common Lisp reader macro,
|
||
and it can be taught to recognize other stuff, as well, such as the neat
|
||
concept of interpolating expression values where {expression} occurs.
|
||
|
||
Still at "internal use" stage, I plan to publish some stuff about Enamel
|
||
not too far into the future.
|
||
|
||
///
|
||
|
||
|
||
maintained by [Mr Schnada][4] <webmaster at schnada de> zorglub 2017-06-02 00:52Z
|
||
|
||
[ [Up][5] | [Top][6] | [Contents][7] | [Endorse][8] | [Donate][9] | [Contact][4] | [Disclosure][10] ]
|
||
|
||
{ Check [markup][11] ([*][12]) | [links][13] ([*][14]) | [style][15] }
|
||
|
||
[1]: http://www.schnada.de/impressum.html "legal notice / disclosure"
|
||
[2]: http://groups.google.com/group/comp.lang.lisp/msg/4917ba734ce860c4
|
||
[3]: http://naggum.no/
|
||
[4]: http://www.schnada.de/contact.html
|
||
[5]: http://www.schnada.de/quotes/contempt.html
|
||
[6]: http://www.schnada.de/index.html
|
||
[7]: http://www.schnada.de/index.html#conts
|
||
[8]: http://www.schnada.de/hylin/colsa.html#indoors
|
||
[9]: http://www.schnada.de/bilders/scan/baalpfig.html
|
||
[10]: http://www.schnada.de/impressum.html
|
||
[11]: http://validator.w3.org/check?uri=referer
|
||
[12]: http://www.htmlhelp.org/cgi-bin/validate.cgi?url=http%3A%2F%2Fwww.schnada.de%2F&warnings=yes&spider=yes
|
||
[13]: http://validator.w3.org/checklink?uri=https%3A%2F%2Fwww.schnada.de%2Fgrapt%2Feriknaggum-enamel.html&hide_type=all&depth=&check=Check
|
||
[14]: http://www.htmlhelp.org/tools/valet/linktest.cgi?url=http%3A%2F%2Fwww.schnada.de%2F&date=2006-01-01&type=Full
|
||
[15]: http://jigsaw.w3.org/css-validator/validator?uri=http%3A%2F%2Fwww.schnada.de%2Fsty%2Fstengel.css&warning=2&profile=css2
|
||
|