[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: OT Palm readers Re: [dvd-discuss] CTEA ProtectsWhatCopyrights?





"Arnold G. Reinhold" wrote:
> I'd argue for a simple subset of HTML. That would give you the best
> of both worlds. The source would still be in plain ASCII.  Some
> simple "pretty printing" (lines wrapped to a standard length,
> paragraph tags on separate lines, headers on their own line, etc.)
> can make the HTML source <I>reasonably</I> readable as is. And, of
> course, it would be understood by existing Web browsers.

Twiki/email-isms could do the simple <I>, <em>, and <b> stuff.  All I
really need though is two rules.

(1) What regexp constitutes a heading for the TOC (e.g. Chapter)
(2) What regexp constitutes a <p>

and define these in a "META" information section in the PG "the small
print".  

META HEADING1="^\s+Chapter\s+\d+"
META PARAGRAPH="^$"

Actually, one could implement other Twiki/email-ism transformations as
s/pat/repl/.  Here is the regexp

s/\b\*(\S+)\*\b/<I>$1</I>/

that would map *italic* to <I>italic</I>  (if I haven't fumble fingered
it).  This
uses a regexp subset to allow flexible plain text encoding of simple
tagging without imposing coding standard on the PG volunteers -- and
without embedded tags in the content.  A standard encoding set COULD be
defined and used and declared

META TAG_ENCODING= "PG_meta_v1.0"

or

META TAG_ENCODING = "Twiki_v2.1"

allowing for any number of implementations for conversion tools to
non-plain text ebook, HTML, RTF, whatever, without encumbering the
"small print" section with inscrutable (and fragile) regexp lists.

Finally, one could do a CDDB-like metatag database for PG files instead
of embedded the encoding information (depending on PG's willingness to
support such a thing).

.002