Pandoc Plug-in
This is a DITA-OT Plug-in to used extend the available input formats for
DITA-OT. Non DITA input sources can be pre-processed using Pandoc to
create create valid DITA source. Files written in multiple input formats can be
directly added to a DITA <map>
and processed as if they
had been written in DITA.
Background
Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It can convert from the following formats:
-
Markdown:
commonmark
(CommonMark Markdown) ,gfm
(GitHub-Flavored Markdown) ,markdown
(Pandoc’s Markdown) ,markdown_mmd
(MultiMarkdown) ,markdown_phpextra
(PHP Markdown Extra) ,markdown_strict
(original unextended Markdown) -
Wiki Formats:
dokuwiki
(DokuWiki markup) ,mediawiki
(MediaWiki markup) ,muse
(Muse) ,tikiwiki
(TikiWiki markup) ,twiki
(TWiki markup) ,vimwiki
(Vimwiki) -
Other Formats:
creole
(Creole 1.0) ,docbook
(DocBook) ,docx
(Word docx) ,epub
(EPUB) ,fb2
(FictionBook2 e-book) ,haddock
(Haddock markup) ,html
(HTML) ,ipynb
(Jupyter notebook) ,jats
(JATS XML) ,json
(JSON version of native AST) ,latex
(LaTeX) ,man
(roff man) ,native
(native Haskell) ,odt
(ODT) ,opml
(OPML) ,org
(Emacs Org mode) ,rst
(reStructuredText) ,t2t
(txt2tags) ,textile
(Textile)
More information about Pandoc can be found at pandoc.org.
This plug-in contains a Lua template which extends the output formats supported by Pandoc to include DITA. The output consists of a single DITA topic for each input file added to the ditamap.
Unlike the standard Markdown Plug-in, this plug-in does not
fail if the <h1...h6>
headers are incorrectly
incremented. This is because the Lua template has been designed to calculate
that headers are incrementing at most one level at a time - the downside of
this is that the output maybe unexpected.
Install
This plug-in needs Pandoc running on user's machine to function correctly. It also requires the presence of the xmltask jar to edit XML files as part of the ANT build. It therefore requires a series of commands to install the relevant plug-in dependencies and configure Pandoc.
Run the plug-in installation commands:
dita install https://github.com/doctales/org.doctales.xmltask/archive/master.zip
dita install https://github.com/jason-fox/fox.jason.passthrough/archive/master.zip
dita install https://github.com/jason-fox/fox.jason.passthrough.pandoc/archive/master.zip
The dita command line tool requires no additional configuration.
Installing Pandoc
To download a copy of Pandoc follow the instructions on the Install page.
Usage
To mark a file to be passed through for Pandoc processing, label
the <topicref>
with @format="pandoc"
within the <map>
as shown:
The additional file will run against the PandocXXX-to-DITA
lua filter to be converted to a *.dita
file and will be added
to the build job without further processing. The @navtitle
of
the included <topic>
will be the same as root name of
the file. Any underscores in the filename will be replaced by spaces in title.
Annotating files
The examples below use Markdown as a passthrough format, other formats need to
provide equivalent annotations to obtain full functionality. Where possible,
annotation aligns with the Markdown DITA syntax reference based on
CommonMark.
The chapter <title>
is taken from the first header
found. Thereafter the document is processed as expected:
Ideally input files should only contain a single #
header.
Pandoc header_attributes can be used to define @id
or @outputclass
attributes:
The following class values in header_attributes have a special meaning on header levels.
-
section
-
example
They are used to generate <section>
and
<example>
elements:
Metadata
YAML metadata block as defined in Pandoc pandoc_metadata_block can be used to specify different metadata elements. The supported elements are:
-
author
-
source
-
publisher
-
permissions
-
audience
-
category
-
keyword
-
resourceid
-
shortdesc
Unrecognized keys are output using data element.
Ditamap <topicmeta>
processing is also supported.
This allows for topic metadata to be added to files for formats other than Markdown.