Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing custom styles in docbook #3657

Open
marcban opened this issue May 10, 2017 · 6 comments
Open

Parsing custom styles in docbook #3657

marcban opened this issue May 10, 2017 · 6 comments

Comments

@marcban
Copy link

marcban commented May 10, 2017

Docbook has a way of specifying custom styles with the role attribute in the phrase element, but Pandoc does'nt parse them.

So Pandoc is able to generates custom styles in docx, odt, html... formats, but only from markdown or html <span> element

from the code of the docbook parser, I can imagine that it would take just one line or two to generate a Span element with custom-style attribute, but I'm not an Haskell developer...

I think this would also solve #1235.

@mb21
Copy link
Collaborator

mb21 commented May 10, 2017

see also the issue of extending custom styles to ODT/ICML: #2106...

@jgm
Copy link
Owner

jgm commented May 10, 2017

The code changes would indeed be simple. It's a matter of deciding on bigger architectural issues. Do we assume that role attributes should be treated as style names in docx and odt output? What problems might that cause? How should they be handled in other output formats? What attribute keyword should be used? (role, data-role, custom-style?)

Another option would be treat role as a class. This wouldn't have an automatic interpretation in docx or odt, but at least it would be possible for a filter to do something with it.

@marcban
Copy link
Author

marcban commented May 10, 2017

Some answers :

  • It seems in docbook the role attribute is specifically designed for custom styles.
    At http://www.sagehill.net/docbookxsl/CustomInlines.html we can read "A phrase with role attribute is an easy way to add specialized elements to your content, without having to customize the DTD".

  • Pandoc already reads the role docbook attribute on the emphasis element in order to distinguish between bold and italic.

  • the internal Pandoc AST currently manage a custom-style attribute on Span element (used in the docx writer, as stated at http://pandoc.org/MANUAL.html#custom-styles-in-docx-output). In JSON format, the output is in this form: {"t":"Span","c":[["",[],[["custom-style","mystyle"]]]....

Ideally, the parser would also add a custom-style on specific docbook elements like guibutton, filename so that it would be easy to style them in the output...

@marcban
Copy link
Author

marcban commented May 10, 2017

a precision : my docbook files are generated from asciidoc python tool, which already generate phases with role for quoted text, see http://www.methods.co.nz/asciidoc/userguide.html#X51.

@thomas-ferchau
Copy link

I also miss this.

  • Converting Asciidoc files to HTML and PDF preserves custom roles / styles (if defined in stylesheet.css for HTML and pdf-theme.yaml for PDF).
  • When converting to Docbook, the .xml file contains roles (in simpara and phrase elements). These custom styles are ignored even if the reference Word document contain the styles:
<simpara role="green">green</simpara>
<simpara><phrase role="green">another green</phrase></simpara>
<simpara><phrase role="red">red</phrase></simpara>
<simpara><emphasis role="marked">marked</emphasis></simpara>

@jgm
Copy link
Owner

jgm commented Oct 11, 2024

The role on phrase is parsed as a class, currently:

% pandoc -f docbook -t native
<simpara role="green">green</simpara>
<simpara><phrase role="green">another green</phrase></simpara>
<simpara><phrase role="red">red</phrase></simpara>
<simpara><emphasis role="marked">marked</emphasis></simpara>
[ Para [ Str "green" ]
, Para
    [ Span
        ( "" , [ "green" ] , [] )
        [ Str "another" , Space , Str "green" ]
    ]
, Para [ Span ( "" , [ "red" ] , [] ) [ Str "red" ] ]
, Para [ Emph [ Str "marked" ] ]
]

You could use a Lua filter that converts classes on Span into custom-style attributes.

function Span(el)
  if el.classes[1] then
    el.attributes['custom-style'] = el.classes[1]1
  end
end

(untested)

The role on simpara currently doesn't do anything; there is no slot for attributes in a pandoc Para element.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants