Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow complex metadata fields to be defined on the command line (e.g. using JSON) #3732

Closed
tajmone opened this issue Jun 11, 2017 · 24 comments
Closed

Comments

@tajmone
Copy link
Contributor

tajmone commented Jun 11, 2017

I need to define a template variable with multiple values from the command line, via the --variable, but couldn't find any references in pandoc's documentation.

I'm posting this here, as an issue, because I propose adding to the documentation mention of this (wether or not it's possible). And, if it's not possible to do it, I'd like to propose it as a feature request.

Here is the working scenario in detail: I'm using pandoc in a custom CMS workflow to convert markdown documents (with YAML front matter) to html. The CMS engine also autogenerates and passes to pandoc some template vars. Among the autogenerated template vars I wanted to include a breacrumbs variable holding a list of all the folder names that make up the current path to which the document belongs to, and possibly also a string with a description (to be shown as a mouse-over popup on each subfolder name/link, to make better navigation).

Pandoc documentation offers an example of a similar multi-value tempalte variable, using YAML metadata block:

Variables can contain arbitrary YAML structures, but the template must match this structure. The author variable in the default templates expects a simple list or string, but can be changed to support more complicated structures. The following combination, for example, would add an affiliation to the author if one is given:

---
title: The document title
author:
- name: Author One
  affiliation: University of Somewhere
- name: Author Two
  affiliation: University of Nowhere
...

To use the structured authors in the example above, you would need a custom template:

$for(author)$
$if(author.name)$
$author.name$$if(author.affiliation)$ ($author.affiliation$)$endif$
$else$
$author$
$endif$

... but there is no mentioning if this could be achieve via the command line options -V and -M (neither here nor in the documentation for these options).

So currently I'm using a YAML settings file in each folder to manually handle the breadcrumbs, but the CMS engine could do a better job at autogenerating these vars and pass them via -V and -M — creating an autogenerated temporary YAML file could be a solution, but a rather inelegant one.

Ideally, -V and -M could accept a JSON string to allow defining complex variables. Is it currently possibile? And, if not, could it be implemented?

Furthermore, the above YAML example states further on:

A dot can be used to select a field of a variable that takes an object as its value. So, for example:

 $author.name$ ($author.affiliation$)

Again, the documentation for, -M KEY[=VAL] doesn't mention if KEY could be a dot separated field value, or how to define multiple values at once.

I think that handling complex variables via command line is a powerful feature in scripted automation (especially so since variables defined via CLI options have higher priority), and should offer the same features of YAML blocks.

@jgm
Copy link
Owner

jgm commented Jun 11, 2017

You can specify -M key=val1 -M key=val2 and key will end up as a metadata list with two values.
You can't currently specify complex objects using -M.
I agree, all this could be better.

I like the idea of allowing JSON to be used to specify complex values. Of course, there's a question how pandoc would know JSON is being used, but I guess the heuristic could be: if the string starts with " or {. So,

-M author='[{"name": "Fred", "institution": "State"}, {"name":"Jones", "institution": "Private"}]'

@jgm jgm changed the title Defining Multi-value Vars from Command Line Arguments Allow complex metadata fields to be defined on the command line Jun 11, 2017
@mb21
Copy link
Collaborator

mb21 commented Jun 12, 2017

there's a question how pandoc would know JSON is being used

or use a different parameter, e.g. -J --json-metadata

-J '{author: [{"name": "Fred", "institution": "State"}, {"name":"Jones", "institution": "Private"}]}'

@tajmone
Copy link
Contributor Author

tajmone commented Jun 12, 2017

Thanks @jgm, the tip on defining multiple time the same key via -M is very useful (I though that the -M and -V options followed YAML's left-biased union rules, so I didn't think it could be achieved),

I like @mb21's proposal, and I suggest a further distinction for JSON definition of metadata and template vars by using capital and lowercase option, respectively:

-J --json-metadata
-j --json-variable

@tajmone
Copy link
Contributor Author

tajmone commented Jun 17, 2017

I've tried implementing the multiple vars definition approach to build breadcrumbs for navigation, but stumbled in the limitation of $for$ loops to iterate vars. Here's the scenario:

Each breadcrumb needs a text to display and the actual link it point to; so I've set two independent vars for the task:

-V crumb=Home -V crumb=aaa -V crumb=bbb 
-V crumb-link=../../index.html -V crumb-link=../index.html -V crumb-link=index.html 

.. then in the HTML pandoc template, I iterate through $crumb$:

$for(crumb)$
   <li><a href="$crumb-link$">$crumb$</a></li>
$endfor$

... the problem is that while $crumb$ var does iterate through the list of values, $crumb-link$ doesn't — it always emits the same value.

I don't see a way to produce these links with two separate $for$ cycles, so my guess is that I'd need to use a single var with dot separated subvars (instead of two independent vars) so that $crumb$ becomes an object holding sub-vars $crumb.name$ and $crumb.link$, and that the template might work with something like:

$for(crumb)$
   <li><a href="$crumb.link$">$crumb.name$</a></li>
$endfor$

NOTE: I'm asking this (despite the previous answer) because I'm not sure of the degree of complexity that can currently be achieved via CLI vars definition.

Is there a way to currently achieve this via CLI options — or otherwise —, beside having to resort to a YAML block? Ie: does pandoc accept dot-separated defintions like -V crumb.name=Home?

If it doesn't, I envisage I could resort to either

  1. generating a temporary YAML file, or
  2. create a var with a raw-HTML string holding the full list of breadcrumb elements and links.

@jgm
Copy link
Owner

jgm commented Jun 18, 2017 via email

@mb21
Copy link
Collaborator

mb21 commented Jun 18, 2017

@tajmone you might also want to look into jekyll (or some other static site generator) to implement more complex layouts...

@jgm jgm added this to the pandoc 2.0 milestone Jun 19, 2017
@tajmone
Copy link
Contributor Author

tajmone commented Jun 22, 2017

@jgm: yes, currently I've managed to use a template var ($breadcrumbs$, defined via -V) that holds the autogenerated raw-html block for the breadcrumbs — I've even managed to insert tabs and newlines in the block, so the final html source looks clean.

When complex vars will be implemented, I'll change the CMS code so I can use a $for$ loop in the template, which is more elegant and provides a better separation between the CMS and the actual pandoc template ... In the meantime I'm glad I've manged to make it work fine without resorting to temporary files.

@mb21 :

you might also want to look into jekyll (or some other static site generator) to implement more complex layouts...

... surely Jekyll, pandocomatic or metalsmith would offer more powerful features; but this particular project is about a the PureBASIC language, so I thought of making a custom CMS in that lang for the sake of coherence — So it's more of challange for the language at hand than a technical choice.

It's turning out quite good actually, because I'm tayloring it to my specific needs. Also, it relies heavily on PP macros to handle extended features (task lists, tables with spanning, etc) and integration with external tools (highlighters, Asciidoc, etc.). Currently, none of the static CMS I've looked into offer out-of-the-box support for PP (pandocomatic supports preprocessors, though), and writing a plugin for a third party CMS might take more time than writing a simple CMS from scratch in a language I'm familiar with.

If I were to create a general purpose static CMS, extensible and customizable, I'd probably be looking into metalsmith which already has a pandoc plugin, and start from there. It's completly customizable since it's fully plugin based, and I'd only need to write a PP preprocessing plugin ... I'm actually going to look into it and give it a try in the future.

@jgm jgm removed this from the pandoc 2.0 milestone Jun 27, 2017
@mb21 mb21 changed the title Allow complex metadata fields to be defined on the command line Allow complex metadata fields to be defined on the command line (e.g. using JSON) Jan 15, 2018
@tajmone
Copy link
Contributor Author

tajmone commented Mar 25, 2018

Just wanted to get an update on this.

Is this feature planned to be implemented somewhere in the future?

@mb21
Copy link
Collaborator

mb21 commented Mar 25, 2018

I'm guessing at some point we'll have to take a long look at all the open issues surrounding variables/metadata/commandline/yaml-block handling and come up with a simple and robust system to replace the current handling... until that time, I wouldn't hold my breath... (but I cannot speak for jgm, of course).

@jgm
Copy link
Owner

jgm commented Mar 25, 2018 via email

@mb21
Copy link
Collaborator

mb21 commented May 4, 2018

Closing in favour of #1960

@mb21 mb21 closed this as completed May 4, 2018
@tajmone
Copy link
Contributor Author

tajmone commented May 4, 2018

Closing in favour of #1960

Proposal #1960 is an excellent enhancement, but being a file-based solution doesn't really address the issue of being able to specify complex metadata/vars via CLI options.

I still think that an inline solution should be available — and for this JSON might be a reasonable workaround.

@mb21 mb21 reopened this May 4, 2018
@mb21
Copy link
Collaborator

mb21 commented May 4, 2018

Okay... sounds just a bit like a headache, escaping things properly... and overlong lines on the commandline...

@nichtich
Copy link
Contributor

nichtich commented Feb 7, 2019

By now its not possible to specify Markdown elements via command line, e.g this would be useful:

pandoc -M 'nocite:@*' --bibliography=references.bib document.md

@brainchild0
Copy link

Does #5790 resolve this issue or is it still needed?

@nichtich
Copy link
Contributor

I think default files better solve this issue.

@brainchild0
Copy link

brainchild0 commented May 21, 2020

@nichtich: Yes, it is the same. Issue #5790 relates to defaults files. (Project files was a proposed name for a more comprehensive feature. I'll change the name of the issue for clarity.)

@jgm
Copy link
Owner

jgm commented May 21, 2020

I'd be content to close this. Any objections?

@tajmone
Copy link
Contributor Author

tajmone commented May 22, 2020

I'd be content to close this. Any objections?

If possible, it would be nice to keep this Issue open, maybe labeling it something like ideas, pending decision, or some other label indicating this feature is still eligible for evaluation (if it gets closed, it would just end in oblivion, whereas keeping it open might inspire some contributors to take a go at it).

@tarleb
Copy link
Collaborator

tarleb commented May 22, 2020

FWIW, here's a Lua filter which evaluates specially marked metadata that has been passed through the command line. It allows to use either JSON or Lua.

--- file: eval-metadata.lua

local yaml_template = "---\nDUMMY: %s\n---"

function evaluate (str)
  local tag, value = str:match('^:(%a*):(.*)')
  if tag == 'lua' then
    return load(
      'return ' .. value,
      "evaluating " .. value
    )()
  elseif tag == 'json' then
    return pandoc.read(yaml_template:format(value)).meta['DUMMY']
  end
  return str
end

function Meta (meta)
  for key, value in pairs(meta) do
    if type(value) == 'string' then
      meta[key] = evaluate(value)
    end
  end
  return meta
end

Example:

pandoc --lua-filter=eval-metadata.lua --to=markdown -s \
    --metadata lua-test=':lua:{one="Hello",two={pandoc.Strong"World!"}}' \
    --metadata json-test=':json:["*emphasized*", "`code`"]' \
    <<< test

Result:

---
json-test:
- '*emphasised*'
- '`code`'
lua-test:
  one: Hello
  two: '**World**'
---

test

@tajmone
Copy link
Contributor Author

tajmone commented May 22, 2020

@tarleb:

FWIW, here's a Lua filter which evaluates specially marked metadata that has been passed through the command line.

Thanks! that's useful indeed.

@brainchild0
Copy link

@tajmone: With the recent addition of defaults files, the application now processes simple data from the command line, with the further capability to process more detailed and comprehensive data from a file. This design is rather typical, and use of JSON or similar structured data languages in command-line parameters, except for applications built specifically for filtering or transforming structured data, is not standard in my experience. Reading your request, I am wondering whether using standard input would be a more suitable alternative.

@tajmone
Copy link
Contributor Author

tajmone commented May 23, 2020

@brainchild0;

except for applications built specifically for filtering or transforming structured data, is not standard in my experience.

I see your point. Maybe I should just close it then.

Reading your request, I am wondering whether using standard input would be a more suitable alternative.

That's a good idea. Also, in the future it would be nice to see a JSON-RPC interface for pandoc, allowing to exchange communications via STD-I/O, HTPP or WebSockets — an interface protocol that is becoming everyday more popular (e.g. the LSP protocol, or even apps like Aria2).

@tajmone tajmone closed this as completed May 23, 2020
@brainchild0
Copy link

@tajmone: It seems to me unlikely that support for full network protocols would align with the objectives, as a standalone application. However, more comprehensive support for interaction through streaming structured JSON data would facilitate integration into a network application, as well as many other automated workflows.

The defaults file evolves in this direction.

See also #6269.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants