Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: new command csv2rst #137

Closed
4 tasks done
peterjc opened this issue Apr 7, 2021 · 3 comments
Closed
4 tasks done

feature request: new command csv2rst #137

peterjc opened this issue Apr 7, 2021 · 3 comments

Comments

@peterjc
Copy link

peterjc commented Apr 7, 2021

Prerequisites

  • make sure you're are using the latest version by csvtk version
  • read the usage

Describe your issue

  • describe the problem
  • provide a reproducible example

Description

I would like to be able to turn CSV or TSV files etc into reStructuredText (RST) tables.
See https://docs.anaconda.com/restructuredtext/detailed/#tables or https://docutils.sourceforge.io/docs/user/rst/quickref.html#tables

Example

Consider a simple input table, I personally prefer tabs:

$ csvtk tab2csv names.tsv 
id,first_name,last_name,username
11,Rob,Pike,rob
2,Ken,Thompson,ken
4,Robert,Griesemer,gri
1,Robert,Thompson,abc
NA,Robert,Abel,123

You already support markdown output:

$ csvtk csv2md -t names.tsv 
id |first_name|last_name|username
:--|:---------|:--------|:-------
11 |Rob       |Pike     |rob
2  |Ken       |Thompson |ken
4  |Robert    |Griesemer|gri
1  |Robert    |Thompson |abc
NA |Robert    |Abel     |123

Rendered here with GitHub,

id first_name last_name username
11 Rob Pike rob
2 Ken Thompson ken
4 Robert Griesemer gri
1 Robert Thompson abc
NA Robert Abel 123

My desired functionality is just the simplest RST table markup:

$ csvtk csv2rst-t names.tsv 
== ========== ========= ========
id first_name last_name username
== ========== ========= ========
11 Rob        Pike      rob
2  Ken        Thompson  ken
4  Robert     Griesemer gri
1  Robert     Thompson  abc
NA Robert     Abel      123
== ========== ========= ========

That is close to the existing pretty output:

$ csvtk pretty -t names.tsv -s " "
...

I've not looked at the rest of csvtk to say if there is value in making an RST grid table, it is more verbose but allows more rich entries:

+----+------------+-----------+----------+
| id | first_name | last_name | username |
+====+============+===========+==========+
| 11 | Rob        | Pike      | rob      |
+----+------------+-----------+----------+
| 2  | Ken        | Thompson  | ken      |
+----+------------+-----------+----------+
| 4  | Robert     | Griesemer | gri      |
+----+------------+-----------+----------+
| 1  | Robert     | Thompson  | abc      |
+----+------------+-----------+----------+
| NA | Robert     | Abel      | 123      |
+----+------------+-----------+----------+
shenwei356 added a commit that referenced this issue Apr 8, 2021
@shenwei356
Copy link
Owner

Here it is.

But it can't handle unicode (utf-f8) well, tested at http://rst.aaroniles.net/

Usage

convert CSV to readable aligned table

Attention:

  1. row span is not supported.

Usage:
  csvtk csv2rst [flags]

Flags:
  -k, --cross string               charactor of cross (default "+")
  -s, --header string              charactor of separator between header row and data rowws (default "=")
  -h, --help                       help for csv2rst
  -b, --horizontal-border string   charactor of horizontal border (default "-")
  -p, --padding string             charactor of padding (default " ")
  -B, --vertical-border string     charactor of vertical border (default "|")

Example

  1. With header row

     $ csvtk csv2rst testdata/names.csv 
     +----+------------+-----------+----------+
     | id | first_name | last_name | username |
     +====+============+===========+==========+
     | 11 | Rob        | Pike      | rob      |
     +----+------------+-----------+----------+
     | 2  | Ken        | Thompson  | ken      |
     +----+------------+-----------+----------+
     | 4  | Robert     | Griesemer | gri      |
     +----+------------+-----------+----------+
     | 1  | Robert     | Thompson  | abc      |
     +----+------------+-----------+----------+
     | NA | Robert     | Abel      | 123      |
     +----+------------+-----------+----------+
    
  2. No header row

     $ csvtk csv2rst -H -t  testdata/digitals.tsv 
     +---+-------+---+
     | 4 | 5     | 6 |
     +---+-------+---+
     | 1 | 2     | 3 |
     +---+-------+---+
     | 7 | 8     | 0 |
     +---+-------+---+
     | 8 | 1,000 | 4 |
     +---+-------+---+
    
  3. Misc

     $ cat testdata/names.csv | head -n 1 | csvtk csv2rst 
     +----+------------+-----------+----------+
     | id | first_name | last_name | username |
     +====+============+===========+==========+
     
     $ cat testdata/names.csv | head -n 1 | csvtk csv2rst -H
     +----+------------+-----------+----------+
     | id | first_name | last_name | username |
     +----+------------+-----------+----------+
     
     $ echo | csvtk csv2rst -H
     [ERRO] xopen: no content
     
     $ echo "a" | csvtk csv2rst -H
     +---+
     | a |
     +---+
     
     # some online rst render reports error
     $ echo "沈伟" | csvtk csv2rst -H
     +--------+
     | 沈伟 |
     +--------+
    

@peterjc
Copy link
Author

peterjc commented Apr 8, 2021

Wow - that was quick. Thank you.

The "rich" RST table syntax does seem more flexible, and will cope with single rows etc. So I understand why you might prefer that.

I was not aware of the RST table issue with unicode, it also happens on https://livesphinx.herokuapp.com/ which is another online test system for RST using Sphinx. Sometimes it works, e.g. emoji or a kanji:

+------------+
| Hello.     |
+----+-------+
| 😀 | Smile |
+----+-------+
+-----------+
| Hello.    |
+----+------+
| 金 | Gold |
+----+------+

I tried some Japanese hiragana, and like your example failed to get it to work regardless of the number of spaces (in case it was a length issue). I think this is probably a bug in the docutils library, perhaps the same thing this sphinx user was asking about: sphinx-doc/sphinx#6702

i.e. The unicode issue is out of your hands.

@shenwei356
Copy link
Owner

The unicode issue is fixed thanks to the go-runewidth package.

$ csvtk csv2rst testdata/unicode.csv 
+-------+---------+
| value | name    |
+=======+=========+
| 1     | 沈伟    |
+-------+---------+
| 2     | 沈伟b   |
+-------+---------+
| 3     | 沈小伟  |
+-------+---------+
| 4     | 沈小伟b |
+-------+---------+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants