converting pairwise values into a matrix or semi-matrix #91

avilella · 2019-10-11T14:29:32Z

There may already be a solution to this with a combination of summary or gather or collapse but here is a feature request I find myself wanting to do every once in a while:

https://stackoverflow.com/questions/22492767/converting-pairwise-distances-into-a-distance-matrix-in-r

Take a list of pairwise distances and convert into a matrix:

A1  A1  0.90
A1  B1  0.85
A1  C1  0.45
A1  D1  0.96
B1  B1  0.90
B1  C1  0.85
B1  D1  0.56
C1  C1  0.55
C1  D1  0.45
D1  D1  0.90

E.g. below:

       A1      B1      C1      D1
A1    0.90    0.85    0.45    0.96
B1            0.90    0.85    0.56
C1                    0.55    0.45
D1                            0.90

The text was updated successfully, but these errors were encountered:

avilella · 2019-10-11T15:02:00Z

This is currently what a similar tool to csvtk does for this contingency table generation:

cat data.tsv | datamash crosstab 2,1 unique 3

shenwei356 · 2019-10-11T15:18:06Z

I think this case is too special, I mean, maybe there are very few people needing this. And since datamash provides this function, there's no need to reinvent this wheel right now.

cwarden · 2022-12-30T12:33:42Z

I don't think the use case is too unusual. tidyr provides spread as the inverse of gather (now pivot_longer and pivot_wider). The pivot documentation provides example use cases.

avilella · 2022-12-30T14:31:02Z

I had use cases for this as well, currently using datamash instead, but it would be nice to have it in csvtk for easier portability.

…

On Fri, Dec 30, 2022, 12:33 Christian G. Warden ***@***.***> wrote: I don't think the use case is too unusual. tidyr provides spread as the inverse of gather (now pivot_longer and pivot_wider). The pivot documentation <https://tidyr.tidyverse.org/articles/pivot.html> provides example use cases. — Reply to this email directly, view it on GitHub <#91 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABGSN6QT2Z74UPHV2DPXG3WP3JDBANCNFSM4I72RQLQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

shenwei356 · 2023-08-11T08:57:02Z

Implemented:

Your data

$ csvtk spread -Ht -k 2 -v 3 data.tsv \
    |  csvtk pretty -t -S bold

┏━━━━┳━━━━━━┳━━━━━━┳━━━━━━┳━━━━━━┓
┃    ┃ A1   ┃ B1   ┃ C1   ┃ D1   ┃
┣━━━━╋━━━━━━╋━━━━━━╋━━━━━━╋━━━━━━┫
┃ A1 ┃ 0.90 ┃ 0.85 ┃ 0.45 ┃ 0.96 ┃
┣━━━━╋━━━━━━╋━━━━━━╋━━━━━━╋━━━━━━┫
┃ B1 ┃      ┃ 0.90 ┃ 0.85 ┃ 0.56 ┃
┣━━━━╋━━━━━━╋━━━━━━╋━━━━━━╋━━━━━━┫
┃ C1 ┃      ┃      ┃ 0.55 ┃ 0.45 ┃
┣━━━━╋━━━━━━╋━━━━━━╋━━━━━━╋━━━━━━┫
┃ D1 ┃      ┃      ┃      ┃ 0.90 ┃
┗━━━━┻━━━━━━┻━━━━━━┻━━━━━━┻━━━━━━┛

Another example: Shuffled columns:

$ csvtk cut -f 1,4,2,3 testdata/names.csv \
  | csvtk pretty -S simple
----------------------------------------
id   username   first_name   last_name
----------------------------------------
11   rob        Rob          Pike
2    ken        Ken          Thompson
4    gri        Robert       Griesemer
1    abc        Robert       Thompson
NA   123        Robert       Abel
----------------------------------------

data -> gather/longer -> spread/wider. Note that the orders of both rows and columns are kept :)

$ csvtk cut -f 1,4,2,3 testdata/names.csv \
    | csvtk gather -k item -v value -f -1 \
    | csvtk spread -k item -v value \
    | csvtk pretty -S simple
----------------------------------------
id   username   first_name   last_name
----------------------------------------
11   rob        Rob          Pike
2    ken        Ken          Thompson
4    gri        Robert       Griesemer
1    abc        Robert       Thompson
NA   123        Robert       Abel
----------------------------------------

shenwei356 closed this as completed Jul 28, 2022

shenwei356 added the new command label Aug 9, 2023

This was referenced Aug 9, 2023

transpose help #239

Closed

Can you create a comand like spread that can many long line into group short lines with group as column names #236

Closed

shenwei356 added a commit that referenced this issue Aug 11, 2023

new command: spread. #91, #236, #239

d57cc9a

This was referenced Aug 11, 2023

align-center and align-right for specific columns #240

Closed

Update csvtk to v0.27.0 bioconda/bioconda-recipes#42490

Merged

chenrui333 mentioned this issue Aug 15, 2023

csvtk 0.27.0 Homebrew/homebrew-core#139561

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

converting pairwise values into a matrix or semi-matrix #91

converting pairwise values into a matrix or semi-matrix #91

avilella commented Oct 11, 2019 •

edited

Loading

avilella commented Oct 11, 2019

shenwei356 commented Oct 11, 2019

cwarden commented Dec 30, 2022

avilella commented Dec 30, 2022 via email

shenwei356 commented Aug 11, 2023

converting pairwise values into a matrix or semi-matrix #91

converting pairwise values into a matrix or semi-matrix #91

Comments

avilella commented Oct 11, 2019 • edited Loading

avilella commented Oct 11, 2019

shenwei356 commented Oct 11, 2019

cwarden commented Dec 30, 2022

avilella commented Dec 30, 2022 via email

shenwei356 commented Aug 11, 2023

avilella commented Oct 11, 2019 •

edited

Loading