You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pyarrow.lib.ArrowNotImplementedError: Unsupported cast from list<item: struct<text: string, label: string>> to struct using function cast_struct
Potential Reason
After some analysis, it turns out that my yaml config is requiring dict[str, list[str]] instead of list[dict[str, str]]. It would work if I change my data to
The non-sequence case works well (dict[str, str] instead of list[dict[str, str]]), which makes me believe it shall be a bug for sequence and my proposed behavior shall be expected.
ain-soph
changed the title
dataset_info sequence format unexpected behavior in README.md YAML
[BUG] dataset_info sequence unexpected behavior in README.md YAML
Sep 9, 2024
Describe the bug
When working on
dataset_info
yaml, I find my data column with formatlist[dict[str, str]]
cannot be coded correctly.My data looks like
My
dataset_info
in README.md is:Error log:
Potential Reason
After some analysis, it turns out that my yaml config is requiring
dict[str, list[str]]
instead oflist[dict[str, str]]
. It would work if I change my data toThese following 2 different
dataset_info
are actually equivalent.Steps to reproduce the bug
Expected behavior
Should work on following data format:
Environment info
datasets
version: 2.21.0huggingface_hub
version: 0.24.5fsspec
version: 2024.6.1The text was updated successfully, but these errors were encountered: