Describe the bug
I'm not sure if this is a bug.
I see that a FeatureType object contains an attribute called self.dtype that is not covered when this feature is a Sequence or a List.
When I try to run a multilabel classification with this example script from the transformers library:
https://github.com/huggingface/transformers/blob/main/examples/pytorch/text-classification/run_classification.py#L442
I get this error on the linked line:
AttributeError: 'List' object has no attribute 'dtype'. Did you mean: '_type'?
Looking at the check that the script is attempting to perform, could we perhaps add a self.dtype="list" attribute for this FeatureType 's: Sequence, List, etc.?
Steps to reproduce the bug
For example, this code works for me:
from datasets import ClassLabel, Features, Sequence, Value
features = {'text': Value('string'), 'label': ClassLabel(names=['No', 'Yes'])}
print(features["text"].dtype)
print(features["label"].dtype)
and this code does not work for me:
from datasets import ClassLabel, Features, Sequence, Value
features = {'text': Value('string'), 'label': Sequence(ClassLabel(names=['No', 'Yes']))}
print(features["label"].dtype) # it could be equal to "list"?
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'List' object has no attribute 'dtype'. Did you mean: '_type'?
Expected behavior
The attribute dtype equal to "list" when using objects of type Sequence.
from datasets import ClassLabel, Features, Sequence, Value
features = {'text': Value('string'), 'label': Sequence(ClassLabel(names=['No', 'Yes']))}
print(features["label"].dtype)
Environment info
I have installed datasets==4.5.0.
Describe the bug
I'm not sure if this is a bug.
I see that a
FeatureTypeobject contains an attribute calledself.dtypethat is not covered when this feature is aSequenceor aList.When I try to run a multilabel classification with this example script from the transformers library:
https://github.com/huggingface/transformers/blob/main/examples/pytorch/text-classification/run_classification.py#L442
I get this error on the linked line:
Looking at the check that the script is attempting to perform, could we perhaps add a
self.dtype="list"attribute for thisFeatureType's:Sequence,List, etc.?Steps to reproduce the bug
For example, this code works for me:
and this code does not work for me:
Expected behavior
The attribute
dtypeequal to"list"when using objects of typeSequence.Environment info
I have installed
datasets==4.5.0.