Are there recommended ways of storing dtype information alongside BIDS tabular files?

I am preparing a BIDS dataset and would like to do so in a way that will facilitate converting the tabular information into some binary format, in particular providing dtype information. Are there recommended ways or existing examples of encoding data type information in the json sidecars – information like Int64 vs Int32 vs Float32? For date[times] this is straightforward, because only one format is allowed. Likewise, categorical information is straightforward, in that levels can be provided, and one could use the Description field to indicate whether the levels have an ordering. For the others, I was about to dump the information in a semi-structured way inside the Description field (example below). I would be curious to hear about other approaches.

{
    "num_vert": {
        "LongName": "NumVert",
        "Description": "Number of Vertices. dtype: int32."    
}

Nope. But I think it would be reasonable to propose.

Right now, the way we’re thinking about validating these is:

if "Delimiter" in definition:
    # Delimiter indicates the value must be parsed. For BIDS purposes,
    # this is a string, even if the parsed array is of numbers.
    typename = "string"
elif "Levels" in json_def:
    # JSON keys are always strings.
    typename = "string"
elif "Units" in json_def:
    # Values with units are always (any exceptions?) numbers.
    typename = "number"
else:
    typename = "string or number"

But the validation is just “Does the value match the regular expression for this typename?” We don’t actually load the values.

We can argue over the specific values, but I think it’d be worth opening an issue on the spec to see if anybody has objections. In the meantime, obviously, do what’s useful for you.

1 Like