Get index from iterables

Johannes · September 11, 2017, 3:35pm

Hi,
— possibly this is a trivial/idiotic question —
but I have the following problem; I would like to use the index instead of the value to name my output files.

I have a node with iterable,
bet = pe.Node(fsl.BET()…
bet.iterables = (“frac”,np.linspace(0.1,0.9,9))
and I want to connect it to a datasink, e.g.
workflow.connect(bet,‘out_file’,sink,’@in_file.@final’)

so… is there a way that the index of the iterator (e.g. 0 to 8 in this case) is used for the output sink directory names/file names (instead of the VALUE)?

In other words, instead of output folders like
_frac_0.100000000000000000001
_frac_0.200000000000000000001

I would like to get
frac_0
frac_1

this is just an example, but I feel it would be really handy for many things!

Thanks!
Johannes

djarecka · September 11, 2017, 4:25pm

I guess you can use substitution for this.

Johannes · September 12, 2017, 8:07am

thanks! I played around with regex and substitutions before, the best I could achieve was to substitute the iterable NAME, e.g. instead of
_frac_0.100000000000000000001
I could have
blabla_0.100000000000000000001

however I’d love to have
blabla_{iterable_index}
where {iterable_index} is element of [0,1,2,3…].

any ideas welcome

djarecka · September 12, 2017, 5:35pm

If you have a list, let’s say ll = [100, 101, 102], you can write substitutes like this:

sub = [('_frac_{}'.format(el), '_frac_{}'.format(i)) for i, el in enumerate(ll)]

and

datasink.inputs.substitutions = sub

In your case, I’m not sure if this will work due to the precision of the numbers that are in your names. If you know the precision that is used (or you’re willing to set one), something like this might work:

sub = [('_frac_{:10.9f}'.format(el), '_frac_{}'.format(i)) for i, el in enumerate(ll)]

If this also doesn’t work, there is always regexp_substitutions. I’m not really regex master, but I believe this, or something similar, should work in your case:

sub_reg = [('_frac_{:2.1f}\d+'.format(el), '_frac_{}'.format(i)) for i, el in enumerate(ll)]
sinker.inputs.regexp_substitutions = sub_reg

Let me know if this solves your problem!

Johannes · September 13, 2017, 10:10am

Hi Dorota,

that’s a clever trick, thanks! I wasn’t aware that lists are a possible substitution!

It works perfectly fine for the following scenario:

sub = [('_frac_{}'.format(el), '_test_{}'.format(i)) for i, el in enumerate(iter_frac)]
iter_frac = [0.10,0.20,0.30,0.40]

resulting in four folders:

 _test_0  _test_1  _test_2  _test_3

which is great

however, interesting things happen, if the first decimal is repeated, e.g. the last one

iter_frac = [0.10,0.20,0.30,0.35]

which results in

_test_0  _test_1  _test_2  _test_25

the last one is sad

this being said our substitution list sub actually appears healthy:

[('_frac_0.1', '_test_0'),
 ('_frac_0.2', '_test_1'),
 ('_frac_0.3', '_test_2'),
 ('_frac_0.35', '_test_3')]

If I use np.linspace() to specify the iterable instead, I’m bound to be running into float precision issues, as 0.1 is slightly off (0.1000000000000001). np.around() also doesn’t seem to be helping here. Thus, the original values are off and a simple substitution as before doesn’t do the trick… playing with the precision here did not help unfortunately.

Regex comes a bit closer, however some names are not substituted, for instance 0.35 may have become 0.34999999999999998 in the world of floats, which can only be rescued if I sacrifice precision (and not being able to differentiate say 0.30 and 0.35). Or I am still missing something… I’ll look deeper into regex later.

Anyhow, thanks a lot for sharing your python wisdom, I am sure I will use lists as substitutions some point in the future!

djarecka · September 20, 2017, 12:57pm

The first issue you can solve by changing precision in string formatting (so it checks two digits after decimal point), you can read more here.

The second problem (i.e. 0.34999999999999998 vs 0.3500000000000001) I’m not sure how to easily change in one line using regex. I believe it might be easier to force lower precision in iterables

Johannes · October 4, 2017, 1:11pm

thanks! meanwhile I’ve adopted a more lazy approach, which works fine.

nmb_iter=50
x = np.linspace(0.1,0.9,nmb_iter)
iter_frac = []
for i in range(nmb_iter):
    iter_frac.append(np.float('%0.3f' %x[i]))

the float madness seems to be resolved with this approach!