Read-only file system. joblib will operate in serial mode (after updating singularity)

#1

Previously, we were running fMRIPrep 1.3.2 with Singularity 2.4, but it had this odd behavior in which after several hours from logging out of our remote connection to the server, the processes seemingly paused. Upon logging back in, the terminal showed continued (resumed) activity. We checked our resource/activity logs to verify, and it that is indeed what seemed to be happening.
Thinking it had something to do with the terminal, we started a job with a terminal, interrupted it with ctrl+z, pushed it to the background using bg, then used command jobs to get the job number, then disowned the job from the terminal. Doing this seemed to side-step that issue for a couple of subjects we ran, but we still searched around for a better solution.

We saw suggestions to update our version of singularity in order to perhaps remedy the problem.
Upon updating to Singularity 3.2.0, we are now greeted with a new issue (and we aren’t certain that the original problem has been solved either).
Shortly after starting a job, we now see new errors:

/usr/local/miniconda/lib/python3.7/site-packages/sklearn/externals/joblib/_multiprocessing_helpers.py:28: UserWarning: [Errno 30] Read-only file system. joblib will operate in serial mode
warnings.warn(’%s. joblib will operate in serial mode’ % (e,))
/usr/local/miniconda/lib/python3.7/importlib/_bootstrap.py:219: ImportWarning: can’t resolve package from spec or package, falling back on name and path
return f(*args, **kwds)
/usr/local/miniconda/lib/python3.7/importlib/_bootstrap.py:219: ImportWarning: can’t resolve package from spec or package, falling back on name and path
return f(*args, **kwds)
/usr/local/miniconda/lib/python3.7/importlib/_bootstrap.py:219: ImportWarning: can’t resolve package from spec or package, falling back on name and path
return f(*args, **kwds)
/usr/local/miniconda/lib/python3.7/importlib/_bootstrap.py:219: ImportWarning: can’t resolve package from spec or package, falling back on name and path
return f(*args, **kwds)
/usr/local/miniconda/lib/python3.7/site-packages/nilearn/datasets/neurovault.py:16: DeprecationWarning: Using or importing the ABCs from ‘collections’ instead of from ‘collections.abc’ is deprecated, and in 3.8 it will stop working
from collections import Container
/usr/local/miniconda/lib/python3.7/site-packages/skimage/init.py:80: ResourceWarning: unclosed file <_io.TextIOWrapper name=’/usr/local/miniconda/lib/python3.7/site-packages/pytest.py’ mode=‘r’ encoding=‘utf-8’>
imp.find_module(‘pytest’)

There are other additional errors, and I can post additional output, but there is a lot of it.
Does anyone have any ideas or suggestions?
Is there currently a known optimal combination of singularity+fmriprep, or known issues with the combination that we have?
Thanks,
Nick

#2

Hi Nick,

after several hours from logging out of our remote connection to the server, the processes seemingly paused. Upon logging back in, the terminal showed continued (resumed) activity.

Could you elaborate on your process here? What indicated that the process paused, and how are you re-connecting to the terminal? Sorry if I’m being a bit obtuse here, but I don’t see enough to assess what might be going on. Some processes take a long time (e.g., FreeSurfer can take multiple hours), during which there will be very little terminal output. I can’t immediately think of a situation that would cause a change based on being logged in.

As to all of the printed warnings, unfortunately, some of our dependencies seem to interfere with the warnings filters we place to try to reduce them. They’re annoying but not indicative of a problem. Are you seeing actual failures?

Best,
Chris

#3

Hey Chris,

I’m working with Nick and we are running these processes on a server. I’m logging system usage on the server. When we close out a session we can see the system usage go to zero, when logging back in we see that the system usage ramp back up to what we normally see when running fmriprep.
In normal circumstances our data takes about 15 hours to go through fmriprep. But when trying to run jobs over the weekend, the job won’t complete until the user resumes the session at which point the job will pick-up where it left off.

We think that this is more so an issue with singularity 2.x.x rather than fmriprep.
We’ve found suggestions that upgrading singularity to a version 3.x.x would help with this issue. But after upgrading singularity we see the above ‘Read-Only’ error messages.
Then the process will stop within about a minute after throwing the following:

Captured warning (<class 'UserWarning'>): [Errno 30] Read-only file system.  joblib will operate in serial mode
Captured warning (<class 'ImportWarning'>): can't resolve package from __spec__ or __package__, falling back on __name__ and __path__
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \k
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \s
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \ 
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \.
Captured warning (<class 'ImportWarning'>): can't resolve package from __spec__ or __package__, falling back on __name__ and __path__
Captured warning (<class 'ImportWarning'>): can't resolve package from __spec__ or __package__, falling back on __name__ and __path__
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \#
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \c
Captured warning (<class 'DeprecationWarning'>): Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working

Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \s
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \g
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \.
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \d
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \{
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \%
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \S
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \|
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \[
Captured warning (<class 'ResourceWarning'>): unclosed file <_io.TextIOWrapper name='/usr/local/miniconda/lib/python3.7/site-packages/pytest.py' mode='r' encoding='utf-8'>
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \A
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \h
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \s
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \*
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \m
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \-
Captured warning (<class 'DeprecationWarning'>): invalid escape sequence \D
Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/managers.py", line 577, in _run_server
    server.serve_forever()
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/managers.py", line 173, in serve_forever
    sys.exit(0)
SystemExit: 0
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/util.py", line 265, in _run_finalizers
    finalizer()
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/util.py", line 189, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/local/miniconda/lib/python3.7/shutil.py", line 485, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/usr/local/miniconda/lib/python3.7/shutil.py", line 443, in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
  File "/usr/local/miniconda/lib/python3.7/shutil.py", line 441, in _rmtree_safe_fd
    os.unlink(entry.name, dir_fd=topfd)
OSError: [Errno 16] Device or resource busy: '.nfs000000000cb6003f00003f66'
Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 136, in __init__
    mp_context = mp.context.get_context(
AttributeError: module 'multiprocessing.context' has no attribute 'get_context'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda/bin/fmriprep", line 11, in <module>
    load_entry_point('fmriprep==1.3.2', 'console_scripts', 'fmriprep')()
  File "/usr/local/miniconda/lib/python3.7/site-packages/fmriprep/cli/run.py", line 436, in main
    fmriprep_wf.run(**plugin_settings)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/workflows.py", line 583, in run
    runner = plugin_mod(plugin_args=plugin_args)
  File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 144, in __init__
    self.pool = ProcessPoolExecutor(max_workers=self.processors)
  File "/usr/local/miniconda/lib/python3.7/concurrent/futures/process.py", line 542, in __init__
    pending_work_items=self._pending_work_items)
  File "/usr/local/miniconda/lib/python3.7/concurrent/futures/process.py", line 158, in __init__
    super().__init__(max_size, ctx=ctx)
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/queues.py", line 42, in __init__
    self._rlock = ctx.Lock()
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/context.py", line 67, in Lock
    return Lock(ctx=self.get_context())
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/synchronize.py", line 162, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/synchronize.py", line 59, in __init__
    unlink_now)

OSError: [Errno 30] Read-only file system
Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/util.py", line 265, in _run_finalizers
    finalizer()
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/util.py", line 189, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/local/miniconda/lib/python3.7/shutil.py", line 489, in rmtree
    onerror(os.rmdir, path, sys.exc_info())
  File "/usr/local/miniconda/lib/python3.7/shutil.py", line 487, in rmtree
    os.rmdir(path)
OSError: [Errno 39] Directory not empty: '/tmp/pymp-us46vzab'

I guess our main question is, are these any know compatibility issues with fmriprep-1.3.2 and singularity version 3.2.0 that could be causing these errors?

Thanks,
Mitch

#4

can you try including -B /tmp:/tmp in your singularity run call? I suspect that /tmp does not get automatically bound, and the default /tmp of the container is not writable.

#5

Sorry guys we should have included the original fmriprep call:
The tmp directory is just where we bound our local directory. We’ve tried to bind to mnt as well with no luck.

singularity run --cleanenv \
-B /scratch1/bloomn/ECDM/DOWNLOADS/1002_121018:/tmp \
/data/nil-bluearc/ccp-hcp/SingularityImages/fmriprep-1.3.2.simg \
--fs-license-file /tmp/.license/freesurfer/license.txt \
-w /tmp /tmp/BIDS /tmp/derivatives participant \
--participant_label 1002 \
--output-space template fsaverage5 \
--template MNI152NLin2009cAsym \
--template-resampling-grid native \
-n-cpus 16 --omp-nthreads 4 --mem-mb 64000 -v`