The User-friendly Deep Learning architecture is comprised of three components:
The backend provides the REST API for managing datasets, job templates, jobs (and there output) and in general the configuration of the system.
By default, the backend stores data files on disk. But thanks to abstraction, the storage system can be easily swapped out.
All other system related information (job templates, etc) is stored in a database. Currently, sqlite and MySQL are supported.
Nodes can be dynamically added to the pool of available workers, by simply starting up the job-launcher. A worker node polls for available jobs using its hardware environment (GPU? Hardware generation? etc) to only attempt to execute jobs that will succeed.
Each job execution will result in a log file being stored back in the backend. This log can be used to determine potential error causes in case a job terminates unexpectedly.