3. CCP4 Cloud Configuration

3.1. Configuration file structure

CCP4 Cloud Configuration File is a json document with the following components:

{
  "FrontEnd" : { ........... },
  "FEProxy"  : { ........... },
  "NumberCrunchers" : [
      { .............. },
      { .............. },
      .....
      { .............. }
  ],
  "Emailer" : { ............ }
}

Not all components are required for all servers.

Front End Server requires "FrontEnd", all "NumberCrunchers" and "Emailer".

Front End Proxy Server requires "FrontEnd", "FEProxy" and one item of "NumberCrunchers" list, which relates to (optional) client server.

Number Cruncher Server requires only one item of `"NumberCrunchers" list, which relates to that server.

A single configuration file with description of all Cloud components, may be used for all servers in the setup. However, maintenance of individual number crunchers (in particular, client-side servers, or CCP4 Cloud Clients) may be more convenient if their configuration files do not contain items related to other components.

3.2. Configuration of the Front End Server

Front End Server configuration has the following structure:

"FrontEnd" : {
  "description"      : {
    "id"   : "ccp4",
    "name" : "CCP4-Harwell",
    "icon" : "images_com/ccp4-harwell.png"
  },
  "protocol"         : "http",
  "host"             : "localhost",
  "port"             : 8081,
  "externalURL"      : "http://localhost:8081",
  "reportURL"        : "https://cloud.ccp4.ac.uk",  // used only for reporting
  "exclusive"        : true,
  "stoppable"        : false,
  "rejectUnauthorized" : true, // optional; use only for debugging, see below
  "exclude_tasks"    : [],
  "fsmount"          : "/",
  "localSetup"       : true,  // optional, overrides automatic definition
  "update_rcode"     : 212, // optional
  "userDataPath"     : "./cofe-users",
  "storage"          : "./cofe-projects",  // for logs, stats, pids, tmp etc.
  "projectsPath"     : "./cofe-projects",  // old version; in this case, "storage" may be omitted
  "projectsPath"     : {   // new version; in this case, "storage" must be given
      "***"   : { "path" : "./cofe-projects", // equivalent to "projectPath"
                  "type" : "volume",          //    given by single string
                      // type "volume" means an ordinary file system
                      // type "home" places projects in path/login/[dirName]
                      //      if path/login already exists and is writable
                  "diskReserve" : 10000,  // new user will not be registered if disk
                                          // has space less than 'diskReserve' (in MB)
                                          // less of already committed space for user
                                          // accounts
                  "dirName" : "ccp4cloud_projects"  // used only for "home" volumes
                },
      "nameN" : { "path" : "pathN",   // any number of any names and any paths
                  "type" : "typeN",
                  "diskReserve" : 10000
                }
  },
  "jobs_safe" : {  // should point on jobs_safe directory as in NCs
      "path"     : "./cofe-nc-storage/jobs_safe",
      "capacity" : 10
  },
  "facilitiesPath"   : "./cofe-facilities",
  "ICAT_wdsl"        : "https://icat02.diamond.ac.uk/ICATService/ICAT?wsdl",
  "ICAT_ids"         : "https://ids01.diamond.ac.uk/ids",
  "auth_software"    : {  // optional item, may be null or missing
    "arpwarp" : {
      "desc_software" : "Arp/wArp Model Building Software from EMBL-Hamburg",
      "icon_software" : "task_arpwarp",
      "desc_provider" : "EMBL Outstation in Hamburg",
      "icon_provider" : "org_emblhamburg",
      "auth_url"      : "https://arpwarp.embl-hamburg.de/api/maketoken/?reqid=$reqid&addr=1.2.3.4&cburl=$cburl"
    },
    "gphl-buster"  : {
      "desc_software" : "Global Phasing Limited Software Suite",
      "icon_software" : "task_buster",
      "desc_provider" : "Global Phasing Limited",
      "icon_provider" : "org_gphl",
      "auth_url"      : "https://arpwarp.embl-hamburg.de/api/maketoken/?reqid=$reqid&addr=1.2.3.4&cburl=$cburl"
    }
  },
  "bootstrapHTML"    : "jscofe.html",
  "maxRestarts"      : 100,
  "fileCapSize"      : 500000,
  "regMode"          : "admin",  // if 'email':  registration by user;
                                 // 'admin': all users are registration by admin
  "sessionCheckPeriod" : 2000,
  "ration"           : {
      "storage"   : 3000,
      "cpu_day"   : 24,
      "cpu_month" : 240
  },
  "cloud_mounts"     : {  // optional item
    "My Computer"    : "/",
    "Home"           : ["$HOME","$USERPROFILE"],
    "CCP4 examples"  : "$CCP4/share/ccp4i2/demo_data",
    "Demo projects"  : "./demo-projects",
    "My files"       : "./$LOGIN/files"
  },
  "logflow" : {   // optional item
    "chunk_length" : 10000,     // number of jobs to advance log file counters
    "log_file" : "/path/to/node_fe"    // full path less of '.log' and '.err' extensions
  }
}

3.2.1. FE Server Configuration Items

“description”

This is an optional object with the following items:

"description" : {
    "id"   : "ccp4",
    "name" : "CCP4-Harwell",
    "icon" : "images_com/ccp4-harwell.png"
}

which gives global identification to a particular CCP4 Cloud setup (instance). Parameters:

“id”
Unix-type id, which must be globally unique. This id allows for discrimination between CCP4 Cloud setups in different geographic locations. The id is used internally and is not exposed to end user.
“name”
Arbitrary name of CCP4 Cloud instance, which is used for reporting to end users. The name may contain HTML formatting
“icon”
Path to icon associated with CCP4 Cloud instance. Both absolute and relative paths may be used. Relative path originates from jscofe directory (the current directory for setver’s node process).

“protocol”, “host”, “port”

These are mandatory items. Typically, all CCP4 Cloud Servers listen to localhost ports, which may receive external requests through redirection from Apache or any other web-server of choice. Therefore values other than:

"protocol" : "http"
"host"     : "localhost"

should not be used other than for development purposes.

"port" : 8081

should be chosen separately for each configured server from the appropriate range of available port numbers

“externalURL”

This is the URL, under which the server is visible to the outside world. If left blank, its value will be calculated as protocol://host:port. However, when chosen port receives redirection from Apache, “externalURL” should be set to address used for redirection, e.g.,

"externalURL" : "https://my.web.server.com/ccp4cloud"

“reportURL”

This optional item specifies server’s URL for reporting to end user in the login page and a few other places. The purpose of this item is to indicate real Front End URL to user when FE Server is accessed via Front End Proxy (FE Proxy runs on local host, therefore, web-address in user’s browser will be always "http://localhost:port" even if data is sent over to remote server. In such situations, specification of “reportURL” allows to inform the user of the actual connection address:

"reportURL" : "https://my.web.server.com/ccp4cloud"

“exclusive”

Specifies whether “port” is exclusively allocated to the server or not. There are few reasons to use value other than

"exclusive" : true

“stoppable”

Specifies whether the server may be stopped by user by executing server’s URL with /stop appended to it, in browser. This is a developer’s option, and there is no reason to use value other than

"stoppable" : false

in production systems.

“rejectUnauthorized”

Switches on/off (true/false) the verification of SSL certificates. This should be set true by default. Switching the verification off may be used for debugging purposes or in case when site should be maintained even at risk while SSL certificates are being sorted. Note that this option is specific to server, not all Cloud setup.

“exclude_tasks”

Optional list of tasks that can be excluded from the list of available tasks. The tasks are specified by the names of the corresponding JavaScript task classes, for example,

"exclude_tasks" : ["TaskAmple","TaskBalbes"]

Task classes are collected in js-common/tasks source code directory.

“fsmount”

File system mount common for communicating servers (Front End and Number Cruncher). Configuration:

"fsmount" : null

causes all data to be transmitted solely via http(s), even within localhost. If a common file mount is specified, e.g.,

"fsmount" : "/"

then data are transmitted using direct file copy, and only metadata is exchanged via http(s).

“localSetup”

This optional configuration overrides automatic identification of local and remote setups. Local setup is one with all servers running on the same hardware host, which is identified by using localhosts for communication. In few particular cases, such identification may fail, which may be corrected by using

"localSetup" : false

“update_rcode”

Optional code (1-255) which is returned when administrators requests update of servers by pressing “Update” button in Administrator Page, tab Nodes. See launcher script for further details. Example:

"update_rcode" : 212

“userDataPath”

Path to directory, which will keep user metadata. Usually, no more than a few KBytes of data per user should be stored. Both absolute and relative paths can be used. Relative path is calculated from directory containing jscofe codes: js-client, js-common, js-server, pycofe and others. The directory must exist before starting server for first time, or server will not start.

“storage”

Path to directory to keep various data files such as logs, usage stats etc. The directory must exist before starting server for first time, or server will not start.

“projectsPath”

Description of file system area allocated for users’ projects and data. In simplest case, it may be just a path to a directory on a file system with sufficient disk space (estimated as number of users times user’s disk quota, see “ration” configuration below). Also in this case, “storage” configuration may be omitted. However, this type of description is deprecated and should not be used.

In all new setups, “projectsPath” should be configured as the following object:

"projectsPath" : {
    "fs01" : { "path"        : "/path01",
               "type"        : "volume",
               "diskReserve" : 10000,
               "dirName"     : "ccp4cloud_projects"
             },
    "fs02" : { "path"        : "/path02",
               "type"        : "volume",
               "diskReserve" : 10000,
               "dirName"     : "ccp4cloud_projects"
             },
    . . . . . . . .
    "fsNN" : { "path"        : "/pathNN",
               "type"        : "volume",
               "diskReserve" : 10000,
               "dirName"     : "ccp4cloud_projects"
             }
}

This configuration allows to place user projects on several file systems (disks), which is useful if users’ data does not fit single file system.

Here,

“fsXX”
is an arbitrary chosen identifier for file system; at least one file system should be configured in the object.
“path”
is path to projects directory on related file system
“type”

is file system type. Two options are available:

“volume” : means ordinary system, projects will be placed in path given

“home” : indicates that projects are placed in user’s home directories; in this, case, the actual directory path is calculated as /path/login/dirName (must exist) where login is user’s login name, and dirName is given below

“diskReserve”
(MBytes) specifies free disk space threshold; if free disk space falls below this value, new user projects will not be allocated on the related file system
“dirName”
specifies custom directory name for “home” type of file systems (ignored for “volume” type)

“jobs_safe”

Description of a safe for failed jobs. Users may choose a type of feedback agreement, including retention of failed jobs, along with input data, in CP4 Cloud safe for further investigation by developers. The safe is a directory in file system shared between the Front End and all Number Crunchers. Safe description is the following object:

"jobs_safe" : {
    "path" : "/path/to/jobs/safe",
    "capacity" : 10
}

Here,

“path”
should point to safe directory as it is seen on the Front End server
“capacity”
is the maximum number of failed jobs of particular type that is retained in the safe (so that only last capacity jobs will be available for investigation)

“facilitiesPath”

Path to directory to keep cache of files obtained from facilities like external file servers, synchrotrons and similar. The directory must exist before starting server for first time, or server will not start.

“ICAT_wdsl”

URL of the WDSL service of STFC/SCD iCAT facility:

"https://icat02.diamond.ac.uk/ICATService/ICAT?wsdl"

“ICAT_ids”

URL of the IDS service of STFC/SCD iCAT facility:

"https://ids01.diamond.ac.uk/ids"

“auth_software”

This is an optional item, which may be null or missing if not used. Otherwise, the item provides description of 3rd-party software, use of which requires user authorisation. The item represents an object with the following structure:

"auth_software"    : {
    "arpwarp" : {
        "desc_software" : "Arp/wArp Model Building Software from EMBL-Hamburg",
        "icon_software" : "task_arpwarp",
        "desc_provider" : "EMBL Outstation in Hamburg",
        "icon_provider" : "org_emblhamburg",
        "auth_url"      : "https://arpwarp.embl-hamburg.de/api/maketoken/?reqid=$reqid&addr=1.2.3.4&cburl=$cburl"
    },
    "gphl-buster"  : {
        "desc_software" : "Global Phasing Limited Software Suite",
        "icon_software" : "task_buster",
        "desc_provider" : "Global Phasing Limited",
        "icon_provider" : "org_gphl",
        "auth_url"      : "https://arpwarp.embl-hamburg.de/api/maketoken/?reqid=$reqid&addr=1.2.3.4&cburl=$cburl"
    }
}

In the above example, keys “arpwarp” and “gphl-buster” are authorisation Ids, defined in the corresponding task classes. “auth_url” are URLs of authorisation servers. Parameter part of these URLs, which follows after question mark, are chosen on agreement with developers of respective authorisation servers.

Configuration of authorisation software results in the appearance of the corresponding entry in user account settings, available in “My Account” page. In that section, a user can request authorisation to use software from the software provider. Upon request, user is redirected to the authorisation server web-page with license agreement and other information, collected by the provider. After completion of the authorisation request, user is redirected back to CCP4 Cloud, and the corresponding authorisation record is added in user’s account automatically. This record enables task, which use authorised software, for the user.

“bootstrapHTML”

Path (relative or absolute) to HTML bootstrap file:

"bootstrap/jscofe.html"

Alternative bootstrap files can be used for debug purposes or quick change of source code directories.

“maxRestarts”

CCP4 Cloud Front End can automatically restart after certain type of errors. This parameter limits the maximal number of restarts in order to avoid infinite restart loops.

“fileCapSize”

Limit size of log file of running jobs delivered to client device, in bytes. If log file exceed this limit, extra content is removed from the middle of the file. Log files of finished jobs are delivered to client device full, without modifications, irrespective of their size.

“regMode”

User registration mode. Configuration

"regMode" : "admin"

specifies administrative registration mode, when a new user can be registered only by server admin (a user with administrative privileges). Configuration

"regMode" : "email"

allows users to register themselves, in addition to registration by admin. In all cases, users are sent temporary password by e-mail.

“sessionCheckPeriod”

Time period, in milliseconds, for checking session status and connection between user device and the Front-End server. This period should be reasonably short such that duplicate sessions are forced to close before a user performs any action (CCP4 Cloud does not support multiple logins); 2000 ms is a reasonable value.

“ration”

This configuration object sets default quotas for new users:

"ration" : {
    "storage"   : 10000,
    "cpu_day"   : 24,
    "cpu_month" : 240
}

where

“storage”
is disk quota, in MBytes
“cpu_day”
is CPU quota per day, in hours
“cpu_month”
is CPU quota per month, in hours

A user is able to submit new jobs if their projects occupy not more than given storage quota, provided that the have not spent more than given cpu_day CPU-hours in last 24-hour time period and not more than given cpu_month CPU-hours in last 30-day time period.

User quotas may be adjusted individually by users with administrative privileges via CCP4 Cloud Administration Facility.

“cloud_mounts”

This optional configuration object specify directories on Front End server, available to users for reading, for example:

"cloud_mounts" : {
    "My Computer"    : "/",
    "Home"           : ["$HOME","$USERPROFILE"],
    "CCP4 examples"  : "$CCP4/share/ccp4i2/demo_data",
    "Demo projects"  : "./demo-projects",
    "My files"       : "./$LOGIN/files"
}

Mount “Demo projects” is reserved for keeping CCP4 Cloud Demo Projects, which can be imported by using the Demo projects button found above user’s List of Projects. Other mounts can be used in dedicated import task (Cloud Import) and in few tasks that use direct access to local and Cloud-based directories (e.g. Xia-2 may process images from a directory mounted in this way). Cloud mount names may be chosen arbitrarily; the appearance of words “My”, “Home”, “CCP4” in mount name will decorate the mount with the corresponding icon in Cloud File Browser.

Directory paths may be both absolute and relative, and they may contain environmental variables as in the above example. Variable $LOGIN will be replaced with user’s Cloud login name, which may be used for cloud storage mapping on per-user basis. When a list of paths is given (cf. above example), the first existing path from the list will be mounted.

“logflow”

This is optional item, which control automatic renumbering of FE log files. FE server writes stderr and stdout log files, with names given in FE launch script. This files may grow uncomfortably large, and, ideally, should be split in manageable chunks for long-term bookkeeping. This may be achieved using the following configuration object:

"logflow" : {
    "chunk_length" : 10000,
    "log_file"     : "/path/to/node_fe"
}

where

“log_file”
is template path to log files. For example, template /path/to/npde_fe means that launch script defines log files with paths /path/to/node_fe.log and /path/to/node_fe.err
“chunk_length”
is the number of job launch records in a chunk. When specified number of jobs is launched, current log files are renamed as /path/to/node_fe.NN.log and /path/to/node_fe.NN.err, where NN is chunk number, and new log files with template names are initialised

3.3. Configuration of the Client Proxy Server

Client Proxy Server configuration has the following structure:

"FEProxy" : {  // optional proxy configuration
  "protocol"         : "http",
  "host"             : "localhost",
  "port"             : 8082,
  "externalURL"      : "http://localhost:8082",
  "exclusive"        : true,
  "stoppable"        : false,
  "rejectUnauthorized" : true, // optional; use only for debugging, see below
  "localisation"     : 1  // 0: all files are taken from remote server
                          // 1: images are taken from local setup
                          // 2: images and js libraries are taken from local setup
                          // 3: images and all js codes are taken from local setup
}

3.3.1. Client Proxy Server Configuration Items

“protocol”, “host”, “port”

These are mandatory items. Client Proxy server always runs on localhost, therefore, values other than:

"protocol" : "http"
"host"     : "localhost"

should not be used other than for development purposes.

"port" : 8082

should be chosen separately from the appropriate range of available port numbers

“externalURL”

This is the URL, under which the Client Proxy server is visible to the browser and Client Server (local Number Cruncher). If left blank, its value will be calculated as protocol://host:port. Values other than this should not be used other than for development purposes:

"externalURL" : "http://localhost:8082"

“exclusive”

Specifies whether “port” is exclusively allocated to the server or not. There are few reasons to use value other than

"exclusive" : true

“stoppable”

Specifies whether the server may be stopped by user by executing server’s URL with /stop appended to it, in browser. This is a developer’s option, and there is no reason to use value other than

"stoppable" : false

in production systems.

“localisation”

Client Proxy Server may be configured to read certain CCP4 Cloud source files from local CCP4 setup, rather than load them from remote server. This makes the system more responsive, but could potentially lead to visualisation defects in case of version mismatch between local and remote CCP4 setups.

“localisation” : 0
all source files are obtained from remote server
“localisation” : 1
image files, icons and documentation, are obtained from local CCP4 setup. This is the recommended option
“localisation” : 2
image files, icons, documentation, and standard JavaScript libraries are obtained from local CCP4 setup
“localisation” : 3
all CCP4 Cloud source files are obtained from local CCP4 setup

“rejectUnauthorized”

Switches on/off (true/false) the verification of SSL certificates. This should be set true by default. Switching the verification off may be used for debugging purposes or in case when site should be maintained even at risk while SSL certificates are being sorted. Note that this option is specific to server, not all Cloud setup.