Minimal required:
backplane:
redisUri: "redis://localhost:6379"
queues:
- name: "cpu"
properties:
- name: "min-cores"
value: "*"
- name: "max-cores"
value: "*"
worker:
publicName: "localhost:8981"
Configuration files also support includes via !include tag:
backplane: !include "backplane.yml"
server: !include "server.yml"
worker: !include "worker.yml"
The configuration can be provided to the server and worker as a CLI argument or through the environment variable CONFIG_PATH For an example configuration containing all of the configuration values, see examples/config.yml.
All Configurations
Common
| Configuration | Accepted and Default Values | Command Line Argument | Description |
|---|---|---|---|
| digestFunction | SHA256, SHA1 | Digest function for this implementation | |
| defaultActionTimeout | Integer, 600 | Default timeout value for an action (seconds) | |
| maximumActionTimeout | Integer, 3600 | Maximum allowed action timeout (seconds) | |
| maxEntrySizeBytes | Long, 2147483648 | Maximum size of a single blob accepted (bytes) | |
| prometheusPort | Integer, 9090 | –prometheus_port | Listening port of the Prometheus metrics endpoint |
| allowSymlinkTargetAbsolute | boolean, false | Permit inputs to contain symlinks with absolute path targets |
Example:
digestFunction: SHA1
defaultActionTimeout: 1800
maximumActionTimeout: 1800
prometheusPort: 9090
server:
...
worker:
...
Server
| Configuration | Accepted and Default Values | Environment Var | Description |
|---|---|---|---|
| instanceType | SHARD | Type of implementation (SHARD is the only one supported) | |
| name | String, shard | Implementation name | |
| publicName | String, DERIVED:port | INSTANCE_NAME | Host:port of the GRPC server, required to be accessible by all servers |
| actionCacheReadOnly | boolean, false | Allow/Deny writing to action cache | |
| port | Integer, 8980 | Listening port of the GRPC server | |
| bindAddress | String | Listening address of the GRPC server, default for Java Grpc (all interface addresses) if unspecified | |
| maxInboundMessageSizeBytes | Integer, 0 | Byte size limit of GRPC messages, default for Java Grpc if unspecified or 0 | |
| maxInboundMetadataSize | Integer, 0 | Byte size limit of GRPC metadata, default for Java Grpc if unspecified or 0 | |
| casWriteTimeout | Integer, 3600 | CAS write timeout (seconds) | |
| bytestreamTimeout | Integer, 3600 | Byte Stream write timeout (seconds) | |
| sslCertificatePath | String, null | Absolute path of the SSL certificate (if TLS used) | |
| sslPrivateKeyPath | String, null | Absolute path of the SSL private key (if TLS used) | |
| runDispatchedMonitor | boolean, true | Enable an agent to monitor the operation store to ensure that dispatched operations with expired worker leases are requeued | |
| dispatchedMonitorIntervalSeconds | Integer, 1 | Dispatched monitor’s lease expiration check interval (seconds) | |
| runOperationQueuer | boolean, true | Acquire execute request entries cooperatively from an arrival queue on the backplane | |
| ensureOutputsPresent | boolean, true | Ensure ActionResult outputs are present in the CAS on ActionCache::GetActionResults. If any outputs are missing a cache miss is returned | |
| maxCpu | Integer, 0 | Maximum number of CPU cores that any min/max-cores property may request (0 = unlimited) | |
| maxRequeueAttempts | Integer, 5 | Maximum number of requeue attempts for an operation | |
| useDenyList | boolean, true | Allow usage of a deny list when looking up actions and invocations (for cache only it is recommended to disable this check) | |
| mergeExecutions | boolean, true | Merge executions with matching action_digest by default (overridable with MERGE_EXECUTIONS correlatedInvocationsId URL param) | |
| grpcTimeout | Integer, 3600 | GRPC request timeout (seconds) | |
| executeKeepaliveAfterSeconds | Integer, 60 | Execute keep alive (seconds) | |
| recordBesEvents | boolean, false | Allow recording of BES events | |
| clusterId | String, local | Buildfarm cluster ID | |
| cloudRegion | String, us-east_1 | Deployment region in the cloud | |
| gracefulShutdownSeconds | Integer, 0 | Time in seconds to allow for connections in flight to finish when shutdown signal is received |
Example:
server:
instanceType: SHARD
name: shard
actionCacheReadOnly: true
recordBesEvents: true
GRPC Metrics
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| enabled | boolean, false | Publish basic GRPC metrics to a Prometheus endpoint |
| provideLatencyHistograms | boolean, false | Publish detailed, more expensive to calculate, metrics |
| labelsToReport | List of Strings, [] | Include custom metrics labels in Prometheus metrics |
Example:
server:
grpcMetrics:
enabled: false
provideLatencyHistograms: false
labelsToReport: []
Server Caches
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| directoryCacheMaxEntries | Long, 64 * 1024 | The max number of entries that the directory cache will hold. |
| commandCacheMaxEntries | Long, 64 * 1024 | The max number of entries that the command cache will hold. |
| digestToActionCacheMaxEntries | Long, 64 * 1024 | The max number of entries that the digest-to-action cache will hold. |
| recentServedExecutionsCacheMaxEntries | Long, 64 * 1024 | The max number of entries that the executions cache will hold. |
Example:
server:
caches:
directoryCacheMaxEntries: 10000
commandCacheMaxEntries: 10000
digestToActionCacheMaxEntries: 10000
recentServedExecutionsCacheMaxEntries: 10000
Admin
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| deploymentEnvironment | String, AWS, GCP | Specify deloyment environment in the cloud |
| clusterEndpoint | String, grpc://localhost | Buildfarm cluster endpoint for Admin use (this is a full buildfarm endpoint) |
Example:
server:
admin:
deploymentEnvironment: AWS
clusterEndpoint: "grpc://localhost"
Metrics
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| publisher | String, aws, gcp, log | Specify publisher type for sending metadata |
| logLevel | String, INFO, FINEST | Specify log level (“log” publisher only, all Java util logging levels are allowed here) |
| topic | String, test | Specify SNS topic name for cloud publishing (“aws” publisher only) |
| topicMaxConnections | Integer, 1000 | Specify maximum number of connections allowed for cloud publishing (“aws” publisher only) |
| secretName | String, test | Specify secret name to pull SNS permissions from (“aws” publisher only) |
Example:
server:
metrics:
publisher: log
logLevel: INFO
server:
metrics:
publisher: aws
topic: buildfarm-metadata-test
topicMaxConnections: 1000
secretName: buildfarm-secret
Correlated Invocations Index Scopes
A set of names to extract per-correlatedInvocationsId fields from a valid URI to create indices which will contain the fragment uuid
| Entry | URI Component |
|---|---|
| host | authority:host |
| username | userinfo:username |
| * (any key) | query[key] (one index per specified pair) |
Example:
server:
correlatedInvocationsIndexScopes: !!set
? host
? username
Redis Backplane
| Configuration | Accepted and Default Values | Environment Var | Command Line Argument | Description |
|---|---|---|---|---|
| type | SHARD | Type of backplane. Currently, the only implementation is SHARD utilizing Redis | ||
| redisUri | String, redis://localhost:6379 | REDIS_URI | –redis_uri | Redis cluster endpoint. This must be a single URI. This can embed a username/password per RFC-3986 Section 3.2.1 and this will take precedence over redisPassword and redisPasswordFile. |
| redisUsername | String, null | Redis username, if applicable | ||
| redisPassword | String, null | Redis password, if applicable | ||
| redisPasswordFile | String, null | File to read for a Redis password. If specified, this takes precedence over redisPassword | ||
| redisNodes | List of Strings, null | List of individual Redis nodes, if applicable | ||
| redisCertificateAuthorityFile | String, null | File to read for Redis connection SSL certificate authority. If specified, this is the exclusive certificate authority bundle for Redis connections. Must be a PEM file. | ||
| redisAuthWithGoogleCredentials | boolean, false | If true, use GOOGLE_DEFAULT_CREDENTIALS environment variable to find a service account to authenticate to Redis. useful for Google Memorystore Redis Cluster | ||
| jedisPoolMaxTotal | Integer, 4000 | The size of the Redis connection pool | ||
| jedisPoolMaxIdle | Integer, 8 | The maximum size of idle connection to Redis | ||
| jedisPoolMinIdle | Integer, 0 | The size of guaranteed idle connection to Redis | ||
| jedisTimeBetweenEvictionRunsMillis | Long, 30000 | The period for detecting idle connections to Redis. If you set -1, detection will be disabled. | ||
| workersHashName | String, Workers | Redis key used to store a hash of registered workers | ||
| workerChannel | String, WorkerChannel | Redis pubsub channel key where changes of the cluster membership are announced | ||
| actionCachePrefix | String, ActionCache | Redis key prefix for all ActionCache entries | ||
| actionCacheExpire | Integer, 2419200 | The TTL maintained for ActionCache entries, refreshed on getActionResult hit | ||
| actionBlocklistPrefix | String, ActionBlocklist | Redis key prefix for all blocklisted actions, which are rejected | ||
| actionBlocklistExpire | Integer, 3600 | The TTL maintained for action blocklist entries | ||
| invocationBlocklistPrefix | String, InvocationBlocklist | Redis key prefix for blocklisted invocations, suffixed with a a tool invocation ID | ||
| operationPrefix | String, Operation | Redis key prefix for all operations, suffixed with the operation’s name | ||
| operationExpire | Integer, 604800 | The TTL maintained for all executions, updated on each modification | ||
| actionExecutionExpire | Integer, 21600 | The TTL maintained for all action -> execution mappings for mergeExecutions | ||
| preQueuedOperationsListName | String, {Arrival}:PreQueuedOperations | Redis key used to store a list of ExecuteEntry awaiting transformation into QueryEntry | ||
| processingListName | String, {Arrival}:ProcessingOperations | Redis key of a list used to ensure reliable processing of arrival queue entries with operation watch monitoring | ||
| processingPrefix | String, Processing | Redis key prefix for operations which are being dequeued from the arrival queue | ||
| processingTimeoutMillis | Integer, 20000 | Delay (in ms) used to populate processing operation entries | ||
| queuedOperationsListName | String, {Execution}:QueuedOperations | Redis key used to store a list of QueueEntry awaiting execution by workers | ||
| dispatchingPrefix | String, Dispatching | Redis key prefix for operations which are being dequeued from the ready to run queue | ||
| dispatchingTimeoutMillis | Integer, 10000 | Delay (in ms) used to populate dispatching operation entries | ||
| dispatchedOperationsHashName | String, DispatchedOperations | Redis key of a hash of operation names to the worker lease for its execution, which are monitored by the dispatched monitor | ||
| operationChannelPrefix | String, OperationChannel | Redis pubsub channel prefix suffixed by an operation name | ||
| casPrefix | String, ContentAddressableStorage | Redis key prefix suffixed with a blob digest that maps to a set of workers with that blob’s availability | ||
| casExpire | Integer, 604800 | The TTL maintained for CAS entries, which is refreshed on any read access of the blob | ||
| subscribeToBackplane | boolean, true | Enable an agent of the backplane client which subscribes to worker channel and operation channel events. If disabled, responsiveness of watchers and CAS are reduced | ||
| runFailsafeOperation | boolean, true | Enable an agent in the backplane client which monitors watched operations and ensures they are in a known maintained, or expirable state | ||
| maxQueueDepth | Integer, 100000 | Maximum length that the ready to run queue is allowed to reach to control an arrival flow for execution | ||
| maxPreQueueDepth | Integer, 1000000 | Maximum lengh that the arrival queue is allowed to reach to control load on the Redis cluster | ||
| priorityQueue | boolean, false | Priority queue type allows prioritizing operations based on Bazel’s –remote_execution_priority= | ||
| timeout | Integer, 10000 | Default timeout | ||
| maxInvocationIdTimeout | Integer, 604800 | Maximum TTL (Time-to-Live in second) of invocationId keys in RedisBackplane | ||
| maxAttempts | Integer, 20 | Maximum number of execution attempts | ||
| connectionValidatedOnBorrow | boolean, false | Whether to validate Redis connections when borrowing from the pool |
Example:
backplane:
type: SHARD
redisUri: "redis://localhost:6379"
priorityQueue: true
Execution Queues
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| name | String | Name of the execution queue (ex: cpu, gpu) |
| allowUnmatched | boolean, true | |
| properties | List of name/value pairs | Any specification of min/max-cores will be allowed to support CPU controls and worker resource delegation |
Example:
backplane:
type: SHARD
redisUri: "redis://localhost:6379"
queues:
- name: "cpu"
allowUnmatched: true
properties:
- name: "min-cores"
value: "*"
- name: "max-cores"
value: "*"
Worker
| Configuration | Accepted and Default Values | Environment Var | Description |
|---|---|---|---|
| port | Integer, 8981 | Listening port of the worker | |
| publicName | String, DERIVED:port | INSTANCE_NAME | Host:port of the GRPC server, required to be accessible by all servers |
| root | String, /tmp/worker | Path for all operation content storage | |
| inlineContentLimit | Integer, 1048567 | Total size in bytes of inline content for action results, output files, stdout, stderr content | |
| operationPollPeriod | Integer, 1 | Period between poll operations at any stage | |
| executeStageWidth | Integer, 0 | EXECUTION_STAGE_WIDTH | Number of CPU cores available for execution (0 = system available cores) |
| executeStageWidthOffset | Integer, 0 | Offset number of CPU cores available for execution (to allow for use by other processes) | |
| inputFetchStageWidth | Integer, 0 | Number of concurrently available slots to fetch inputs (0 = system calculated based on CPU cores) | |
| inputFetchDeadline | Integer, 60 | Limit on time (seconds) for input fetch stage to fetch inputs | |
| reportResultStageWidth | Integer, 1 | Number of concurrently available slots to write results and clean up execution directories | |
| linkExecFileSystem | boolean, true | Use hard links instead of file copies to populate execution directories. Disable on Windows to compensate for shared hard-link deletion semantics for running executables. | |
| linkInputDirectories | boolean, true | Use an input directory creation strategy which creates a single directory tree at the highest level containing no output paths of any kind, and symlinks that directory into an action’s execroot, saving large amounts of time spent manufacturing the same read-only input hierirchy over multiple actions’ executions | |
| execOwner | String, null | Create exec trees containing directories that are owned by this user | |
| execOwners | List of Strings | Create exec trees containing directories that are owned by these users, unique to each execution. Must contain at least the sum of {inputFetchStageWidth, executeStageWidth, reportResultStageWidth} owners. If this option is specified and not empty, execOwner is ignored | |
| hexBucketLevels | Integer, 0 | Number of levels to create for directory storage by leading byte of the hash | |
| defaultMaxCores | Integer, 0 | Constrain all executions to this logical core count unless otherwise specified via min/max-cores (0 = no limit) | |
| limitGlobalExecution | boolean, false | Constrain all executions to a pool of logical cores specified in executeStageWidth | |
| onlyMulticoreTests | boolean, false | Only permit tests to exceed the default coresvalue for their min/max-cores range specification (only works with non-zero defaultMaxCores) | |
| allowBringYourOwnContainer | boolean, false | Enable execution in a custom Docker container | |
| errorOperationRemainingResources | boolean, false | ||
| errorOperationOutputSizeExceeded | boolean, false | Operations which produce single output files which exceed maxEntrySizeBytes will fail with a violation type which implies a user error. When disabled, the violation will indicate a transient error, with the action blocklisted. | |
| linkedInputDirectories | List of Strings, ^(?!external$).*$ | A list of regular expressions matching input directories which will be subject to the effects of linkInputDirectories setting | |
| gracefulShutdownSeconds | Integer, 0 | Time in seconds to allow for operations in flight to finish when shutdown signal is received | |
| createSymlinkOutputs | boolean, false | Creates SymlinkNodes for symbolic links discovered in output paths for actions. No verification of the symlink target path occurs. Buildstream, for example, requires this. | |
| zstdBufferPoolSize | Integer, 2048 | Specifies the maximum number of zstd data buffers that may be in use concurrently by the filesystem CAS. Increase to improve compressed blob throughput, decrease to reduce memory usage. | |
| persistentWorkerActionMnemonicAllowlist | Set | Controls which Action Mnemonics can run with a persistent worker. Use * for wildcard. |
worker:
port: 8981
publicName: "localhost:8981"
linkedInputDirectories:
- "^path/to/common/directory"
Capabilities
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| cas | boolean, true | Enables worker to be a shard of the CAS |
| execution | boolean, true | Enables worker to participate in execution pool |
Example:
worker:
capabilities:
cas: true
execution: true
Sandbox Settings
Using the sandbox can be configurable by the client via exec_properties. However, sometimes it is preferred to enable it via buildfarm config to prevent users from running actions outside the sandbox.
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| alwaysUseSandbox | boolean, false | Whether or not to always use the sandbox when running actions. It may be preferred to enforce sandbox usage than rely on client selection. |
| alwaysUseAsNobody | boolean, false | Whether or not to always use the as-nobody wrapper when running actions. It may be preferred to enforce this wrapper instead of relying on client selection. |
| alwaysUseCgroups | boolean, true | Whether or not to use cgroups when sandboxing actions. It may be preferred to enforce cgroup usage. |
| alwaysUseTmpFs | boolean, false | Whether or not to always use tmpfs when using the sandbox. It may be preferred to enforce sandbox usage than rely on client selection. |
| additionalWritePaths | List of Strings, [] | Additional paths the sandbox is allowed to write to. Suggestions may include: /tmp, /dev/shm |
| tmpFsPaths | List of Strings, [] | Additional paths the sandbox uses for tmpfs. Suggestions may include: /tmp |
| selectForBlockNetwork | boolean, false | If the action requires “block network” use the sandbox to fulfill this request. Otherwise, there may be no alternative solution and the “block network” request will be ignored / implemented differently. |
| selectForTmpFs | boolean, false | If the action requires “tmpfs” use the sandbox to fulfill this request.execution. Otherwise, there may be no alternative solution and the “tmpfs” request will be ignored / implemented differently. |
Example:
worker:
sandboxSettings:
alwaysUseSandbox: true
alwaysUseAsNobody: false
alwaysUseCgroups: true
alwaysUseTmpFs: true
additionalWritePaths: []
tmpFsPaths: []
selectForBlockNetwork: false
selectForTmpFs: false
Note: In order for these settings to take effect, you must also configure limitGlobalExecution: true.
Dequeue Match
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| allowUnmatched | boolean, false | |
| properties | List of name/value pairs | Pairs of provisions available to match against action properties |
Example:
worker:
dequeueMatchSettings:
allowUnmatched: false
properties:
- name: "gpu"
value: "nvidia RTX 2090"
Resources
A list of limited resources that are available to the worker to be depleted by actions which execute containing a “resource:
The default resource type is SEMAPHORE. The expected use case is that a resource is internally allocated and managed by an execution, and the exhaustion prevents executions from starting which would block or fault if they could not consume the resource. Common examples include licensed software with tokens, and this mechanism can work for singleton resources like one gpu on a worker.
The POOL resource type is intended for externally defined resource allotment. The Integer range [0,amount) will be created in a pool of free ids provided to The common use case here is multiple gpus on a system, where the concurrent execution on all gpus can take place, but each execution completion frees a different gpu that must be targetted with, for example, the CUDA_VISIBLE_DEVICES env var. The pool takes a resource out of a queue of available ids, and replaces it in the queue when the execution is complete. Since externally defined resources need to be injected into an execution, the depletion of a pool resource applies an automatic execution policy, named ‘pool-
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| name | string | Resource identifier present on worker |
| type | SEMAPHORE, POOL | Resource count depleted by actions |
| amount | Integer | Resource count depleted by actions |
Example:
worker:
dequeueMatchSettings:
properties:
- name: "resource:special-compiler-license"
value: "1" # only actions which request one compiler license at a time will be accepted
resources:
name: "special-compiler-license"
amount: 3
Worker CAS
Unless specified, options are only relevant for FILESYSTEM type
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| type | FILESYSTEM, GRPC | Type of CAS used |
| path | String, cache | Local cache location relative to the ‘root’, or absolute |
| maxSizeBytes | Integer, 0 | Limit for contents of files retained from CAS in the cache, value of 0 means to auto-configure to 90% of root/path underlying filesystem space |
| fileDirectoriesIndexInMemory | boolean, false | Determines if the file directories bidirectional mapping should be stored in memory or in sqlite |
| skipLoad | boolean, false | Determines if transient data on the worker should be loaded into CAS on worker startup (affects startup time) |
| target | String, null | For GRPC CAS type, target for external CAS endpoint |
Example:
This definition will create a filesystem-based CAS file cache at the path “
worker:
storages:
- type: FILESYSTEM
path: "cache"
maxSizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024
This definition elides FILESYSTEM configuration with ‘…’, will read-through an external GRPC CAS supporting the REAPI CAS Services into its storage, and will attempt to write expiring entries into the GRPC CAS (i.e. pushing new entries into the head of a worker LRU list will drop the entries from the tail into the GRPC CAS).
worker:
storages:
- type: FILESYSTEM
...
- type: GRPC
target: "cas.external.com:1234"
Execution Policies
| Configuration | Accepted and Default Values | Description |
|---|---|---|
| name | String | Execution policy name |
| prioritized | Boolean, false | If true, policy will run before built-in policies |
| executionWrapper | Execution wrapper, containing a path and list of arguments | Execution wrapper, its path and a list of arguments for the wrapper |
Example:
worker:
executionPolicies:
- name: as-nobody
prioritized: true
executionWrapper:
path: /app/build_buildfarm/as-nobody
arguments:
- "-u"
- "<exec-owner>"
- name: unshare
executionWrapper:
path: /usr/bin/unshare
arguments:
- "-n"
- "-r"
- name: linux-sandbox
executionWrapper:
path: /app/build_buildfarm/linux-sandbox
arguments:
# use "--" to signal the end of linux-sandbox args. "--" should always be last!
- "--"
- name: test
executionWrapper:
path: /YOUR/WRAPPER
arguments:
- arg1
- arg2
- "<platform-property-name>"
arg1 and arg2 are interpreted literally. <platform-property-value> will be substituted with the value of a property named "platform-property-name" from a Command’s Platform or the requested pool resources for the execution. If a matching property or pool resource is not found for a specified name, the entire wrapper will be discarded and have no effect on the execution.
<exec-owner> is an automatically provided pool resource when execOwner or execOwners is specified, and will contain the value of the execution’s owner selected for exec tree creation.
An execution with as-nobody, unshare, and linux-sandbox execution policies enabled would produce a command line like:
/app/build_buildfarm/as-nobody -u <exec-owner> /usr/bin/unshare -n -r /app/build_buildfarm/linux-sandbox -- /YOUR/WRAPPER arg1 arg2 <platform-property-name> ACTION
where ACTION is the Command from remote execution action.