Link Search Menu Expand Document

Minimal required:

backplane:
  redisUri: "redis://localhost:6379"
  queues:
    - name: "cpu"
      properties:
        - name: "min-cores"
          value: "*"
        - name: "max-cores"
          value: "*"
worker:
  publicName: "localhost:8981"

Configuration files also support includes via !include tag:

backplane: !include "backplane.yml"
server: !include "server.yml"
worker: !include "worker.yml"

The configuration can be provided to the server and worker as a CLI argument or through the environment variable CONFIG_PATH For an example configuration containing all of the configuration values, see examples/config.yml.

All Configurations

Common

ConfigurationAccepted and Default ValuesCommand Line ArgumentDescription
digestFunctionSHA256, SHA1 Digest function for this implementation
defaultActionTimeoutInteger, 600 Default timeout value for an action (seconds)
maximumActionTimeoutInteger, 3600 Maximum allowed action timeout (seconds)
maxEntrySizeBytesLong, 2147483648 Maximum size of a single blob accepted (bytes)
prometheusPortInteger, 9090–prometheus_portListening port of the Prometheus metrics endpoint
allowSymlinkTargetAbsoluteboolean, false Permit inputs to contain symlinks with absolute path targets

Example:

digestFunction: SHA1
defaultActionTimeout: 1800
maximumActionTimeout: 1800
prometheusPort: 9090
server:
  ...
worker:
  ...

Server

ConfigurationAccepted and Default ValuesEnvironment VarDescription
instanceTypeSHARD Type of implementation (SHARD is the only one supported)
nameString, shard Implementation name
publicNameString, DERIVED:portINSTANCE_NAMEHost:port of the GRPC server, required to be accessible by all servers
actionCacheReadOnlyboolean, false Allow/Deny writing to action cache
portInteger, 8980 Listening port of the GRPC server
bindAddressString Listening address of the GRPC server, default for Java Grpc (all interface addresses) if unspecified
maxInboundMessageSizeBytesInteger, 0 Byte size limit of GRPC messages, default for Java Grpc if unspecified or 0
maxInboundMetadataSizeInteger, 0 Byte size limit of GRPC metadata, default for Java Grpc if unspecified or 0
casWriteTimeoutInteger, 3600 CAS write timeout (seconds)
bytestreamTimeoutInteger, 3600 Byte Stream write timeout (seconds)
sslCertificatePathString, null Absolute path of the SSL certificate (if TLS used)
sslPrivateKeyPathString, null Absolute path of the SSL private key (if TLS used)
runDispatchedMonitorboolean, true Enable an agent to monitor the operation store to ensure that dispatched operations with expired worker leases are requeued
dispatchedMonitorIntervalSecondsInteger, 1 Dispatched monitor’s lease expiration check interval (seconds)
runOperationQueuerboolean, true Acquire execute request entries cooperatively from an arrival queue on the backplane
ensureOutputsPresentboolean, true Ensure ActionResult outputs are present in the CAS on ActionCache::GetActionResults. If any outputs are missing a cache miss is returned
maxCpuInteger, 0 Maximum number of CPU cores that any min/max-cores property may request (0 = unlimited)
maxRequeueAttemptsInteger, 5 Maximum number of requeue attempts for an operation
useDenyListboolean, true Allow usage of a deny list when looking up actions and invocations (for cache only it is recommended to disable this check)
mergeExecutionsboolean, true Merge executions with matching action_digest by default (overridable with MERGE_EXECUTIONS correlatedInvocationsId URL param)
grpcTimeoutInteger, 3600 GRPC request timeout (seconds)
executeKeepaliveAfterSecondsInteger, 60 Execute keep alive (seconds)
recordBesEventsboolean, false Allow recording of BES events
clusterIdString, local Buildfarm cluster ID
cloudRegionString, us-east_1 Deployment region in the cloud
gracefulShutdownSecondsInteger, 0 Time in seconds to allow for connections in flight to finish when shutdown signal is received

Example:

server:
  instanceType: SHARD
  name: shard
  actionCacheReadOnly: true
  recordBesEvents: true

GRPC Metrics

ConfigurationAccepted and Default ValuesDescription
enabledboolean, falsePublish basic GRPC metrics to a Prometheus endpoint
provideLatencyHistogramsboolean, falsePublish detailed, more expensive to calculate, metrics
labelsToReportList of Strings, []Include custom metrics labels in Prometheus metrics

Example:

server:
  grpcMetrics:
    enabled: false
    provideLatencyHistograms: false
    labelsToReport: []

Server Caches

ConfigurationAccepted and Default ValuesDescription
directoryCacheMaxEntriesLong, 64 * 1024The max number of entries that the directory cache will hold.
commandCacheMaxEntriesLong, 64 * 1024The max number of entries that the command cache will hold.
digestToActionCacheMaxEntriesLong, 64 * 1024The max number of entries that the digest-to-action cache will hold.
recentServedExecutionsCacheMaxEntriesLong, 64 * 1024The max number of entries that the executions cache will hold.

Example:

server:
  caches:
    directoryCacheMaxEntries: 10000
    commandCacheMaxEntries: 10000
    digestToActionCacheMaxEntries: 10000
    recentServedExecutionsCacheMaxEntries: 10000

Admin

ConfigurationAccepted and Default ValuesDescription
deploymentEnvironmentString, AWS, GCPSpecify deloyment environment in the cloud
clusterEndpointString, grpc://localhostBuildfarm cluster endpoint for Admin use (this is a full buildfarm endpoint)

Example:

server:
  admin:
    deploymentEnvironment: AWS
    clusterEndpoint: "grpc://localhost"

Metrics

ConfigurationAccepted and Default ValuesDescription
publisherString, aws, gcp, logSpecify publisher type for sending metadata
logLevelString, INFO, FINESTSpecify log level (“log” publisher only, all Java util logging levels are allowed here)
topicString, testSpecify SNS topic name for cloud publishing (“aws” publisher only)
topicMaxConnectionsInteger, 1000Specify maximum number of connections allowed for cloud publishing (“aws” publisher only)
secretNameString, testSpecify secret name to pull SNS permissions from (“aws” publisher only)

Example:

server:
  metrics:
    publisher: log
    logLevel: INFO
server:
  metrics:
    publisher: aws
    topic: buildfarm-metadata-test
    topicMaxConnections: 1000
    secretName: buildfarm-secret

Correlated Invocations Index Scopes

A set of names to extract per-correlatedInvocationsId fields from a valid URI to create indices which will contain the fragment uuid

EntryURI Component
hostauthority:host
usernameuserinfo:username
* (any key)query[key] (one index per specified pair)

Example:

server:
  correlatedInvocationsIndexScopes: !!set
    ? host
    ? username

Redis Backplane

ConfigurationAccepted and Default ValuesEnvironment VarCommand Line ArgumentDescription
typeSHARD  Type of backplane. Currently, the only implementation is SHARD utilizing Redis
redisUriString, redis://localhost:6379REDIS_URI–redis_uriRedis cluster endpoint. This must be a single URI. This can embed a username/password per RFC-3986 Section 3.2.1 and this will take precedence over redisPassword and redisPasswordFile.
redisUsernameString, null  Redis username, if applicable
redisPasswordString, null  Redis password, if applicable
redisPasswordFileString, null  File to read for a Redis password. If specified, this takes precedence over redisPassword
redisNodesList of Strings, null  List of individual Redis nodes, if applicable
redisCertificateAuthorityFileString, null  File to read for Redis connection SSL certificate authority. If specified, this is the exclusive certificate authority bundle for Redis connections. Must be a PEM file.
redisAuthWithGoogleCredentialsboolean, false  If true, use GOOGLE_DEFAULT_CREDENTIALS environment variable to find a service account to authenticate to Redis. useful for Google Memorystore Redis Cluster
jedisPoolMaxTotalInteger, 4000  The size of the Redis connection pool
jedisPoolMaxIdleInteger, 8  The maximum size of idle connection to Redis
jedisPoolMinIdleInteger, 0  The size of guaranteed idle connection to Redis
jedisTimeBetweenEvictionRunsMillisLong, 30000  The period for detecting idle connections to Redis. If you set -1, detection will be disabled.
workersHashNameString, Workers  Redis key used to store a hash of registered workers
workerChannelString, WorkerChannel  Redis pubsub channel key where changes of the cluster membership are announced
actionCachePrefixString, ActionCache  Redis key prefix for all ActionCache entries
actionCacheExpireInteger, 2419200  The TTL maintained for ActionCache entries, refreshed on getActionResult hit
actionBlocklistPrefixString, ActionBlocklist  Redis key prefix for all blocklisted actions, which are rejected
actionBlocklistExpireInteger, 3600  The TTL maintained for action blocklist entries
invocationBlocklistPrefixString, InvocationBlocklist  Redis key prefix for blocklisted invocations, suffixed with a a tool invocation ID
operationPrefixString, Operation  Redis key prefix for all operations, suffixed with the operation’s name
operationExpireInteger, 604800  The TTL maintained for all executions, updated on each modification
actionExecutionExpireInteger, 21600  The TTL maintained for all action -> execution mappings for mergeExecutions
preQueuedOperationsListNameString, {Arrival}:PreQueuedOperations  Redis key used to store a list of ExecuteEntry awaiting transformation into QueryEntry
processingListNameString, {Arrival}:ProcessingOperations  Redis key of a list used to ensure reliable processing of arrival queue entries with operation watch monitoring
processingPrefixString, Processing  Redis key prefix for operations which are being dequeued from the arrival queue
processingTimeoutMillisInteger, 20000  Delay (in ms) used to populate processing operation entries
queuedOperationsListNameString, {Execution}:QueuedOperations  Redis key used to store a list of QueueEntry awaiting execution by workers
dispatchingPrefixString, Dispatching  Redis key prefix for operations which are being dequeued from the ready to run queue
dispatchingTimeoutMillisInteger, 10000  Delay (in ms) used to populate dispatching operation entries
dispatchedOperationsHashNameString, DispatchedOperations  Redis key of a hash of operation names to the worker lease for its execution, which are monitored by the dispatched monitor
operationChannelPrefixString, OperationChannel  Redis pubsub channel prefix suffixed by an operation name
casPrefixString, ContentAddressableStorage  Redis key prefix suffixed with a blob digest that maps to a set of workers with that blob’s availability
casExpireInteger, 604800  The TTL maintained for CAS entries, which is refreshed on any read access of the blob
subscribeToBackplaneboolean, true  Enable an agent of the backplane client which subscribes to worker channel and operation channel events. If disabled, responsiveness of watchers and CAS are reduced
runFailsafeOperationboolean, true  Enable an agent in the backplane client which monitors watched operations and ensures they are in a known maintained, or expirable state
maxQueueDepthInteger, 100000  Maximum length that the ready to run queue is allowed to reach to control an arrival flow for execution
maxPreQueueDepthInteger, 1000000  Maximum lengh that the arrival queue is allowed to reach to control load on the Redis cluster
priorityQueueboolean, false  Priority queue type allows prioritizing operations based on Bazel’s –remote_execution_priority= flag
timeoutInteger, 10000  Default timeout
maxInvocationIdTimeoutInteger, 604800  Maximum TTL (Time-to-Live in second) of invocationId keys in RedisBackplane
maxAttemptsInteger, 20  Maximum number of execution attempts
connectionValidatedOnBorrowboolean, false  Whether to validate Redis connections when borrowing from the pool

Example:

backplane:
  type: SHARD
  redisUri: "redis://localhost:6379"
  priorityQueue: true

Execution Queues

ConfigurationAccepted and Default ValuesDescription
nameStringName of the execution queue (ex: cpu, gpu)
allowUnmatchedboolean, true 
propertiesList of name/value pairsAny specification of min/max-cores will be allowed to support CPU controls and worker resource delegation

Example:

backplane:
  type: SHARD
  redisUri: "redis://localhost:6379"
  queues:
    - name: "cpu"
      allowUnmatched: true
      properties:
        - name: "min-cores"
          value: "*"
        - name: "max-cores"
          value: "*"

Worker

ConfigurationAccepted and Default ValuesEnvironment VarDescription
portInteger, 8981 Listening port of the worker
publicNameString, DERIVED:portINSTANCE_NAMEHost:port of the GRPC server, required to be accessible by all servers
rootString, /tmp/worker Path for all operation content storage
inlineContentLimitInteger, 1048567 Total size in bytes of inline content for action results, output files, stdout, stderr content
operationPollPeriodInteger, 1 Period between poll operations at any stage
executeStageWidthInteger, 0EXECUTION_STAGE_WIDTHNumber of CPU cores available for execution (0 = system available cores)
executeStageWidthOffsetInteger, 0 Offset number of CPU cores available for execution (to allow for use by other processes)
inputFetchStageWidthInteger, 0 Number of concurrently available slots to fetch inputs (0 = system calculated based on CPU cores)
inputFetchDeadlineInteger, 60 Limit on time (seconds) for input fetch stage to fetch inputs
reportResultStageWidthInteger, 1 Number of concurrently available slots to write results and clean up execution directories
linkExecFileSystemboolean, true Use hard links instead of file copies to populate execution directories. Disable on Windows to compensate for shared hard-link deletion semantics for running executables.
linkInputDirectoriesboolean, true Use an input directory creation strategy which creates a single directory tree at the highest level containing no output paths of any kind, and symlinks that directory into an action’s execroot, saving large amounts of time spent manufacturing the same read-only input hierirchy over multiple actions’ executions
execOwnerString, null Create exec trees containing directories that are owned by this user
execOwnersList of Strings Create exec trees containing directories that are owned by these users, unique to each execution. Must contain at least the sum of {inputFetchStageWidth, executeStageWidth, reportResultStageWidth} owners. If this option is specified and not empty, execOwner is ignored
hexBucketLevelsInteger, 0 Number of levels to create for directory storage by leading byte of the hash
defaultMaxCoresInteger, 0 Constrain all executions to this logical core count unless otherwise specified via min/max-cores (0 = no limit)
limitGlobalExecutionboolean, false Constrain all executions to a pool of logical cores specified in executeStageWidth
onlyMulticoreTestsboolean, false Only permit tests to exceed the default coresvalue for their min/max-cores range specification (only works with non-zero defaultMaxCores)
allowBringYourOwnContainerboolean, false Enable execution in a custom Docker container
errorOperationRemainingResourcesboolean, false  
errorOperationOutputSizeExceededboolean, false Operations which produce single output files which exceed maxEntrySizeBytes will fail with a violation type which implies a user error. When disabled, the violation will indicate a transient error, with the action blocklisted.
linkedInputDirectoriesList of Strings, ^(?!external$).*$ A list of regular expressions matching input directories which will be subject to the effects of linkInputDirectories setting
gracefulShutdownSecondsInteger, 0 Time in seconds to allow for operations in flight to finish when shutdown signal is received
createSymlinkOutputsboolean, false Creates SymlinkNodes for symbolic links discovered in output paths for actions. No verification of the symlink target path occurs. Buildstream, for example, requires this.
zstdBufferPoolSizeInteger, 2048 Specifies the maximum number of zstd data buffers that may be in use concurrently by the filesystem CAS. Increase to improve compressed blob throughput, decrease to reduce memory usage.
persistentWorkerActionMnemonicAllowlistSet, _"*"_ Controls which Action Mnemonics can run with a persistent worker. Use * for wildcard.
worker:
  port: 8981
  publicName: "localhost:8981"
  linkedInputDirectories:
    - "^path/to/common/directory"

Capabilities

ConfigurationAccepted and Default ValuesDescription
casboolean, trueEnables worker to be a shard of the CAS
executionboolean, trueEnables worker to participate in execution pool

Example:

worker:
  capabilities:
    cas: true
    execution: true

Sandbox Settings

Using the sandbox can be configurable by the client via exec_properties. However, sometimes it is preferred to enable it via buildfarm config to prevent users from running actions outside the sandbox.

ConfigurationAccepted and Default ValuesDescription
alwaysUseSandboxboolean, falseWhether or not to always use the sandbox when running actions. It may be preferred to enforce sandbox usage than rely on client selection.
alwaysUseAsNobodyboolean, falseWhether or not to always use the as-nobody wrapper when running actions. It may be preferred to enforce this wrapper instead of relying on client selection.
alwaysUseCgroupsboolean, trueWhether or not to use cgroups when sandboxing actions. It may be preferred to enforce cgroup usage.
alwaysUseTmpFsboolean, falseWhether or not to always use tmpfs when using the sandbox. It may be preferred to enforce sandbox usage than rely on client selection.
additionalWritePathsList of Strings, []Additional paths the sandbox is allowed to write to. Suggestions may include: /tmp, /dev/shm
tmpFsPathsList of Strings, []Additional paths the sandbox uses for tmpfs. Suggestions may include: /tmp
selectForBlockNetworkboolean, falseIf the action requires “block network” use the sandbox to fulfill this request. Otherwise, there may be no alternative solution and the “block network” request will be ignored / implemented differently.
selectForTmpFsboolean, falseIf the action requires “tmpfs” use the sandbox to fulfill this request.execution. Otherwise, there may be no alternative solution and the “tmpfs” request will be ignored / implemented differently.

Example:

worker:
  sandboxSettings:
    alwaysUseSandbox: true
    alwaysUseAsNobody: false
    alwaysUseCgroups: true
    alwaysUseTmpFs: true
    additionalWritePaths: []
    tmpFsPaths: []
    selectForBlockNetwork: false
    selectForTmpFs: false

Note: In order for these settings to take effect, you must also configure limitGlobalExecution: true.

Dequeue Match

ConfigurationAccepted and Default ValuesDescription
allowUnmatchedboolean, false 
propertiesList of name/value pairsPairs of provisions available to match against action properties

Example:

worker:
  dequeueMatchSettings:
    allowUnmatched: false
    properties:
      - name: "gpu"
        value: "nvidia RTX 2090"

Resources

A list of limited resources that are available to the worker to be depleted by actions which execute containing a “resource:": "N" property. The dequeueMatchSettings may also further limit executions to contain "resource:" in properties, with either a specific limited resource count as the only accepted value for the action property. The use case here is one where executions are not allowed to request any value other than the one specified. There are no operators currently for asserting an execution requests 'less than' a particular number of resources.

The default resource type is SEMAPHORE. The expected use case is that a resource is internally allocated and managed by an execution, and the exhaustion prevents executions from starting which would block or fault if they could not consume the resource. Common examples include licensed software with tokens, and this mechanism can work for singleton resources like one gpu on a worker.

The POOL resource type is intended for externally defined resource allotment. The Integer range [0,amount) will be created in a pool of free ids provided to The common use case here is multiple gpus on a system, where the concurrent execution on all gpus can take place, but each execution completion frees a different gpu that must be targetted with, for example, the CUDA_VISIBLE_DEVICES env var. The pool takes a resource out of a queue of available ids, and replaces it in the queue when the execution is complete. Since externally defined resources need to be injected into an execution, the depletion of a pool resource applies an automatic execution policy, named ‘pool-', and supplies transformations for both '<_name_>' - expanded to all ids in multiple arguments, or '<_name_-_N_>' for 0-N addressable ids, up to the amount requested. Subsequent literal arguments can provide a delimiter for the wrapper.

ConfigurationAccepted and Default ValuesDescription
namestringResource identifier present on worker
typeSEMAPHORE, POOLResource count depleted by actions
amountIntegerResource count depleted by actions

Example:

worker:
  dequeueMatchSettings:
    properties:
      - name: "resource:special-compiler-license"
        value: "1" # only actions which request one compiler license at a time will be accepted
  resources:
    name: "special-compiler-license"
    amount: 3

Worker CAS

Unless specified, options are only relevant for FILESYSTEM type

ConfigurationAccepted and Default ValuesDescription
typeFILESYSTEM, GRPCType of CAS used
pathString, cacheLocal cache location relative to the ‘root’, or absolute
maxSizeBytesInteger, 0Limit for contents of files retained from CAS in the cache, value of 0 means to auto-configure to 90% of root/path underlying filesystem space
fileDirectoriesIndexInMemoryboolean, falseDetermines if the file directories bidirectional mapping should be stored in memory or in sqlite
skipLoadboolean, falseDetermines if transient data on the worker should be loaded into CAS on worker startup (affects startup time)
targetString, nullFor GRPC CAS type, target for external CAS endpoint

Example:

This definition will create a filesystem-based CAS file cache at the path “/cache" on the worker that will reject entries over 2GiB in size, and will expire LRU blobs when the aggregate size of all blobs exceeds 2GiB in order to insert additional entries.

worker:
  storages:
    - type: FILESYSTEM
      path: "cache"
      maxSizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024

This definition elides FILESYSTEM configuration with ‘…’, will read-through an external GRPC CAS supporting the REAPI CAS Services into its storage, and will attempt to write expiring entries into the GRPC CAS (i.e. pushing new entries into the head of a worker LRU list will drop the entries from the tail into the GRPC CAS).

worker:
  storages:
    - type: FILESYSTEM
      ...
    - type: GRPC
      target: "cas.external.com:1234"

Execution Policies

ConfigurationAccepted and Default ValuesDescription
nameStringExecution policy name
prioritizedBoolean, falseIf true, policy will run before built-in policies
executionWrapperExecution wrapper, containing a path and list of argumentsExecution wrapper, its path and a list of arguments for the wrapper

Example:

worker:
  executionPolicies:
    - name: as-nobody
      prioritized: true
      executionWrapper:
        path: /app/build_buildfarm/as-nobody
        arguments:
          - "-u"
          - "<exec-owner>"
    - name: unshare
      executionWrapper:
        path: /usr/bin/unshare
        arguments:
          - "-n"
          - "-r"
    - name: linux-sandbox
      executionWrapper:
        path: /app/build_buildfarm/linux-sandbox
        arguments:
          # use "--" to signal the end of linux-sandbox args. "--" should always be last!
          - "--"
    - name: test
      executionWrapper:
        path: /YOUR/WRAPPER
        arguments:
          - arg1
          - arg2
          - "<platform-property-name>"

arg1 and arg2 are interpreted literally. <platform-property-value> will be substituted with the value of a property named "platform-property-name" from a Command’s Platform or the requested pool resources for the execution. If a matching property or pool resource is not found for a specified name, the entire wrapper will be discarded and have no effect on the execution.

<exec-owner> is an automatically provided pool resource when execOwner or execOwners is specified, and will contain the value of the execution’s owner selected for exec tree creation.

An execution with as-nobody, unshare, and linux-sandbox execution policies enabled would produce a command line like:

/app/build_buildfarm/as-nobody -u <exec-owner> /usr/bin/unshare -n -r /app/build_buildfarm/linux-sandbox -- /YOUR/WRAPPER arg1 arg2 <platform-property-name> ACTION

where ACTION is the Command from remote execution action.


Table of contents