High-Throughput Optimizations¶

Enough games. It's time to get serious and prepare your Assemblyline deployment to scan multiple millions of files.

The following details the configurations required to deploy our current biggest production environment, which is the biggest Assemblyline 4 deployment that we are aware of. We also will discuss the rationale behind these decisions.

Nodes¶

Don't use nodes that are too small because Elastic/Redis can use a lot of resources
Minimum: 8 cores / 32 GB
What we use: 16 cores / 64 GB
The minimum amount of nodes required by your cluster is the number of Elastic pods that you have
We have 12 Elastic pods so our deployment auto-scales from 12 nodes to 72 nodes

Ingestion¶

For high-volume ingestion, do not use /api/v4/submit/
Use this instead: /api/v4/ingest/. This API is tailored for rate-limiting if Assemblyline can't keep up. This API will queue submissions for processing later if there is a backlog.
If ingestion slows down the UI, the ingestion rate is too high
Set separateIngestAPI:true in your values.yml file to spin up dedicated pods for ingestion

File Storage¶

Do not use the provided minio container for file storage
It's not that minio is not good, but rather we just haven't spent any effort making the HELM chart deploy minio correctly
Use either Azure blob storage if you are on AKS or Amazon S3 if you are on AWS
Deploy your own minio with redundancy or any other well-supported S3-compatible file storage
Don't put your file storage secrets in your values.yml file. Use Kubernetes secrets instead.

Example:

internalFilestore: false
configuration:
  filestore:
    storage:
      - "azure://<blob_store_name>.blob.core.windows.net/storage?access_key=${FILESTORE_PASSWORD}"
    cache:
      - "azure://<blob_store_name>.blob.core.windows.net/cache?access_key=${FILESTORE_PASSWORD}"

Redis¶

All messaging passed to services and Dispatcher/Ingester-shared memory space is stored in Redis
Redis is our only component that cannot be scaled
You should tweak RAM / CPU / thread requirements to fit your need
We use the following values in values.yml:

redisVolatileIOThreads: 5
redisVolatileReqCPU: 4
redisVolatileLimCPU: 4
redisVolatileReqRam: 4Gi

redisPersistentIOThreads: 3
redisPersistentReqCPU: 2
redisPersistentLimCPU: 2
redisPersistentReqRam: 8Gi
redisPersistentLimRam: 32Gi

Dispatcher¶

You can change the number of threads the Dispatcher uses
Make sure Dispatcher is reserved a full core and has enough RAM
NOTE: It's a Python process so don't give it more than a core
We use the following values in values.yml:

disptacherShutdownGrace: 1800
dispatcherResultThreads: 8
dispatcherFinalizeThreads: 8
dispatcherReqCPU: 1
dispatcherLimCPU: 1
dispatcherReqRam: 2Gi
dispatcherLimRam: 4Gi

Expiry¶

With big data input comes big data deletion
We give Expiry more cores and more workers to be able to expire all that data
We use the following values in values.yml:

expiryReqCPU: 2
expiryLimCPU: 4
configuration:
  core:
    expiry:
      workers: 50
      delete_workers: 5

Scaling¶

Use cpu_overallocation to make sure the cloud node auto-scaler works. Use a value between 1.05 to 1.10 (105% to 110%).
overallocation_node_limit will determine your maximum amount of nodes
min_instances determines the minimum number of service pods loaded. We use 2 so our reaction time is faster but that costs more money.
cpu_reservation is the percentage of the required maximum CPU for a service that will be reserved by Kubernetes. The higher the value, the less time the services fight for CPU time as their CPU usage is reserved, but that comes at the price of a higher cost!
We use the following values in values.yml:

configuration:
  core:
    scaler:
      cpu_overallocation: 1.05
      overallocation_node_limit: 72
      service_defaults:
        min_instances: 2
  services:
    cpu_reservation: 0.7

Auto-Scalers¶

The Scaler component is dedicated to managing services
To make sure you have enough core components to handle the service load you can adjust the max number of components in the values.yml files.
We use the following values in values.yml:

dispatcherInstancesMax: 25
ingestAPIInstancesMax: 50
serviceServerInstancesMax: 50
dispatcherTargetUsage: 40

Datastore¶

Since you'll have more data you'll need more Elastic pods (replicas)
To make the most out of those pods they will need more CPU. Match the request/limit of CPU so Elastic does not fight with services for CPU time.
The size of the indices will be larger, Elastic will need more RAM to process the queries
To take advantage of distributed computing, since Elastic has more nodes, it will need more shards so each node gets busy enough
If you've deployed your cluster before adjusting the shard, you'll have to use the fix_shards CLI command to edit the shard count on affected indices
Our biggest production system has 4.7TB of indices with 1.8 Billion documents
We use the following values in values.yml:

elasticEmptyResultShards: 16
elasticFileShards: 16
elasticResultShards: 36
elasticSubmissionShards: 24
datastore:
  replicas: 12
  resources:
    requests:
      cpu: 4
      memory: 12Gi
    limits:
      cpu: 4
      memory: 20Gi