Audience and purpose

This document targets customer administrators and infrastructure engineers designing a scalable and highly available IceWarp deployment.

Goals:

Explain the basic principles of HA and load balancing for IceWarp.
Provide a minimal example suitable for lab testing and clarity.
Provide a full HA reference example (production-oriented) that demonstrates where HA must be implemented outside IceWarp.
Provide a concrete frontend load balancer example for all required protocols/ports, using Keepalived + Linux IPVS (LVS) with fwmarks, preserving real client source IP visibility on backend IceWarp servers for logging and filtering purposes.

1. Core principles: What IceWarp clustering is (and is not)

1.1 IceWarp clustering model: 1 + N shared-data application nodes

IceWarp clustering is supported by running multiple IceWarp application nodes that all share the same backend data and state:

Shared configuration
Shared datadir
Shared archivedir
Shared SQL databases (all nodes connect to the same database layer)
Shared key-value store for PHP sessions (so session state is accessible regardless of which node receives the request)

This is a “shared everything” design: you scale the IceWarp application tier horizontally by adding nodes, while keeping data consistent by sharing the backends.

1.2 Master vs slave nodes (recommended traffic distribution)

An IceWarp cluster uses:

one Master node
one or more Slave nodes

Important operational note:

The Master node can process user requests, but it also performs administrative and potentially blocking operations (for example: domain/user creation/deletion, directory cache updates, etc.).
Recommended policy:
- send normal user requests to slave nodes
- keep the master as a “last resort” node (fallback capacity)

1.3 What IceWarp does not provide

IceWarp itself does not provide:

data sharding
data replication
automatic HA for shared filesystem
automatic HA for SQL databases
automatic HA for the key-value/session store

Therefore, production HA depends on designing reliable backend layers:

filesystem storage (maildir/archivedir/config)
SQL database backend
key-value/session store backend
frontend load balancing with redundancy

2. Minimal example (lab / clarity; no backend HA)

2.1 Purpose

This example is intentionally minimal to illustrate the core mechanism (shared backends + multiple IceWarp nodes). It is suitable for labs, demos, and understanding request distribution.

2.2 Topology

IceWarp

iw-master (Master)
iw-slave-1 (Slave)

Backends (single-instance; no HA)

storage-1 (NFS)
db-1 (MySQL)
kv-1 (KeyDB/Redis-compatible key-value store for PHP sessions)

Frontend

lb-1 (single load balancer)

2.3 Behavior

Both IceWarp nodes mount the same NFS share at the same path(s).
Both IceWarp nodes connect to the same SQL database and session store.
The load balancer distributes incoming connections so users may land on either node.

2.4 Limitations (not production)

Storage, DB, KV, and the load balancer are single points of failure.
No automated failover.

3. Full HA reference example (recommended production architecture)

3.1 IceWarp application tier

iw-master (Master; kept out of primary pool or configured as low-weight fallback)
iw-app-1 (primary)
iw-app-2 (primary)
iw-app-3 (primary)
optional additional application servers

All nodes share:

the same filesystem content
the same SQL databases (via a stable HA endpoint)
the same session store (via a stable HA endpoint)

3.2 Storage tier: shared filesystem + HA outside IceWarp

IceWarp nodes require a shared filesystem for configuration/mail/archive. In production, ensure the filesystem layer has its own HA plan (replication and failover), because IceWarp does not replicate data.

(Implementation details vary. An example approach is ZFS replication plus NFS failover, described separately in appendix B of this document.)

3.3 SQL tier: 3 MySQL servers + 2 ProxySQL load balancers + Keepalived VIP

Any MySQL/compatible cloud service or on-premise MySQL HA cluster setup may be used. Example design:

MySQL:
- mysql-1 (primary/write source)
- mysql-2 (replica)
- mysql-3 (replica)
ProxySQL:
- proxysql-1 (master)
- proxysql-2 (backup)
Keepalived provides a stable SQL endpoint:
- mysql-vip4 (IPv4 VIP)
- mysql-vip6 (IPv6 VIP)

All IceWarp nodes connect to the cluster ProxySQL VIP(s), not directly to individual MySQL nodes.

3.4 Key-value/session tier: 2 KeyDB multi-master + Keepalived VIP

Any Redis/compatible cloud service or on-premise HA Redis/compatible key-value store may be used. Example design:

KeyDB:
- keydb-1 (multi-master)
- keydb-2 (multi-master)
Keepalived provides:
- kv-vip4 (IPv4 VIP)
- kv-vip6 (IPv6 VIP)

All IceWarp nodes connect to the KeyDB VIP(s) so sessions are shared and node-independent.

4. Frontend load balancing options (L4) and required protocols/ports

4.1 Services/ports that are typically load balanced

The frontend usually supports these TCP services:

SMTP: 25
SMTP submission: 587
SMTPS: 465
POP3: 110
IMAP: 143
POP3S: 995
IMAPS: 993
XMPP: 5222
XMPPS: 5223
HTTP: 80
HTTPS: 443

4.2 Recommended frontend approach in this document

This document uses:

2 load balancers: lb-1 and lb-2
Keepalived providing:
- public-vip4 (IPv4)
- public-vip6 (IPv6)
Linux IPVS (LVS) load balancing using fwmarks
Health checks per service port

Traffic policy:

Slaves receive the bulk of user traffic (high weight).
The master node remains a fallback (low weight).

4.3 Real client source IP visibility

This design can preserve real client source IP visibility at the backend IceWarp nodes.

Important note:

Outbound SNAT may be present on the load balancers to provide internet egress for private servers (updates, repos, etc.).
That outbound SNAT must be scoped so it does not SNAT inbound VIP-forwarded connections.

See Appendix A for configuration patterns and a validation procedure.

Appendix A — Frontend load balancer configuration (Keepalived + IPVS fwmarks, dual-stack)

This appendix provides a concrete and reproducible configuration approach for:

dual-stack VIPs
packet marking rules
Keepalived/IPVS fwmark virtual servers
required ports
validation of real source IP visibility

A.1 Port / fwmark mapping

Rule: fwmark = decimal port number in Keepalived/IPVS, and the firewall mark value is hex(port).

Service	Port	Keepalived fwmark (decimal)	iptables MARK value (hex)
SMTP	25	25	0x19
SMTP submission	587	587	0x24b
SMTPS	465	465	0x1d1
POP3	110	110	0x6e
IMAP	143	143	0x8f
POP3S	995	995	0x3e3
IMAPS	993	993	0x3e1
XMPP	5222	5222	0x1466
XMPPS	5223	5223	0x1467
HTTP	80	80	0x50
HTTPS	443	443	0x1bb

A.2 Packet marking rules (IPv4)

Apply on both load balancers.

Assumptions:

<PUBLIC_VIP4> is your public IPv4 VIP address (e.g., 203.0.113.10)

# SMTP
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 25   -j MARK --set-xmark 0x19/0xffffffff

# SMTP submission / SMTPS
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 587  -j MARK --set-xmark 0x24b/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 465  -j MARK --set-xmark 0x1d1/0xffffffff

# POP3 / IMAP
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 110  -j MARK --set-xmark 0x6e/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 143  -j MARK --set-xmark 0x8f/0xffffffff

# POP3S / IMAPS
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 995  -j MARK --set-xmark 0x3e3/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 993  -j MARK --set-xmark 0x3e1/0xffffffff

# XMPP / XMPPS
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 5222 -j MARK --set-xmark 0x1466/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 5223 -j MARK --set-xmark 0x1467/0xffffffff

# HTTP / HTTPS
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 80   -j MARK --set-xmark 0x50/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP4>/32 -p tcp --dport 443  -j MARK --set-xmark 0x1bb/0xffffffff

A.3 Packet marking rules (IPv6)

Apply on both load balancers.

Assumptions:

<PUBLIC_VIP6> is your public IPv6 VIP address (e.g., 2001:db8::10)

# SMTP
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 25   -j MARK --set-xmark 0x19/0xffffffff

# SMTP submission / SMTPS
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 587  -j MARK --set-xmark 0x24b/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 465  -j MARK --set-xmark 0x1d1/0xffffffff

# POP3 / IMAP
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 110  -j MARK --set-xmark 0x6e/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 143  -j MARK --set-xmark 0x8f/0xffffffff

# POP3S / IMAPS
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 995  -j MARK --set-xmark 0x3e3/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 993  -j MARK --set-xmark 0x3e1/0xffffffff

# XMPP / XMPPS
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 5222 -j MARK --set-xmark 0x1466/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 5223 -j MARK --set-xmark 0x1467/0xffffffff

# HTTP / HTTPS
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 80   -j MARK --set-xmark 0x50/0xffffffff
-A PREROUTING -t mangle -d <PUBLIC_VIP6>/128 -p tcp --dport 443  -j MARK --set-xmark 0x1bb/0xffffffff

A.4 Keepalived IPVS configuration template (per port)

Pattern:

lb_kind NAT
lb_algo wrr
master as fallback (weight 1)
slaves as primary (weight 640)
TCP health checks per service port

virtual_server fwmark <PORT_DECIMAL> {
    delay_loop 20
    lb_kind NAT
    lb_algo wrr
    #protocol TCP

    # Master (fallback / last resort)
    real_server <MASTER_PRIVATE_IP> <PORT_DECIMAL> {
        weight 1
        TCP_CHECK {
            connect_port <PORT_DECIMAL>
            connect_timeout 5
            delay_before_retry 5
        }
    }

    # Slaves (primary)
    real_server <SLAVE1_PRIVATE_IP> <PORT_DECIMAL> {
        weight 640
        TCP_CHECK {
            connect_port <PORT_DECIMAL>
            connect_timeout 5
            delay_before_retry 5
        }
    }

    real_server <SLAVE2_PRIVATE_IP> <PORT_DECIMAL> {
        weight 640
        TCP_CHECK {
            connect_port <PORT_DECIMAL>
            connect_timeout 5
            delay_before_retry 5
        }
    }
}

Create virtual_server fwmark blocks for each required port: 25, 587, 465, 110, 143, 995, 993, 5222, 5223, 80, 443.

A.5 Routing requirement on IceWarp nodes (NAT design)

For LVS/NAT, real servers must return traffic via the load balancer path.

Typical requirement:

IceWarp nodes have their default route set to the load balancer’s internal address (DIP), or an internal VIP if you use one.

A.6 Outbound SNAT is for egress only (not inbound VIP load balancing)

It is common to configure SNAT on the load balancers so private servers can reach the internet (package updates, repositories, etc.).

Example (conceptual):

-A POSTROUTING -s <NET_PRIVATE_NETWORK> -o <PUBLIC_INTERFACE> -j SNAT --to-source <LB_PUBLIC_IP_OR_PUBLIC_VIP>

This SNAT is intended for outbound traffic initiated from the private network. It must be scoped so that inbound VIP-forwarded connections are not SNATed, preserving real client source IP visibility on the IceWarp nodes.

A.7 Validation: confirm real client source IP visibility

On an IceWarp node:

sudo tcpdump -ni any 'tcp port 25' -vv

From an external client with a known public IP:

telnet <mail.example.com> 25

Expected: the TCP SYN source IP on the IceWarp node is the client IP (not the load balancer).

Appendix B — HA storage (ZFS replication + automatic NFS failover with Keepalived, dual-stack)

This appendix describes a production-oriented approach for IceWarp shared storage using:

Two storage servers (storage-a, storage-b)
ZFS datasets for IceWarp data
ZFS replication (periodic zfs send|recv) from active → standby
NFS service active on only one node at a time
Keepalived-managed IPv4/IPv6 VIPs for the NFS endpoint
Automatic failover using Keepalived notify_* scripts

IceWarp nodes mount the NFS share via the VIP(s), so the NFS endpoint remains stable during failover.

B.1 What this provides (and what it does not)

Provides

A stable NFS endpoint (VIP) for:
- IceWarp configuration
- maildir
- archivedir
Automatic storage failover (active/passive)
Data replication via ZFS snapshots

Does not provide

True active/active shared-write filesystem semantics
Zero-RPO by default (RPO depends on replication interval)
Zero-downtime failover (NFS clients may see a brief stall/retry during failover)

B.2 Reference topology

Nodes

storage-a (normally active)
storage-b (normally standby)

VIPs (example names)

nfs-vip4 — IPv4 VIP
nfs-vip6 — IPv6 VIP

Clients

IceWarp nodes (master + slaves) mount NFS from nfs-vip4/nfs-vip6.

B.3 Dataset layout (recommended)

Use one ZFS pool per server, with matching dataset names on both servers.

Example:

Pool: tank
Dataset: tank/icewarp

Within the mounted dataset, store the shared directories required by IceWarp, for example:

config/
mail/ (maildir)
archive/ (archivedir)

Key requirement: every IceWarp node mounts the NFS export to the same mount path(s) expected by IceWarp.

B.4 Replication strategy (active → standby)

Replication model

storage-a is the replication source while active
storage-b receives replicated snapshots

Replication typically consists of:

Creating periodic snapshots on the source
Sending incremental snapshot streams to the standby
Keeping a retention policy (e.g., hourly + daily)

RPO guidance

Lower replication interval ⇒ lower RPO but higher overhead
Typical starting point: every 1–5 minutes for low RPO, or 15 minutes for moderate RPO

Important operational note

During normal operation, the standby dataset should be treated as read-only (or at least not exported via NFS). Only the active node should export the data.

B.5 NFS export design (single active exporter)

Only one of storage-a or storage-b runs as the active NFS exporter at any time.

NFS endpoint

The VIP(s) (nfs-vip4, nfs-vip6) are assigned only to the active exporter.
IceWarp nodes mount NFS using VIP(s) and should not mount directly from the physical node addresses.

Mount options (client-side)

Use mount options appropriate for your environment; common principles:

Prefer resilient options (timeouts/retry behavior) consistent with failover expectations
Ensure mountpoints are present and consistent across all IceWarp nodes

(Exact mount options differ by distribution and operational preference; validate in your lab.)

B.6 Keepalived failover orchestration (automatic)

High-level state machine

When a node becomes MASTER (owns VIP): it must ensure storage + NFS are ready before advertising service.
When a node becomes BACKUP (loses VIP): it must stop NFS service and ensure the dataset is not actively exported.

Keepalived responsibilities

Move nfs-vip4 and nfs-vip6 between the nodes
Call notify scripts on role transitions:
- notify_master: promote and start serving NFS
- notify_backup / notify_stop / notify_fault: stop serving NFS safely

Recommended `notify_master` actions (conceptual)

On the node that becomes MASTER:

Ensure the peer is not exporting NFS (best-effort; do not rely on this alone)
Import and mount ZFS pool/dataset if needed
Set dataset properties suitable for serving (e.g., ensure correct mountpoint)
Start required services:
- NFS server service(s)
- rpcbind/statd/etc. as required by your distro
Export NFS shares
(Optional) Run a lightweight readiness check (e.g., showmount locally)
Allow VIP to be active for client mounts

Recommended `notify_backup` / `notify_fault` actions (conceptual)

On the node that loses MASTER role:

Stop NFS exports / stop NFS server service
Ensure dataset is no longer served
(Optional) If desired: keep ZFS imported but do not export NFS, or fully export the pool (policy choice)
Resume receiving ZFS replication streams if applicable

Implementation note: exact service names and ordering differ by OS. Keep the public doc vendor-neutral and provide the sequence and intent rather than distro-specific commands.

B.7 Dual-stack behavior (IPv4 + IPv6)

Keepalived should manage both VIPs as part of the same failover decision to avoid split-brain service endpoints (v4 on one node, v6 on the other).
Confirm that:
- NFS is reachable over IPv4 and IPv6 via the VIPs
- firewall rules allow NFS-related ports on both stacks
- IceWarp nodes can mount using either stack (or both, depending on your policy)

B.8 NFS failover considerations for IceWarp workloads

Expected client behavior during failover

During VIP move and NFS restart, clients may see:

brief pauses
retries
temporary I/O errors depending on mount options

Plan maintenance windows and test behavior in your environment.

Split-brain avoidance

Because this is active/passive storage:

ensure only one node can export/write the active dataset at a time
ensure replication direction is well-defined (normally active → standby)

Testing checklist (recommended)

Before production:

Mount from multiple IceWarp nodes via VIP
Perform file create/read/write tests within the shared IceWarp directories
Trigger Keepalived failover (move VIP) and confirm:
- NFS service becomes available on the new active node
- clients recover
- data is consistent with expected RPO
Confirm replication continues correctly after failover (direction may need manual adjustment depending on your replication design)

Appendix C — Example HA database layer (Percona Server for MySQL 8.0: 1 primary + 2 replicas, Orchestrator auto-failover, ProxySQL pair with Keepalived VIP)

This appendix provides a generic Linux reference design for an IceWarp database backend:

Percona Server for MySQL 8.0
- mysql-1 — primary (writer)
- mysql-2 — replica
- mysql-3 — replica
Orchestrator (Openark) installed on all 3 DB nodes (high availability for the orchestration plane)
- monitors replication topology
- performs automatic failover by promoting a replica when the primary fails
ProxySQL deployed as a redundant pair (proxysql-1, proxysql-2)
Keepalived VIP in front of ProxySQL so IceWarp nodes connect to a single stable DB endpoint

Key behavior in this design:

Orchestrator promotion sets the newly promoted primary to read_only=0
Non-primaries remain read_only=1
ProxySQL detects read_only state via its monitoring and automatically routes writes to the only backend that is writable.
Therefore Orchestrator→ProxySQL promotion hooks are not required.

C.1 Goals and assumptions

Goals

Stable SQL endpoint for IceWarp: mysql-vip4 / mysql-vip6 (VIP in front of ProxySQL)
ProxySQL node redundancy (VIP failover)
Automatic MySQL primary promotion on failure (Orchestrator)

Assumptions

All MySQL nodes have reliable time sync (NTP) and stable hostnames.
GTID replication is used (recommended for reliable failover).
Replicas are configured to start in read-only mode.
ProxySQL is configured in the common “default monitoring” pattern where:
- only nodes with read_only=0 are eligible as a writer
- nodes with read_only=1 are treated as replicas/readers

C.2 Reference topology

DB nodes (Percona + Orchestrator)

mysql-1, mysql-2, mysql-3

Proxy layer

proxysql-1, proxysql-2 (Keepalived VIP in front)

VIP

mysql-vip4 (IPv4)
mysql-vip6 (IPv6, optional)

C.3 MySQL configuration model (primary + replicas)

C.3.1 Replicas start as read-only (important)

To make failover behavior deterministic and safe:

configure MySQL instances to start with read_only=1 (and ideally super_read_only=1) as the baseline
Orchestrator will explicitly set read_only=0 on the promoted node during promotion

Operational note:

Your initial primary (mysql-1) must be set to read_only=0 in steady state (either via bootstrap procedure or Orchestrator-managed state).

C.3.2 GTID replication (recommended)

Use GTID mode consistently across all nodes to support reliable promotion and re-pointing.

C.4 Orchestrator auto-failover behavior

What Orchestrator does

When the primary fails:

Detect failure
Select the best replica candidate
Promote it:
- set it to writable (read_only=0)
Ensure the rest of the topology is safe:
- other nodes remain read-only (read_only=1)
- replication is reconfigured so remaining replicas follow the new primary

This read_only transition is the key signal consumed by ProxySQL.

C.5 ProxySQL routing without orchestrator hooks (read_only-based writer selection)

C.5.1 Why no hooks are needed

With ProxySQL configured to rely on MySQL monitoring and read_only state:

Orchestrator changes read_only during promotion
ProxySQL detects the state change automatically
ProxySQL routes writes to the node that is currently writable (read_only=0)

This removes the need for custom Orchestrator recovery hooks to “push” a new writer into ProxySQL.

C.5.2 Required ProxySQL monitoring prerequisites

To make this work reliably, ensure:

ProxySQL monitoring user has permission to check server variables/status
monitoring is enabled and health check intervals are appropriate
only one node is writable at a time (Orchestrator must enforce this)

C.6 Keepalived VIP in front of ProxySQL

Keepalived provides a stable endpoint:

IceWarp connects to mysql-vip4 / mysql-vip6
VIP floats between proxysql-1 and proxysql-2
Use track_script so VIP is held only when ProxySQL is healthy

C.7 Failure scenarios (expected behavior)

ProxySQL failure

VIP moves to the surviving ProxySQL node
IceWarp reconnects to the same VIP

Primary MySQL failure

Orchestrator promotes a replica and sets read_only=0 on it
ProxySQL detects the new writable node and routes writes accordingly
IceWarp continues using the same ProxySQL VIP endpoint

C.8 Validation checklist

Confirm only the intended primary has read_only=0.
Confirm ProxySQL routes writes to the writable node.
Make sure MySQL servers start with read_only=1 (configure it in my.cnf)
Simulate primary failure in a test environment:
- verify Orchestrator promotion
- verify ProxySQL routing converges automatically
- verify IceWarp continues normal operation through the VIP

Appendix D — HA session store (2-node KeyDB multi-master + Keepalived VIP, dual-stack)

This appendix describes a simple, production-oriented HA design for the IceWarp shared key-value/session store:

2 KeyDB nodes in multi-master (active/active) replication
Keepalived providing a stable VIP (optional dual-stack)
IceWarp nodes connect to the KeyDB VIP so PHP sessions are accessible from any IceWarp application node

KeyDB is Redis-compatible, but HA behavior and operational details should be validated against your KeyDB version and licensing/support policy.

D.1 Goals and assumptions

Goals

Stable endpoint for IceWarp session storage: kv-vip4 / kv-vip6
Survive failure of one KeyDB node without changing IceWarp configuration
Keep session state consistent and accessible to all IceWarp nodes

Assumptions

Low latency and reliable connectivity between KeyDB nodes (same DC / LAN recommended)
Firewall allows required KeyDB ports between:
- IceWarp nodes ↔ VIP (client traffic)
- KeyDB-1 ↔ KeyDB-2 (replication traffic)
You have defined persistence policy appropriate for session storage (RDB/AOF) and understand the write durability implications.

D.2 Reference topology

KeyDB nodes

keydb-1
keydb-2

VIP via Keepalived (on KeyDB nodes)

kv-vip4 (IPv4 VIP)
kv-vip6 (IPv6 VIP, optional)

Clients

IceWarp master + all IceWarp slaves connect to kv-vip4 / kv-vip6

D.3 KeyDB multi-master basics (conceptual)

D.3.1 What “multi-master” means here

In a 2-node multi-master setup:

both keydb-1 and keydb-2 accept writes
both replicate to each other

This is typically chosen for:

simplicity (either node can serve)
fast failover (no explicit promotion step required)

D.3.2 Consistency and conflict notes (important)

Multi-master replication can involve edge cases:

concurrent writes to the same keys from different masters
conflict resolution rules and last-write-wins behavior (implementation-specific)

For PHP session workloads, this is often acceptable because:

session keys are typically written by the node that handles the user’s request
session updates are frequent but usually not simultaneous across different masters for the same session key
the application can tolerate occasional overwrites within session TTL constraints

Still, document for customers:

expected behavior under network partitions
whether you require “prefer a single writer” operational policy (optional) even in multi-master mode

D.4 Keepalived VIP for KeyDB (active/passive VIP, active/active backends)

D.4.1 Why a VIP in front of an active/active pair

Even though KeyDB is multi-master, a VIP provides:

a single stable endpoint for IceWarp nodes
simple configuration (one host/port)
fast failover if the currently selected VIP-holder fails

D.4.2 How VIP failover works

Both keydb-1 and keydb-2 run Keepalived.
One node holds kv-vip4/kv-vip6 at a time.
IceWarp connects to the VIP.
If the VIP-holder fails, the VIP moves to the other node.

Important: Keepalived VIP failover does not eliminate the need for KeyDB replication; it only provides a stable address.

D.5 Health checking (recommended)

Configure Keepalived to hold the VIP only when KeyDB is healthy.

A typical track_script should validate at least:

KeyDB process is running
TCP port is accepting connections
(optional) a lightweight ping/INFO command succeeds locally

This prevents the VIP from being advertised by a node that is up at the OS level but not actually serving KeyDB.

D.6 Dual-stack considerations (IPv4 + IPv6)

If you support dual-stack:

manage both kv-vip4 and kv-vip6 in the same failover decision to avoid split service (v4 on one node, v6 on the other)
ensure firewall rules are correct for both stacks
decide whether IceWarp clients should prefer IPv4 or IPv6 (document DNS policy accordingly)

D.7 Recommended operational policies (what to document)

D.7.1 Persistence policy

For session workloads, define and document:

whether AOF is enabled
whether RDB snapshots are used
expected behavior on restart (session loss tolerance)

D.7.2 Maintenance and failover testing

Test Keepalived failover by stopping KeyDB or Keepalived on the VIP-holder and verifying:
- VIP moves
- IceWarp continues functioning
Test KeyDB replication health by:
- writing a test key on keydb-1 and verifying it appears on keydb-2
- repeating in the opposite direction

D.7.3 Partition handling (important)

Document your expected behavior during a network partition between keydb-1 and keydb-2:

whether you prefer to keep serving from both sides
or enforce a conservative policy (e.g., force VIP to stay with a preferred node)

D.8 Validation checklist (recommended)

Replication
- Confirm both nodes replicate to each other.
VIP
- Confirm IceWarp nodes can connect to kv-vip4/kv-vip6.
Failover
- Stop KeyDB on current VIP-holder and confirm VIP moves and IceWarp continues to authenticate/maintain sessions.
Logging
- Confirm IceWarp session continuity when a request lands on different IceWarp nodes.

E.1 path.dat

A config file placed in the root of IceWarp installation path (default: /opt/icewarp/path.dat) containing path overrides for various IceWarp directories (config directory, smtp queue directories, license file path), IPs of cluster nodes, and node API URL for WebDocument service callbacks. The config file is parsed by line number so it is important to maintain the correct line order and not add/remove lines without adjusting the rest of the file accordingly. This is a commented example of the file content with explanations (lines starting with * in comment are mandatory for multi-node setups, other lines can be left empty if you're not sure if you need them or not, the server will use default paths and settings in that case):

Line #1:  /mnt/nfs-data/icewarp/config/                                                   // * ConfigPath - full path to the config directory - mandatory for multi-node setups, all nodes must point to the same shared config directory
Line #2:  /mnt/nfs-data/icewarp/html/                                                     // HTMLPath - full path to the HTML directory - not mandatory, leave empty if you're not sure if you need it or not, the server will use the default path if this line is left empty  
Line #3:                                                                                  // Empty line - reserved for future use, do not remove or add lines without adjusting the rest of the file accordingly
Line #4:  IWSV1                                                                           // * Node ID - specifies the prefix for all message files and log files 
Line #5:  \\server\mail;user;password | \\server\logs;user;password | ...                 // RemoteLogonPath - specifies remote paths and their username and password to logon with - leave empty if you're not sure if you need it or not
Line #6:  /mnt/nfs-data/icewarp/spam/                                                     // SpamPath - full path to the spam directory - deprecated, leave empty and use individual local paths
Line #7:  /mnt/nfs-data/icewarp/calendar/                                                 // GroupwarePath - full path to the groupware directory - deprecated, leave empty and use individual local paths
Line #8:  /mnt/nfs-data/icewarp/mail/_outgoing/                                           // OutgoingPath - full path to the mail outgoing directory - leave empty and use individual local paths in most cases, but if you want to share the outgoing queue between nodes, you can specify a shared path here
Line #9:  /mnt/nfs-data/icewarp/mail/_outgoing/retry/                                     // OutgoingRetryPath - full path to the mail retry directory - leave empty and use individual local paths in most cases, but if you want to share the outgoing queue between nodes, you can specify a shared path here
Line #10: 127.0.0.1;192.168.0.1                                                           // IPBinding - the IPs you want the server services to bind to - not mandatory, leave empty if you're not sure if you need it or not
Line #11: mail.domain.com                                                                 // HostName - the hostname you want the server to use in communication with other servers - not mandatory, leave empty if you're not sure if you need it or not
Line #12: /opt/icewarp/config/license.key                                                 // * LicenseFile - full path to a local license file - do not set this to a shared path, each node should have its own local license file
Line #13: /mnt/nfs-data/icewarp/mail/_incoming/                                           // IncomingPath - full path to the mail incoming directory 
Line #14: 0 or 1                                                                          // * SlaveServer - specifies if the server has a role of slave. 0 means the server is a master, 1 means the server is a slave. In a multi-node setup, you should have exactly one master and at least one slave.
Line #15: 192.168.0.2                                                                     // * MasterHost - specifies the master host IP/hostname. 
Line #16: 192.168.0.3;192.168.0.4                                                         // * SlaveHosts - specifies the list of slave hosts IPs/hostnames separated by semicolons. 
Line #17: TeamChatAPI URL https://185.30.21.21:55991/teamchatapi/                         // * Publicly available URL/IP:port TeamChatAPI (https) listens on the LoadBalancer and is forwarded to this node's port 443 - this is used for WebDocument service callbacks

This is an example of a valid path.dat file on a master node 192.168.0.2 of a 3-node cluster with shared config. The master node has the IP 192.168.0.2 and the slaves have the IPs 192.168.0.3 and 192.168.0.4:

/mnt/nfs-data/icewarp/config/

IWSV1







/opt/icewarp/config/license.key

0
192.168.0.2
192.168.0.3;192.168.0.4
https://185.30.21.21:55991/teamchatapi/

This is an example of a valid path.dat file on a slave node 192.168.0.3 of a 3-node cluster with shared config. The master node has the IP 192.168.0.2 and the slaves have the IPs 192.168.0.3 and 192.168.0.4:

/mnt/nfs-data/icewarp/config/

IWSV2







/opt/icewarp/config/license.key

1
192.168.0.2
192.168.0.3;192.168.0.4
https://185.30.21.21:55992/teamchatapi/

E.2.1 IceWarp API folder path settings

Following paths are set via IceWarp administration console or WebAdmin or IceWarp API and must be set to point to the same shared path for all nodes in the cluster:

Mailbox path
- the path to the mail directory for each node, must be set to a shared path if you want to share the mail directory between nodes
- API variable name: C_System_Storage_Dir_MailPath
- example setting via API: /opt/icewarp/tool.sh set system C_System_Storage_Dir_MailPath /mnt/nfs-data/icewarp/mail/
AutoArchive path
- the path to the email auto archive directory for each node, must be set to a shared path if you want to share the archive between nodes
- API variable name: C_System_Tools_AutoArchive_Path
- example setting via API: /opt/icewarp/tool.sh set system C_System_Tools_AutoArchive_Path /mnt/nfs-data/icewarp/archive/

E.2.2 IceWarp API database settings

Following database connection strings are set via IceWarp administration console or WebAdmin or IceWarp API and must be set to point to the same shared database for all nodes in the cluster:

Accounts database
- storing information about user accounts, domains, aliases, distribution lists, etc.
- API variable name: c_system_storage_accounts_odbcconnstring
- example setting via API: /opt/icewarp/tool.sh set system c_system_storage_accounts_odbcconnstring '<DB_NAME_ACCOUNTS>;<DB_ICEWARP_USER>;<DB_ICEWARP_USER_PASSWORD>;<DBLB_IP_VIP>:<DBLB_PORT_VIP>;3;2'
Antispam database
- storing information about spam filtering, blacklists, whitelists, etc.
- API variable name: c_as_challenge_connectionstring
- example setting via API: /opt/icewarp/tool.sh set system c_as_challenge_connectionstring '<DB_NAME_ANTISPAM>;<DB_ICEWARP_USER>;<DB_ICEWARP_USER_PASSWORD>;<DBLB_IP_VIP>:<DBLB_PORT_VIP>;3;2'
Groupware database
- storing information about calendar, contacts, tasks, notes, etc.
- API variable name: c_gw_connectionstring
- example setting via API: /opt/icewarp/tool.sh set system c_gw_connectionstring '<DB_NAME_GROUPWARE>;<DB_ICEWARP_USER>;<DB_ICEWARP_USER_PASSWORD>;<DBLB_IP_VIP>:<DBLB_PORT_VIP>;3;2'
ActiveSync database
- storing information about mobile device synchronization, policies, etc.
- API variable name: C_ActiveSync_DBConnection
- example setting via API part 1: /opt/icewarp/tool.sh set system C_ActiveSync_DBConnection 'mysql:host=<DBLB_IP_VIP>;port=<DBLB_PORT_VIP>;dbname=<DB_NAME_EAS>'
- example setting via API part 2: /opt/icewarp/tool.sh set system C_ActiveSync_DBUser '<DB_ICEWARP_USER>'
- example setting via API part 3: /opt/icewarp/tool.sh set system C_ActiveSync_DBPass '<DB_ICEWARP_USER_PASSWORD>'
Directory cache database
- storing cached information about filesystem structure and metadata for faster access and to save IOps on the shared storage
- API variable name: c_accounts_global_accounts_directorycacheconnectionstring
- example setting via API: /opt/icewarp/tool.sh set system c_accounts_global_accounts_directorycacheconnectionstring '<DB_NAME_DIRCACHE>;<DB_ICEWARP_USER>;<DB_ICEWARP_USER_PASSWORD>;<DBLB_IP_VIP>:<DBLB_PORT_VIP>;3;2'

WebClient cache database

storing cached information for the WebClient for faster access and to save IOps on the shared storage
API variable name: not available, must be set in config/_webmail/server.xml config file which is in the shared config path
example config/_webmail/server.xml:

<global_settings>
  <item>
    <dbconn>mysql:host=<DBLB_IP_VIP>;port=<DBLB_PORT_VIP>;dbname=<DB_NAME_WEBCLIENT></dbconn>
    <dbuser><DB_ICEWARP_USER></dbuser>
    <dbpass><DB_ICEWARP_USER_PASSWORD></dbpass>
    <dbsyntax>mysql</dbsyntax>
    ... other settings, leave as is ...
  </item>
</global_settings>

E.3 license

Each node in the cluster must have its own local license file, and the path to that file must be specified in line 12 of the path.dat file. Do not set this to a shared path, as each node should have its own local license file. The license file should be placed on each node and the path should be updated accordingly in the path.dat file for each node.

E.4 default route to LB DIP

In a typical LVS/NAT design, IceWarp nodes have their default route set to the load balancer’s internal address (DIP), or an internal VIP if you use one. This ensures that return traffic from the IceWarp nodes goes back through the load balancer, which is necessary for proper routing and source IP visibility in a NAT setup.

E.5 php.ini redis example

If you are using Redis for session storage, your php.ini should have the following settings to point to the Redis VIP:

extension=igbinary
extension=msgpack
extension=redis
session.save_handler = redis
session.save_path = "tcp://<REDIS_CLUSTER_VIP_IP>:<REDIS_CLUSTER_VIP_PORT>?auth=<REDIS_AUTH_PASSWORD>&prefix=IWCPHPSESS:"
; REDIS SESSION LOCKING
; Should the locking be enabled? Defaults to: 0.
redis.session.locking_enabled = 1
; How long should the lock live (in seconds)? Defaults to: value of max_execution_time.
redis.session.lock_expire = 60
; How long to wait between attempts to acquire lock, in microseconds (µs)?. Defaults to: 2000
; redis.session.lock_wait_time = 50000
; Maximum number of times to retry (-1 means infinite). Defaults to: 10
redis.session.lock_retries = -1

Append this to your existing /opt/icewarp/php/php.ini and create a separate /opt/icewarp/php/php.user.ini file with the same content to ensure that these settings are preserved during IceWarp updates.

Appendix F - FIREWALL CONFIG - REQUIRED PORTS

This is a general guideline for the firewall configuration and required ports for a clustered IceWarp setup with shared storage and database. The actual firewall rules may vary based on specific use cases, workloads, and security policies. It is important to review and adjust the firewall rules according to your environment and requirements.

Protocol	Type	Source	Source Port	Destination	Destination Port	Notes
TCP+UDP	Inbound	Any	Any	Any	25	SMTP
TCP+UDP	Inbound	Any	Any	Any	587	SMTP client submission
TCP+UDP	Inbound	Any	Any	Any	465	SMTP client tls
TCP+UDP	Inbound	Any	Any	Any	21	FTP
TCP+UDP	Inbound	Any	Any	Any	990	FTP SSL
TCP+UDP	Inbound	Any	Any	Any	80	HTTP
TCP+UDP	Inbound	Any	Any	Any	32000	HTTP alternative port
TCP+UDP	Inbound	Any	Any	Any	32001	HTTPS alternative port
TCP+UDP	Inbound	Any	Any	Any	443	HTTPS
TCP+UDP	Inbound	Any	Any	Any	110	POP3
TCP+UDP	Inbound	Any	Any	Any	995	POP3 SSL
TCP+UDP	Inbound	Any	Any	Any	143	IMAP
TCP+UDP	Inbound	Any	Any	Any	993	IMAP SSL
TCP+UDP	Inbound	Any	Any	Any	5222	XMPP
TCP+UDP	Inbound	Any	Any	Any	5223	XMPP SSL
TCP+UDP	Inbound	Any	Any	Any	5229	XMPP BOSH
TCP+UDP	Inbound	Any	Any	Any	5060	SIP
TCP+UDP	Inbound	Any	Any	Any	5061	SIP TLS
UDP	Inbound	Any	Any	Any	10000-10255	RTP
TCP+UDP	Inbound	Any	Any	Any	1080	SOCKS (proxy)
TCP+UDP	Inbound	Any	Any	Any	161	SNMP
TCP+UDP	Inbound	Any	Any	Any	389	LDAP
TCP+UDP	Inbound	Any	Any	Any	636	LDAPS
TCP+UDP	Inbound	Any	Any	Any	4069	MINGER
TCP+UDP	Inbound	Any	Any	Any	4070	MINGER SSL
TCP+UDP	Inbound	Any	Any	Any	13	TIMESYNC
TCP+UDP	Inbound	Any	Any	Any	53	DNS
TCP+UDP	Inbound	Any	Any	Any	3306	MYSQL

Appendix G - CACHE INVALIDATION

IceWarp needs to be able to invalidate caches on all nodes in the cluster in multi-node setups when certain events happen (e.g. account change, maildir update, etc.) to ensure that all nodes have consistent and up-to-date information. This is achieved through the use of IceWarp internal cache invalidation mechanism which relies on these UDP ports being open between all nodes in the cluster:

25/udp
110/udp
80/udp
5222/udp
5229/udp
32002/udp

Articles in this section

Audience and purpose

1. Core principles: What IceWarp clustering is (and is not)

1.1 IceWarp clustering model: 1 + N shared-data application nodes

1.2 Master vs slave nodes (recommended traffic distribution)

1.3 What IceWarp does not provide

2. Minimal example (lab / clarity; no backend HA)

2.1 Purpose

2.2 Topology

2.3 Behavior

2.4 Limitations (not production)

3. Full HA reference example (recommended production architecture)

3.1 IceWarp application tier

3.2 Storage tier: shared filesystem + HA outside IceWarp

3.3 SQL tier: 3 MySQL servers + 2 ProxySQL load balancers + Keepalived VIP

3.4 Key-value/session tier: 2 KeyDB multi-master + Keepalived VIP

4. Frontend load balancing options (L4) and required protocols/ports

4.1 Services/ports that are typically load balanced

4.2 Recommended frontend approach in this document

4.3 Real client source IP visibility

Appendix A — Frontend load balancer configuration (Keepalived + IPVS fwmarks, dual-stack)

A.1 Port / fwmark mapping

A.2 Packet marking rules (IPv4)

A.3 Packet marking rules (IPv6)

A.4 Keepalived IPVS configuration template (per port)

A.5 Routing requirement on IceWarp nodes (NAT design)

A.6 Outbound SNAT is for egress only (not inbound VIP load balancing)

A.7 Validation: confirm real client source IP visibility

Appendix B — HA storage (ZFS replication + automatic NFS failover with Keepalived, dual-stack)

B.1 What this provides (and what it does not)

Provides

Does not provide

B.2 Reference topology

Nodes

VIPs (example names)

Clients

B.3 Dataset layout (recommended)

B.4 Replication strategy (active → standby)

Replication model

RPO guidance

Important operational note

B.5 NFS export design (single active exporter)

NFS endpoint

Mount options (client-side)

B.6 Keepalived failover orchestration (automatic)

High-level state machine

Keepalived responsibilities

Recommended notify_master actions (conceptual)

Recommended notify_backup / notify_fault actions (conceptual)

B.7 Dual-stack behavior (IPv4 + IPv6)

B.8 NFS failover considerations for IceWarp workloads

Expected client behavior during failover

Split-brain avoidance

Testing checklist (recommended)

Appendix C — Example HA database layer (Percona Server for MySQL 8.0: 1 primary + 2 replicas, Orchestrator auto-failover, ProxySQL pair with Keepalived VIP)

C.1 Goals and assumptions

Goals

Assumptions

C.2 Reference topology

DB nodes (Percona + Orchestrator)

Proxy layer

VIP

C.3 MySQL configuration model (primary + replicas)

C.3.1 Replicas start as read-only (important)

C.3.2 GTID replication (recommended)

C.4 Orchestrator auto-failover behavior

What Orchestrator does

C.5 ProxySQL routing without orchestrator hooks (read_only-based writer selection)

C.5.1 Why no hooks are needed

C.5.2 Required ProxySQL monitoring prerequisites

C.6 Keepalived VIP in front of ProxySQL

C.7 Failure scenarios (expected behavior)

ProxySQL failure

Primary MySQL failure

C.8 Validation checklist

Appendix D — HA session store (2-node KeyDB multi-master + Keepalived VIP, dual-stack)

D.1 Goals and assumptions

Goals

Assumptions

D.2 Reference topology

Recommended `notify_master` actions (conceptual)

Recommended `notify_backup` / `notify_fault` actions (conceptual)