Skip to main content

vault HA mysql backend cannot assign requested address

·2 mins

We were trying to make a single node vault into HA, this vault node was using mysql as backend and pointing to a single mysql node, although the mysql percona cluster itself had three node.

So we decided to add two more nodes into vault as well as make all vault nodes use a HA mysql connection. As vault does not allow configuring all three nodes in the connection string we decided to use use HAproxy in front of mysql and configure it to use myqsql node as active and other two backup.

When we configured vault to talk to HAproxy mysql backend, it worked for few minutes and would error out on following error.

[ERROR] core: failed to renew releases: error="failed to persist accessor index entry: dial tcp 10.10.0.126:3306: connect: cannot assign requested address"

At first it looked that we have wrong mysql configuration or it is running out of available connections, so we looked at those and found not to be the case. Secondly we looked at how haproxy is handling these connections and if there is an upper limit of sessions which might be preventing vault from creating new session.

We looked vault documentation for mysql backend here https://www.vaultproject.io/docs/v1.4.x/configuration/storage/mysql (note this link is for 1.4 version look at your vault version docs)

we decided to add max_idle_connections to 10 in hope that it will keep that many connections ready and in idle state to be used by vault, but that did not resolve the issue. in deperate attempt to solve the issue we also tried increasing it to 20 or 30 with no success.

Further searching on internet, we found that this error relates to system running out of available source ports to make connection in the pool, but what could be doing that. While searching we stumbled on this thread https://github.com/hashicorp/vault/issues/11936 which suggested that this seems to happen due to the connection pooling which is too small by default (unset?).

If we set max_idle_connections > max_parallel, the connections are not torn down and there is no churn. It has the obvious down side of having many connections open, but maybe max_parallel can be lowered too.

The following setting worked flawlessly.

max_idle_connections = 256
max_parallel = 128

This solved the issue for us. We set our max_parallel to 130, and max_idle_connections to 135, and it worked and our error was gone.