Redisļ
The Heroku Data for
Redis add-on is
attached to our nextstrain-server
, nextstrain-canary
, and
nextstrain-dev
apps. Redis is used to persistently store login
sessions after authentication via AWS Cognito. A
persistent data store is important for preserving sessions across
deploys and regular dyno restarts.
Note
This amounts to using Redis as a database rather the more common approach of using it as a cache.
Maintenanceļ
Heroku will automatically perform minor maintenance tasks such as patching the operating system or required libraries. Email notifications will be sent for these, typically with a subject line of Maintenance required on your Redis add-on (REDIS on nextstrain-server). These are expected to just work. In case any issues arise, the scheduled maintenance window (Friday at 22:00 UTC to Saturday at 02:00 UTC) tries to optimize for being outside/on the fringes of business hours in relevant places around the world while being in US/Pacific business hours so the Seattle team can respond.
Upgrading the add-on versionļ
Heroku occasionally releases a new major version of the add-on. When this happens, the oldest supported version is deprecated. Based on previous upgrades and projected end-of-life dates, this happens about once every 1.5-2 years.
When this happens, Heroku begins sending deprecation notices via email.
Steps to ensure a smooth upgrade are detailed below. They have been
adapted from @tsibleyās notes on the 5 ā 6
upgrade,
which is based on the --fork
upgrade method
described in Herokuās own documentation.
Warning
Instead of entering maintenance mode for the whole site (as suggested by Herokuās docs), weāll instead put it into a slightly degraded state by removing (read/write) access to Redis.
This wonāt affect access to public resources, but will affect anyone with an existing login session or establishing a new login session during the very brief switchover window:
Any group member change made via the RESTful API (
{PUT, DELETE} /groups/{name}/settings/roles/{role}/members/{username}
) during the period will lose the āuserStaleBeforeā mark for the changed member. Users will have to manually log out then back in or wait up to an hour for those changes to take effect.Existing login sessions will be temporarily āforgottenā. Theyāll be ārememberedā again after the upgrade.
New login sessions established during the upgrade will be permanently forgotten after the upgrade. Anyone unfortunate enough to encounter this will need to log in again, although based on current usage, it can be expected to affect approximately zero people.
The manual steps come with the benefit of allowing the majority of the site to remain usable.
Warning
During each step that causes a restart of the app, web requests may need to wait ~30 seconds for a response. There is an open issue tracking this behavior.
Note
Heroku provides an in-place upgrade method that is much simpler than the steps below. However, there is no option to roll back in case anything unexpected happens. If we find ourselves going through this process more often without any failures and this becomes too tedious, it may be worth switching to use the in-place upgrade method.
See previous discussion for tradeoffs between the approaches.
Gather information.
# This assumes there is only one instance. old_instance=$(heroku addons --app nextstrain-server --json | jq -r '.[] | select(.addon_service.name == "heroku-redis") | .name') heroku redis:info "$old_instance" -a nextstrain-server | tee redis-info heroku addons:info "$old_instance" | tee redis-addon-info
Start a watch session on logs for
nextstrain-server
:heroku logs --app nextstrain-server --tail
Tip
These logs show console output from the server as well as every incoming HTTP request. This is inherently noisy, but any messages with
status=5xx
should be investigated.Keep the logs streaming until the update is completed, since
heroku logs
is only able to provide the last 1500 messages retrospectively.It can also be useful to refresh https://nextstrain.org after every step that updates the app to ensure the app is still running as expected.
Log in to one of the instances (dev,canary,server) if you are not already.
Disable the Redis requirement across apps:
for app in nextstrain-{dev,canary,server}; do heroku config:set REDIS_REQUIRED=false -a "$app" done
Disabling writes to Redis by changing its attachment from
REDIS
toOLD_REDIS
on the apps:for app in nextstrain-{dev,canary,server}; do heroku addons:attach --as OLD_REDIS "$old_instance" -a "$app" heroku addons:detach REDIS -a "$app" done
Warning
This step causes the site to enter the degraded state described earlier. If the need arises, you can roll back to the old instance:
for app in nextstrain-{dev,canary,server}; do heroku addons:attach --as REDIS "$old_instance" -a "$app" heroku addons:detach OLD_REDIS -a "$app" done
Create the new, upgraded Redis instance on
nextstrain-server
as a fork (snapshot copy) of the old:heroku addons:create heroku-redis:premium-0 \ --as NEW_REDIS \ -a nextstrain-server \ --fork "$(heroku config:get OLD_REDIS_URL -a nextstrain-server)"
Set a variable for the new instance name to be used in subsequent steps:
# Replace value with name from output of previous step new_instance="redis-X-N"
Wait for it to be ready:
heroku addons:info "$new_instance"
Its
State
will change fromcreating
tocreated
.Check that the fork is done:
heroku redis:info "$new_instance" -a nextstrain-server
This starts at
fork in progress
and is supposed to change once completed (forks start as replicas and then switch to primaries), but it may appear stuck in that state. If that happens, it should be safe to continue as long as all data looks to be transferred. Do this by entering Redis CLI (heroku redis:cli
) on both instances and comparing the output of:Compare settings to the previous instance and adjust as necessary:
heroku redis:info "$new_instance" -a nextstrain-server | tee redis-new-info git diff redis{,-new}-info # make adjustments with other `heroku redis:ā¦` commands
These adjustments have been necessary during previous upgrades (
data:maintenances:window:update
requires the Data Maintenance CLI Plugin):heroku redis:maxmemory "$new_instance" -a nextstrain-server -p volatile-ttl heroku data:maintenances:window:update "$new_instance" Friday 22:00 -a nextstrain-server
Use the new Redis instance on across apps:
heroku redis:promote "$new_instance" -a nextstrain-server # attaches as REDIS heroku addons:detach NEW_REDIS -a nextstrain-server # removes old NEW_REDIS attachment for app in nextstrain-{dev,canary}; do heroku addons:attach --as REDIS "$new_instance" -a "$app" done
Test that the new instance works:
Load the website and check that your login session is now ārememberedā again.
Check that you can successfully log out and log back in.
Check that you can remove/add a member from a group.
Remove the old Redis instance:
for app in nextstrain-{dev,canary,server}; do heroku addons:detach OLD_REDIS -a "$app" done heroku addons:destroy "$old_instance"
Reinstate the Redis requirement across apps:
for app in nextstrain-{dev,canary,server}; do heroku config:unset REDIS_REQUIRED -a "$app" done
Limitationsļ
If our Redis instance reaches its maximum memory limit, existing keys
will be evicted using the volatile-ttl
policy
to make space for new keys. This should preserve the most active logged
in sessions and avoid throwing errors if we hit the limit. If we
regularly start hitting the memory limit, we should bump up to the next
add-on plan, but I donāt expect this to happen anytime soon with current
usage.