Upgrading an Elasticsearch cluster from 2.x to 5.0.0

Tweet about this on TwitterDigg thisShare on RedditShare on Google+Share on FacebookEmail this to someoneShare on LinkedIn

Sisyphe sister project is running an Elasticsearch cluster. We wrote two Briks to help us manage this cluster: client::elasticsearch and client::elasticsearch::query. The first one as some raw mappings to the Elasticsearch API and is mainly using the Search::Elasticsearch Perl module. The second serves as a quick way to perform basic searches on your indexes. In this post, we will describe how to use these modules to perform an Elasticsearch major version upgrade.

Requirements

You will of course have to install Metabrik on a host with access to all nodes of your cluster. Once done, you have to load and configure required Briks:

use brik::tool
run brik::tool install client::elasticsearch
run brik::tool install client::elasticsearch::query

Once installed, it is not required to call these Commands anymore (except to update dependencies). You should do it now, in case you have an older version of Search::Elasticsearch Perl module. Without the version 5.x, this procedure will fail.

use client::elasticsearch
use client::elasticsearch::query
my $nodes = [ 'http://node1:9200', 'http://node2:9200', 'http://node3:9200' ]
set client::elasticsearch nodes $nodes
set client::elasticsearch::query nodes $nodes
run client::elasticsearch open

You can put these lines in your ~/.metabrik_rc file and start metabrik.

Backup all your indices

Performing such a major upgrade is a risky thing. We urge you to refrain skipping this step. To make things easy, we developed an export_to_csv Command. It is as simple as executing it for all your indices like:

run client::elasticsearch list_indices
for (@$RUN) { $CON->run('client::elasticsearch', 'export_as_csv', $_) }

All your indices will be saved in the current directory as CSV files. One per combination of index and type.

First steps with client::elasticsearch Brik

You have plenty of Commands available in this Brik. We will not describe all of them here, but you can start by typing help at the prompt and try the info Command:

help client::elasticsearch
run client::elasticsearch::info

client-elasticsearch-info

You may also try the get_cluster_health Command or the list_indices one:

run client::elasticsearch get_cluster_health
run client::elasticsearch list_indices

get-cluster-health

Now you got the picture, try playing with some other Commands like the www_search Command:

run client::elasticsearch www_search * www-2016-02-06

search-www

Time to upgrade your cluster

Elasticsearch 5.0.0 is out, and we wanted to give a try. We had to update the client::elasticsearch Brik to make it compatible with this new version, but more importantly, we had to upgrade our cluster from 2.4.x to 5.0.0. Here is the procedure you may apply, it is based on the official document for a cluster upgrade.

Stop your indexation tasks

Of course, we will have to stop all of your ES (Elasticsearch) instances, so the first step is to stop all of your indexation tasks. We are mainly using logstash, so we have to stop these processes on all of instances. The specific command to run on your server depends on the operating system. For us, it is a matter of shutting it down like:

sudo service logstash stop

Once done, you have to disable shard allocation as described in the documentation. There is a Command for that, and another one to verify it has been applied:

run client::elasticsearch flush_synced
run client::elasticsearch disable_shard_allocation
run client::elasticsearch get_cluster_settings

get-cluster-settings

Then perform another synced flush to make recovery faster after cluster restart. It may take some time.

run client::elasticsearch flush_synced

Backup required indices

You will have to backup your indices. You should have already done it by step one. Some of them were probably created with an older version of ES (like before 2.0.0), backups will be used to restore them in the new index format for ES 5.0.0. After the backup is done, you must delete old indices so they will not interfere with the start of ES 5.0.0 process. Our upgrade process failed at first time because of those old indices. So we had to export them from the ES 2.x cluster and import them back after the ES 5.x upgrade. Typical error message is:

"The index [[index-2016-02-03/tIhwAIL3R6G4zTUG6ucf6g]] was created before v2.0.0.beta1. It should be reindexed in Elasticsearch 2.x before upgrading to 5.0.0."
run client::elasticsearch list_indices_version *

list-indices-version

If you have older indices than version “2020199” (it means 2.2.1), you should consider reindexing instead (see previously mentioned CSV import/export method). If backup task completed successfully, you can now safely delete your backed-up indices:

run client::elasticsearch delete_index index-1,index-2,other-*

Alternative backup option with snapshoting

You may also consider using the snapshoting feature available with ES to backup your indices. This function will unfortunately not upgrade them, and you will have to export/import some of them as CSV for reindexing. But for future management of your indices, it is a feature worth to know.

To take advantage of it, your first have to have a shared filesystem available for all your nodes. We did configure a shared NFS server for that, and updated elasticsearch.yml configuration file to add the following shared path:

path.repo: ["/nfs/backup"]

Create the snapshot repository and backup

One NFS is setup and running and all nodes can read/write to it, you have to create a snapshot repository:

run client::elasticsearch create_shared_fs_snapshot_repository /nfs/backup/es

Verify it has worked:

run client::elasticsearch get_snapshot_repositories

Then perform the backup as either a full backup or as a selected backup of specific indices:

run client::elasticsearch create_snapshot
run client::elasticsearch create_snapshot_for_indices "[ qw(index1 index2 other-*) ]"

Wait for it to be done and look at its progress with:

do { $RUN = ! $CON->run('client::elasticsearch', 'is_snapshot_finished'); print "Is running: $RUN\n"; sleep(5) } while ($RUN)

Alternatively, look at its status:

run client::elasticsearch get_snapshot_status

Restore snapshoted indices

Later on, if you want to restore indices:

run client::elasticsearch restore_snapshot snapshot repository

Note: you may still be unable to restore ancient indexes. If you have to restore only specific indices, you can do it by using the restore_snapshot_for_indices Command:

run client::elasticsearch restore_snapshot_for_indices "[ qw(index-2016-05-*) ]" type repository

And to see progress:

do { $RUN = $CON->run('client::elasticsearch', 'count_yellow_shards'); print "Remaining: $RUN\n"; sleep(60) } while ($RUN)

Shutdown and upgrade all nodes

Now stop all your elasticsearch processes.

sudo service elasticsearch stop

The software upgrade process depends on your operating system, we will not describe it here. You also have to consider upgrading any installed plugins. After software upgrade, you will have to change some configuration directives which are either new or obsolete. For instance, we had to remove:

index.number_of_replicas
discovery.zen.ping.multicast.enabled
path.work
path.plugins

And we had to create a new directory:

mkdir /usr/local/etc/elasticsearch/scripts

Other breaking changes list can be found here. It is also safe to set the minimum master nodes parameter value as described here:

discovery.zen.minimum_master_nodes: 2 as described

Time to start and pray

Before restarting, we rename the old log file so we can see easily the new process starting up and potential errors. We restart all our nodes and pray for a good and fast recovery.

Note: for FreeBSD, we had to modify the rc.d script to enforce Java heap sizes:

ES_JAVA_OPTS="-Xms8g -Xmx8g"
export ES_JAVA_OPTS

Typical error message is:

[2016-11-13T07:43:25,786][ERROR][o.e.b.Bootstrap ] [node1] node validation exception
 bootstrap checks failed
 initial heap size [536870912] not equal to maximum heap size [8558477312]; this can cause resize pauses and prevents mlockall from locking the entire heap

Restore all indices and reenable indexation

You upgrade should be completed now. You have restarted your elasticsearch processes on all your nodes with:

sudo service elasticsearch start
sudo service logstash start

You can renable the shard allocation and import your saved CSV backups:

run client::elasticsearch enable_shard_allocation
run shell::command capture ls *.csv
for (@$RUN) { $CON->run('client::elasticsearch', 'import_from_csv', $_) }

Upgrade of Logstash and Kibana

Finally, you have to upgrade Logstash and Kibana to version 5.0.0. Hopefully, it perfectly worked for us. We hope your upgrade will go smoothly thanks to this guide, please let us know of any success or failure at doing so 🙂

 

Tweet about this on TwitterDigg thisShare on RedditShare on Google+Share on FacebookEmail this to someoneShare on LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *