Archive

Posts Tagged ‘tool’

Elasticsearch util to copy/reindex index(es)

August 30th, 2015 No comments

Elasticsearch (and the entire ELK stack) is pretty useful open source piece of software for analyzing large datasets.   I manage a fairly large ELK infrastructure at work — around 90+ ES clusters, 300+ TB of data.   One of things I’ve found myself having to do is copying and/or reindexing one or more index(es).   Sometime to the same ES cluster, sometime moving index(es) to another cluster.

Regardless, it is just something that is done often enough, but yet in an ad-hoc manner.   It’s not worth setting up logstash config to do this and then tearing them down.

Here is an example logstash config to do something like this.

logstash config:

input {
 elasticsearch {
   hosts => [ "host1", "host2", ..., "hostN" ]
   index => "index"
 }
}
filter {
 ......
}
output {
 elasticsearch {
 .....
 }
}

This gets old fast when there are many indices. So I wrote a tool to do this in Go. I used the elastic go library from Olivere (https://github.com/olivere/elastic).

I call it espipe and put it on my Github repo — https://github.com/TinLe/tools.

You will need to download it, and make sure you have a golang build environment setup. Then change into the source where espipe.go is located and type go build.

If you don’t have golang build environment setup and just want the binary to use, you can d/l  espipe (this is built for linux x86_64).

 

Simple usage:

$ ./espipe -h
Usage of ./espipe:
  -bulksize int
    	Number of docs to send to ES per chunk (default to 500) (default 500)
  -dst string
    	Destination ES cluster (default to http://localhost:9200) (default "http://localhost:9200")
  -sidx string
    	Source index(es) to copy (default to all '*') (default "logstash*")
  -src string
    	Source ES cluster (default to http://localhost:9200) (default "http://localhost:9200")
  -tidx string
    	Target index to copy (default to 'copyidx') (default "copyidx")

# the following copy all nginx-access-YYYY.MM.DD indices from local cluster to
# anothercluster and consolidated all into one index
$ ./espipe -dst http://localhost:9200 -src http://anothercluster:9200 -sidx 'nginx-access*' -tidx 'nginx-consolidated' -bulksize 1000