come spostare i dati di elasticsearch da un server a un altro

Question 1

Come faccio a spostare i dati di Elasticsearch da un server a un altro?

Ho un server A che esegue Elasticsearch 1.1.1 su un nodo locale con più indici. Vorrei copiare quei dati sul server B che esegue Elasticsearch 1.3.4

Procedura finora

Chiudi ES su entrambi i server e
scp tutti i dati nella directory dati corretta sul nuovo server. (i dati sembrano trovarsi in / var / lib / elasticsearch / sulle mie macchine Debian)
modificare le autorizzazioni e la proprietà in elasticsearch: elasticsearch
avviare il nuovo server ES

Quando guardo il cluster con il plug-in ES head, non viene visualizzato alcun indice.

Sembra che i dati non vengano caricati. Mi sto perdendo qualcosa?

Question 2

La risposta selezionata lo fa sembrare leggermente più complesso di quello che è, quanto segue è ciò di cui hai bisogno (installa prima npm sul tuo sistema).

npm install -g elasticdump
elasticdump --input=http://mysrc.com:9200/my_index --output=http://mydest.com:9200/my_index --type=mapping
elasticdump --input=http://mysrc.com:9200/my_index --output=http://mydest.com:9200/my_index --type=data

È possibile saltare il primo comando elasticdump per le copie successive se le mappature rimangono costanti.

Ho appena eseguito una migrazione da AWS a Qbox.io con quanto sopra senza problemi.

Maggiori dettagli su:

https://www.npmjs.com/package/elasticdump

Pagina della guida (a partire da febbraio 2016) inclusa per completezza:

elasticdump: Import and export tools for elasticsearch

Usage: elasticdump --input SOURCE --output DESTINATION [OPTIONS]

--input
                    Source location (required)
--input-index
                    Source index and type
                    (default: all, example: index/type)
--output
                    Destination location (required)
--output-index
                    Destination index and type
                    (default: all, example: index/type)
--limit
                    How many objects to move in bulk per operation
                    limit is approximate for file streams
                    (default: 100)
--debug
                    Display the elasticsearch commands being used
                    (default: false)
--type
                    What are we exporting?
                    (default: data, options: [data, mapping])
--delete
                    Delete documents one-by-one from the input as they are
                    moved.  Will not delete the source index
                    (default: false)
--searchBody
                    Preform a partial extract based on search results
                    (when ES is the input,
                    (default: '{"query": { "match_all": {} } }'))
--sourceOnly
                    Output only the json contained within the document _source
                    Normal: {"_index":"","_type":"","_id":"", "_source":{SOURCE}}
                    sourceOnly: {SOURCE}
                    (default: false)
--all
                    Load/store documents from ALL indexes
                    (default: false)
--bulk
                    Leverage elasticsearch Bulk API when writing documents
                    (default: false)
--ignore-errors
                    Will continue the read/write loop on write error
                    (default: false)
--scrollTime
                    Time the nodes will hold the requested search in order.
                    (default: 10m)
--maxSockets
                    How many simultaneous HTTP requests can we process make?
                    (default:
                      5 [node <= v0.10.x] /
                      Infinity [node >= v0.11.x] )
--bulk-mode
                    The mode can be index, delete or update.
                    'index': Add or replace documents on the destination index.
                    'delete': Delete documents on destination index.
                    'update': Use 'doc_as_upsert' option with bulk update API to do partial update.
                    (default: index)
--bulk-use-output-index-name
                    Force use of destination index name (the actual output URL)
                    as destination while bulk writing to ES. Allows
                    leveraging Bulk API copying data inside the same
                    elasticsearch instance.
                    (default: false)
--timeout
                    Integer containing the number of milliseconds to wait for
                    a request to respond before aborting the request. Passed
                    directly to the request library. If used in bulk writing,
                    it will result in the entire batch not being written.
                    Mostly used when you don't care too much if you lose some
                    data when importing but rather have speed.
--skip
                    Integer containing the number of rows you wish to skip
                    ahead from the input transport.  When importing a large
                    index, things can go wrong, be it connectivity, crashes,
                    someone forgetting to `screen`, etc.  This allows you
                    to start the dump again from the last known line written
                    (as logged by the `offset` in the output).  Please be
                    advised that since no sorting is specified when the
                    dump is initially created, there's no real way to
                    guarantee that the skipped rows have already been
                    written/parsed.  This is more of an option for when
                    you want to get most data as possible in the index
                    without concern for losing some rows in the process,
                    similar to the `timeout` option.
--inputTransport
                    Provide a custom js file to us as the input transport
--outputTransport
                    Provide a custom js file to us as the output transport
--toLog
                    When using a custom outputTransport, should log lines
                    be appended to the output stream?
                    (default: true, except for `$`)
--help
                    This page

Examples:

# Copy an index from production to staging with mappings:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=mapping
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=data

# Backup index data to a file:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index_mapping.json \
  --type=mapping
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index.json \
  --type=data

# Backup and index to a gzip using stdout:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=$ \
  | gzip > /data/my_index.json.gz

# Backup ALL indices, then use Bulk API to populate another ES cluster:
elasticdump \
  --all=true \
  --input=http://production-a.es.com:9200/ \
  --output=/data/production.json
elasticdump \
  --bulk=true \
  --input=/data/production.json \
  --output=http://production-b.es.com:9200/

# Backup the results of a query to a file
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=query.json \
  --searchBody '{"query":{"term":{"username": "admin"}}}'

------------------------------------------------------------------------------
Learn more @ https://github.com/taskrabbit/elasticsearch-dump`enter code here`

Question 3

Usa ElasticDump

1) yum installa epel-release

2) yum installa nodejs

3) yum installa npm

4) npm installa elasticdump

5) cd node_modules / elasticdump / bin

6)

./elasticdump \

  --input=http://192.168.1.1:9200/original \

  --output=http://192.168.1.2:9200/newCopy \

  --type=data

Question 4

Puoi utilizzare la funzione di istantanea / ripristino disponibile in Elasticsearch per questo. Dopo aver configurato un archivio di istantanee basato su file system, è possibile spostarlo tra i cluster e ripristinarlo su un cluster diverso

Question 5

Ho provato su Ubuntu a spostare i dati da ELK 2.4.3 a ELK 5.1.1

Di seguito sono riportati i passaggi

$ sudo apt-get update

$ sudo apt-get install -y python-software-properties python g++ make

$ sudo add-apt-repository ppa:chris-lea/node.js

$ sudo apt-get update

$ sudo apt-get install npm

$ sudo apt-get install nodejs

$ npm install colors

$ npm install nomnom

$ npm install elasticdump

nella home directory goto

$ cd node_modules/elasticdump/

eseguire il comando

Se hai bisogno di autenticazione http di base, puoi usarlo in questo modo:

--input=http://name:password@localhost:9200/my_index

Copia un indice dalla produzione:

$ ./bin/elasticdump --input="http://Source:9200/Sourceindex" --output="http://username:password@Destination:9200/Destination_index"  --type=data

Question 6

C'è anche l' _reindexopzione

Dalla documentazione:

Tramite l'API di reindicizzazione di Elasticsearch, disponibile nella versione 5.xe successive, puoi connettere la tua nuova distribuzione di Elasticsearch Service in remoto al tuo vecchio cluster Elasticsearch. Questo estrae i dati dal vecchio cluster e li indicizza in quello nuovo. La reindicizzazione essenzialmente ricostruisce l'indice da zero e può richiedere più risorse da eseguire.

POST _reindex
{
  "source": {
    "remote": {
      "host": "https://REMOTE_ELASTICSEARCH_ENDPOINT:PORT",
      "username": "USER",
      "password": "PASSWORD"
    },
    "index": "INDEX_NAME",
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "INDEX_NAME"
  }
}

Question 7

Se puoi aggiungere il secondo server al cluster, puoi farlo:

Aggiungi il server B al cluster con il server A
Aumenta il numero di repliche per gli indici
ES copierà automaticamente gli indici sul server B
Chiudi il server A
Diminuire il numero di repliche per gli indici

Funzionerà solo se il numero di sostituzioni è uguale al numero di nodi.

Question 8

Se qualcuno riscontra lo stesso problema, quando si tenta di eseguire il dump da elasticsearch <2.0 a> 2.0 è necessario:

elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=analyzer
elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=mapping
elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=data --transform "delete doc.__source['_id']"

Question 9

Ho sempre avuto successo semplicemente copiando la directory / cartella dell'indice sul nuovo server e riavviandolo. Troverai l'ID indice facendo GET /_cat/indicese la cartella corrispondente a questo ID si trova data\nodes\0\indices(di solito all'interno della tua cartella elasticsearch a meno che tu non l'abbia spostata).

Question 10

Possiamo usare elasticdumpo multielasticdumpper prendere il backup e ripristinarlo, possiamo spostare i dati da un server / cluster a un altro server / cluster.

Si prega di trovare una risposta dettagliata che ho fornito qui .

Question 11

Se hai semplicemente bisogno di trasferire dati da un server elasticsearch a un altro, puoi anche utilizzare elasticsearch-document-transfer .

Passaggi:

Apri una directory nel tuo terminale ed esegui
$ npm install elasticsearch-document-transfer.
Crea un file config.js
Aggiungi i dettagli di connessione di entrambi i server elasticsearch in config.js
Imposta i valori appropriati in options.js
Esegui nel terminale
$ node index.js

Question 12

È possibile acquisire un'istantanea dello stato completo del cluster (inclusi tutti gli indici di dati) e ripristinarli (utilizzando l'API di ripristino) nel nuovo cluster o server.