オンラインバックアップ

Neo4j Wiki から

The online-backup component referred to from this site is obsolete as of 2011-02-08, Neo4j version 1.3.M01 Abisko Lampa. Manual for the new backup component will arrive soon.
This page will teach you how to use the Neo4j online backup component.


目次

[edit] Summary

In summary, this is what you have to do:
  • Preparation:
    • shutdown the database, copy all the files to the backup location
    • configure the database to keep its logical log in the future
  • Performing backup:
    • instantiate Neo4jBackup according to your scenario (location/running, single/multiple data sources)
    • configure file output and log level of the backup log
    • off you go, doBackup()!

[edit] Online backup basics

Note: If you don't really need to backup from a running Neo4j instance, simply shut down Neo4j and copy the store files to a different location.

The online backup utility can be used to synchronize a destination Neo4j database from a source Neo4j database. The source database is a running EmbeddedGraphDatabase instance, which can continue to run as usual during the backup.

The destination is either a running EmbeddedGraphDatabase or a file system location with a Neo4j database in it. The destination database has to start out as a copy of the files of the original data store.

All completed transactions for all included data sources will be copied to the backup. Transactions that are still open don't affect the backup, and are of course not included in the backup.

The component information is located at: http://components.neo4j.org/neo4j-online-backup/

Adding online-backup as a Maven dependency is done like this (assuming version 0.6-SNAPSHOT):

    <dependency>
    	<groupId>org.neo4j</groupId>
    	<artifactId>neo4j-online-backup</artifactId>
    	<version>0.6-SNAPSHOT</version>
    </dependency>


If you want to download the component as a jar file, it's found here: http://m2.neo4j.org/org/neo4j/neo4j-online-backup/

[edit] Database configuration

Note: All data sources included in a backup have to be set to keep their logical logs.

The backup relies on using the logical logs, so the original (source) database has to be configured to keep the logs.

With the latest SNAPSHOT version this can be done using the keep_logical_logs configuration setting. The value is a comma separated list of data sources or true to keep logs for all registered data sources. If you use Neo4j together with the index component, this setting would apply:

keep_logical_logs=nioneodb,lucene

for data sources nioneodb and lucene. Or:

keep_logical_logs=true

for all registered data sources.

To activate the same configuration programmatically, do like this:

        EmbeddedGraphDatabase graphDb = new EmbeddedGraphDatabase( STORE_LOCATION_DIR );
        XaDataSourceManager xaDsMgr = graphDb.getConfig().getTxModule().getXaDataSourceManager();
        XaDataSource dataSource = xaDsMgr.getXaDataSource( "nioneodb" );
        dataSource.keepLogicalLogs( true );
        dataSource = xaDsMgr.getXaDataSource( "lucene" );
        dataSource.keepLogicalLogs( true );

If you are using fulltext indexing, add the lucene-fulltext data source as well.

There are also settings for auto-rotating the logs. These are the corresponding methods, using the default values for the settings:

        dataSource.setAutoRotate( true );
        dataSource.setLogicalLogTargetSize( 10 * 1024 * 1024 ); // 10 MB

[edit] How to perform backup

Note: The very first backup has to be performed by shutting down the neo4j database and copying its files to the backup location. All subsequent backups can then be performed online using the online backup utility to keep the backup in sync with the live database.

The backup method can differ in two ways:

  1. destination is a running EmbeddedGraphDatabase instance vs. only the location of a neo4j database is given
  2. there is just a single data source (e.g. neo4j) vs. multiple data sources (e.g. neo4j + lucene)

We will walk you through the different alternatives below.

[edit] Single data source; backup to file system location

        EmbeddedGraphDatabase graphDb = getTheGraphDbFromApp();
        String location = "/var/backup/neo4j-db";
        Backup backup = Neo4jBackup.neo4jDataSource( graphDb, location );
        // Previous versions have a constructor instead, like new Neo4jBackup( graphDb, location );
        backup.doBackup();

That's it.

Note: If there is a problem writing to the file system location Backup.doBackup() will throw an IOException.

[edit] Single data source; backup to running backup database

        EmbeddedGraphDatabase graphDb = getTheGraphDbFromApp();
        String location = "/var/backup/neo4j-db";
        EmbeddedGraphDatabase backupGraphDb = new EmbeddedGraphDatabase( location );
        Backup backup = Neo4jBackup.neo4jDataSource( graphDb, backupGraphDb );
        // Previous versions have a constructor instead, like new Neo4jBackup( graphDb, backupGraphDb );
        backup.doBackup();
        backupGraphDb.shutdown();

Not much to say here. Feed both databases to Neo4jBackup and you should be fine.

[edit] Multiple data sources; backup to file system location

This variation will lookup all registered data sources and include them in the backup.

        EmbeddedGraphDatabase graphDb = getTheGraphDbFromApp(); // assume lucene is hooked into this instance
        String location = "/var/backup/neo4j-db";
        Backup backup = Neo4jBackup.allDataSources( graphDb, location );
        // Previous versions have a constructor instead, like new Neo4jBackup( graphDb, location, Arrays.asList( "nioneodb", "lucene" ) );
        backup.doBackup();

[edit] Multiple data sources; backup to running data sources

This variation will lookup all registered data sources and include them in the backup.

(TODO: missing info: how to wrap your data source to be used together with neo4j)

        EmbeddedGraphDatabase neo = getTheGraphDbFromApp();
        String location = "/var/backup/neo4j-db";
        EmbeddedGraphDatabase backupGraphDb = new EmbeddedGraphDatabase( location );
        IndexService backupIndexService = new LuceneIndexService( backupGraphDb );
        Backup backup = Neo4jBackup.allDataSources( graphDb, backupGraphDb );
        // Previous versions have a constructor instead, like new Neo4jBackup( graphDb, backupGraphDb, Arrays.asList( "nioneodb", "lucene" ) );
        // (in which case you had to know the actual data sources)
        backup.doBackup();
        backupIndexService.shutdown();
        backupGraphDb.shutdown();

[edit] Custom data sources

This variation should be considered an expert mode. Here you can choose exactly which data sources should be included in the backup.

        EmbeddedGraphDatabase neo = getTheGraphDbFromApp();
        String location = "/var/backup/neo4j-db";
        EmbeddedGraphDatabase backupGraphDb = new EmbeddedGraphDatabase( location );
        IndexService backupIndexService = new LuceneIndexService( backupGraphDb );
        Backup backup = Neo4jBackup.customDataSources( graphDb, backupGraphDb, "nioneodb", "lucene" );
        // Previous versions have a constructor instead, like new Neo4jBackup( graphDb, backupGraphDb, Arrays.asList( "nioneodb", "lucene" ) );
        backup.doBackup();
        backupIndexService.shutdown();
        backupGraphDb.shutdown();

To backup to a destination directory, instead use:

        String location = "/var/backup/neo4j-db";
        Backup backup = Neo4jBackup.customDataSources( graphDb, location, "nioneodb", "lucene" );
        backup.doBackup();

[edit] Manually transferring and applying logical logs

If you have a running Neo4j graph database which is set up to keep its logical logs, you can manually copy or move rotated logical logs from the server and have a client apply them on a destination database. The first step still is to start with a copy of the source database and from there apply new logs incrementally whenever you like.

It's done by starting up a new JVM and run the org.neo4j.onlinebackup.ApplyNewLogs main class with a path to the destination database where you've put your copied/moved logical logs from the source database (keeping the directory structure from the source database). It will then apply those logs on the destination database. Example (assuming you have a running source database in /var/db and a destintion database (originated from the source database at some point) in /var/backup-db:

mv /var/db/*log.v* /var/backup-db/
java -cp $CLASSPATH_INCLUDING_ONLINE_BACKUP_AND_ITS_DEPENDENCIES \
            org.neo4j.onlinebackup.ApplyNewLogs /var/backup-db

If you're using LuceneIndexService/LuceneFulltextIndexService as well you'll have to additionally move/copy its logs. So the script can be extended to this:

mv /var/db/*log.v* /var/backup-db/
mv /var/db/lucene/*log.v* /var/backup-db/lucene/
mv /var/db/lucene-fulltext/*log.v* /var/backup-db/lucene-fulltext/
java -cp $CLASSPATH_INCLUDING_ONLINE_BACKUP_AND_INDEX_AND_THEIR_DEPENDENCIES \
            org.neo4j.onlinebackup.ApplyNewLogs /var/backup-db

[edit] Backup logs

As per default, backup logs are sent to standard error output (usually to the console, that is). If you want to you can enable logging to a file as well (default is: off), using the following method call:

        backup.enableFileLogger();

The log file will be named backup.log and created or appended to in the current working directory.

From version 0.6 (only snapshots available as of 2010-06-17) you can specify the backup log location like this:

        backup.enableFileLogger("/path/to/location/neo4j-backup.log");

Changed your mind?Then go:

        backup.disableFileLogger();

There are three different log levels to choose from:

        backup.setLogLevelNormal(); // default, few lines of output
        backup.setLogLevelDebug();  // detailed output
        backup.setLogLevelOff();    // no output at all

This setting affects both console and file log output.

[edit] Inspecting a backup

To inspect a backup nodespace, which is intended to be used again as the destination for online-backup, you have to connect to it in read-only mode. The Shell tool supports this, while Neoclipse does at the moment not.

From code, make sure to use EmbeddedReadOnlyGraphDatabase when inspecting a backup nodespace. This will leave the database in a correct state for use with online-backup again.

Neo4j のサイト
ツールボックス