libfoedus-core
FOEDUS Core Library
Snapshot Manager

Snapshot Manager, which manages snapshot files of the database. More...

Detailed Description

Snapshot Manager, which manages snapshot files of the database.

This package contains classes to handle snapshot files.

Snapshot

One snapshot consists of a set of snapshot files and a snapshot metadata file (snapshot.xml). Every snapshot is tagged with base epoch and valid-until epoch. The snapshot logically contains all information of the database upto valid-until epoch, meaning the previous snapshot (whose valid-until should be equal to base epoch of the snapshot) as well as all log entries in the transactional log from a base epoch until valid-until epoch.

Snapshot Files

However, the snapshot does not necessarily physically contains all information from the previous snapshot because it makes each snapshot-ing too expensive. One common approach is LSM-Tree (Log Structured Merge Tree), but it is not a good fit for serializable transactional processing. Even a trivial primary key constraint would be too expensive on top of LSM-Tree.

Instead, snapshot files in FOEDUS are overlays of the database image. Each snapshot file contains new version of data pages that overwrite a required portion of storages. They are not incremental new-tuples/tombstones data as in LSM. They are a complete representation of the storages, but it might contain pointers to old snapshot files if pages under it had no change.

In the worst case, transactions in one epoch updates just one tuple in every page, resulting in a snapshot that physically contains all data. However, it is rare and such a workload is fundamentally expensive if data size does not fit DRAM (if it does, this approach is also fine).

Making a new Snapshot

Snapshot Manager creates a new set of snapshot files as well as its metadata file occasionally. The frequency is a tuning knob. The mechanism to create snapshot files is called Log-Gleaner (foedus::snapshot::LogGleaner). See its documentation below.

Log Gleaner Overview

LogGleaner is the main class that manages most mechanisms to construct a new set of snapshot files. Snapshot procedure constructs and calls this object during snapshot. It receives partitioning policy (which snapshot partitions to send ranges of keys) per storage and beginning/ending epoch of logs to glean while log-gleaning.

Log-gleaning consists of two components; mapper (foedus::snapshot::LogMapper) and reducer (foedus::snapshot::LogReducer), obviously named after the well-known map-reduce concepts.

Mapper

LogGleaner launches a set of mapper threads (foedus::snapshot::LogMapper) to read log files. Each LogMapper corresponds to foedus::log::Logger, the NUMA-local log writer which simply writes out log entries produced by local worker threads. Thus, the log files contain log entries that might be sent to any partitions. LogMapper maps each log entry to some partition and send it to a reducer corresponding to the partition. For more details, see foedus::snapshot::LogMapper.

Reducer

LogGleaner also launches a set of reducer threads (foedus::snapshot::LogReducer), one for each NUMA node. LogReducer sorts log entries sent from LogMapper. The log entries are sorted by key and ordinal (*), then processed just like usual APPLY at the end of transaction, but on top of snapshot files.

(*) otherwise correct result is not guaranteed. For example, imagine the following case:

Ordinal-1 must be processed before ordinal 2.

Synchronization

LogGleaner coordinates the synchronization between mappers and reducers during snapshotting. At the beginning of snapshotting, gleaner wakes up reducers and mappers. Mappers go in to sleep when they process all logs. When all mappers went to sleep, reducers start to also go into sleep when they process all logs they receive. When all of them are done, gleaner initiates the last wrap-up phase. Additionally, LogGleaner is in charge of receiving termination request from the engine if the user invokes Engine::uninitialize() and requesting reducers/mappers to stop.

Reducers/mappers occasionally check if they are requested to stop when they get idle or complete all work. They do the check at least once for a while so that the latency to stop can not be catastrophic.

Constructing Root Pages

After all mappers and reducers complete, the last phase of log gleaning is to construct root pages for the storages modified in this snapshotting. Gleaner collects root-page-info from each reducer and combines them to create the root page(s). When all set, gleaner produces maps from storage ID to a new root page ID. This will be written out in a snapshot metadata file by snapshot manager.

Note
This is a private implementation-details of Snapshot Manager, thus file name ends with _impl. Do not include this header from a client program. There is no case client program needs to access this internal class.
Collaboration diagram for Snapshot Manager:

Files

file  fwd.hpp
 Forward declarations of classes in snapshot manager package.
 
file  snapshot_id.hpp
 Typedefs of ID types used in snapshot package.
 

Classes

struct  foedus::snapshot::LogBuffer
 Packages handling of 4-bytes representation of position in log buffers. More...
 
class  foedus::snapshot::SortedBuffer
 Represents one input stream of sorted log entries. More...
 
class  foedus::snapshot::InMemorySortedBuffer
 Implementation of SortedBuffer that is backed by fully in-memory buffer. More...
 
class  foedus::snapshot::DumpFileSortedBuffer
 Implementation of SortedBuffer that is backed by a dumped file. More...
 
class  foedus::snapshot::LogGleaner
 A log-gleaner, which constructs a new set of snapshot files during snapshotting. More...
 
class  foedus::snapshot::LogGleanerRef
 A remote view of LogGleaner from all engines. More...
 
struct  foedus::snapshot::LogGleanerResource::PerNodeResource
 These buffers are used to read intermediate results from each reducer to compose the root page or other kinds of pages that weren't composed in each reducer (eg. More...
 
struct  foedus::snapshot::LogGleanerResource
 Local resource for the log gleaner, which runs only in the master node. More...
 
class  foedus::snapshot::LogMapper
 A log mapper, which reads log files from one logger and sends them to corresponding log reducers. More...
 
struct  foedus::snapshot::ReducerBufferStatus::Components
 
union  foedus::snapshot::ReducerBufferStatus
 Compactly represents important status informations of a reducer buffer. More...
 
struct  foedus::snapshot::BlockHeaderBase
 All log blocks in mapper/reducers start with this header. More...
 
struct  foedus::snapshot::FullBlockHeader
 All blocks that have content start with this header. More...
 
struct  foedus::snapshot::LogReducerControlBlock
 Shared data for LogReducer. More...
 
class  foedus::snapshot::LogReducer
 A log reducer, which receives log entries sent from mappers and applies them to construct new snapshot files. More...
 
class  foedus::snapshot::LogReducerRef
 A remote view of LogReducer from all engines. More...
 
class  foedus::snapshot::MapReduceBase
 Base class for LogMapper and LogReducer to share common code. More...
 
struct  foedus::snapshot::MergeSort::SortEntry
 Entries we actually sort. More...
 
struct  foedus::snapshot::MergeSort::PositionEntry
 Provides additional information for each entry we are sorting. More...
 
struct  foedus::snapshot::MergeSort::InputStatus
 Current status of each input. More...
 
struct  foedus::snapshot::MergeSort::AdjustComparatorMasstree
 Used in batch_sort_adjust_sort if the storage is a masstree storage. More...
 
struct  foedus::snapshot::MergeSort::GroupifyResult
 Represents a group of consecutive logs in the current batch. More...
 
class  foedus::snapshot::MergeSort
 Receives an arbitrary number of sorted buffers and emits one fully sorted stream of logs. More...
 
struct  foedus::snapshot::Snapshot
 Represents one snapshot that converts all logs from base epoch to valid_until epoch into snapshot file(s). More...
 
class  foedus::snapshot::SnapshotManager
 Snapshot manager that atomically and durably writes out a snapshot file. More...
 
class  foedus::snapshot::SnapshotManagerPimpl
 Pimpl object of SnapshotManager. More...
 
struct  foedus::snapshot::SnapshotMetadata
 Represents the data in one snapshot metadata file. More...
 
struct  foedus::snapshot::SnapshotOptions
 Set of options for snapshot manager. More...
 
class  foedus::snapshot::SnapshotWriter
 Writes out one snapshot file for all data pages in one reducer. More...
 
struct  foedus::storage::Composer::ComposeArguments
 Arguments for compose() More...
 
struct  foedus::storage::Composer::ConstructRootArguments
 Arguments for construct_root() More...
 
struct  foedus::storage::Composer::DropVolatilesArguments
 Arguments for drop_volatiles() More...
 
struct  foedus::storage::Composer::DropResult
 Retrun value of drop_volatiles() More...
 
class  foedus::storage::Composer
 Represents a logic to compose a new version of data pages for one storage. More...
 

Typedefs

typedef uint16_t foedus::snapshot::SnapshotId
 Unique ID of Snapshot. More...
 
typedef uint32_t foedus::snapshot::BufferPosition
 Represents a position in some buffer. More...
 

Functions

SnapshotId foedus::snapshot::increment (SnapshotId id)
 Increment SnapshotId. More...
 

Class Documentation

struct foedus::snapshot::ReducerBufferStatus::Components

Definition at line 71 of file log_reducer_impl.hpp.

Class Members
uint16_t active_writers_
uint16_t flags_
BufferPosition tail_position_
struct foedus::storage::Composer::ComposeArguments

Arguments for compose()

Definition at line 95 of file composer.hpp.

Collaboration diagram for foedus::storage::Composer::ComposeArguments:
Class Members
Epoch base_epoch_ All log entries in this inputs are assured to be after this epoch.

Also, it is assured to be within 2^16 from this epoch.

SortedBuffer *const * log_streams_ Sorted runs.
uint32_t log_streams_count_ Number of sorted runs.
SnapshotFileSet * previous_snapshot_files_ To read existing snapshots.
Page * root_info_page_ [OUT] Returns pointers and related information that is required to construct the root page.

The data format depends on the composer. In all implementations, the information must fit in one page (should be, otherwise we can't have a root page)

SnapshotWriter * snapshot_writer_ Writes out composed pages.
AlignedMemory * work_memory_ Working memory to be used in this method.

Automatically expand if needed.

struct foedus::storage::Composer::ConstructRootArguments

Arguments for construct_root()

Definition at line 124 of file composer.hpp.

Collaboration diagram for foedus::storage::Composer::ConstructRootArguments:
Class Members
LogGleanerResource * gleaner_resource_ All pre-allocated resouces to help run construct_root(), such as memory buffers.
SnapshotPagePointer * new_root_page_pointer_ [OUT] Returns pointer to new root snapshot page/
SnapshotFileSet * previous_snapshot_files_ To read existing snapshots.
const Page *const * root_info_pages_ Root info pages output by compose()
uint32_t root_info_pages_count_ Number of root info pages.
SnapshotWriter * snapshot_writer_ Writes out composed pages.

Typedef Documentation

Represents a position in some buffer.

As log is always 8-byte aligned, we divide the original byte position by 8. Thus, this can represent up to 8 * 2^32=32GB, which is the maximum value of log_mapper_io_buffer_mb_.

See also
to_buffer_position
from_buffer_position

Definition at line 72 of file snapshot_id.hpp.

typedef uint16_t foedus::snapshot::SnapshotId

Unique ID of Snapshot.

Snapshot ID is a 16-bit integer. As we periodically merge all snapshots, we won't have 2^16 snapshots at one time. This ID wraps around, but it causes no issue as we never compare greater-than/less-than between snapshot ID. All snapshots contain base and valid_until epochs, so we just compare them.

ID-0 is a special value that means NULL. Use the following method to increment a snapshot ID to preserve this invariant.

Definition at line 43 of file snapshot_id.hpp.

Function Documentation

SnapshotId foedus::snapshot::increment ( SnapshotId  id)
inline

Increment SnapshotId.

Invariant
id != kNullSnapshotId

Definition at line 52 of file snapshot_id.hpp.

References ASSERT_ND.

Referenced by foedus::snapshot::SnapshotManagerPimpl::handle_snapshot_triggered(), and foedus::snapshot::SnapshotManagerPimpl::issue_next_snapshot_id().

52  {
54  ++id;
55  if (id == kNullSnapshotId) {
56  return 1; // wrap around, and skip 0.
57  } else {
58  return id;
59  }
60 }
const SnapshotId kNullSnapshotId
Definition: snapshot_id.hpp:45
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72

Here is the caller graph for this function: