libfoedus-core
FOEDUS Core Library
|
Snapshot Manager, which manages snapshot files of the database. More...
Snapshot Manager, which manages snapshot files of the database.
This package contains classes to handle snapshot files.
One snapshot consists of a set of snapshot files and a snapshot metadata file (snapshot.xml). Every snapshot is tagged with base epoch and valid-until epoch. The snapshot logically contains all information of the database upto valid-until epoch, meaning the previous snapshot (whose valid-until should be equal to base epoch of the snapshot) as well as all log entries in the transactional log from a base epoch until valid-until epoch.
However, the snapshot does not necessarily physically contains all information from the previous snapshot because it makes each snapshot-ing too expensive. One common approach is LSM-Tree (Log Structured Merge Tree), but it is not a good fit for serializable transactional processing. Even a trivial primary key constraint would be too expensive on top of LSM-Tree.
Instead, snapshot files in FOEDUS are overlays of the database image. Each snapshot file contains new version of data pages that overwrite a required portion of storages. They are not incremental new-tuples/tombstones data as in LSM. They are a complete representation of the storages, but it might contain pointers to old snapshot files if pages under it had no change.
In the worst case, transactions in one epoch updates just one tuple in every page, resulting in a snapshot that physically contains all data. However, it is rare and such a workload is fundamentally expensive if data size does not fit DRAM (if it does, this approach is also fine).
Snapshot Manager creates a new set of snapshot files as well as its metadata file occasionally. The frequency is a tuning knob. The mechanism to create snapshot files is called Log-Gleaner (foedus::snapshot::LogGleaner). See its documentation below.
LogGleaner is the main class that manages most mechanisms to construct a new set of snapshot files. Snapshot procedure constructs and calls this object during snapshot. It receives partitioning policy (which snapshot partitions to send ranges of keys) per storage and beginning/ending epoch of logs to glean while log-gleaning.
Log-gleaning consists of two components; mapper (foedus::snapshot::LogMapper) and reducer (foedus::snapshot::LogReducer), obviously named after the well-known map-reduce concepts.
LogGleaner launches a set of mapper threads (foedus::snapshot::LogMapper) to read log files. Each LogMapper corresponds to foedus::log::Logger, the NUMA-local log writer which simply writes out log entries produced by local worker threads. Thus, the log files contain log entries that might be sent to any partitions. LogMapper maps each log entry to some partition and send it to a reducer corresponding to the partition. For more details, see foedus::snapshot::LogMapper.
LogGleaner also launches a set of reducer threads (foedus::snapshot::LogReducer), one for each NUMA node. LogReducer sorts log entries sent from LogMapper. The log entries are sorted by key and ordinal (*), then processed just like usual APPLY at the end of transaction, but on top of snapshot files.
(*) otherwise correct result is not guaranteed. For example, imagine the following case:
Ordinal-1 must be processed before ordinal 2.
LogGleaner coordinates the synchronization between mappers and reducers during snapshotting. At the beginning of snapshotting, gleaner wakes up reducers and mappers. Mappers go in to sleep when they process all logs. When all mappers went to sleep, reducers start to also go into sleep when they process all logs they receive. When all of them are done, gleaner initiates the last wrap-up phase. Additionally, LogGleaner is in charge of receiving termination request from the engine if the user invokes Engine::uninitialize() and requesting reducers/mappers to stop.
Reducers/mappers occasionally check if they are requested to stop when they get idle or complete all work. They do the check at least once for a while so that the latency to stop can not be catastrophic.
After all mappers and reducers complete, the last phase of log gleaning is to construct root pages for the storages modified in this snapshotting. Gleaner collects root-page-info from each reducer and combines them to create the root page(s). When all set, gleaner produces maps from storage ID to a new root page ID. This will be written out in a snapshot metadata file by snapshot manager.
![]() |
Files | |
file | fwd.hpp |
Forward declarations of classes in snapshot manager package. | |
file | snapshot_id.hpp |
Typedefs of ID types used in snapshot package. | |
Classes | |
struct | foedus::snapshot::LogBuffer |
Packages handling of 4-bytes representation of position in log buffers. More... | |
class | foedus::snapshot::SortedBuffer |
Represents one input stream of sorted log entries. More... | |
class | foedus::snapshot::InMemorySortedBuffer |
Implementation of SortedBuffer that is backed by fully in-memory buffer. More... | |
class | foedus::snapshot::DumpFileSortedBuffer |
Implementation of SortedBuffer that is backed by a dumped file. More... | |
class | foedus::snapshot::LogGleaner |
A log-gleaner, which constructs a new set of snapshot files during snapshotting. More... | |
class | foedus::snapshot::LogGleanerRef |
A remote view of LogGleaner from all engines. More... | |
struct | foedus::snapshot::LogGleanerResource::PerNodeResource |
These buffers are used to read intermediate results from each reducer to compose the root page or other kinds of pages that weren't composed in each reducer (eg. More... | |
struct | foedus::snapshot::LogGleanerResource |
Local resource for the log gleaner, which runs only in the master node. More... | |
class | foedus::snapshot::LogMapper |
A log mapper, which reads log files from one logger and sends them to corresponding log reducers. More... | |
struct | foedus::snapshot::ReducerBufferStatus::Components |
union | foedus::snapshot::ReducerBufferStatus |
Compactly represents important status informations of a reducer buffer. More... | |
struct | foedus::snapshot::BlockHeaderBase |
All log blocks in mapper/reducers start with this header. More... | |
struct | foedus::snapshot::FullBlockHeader |
All blocks that have content start with this header. More... | |
struct | foedus::snapshot::LogReducerControlBlock |
Shared data for LogReducer. More... | |
class | foedus::snapshot::LogReducer |
A log reducer, which receives log entries sent from mappers and applies them to construct new snapshot files. More... | |
class | foedus::snapshot::LogReducerRef |
A remote view of LogReducer from all engines. More... | |
class | foedus::snapshot::MapReduceBase |
Base class for LogMapper and LogReducer to share common code. More... | |
struct | foedus::snapshot::MergeSort::SortEntry |
Entries we actually sort. More... | |
struct | foedus::snapshot::MergeSort::PositionEntry |
Provides additional information for each entry we are sorting. More... | |
struct | foedus::snapshot::MergeSort::InputStatus |
Current status of each input. More... | |
struct | foedus::snapshot::MergeSort::AdjustComparatorMasstree |
Used in batch_sort_adjust_sort if the storage is a masstree storage. More... | |
struct | foedus::snapshot::MergeSort::GroupifyResult |
Represents a group of consecutive logs in the current batch. More... | |
class | foedus::snapshot::MergeSort |
Receives an arbitrary number of sorted buffers and emits one fully sorted stream of logs. More... | |
struct | foedus::snapshot::Snapshot |
Represents one snapshot that converts all logs from base epoch to valid_until epoch into snapshot file(s). More... | |
class | foedus::snapshot::SnapshotManager |
Snapshot manager that atomically and durably writes out a snapshot file. More... | |
class | foedus::snapshot::SnapshotManagerPimpl |
Pimpl object of SnapshotManager. More... | |
struct | foedus::snapshot::SnapshotMetadata |
Represents the data in one snapshot metadata file. More... | |
struct | foedus::snapshot::SnapshotOptions |
Set of options for snapshot manager. More... | |
class | foedus::snapshot::SnapshotWriter |
Writes out one snapshot file for all data pages in one reducer. More... | |
struct | foedus::storage::Composer::ComposeArguments |
Arguments for compose() More... | |
struct | foedus::storage::Composer::ConstructRootArguments |
Arguments for construct_root() More... | |
struct | foedus::storage::Composer::DropVolatilesArguments |
Arguments for drop_volatiles() More... | |
struct | foedus::storage::Composer::DropResult |
Retrun value of drop_volatiles() More... | |
class | foedus::storage::Composer |
Represents a logic to compose a new version of data pages for one storage. More... | |
Typedefs | |
typedef uint16_t | foedus::snapshot::SnapshotId |
Unique ID of Snapshot. More... | |
typedef uint32_t | foedus::snapshot::BufferPosition |
Represents a position in some buffer. More... | |
Functions | |
SnapshotId | foedus::snapshot::increment (SnapshotId id) |
Increment SnapshotId. More... | |
struct foedus::snapshot::ReducerBufferStatus::Components |
Definition at line 71 of file log_reducer_impl.hpp.
Class Members | ||
---|---|---|
uint16_t | active_writers_ | |
uint16_t | flags_ | |
BufferPosition | tail_position_ |
struct foedus::storage::Composer::ComposeArguments |
Arguments for compose()
Definition at line 95 of file composer.hpp.
Class Members | ||
---|---|---|
Epoch | base_epoch_ |
All log entries in this inputs are assured to be after this epoch. Also, it is assured to be within 2^16 from this epoch. |
SortedBuffer *const * | log_streams_ | Sorted runs. |
uint32_t | log_streams_count_ | Number of sorted runs. |
SnapshotFileSet * | previous_snapshot_files_ | To read existing snapshots. |
Page * | root_info_page_ |
[OUT] Returns pointers and related information that is required to construct the root page. The data format depends on the composer. In all implementations, the information must fit in one page (should be, otherwise we can't have a root page) |
SnapshotWriter * | snapshot_writer_ | Writes out composed pages. |
AlignedMemory * | work_memory_ |
Working memory to be used in this method. Automatically expand if needed. |
struct foedus::storage::Composer::ConstructRootArguments |
Arguments for construct_root()
Definition at line 124 of file composer.hpp.
Class Members | ||
---|---|---|
LogGleanerResource * | gleaner_resource_ | All pre-allocated resouces to help run construct_root(), such as memory buffers. |
SnapshotPagePointer * | new_root_page_pointer_ | [OUT] Returns pointer to new root snapshot page/ |
SnapshotFileSet * | previous_snapshot_files_ | To read existing snapshots. |
const Page *const * | root_info_pages_ | Root info pages output by compose() |
uint32_t | root_info_pages_count_ | Number of root info pages. |
SnapshotWriter * | snapshot_writer_ | Writes out composed pages. |
typedef uint32_t foedus::snapshot::BufferPosition |
Represents a position in some buffer.
As log is always 8-byte aligned, we divide the original byte position by 8. Thus, this can represent up to 8 * 2^32=32GB, which is the maximum value of log_mapper_io_buffer_mb_.
Definition at line 72 of file snapshot_id.hpp.
typedef uint16_t foedus::snapshot::SnapshotId |
Unique ID of Snapshot.
Snapshot ID is a 16-bit integer. As we periodically merge all snapshots, we won't have 2^16 snapshots at one time. This ID wraps around, but it causes no issue as we never compare greater-than/less-than between snapshot ID. All snapshots contain base and valid_until epochs, so we just compare them.
ID-0 is a special value that means NULL. Use the following method to increment a snapshot ID to preserve this invariant.
Definition at line 43 of file snapshot_id.hpp.
|
inline |
Increment SnapshotId.
Definition at line 52 of file snapshot_id.hpp.
References ASSERT_ND.
Referenced by foedus::snapshot::SnapshotManagerPimpl::handle_snapshot_triggered(), and foedus::snapshot::SnapshotManagerPimpl::issue_next_snapshot_id().