libfoedus-core
FOEDUS Core Library
|
Snapshot Manager, which manages snapshot files of the database. More...
Snapshot Manager, which manages snapshot files of the database.
This package contains classes to handle snapshot files.
One snapshot consists of a set of snapshot files and a snapshot metadata file (snapshot.xml). Every snapshot is tagged with base epoch and valid-until epoch. The snapshot logically contains all information of the database upto valid-until epoch, meaning the previous snapshot (whose valid-until should be equal to base epoch of the snapshot) as well as all log entries in the transactional log from a base epoch until valid-until epoch.
However, the snapshot does not necessarily physically contains all information from the previous snapshot because it makes each snapshot-ing too expensive. One common approach is LSM-Tree (Log Structured Merge Tree), but it is not a good fit for serializable transactional processing. Even a trivial primary key constraint would be too expensive on top of LSM-Tree.
Instead, snapshot files in FOEDUS are overlays of the database image. Each snapshot file contains new version of data pages that overwrite a required portion of storages. They are not incremental new-tuples/tombstones data as in LSM. They are a complete representation of the storages, but it might contain pointers to old snapshot files if pages under it had no change.
In the worst case, transactions in one epoch updates just one tuple in every page, resulting in a snapshot that physically contains all data. However, it is rare and such a workload is fundamentally expensive if data size does not fit DRAM (if it does, this approach is also fine).
Snapshot Manager creates a new set of snapshot files as well as its metadata file occasionally. The frequency is a tuning knob. The mechanism to create snapshot files is called Log-Gleaner (foedus::snapshot::LogGleaner). See its documentation below.
LogGleaner is the main class that manages most mechanisms to construct a new set of snapshot files. Snapshot procedure constructs and calls this object during snapshot. It receives partitioning policy (which snapshot partitions to send ranges of keys) per storage and beginning/ending epoch of logs to glean while log-gleaning.
Log-gleaning consists of two components; mapper (foedus::snapshot::LogMapper) and reducer (foedus::snapshot::LogReducer), obviously named after the well-known map-reduce concepts.
LogGleaner launches a set of mapper threads (foedus::snapshot::LogMapper) to read log files. Each LogMapper corresponds to foedus::log::Logger, the NUMA-local log writer which simply writes out log entries produced by local worker threads. Thus, the log files contain log entries that might be sent to any partitions. LogMapper maps each log entry to some partition and send it to a reducer corresponding to the partition. For more details, see foedus::snapshot::LogMapper.
LogGleaner also launches a set of reducer threads (foedus::snapshot::LogReducer), one for each NUMA node. LogReducer sorts log entries sent from LogMapper. The log entries are sorted by key and ordinal (*), then processed just like usual APPLY at the end of transaction, but on top of snapshot files.
(*) otherwise correct result is not guaranteed. For example, imagine the following case:
Ordinal-1 must be processed before ordinal 2.
LogGleaner coordinates the synchronization between mappers and reducers during snapshotting. At the beginning of snapshotting, gleaner wakes up reducers and mappers. Mappers go in to sleep when they process all logs. When all mappers went to sleep, reducers start to also go into sleep when they process all logs they receive. When all of them are done, gleaner initiates the last wrap-up phase. Additionally, LogGleaner is in charge of receiving termination request from the engine if the user invokes Engine::uninitialize() and requesting reducers/mappers to stop.
Reducers/mappers occasionally check if they are requested to stop when they get idle or complete all work. They do the check at least once for a while so that the latency to stop can not be catastrophic.
After all mappers and reducers complete, the last phase of log gleaning is to construct root pages for the storages modified in this snapshotting. Gleaner collects root-page-info from each reducer and combines them to create the root page(s). When all set, gleaner produces maps from storage ID to a new root page ID. This will be written out in a snapshot metadata file by snapshot manager.
Classes | |
struct | BlockHeaderBase |
All log blocks in mapper/reducers start with this header. More... | |
class | DumpFileSortedBuffer |
Implementation of SortedBuffer that is backed by a dumped file. More... | |
struct | FillerBlockHeader |
A header for a dummy storage block that fills the gap between the end of previous storage block and the beginning of next storage block. More... | |
struct | FullBlockHeader |
All blocks that have content start with this header. More... | |
class | InMemorySortedBuffer |
Implementation of SortedBuffer that is backed by fully in-memory buffer. More... | |
struct | LogBuffer |
Packages handling of 4-bytes representation of position in log buffers. More... | |
class | LogGleaner |
A log-gleaner, which constructs a new set of snapshot files during snapshotting. More... | |
struct | LogGleanerControlBlock |
Shared data for LogGleaner. More... | |
class | LogGleanerRef |
A remote view of LogGleaner from all engines. More... | |
struct | LogGleanerResource |
Local resource for the log gleaner, which runs only in the master node. More... | |
class | LogMapper |
A log mapper, which reads log files from one logger and sends them to corresponding log reducers. More... | |
class | LogReducer |
A log reducer, which receives log entries sent from mappers and applies them to construct new snapshot files. More... | |
struct | LogReducerControlBlock |
Shared data for LogReducer. More... | |
class | LogReducerRef |
A remote view of LogReducer from all engines. More... | |
class | MapReduceBase |
Base class for LogMapper and LogReducer to share common code. More... | |
class | MergeSort |
Receives an arbitrary number of sorted buffers and emits one fully sorted stream of logs. More... | |
union | ReducerBufferStatus |
Compactly represents important status informations of a reducer buffer. More... | |
struct | Snapshot |
Represents one snapshot that converts all logs from base epoch to valid_until epoch into snapshot file(s). More... | |
class | SnapshotManager |
Snapshot manager that atomically and durably writes out a snapshot file. More... | |
struct | SnapshotManagerControlBlock |
Shared data in SnapshotManagerPimpl. More... | |
class | SnapshotManagerPimpl |
Pimpl object of SnapshotManager. More... | |
struct | SnapshotMetadata |
Represents the data in one snapshot metadata file. More... | |
struct | SnapshotOptions |
Set of options for snapshot manager. More... | |
class | SnapshotWriter |
Writes out one snapshot file for all data pages in one reducer. More... | |
class | SortedBuffer |
Represents one input stream of sorted log entries. More... | |
Typedefs | |
typedef uint16_t | SnapshotId |
Unique ID of Snapshot. More... | |
typedef uint32_t | BufferPosition |
Represents a position in some buffer. More... | |
Enumerations | |
enum | ReducerConstants { kFlagNoMoreWriters = 0x0001 } |
Functions | |
bool | is_array_log_type (uint16_t log_type) |
bool | is_hash_log_type (uint16_t log_type) |
bool | is_masstree_log_type (uint16_t log_type) |
SnapshotId | increment (SnapshotId id) |
Increment SnapshotId. More... | |
BufferPosition | to_buffer_position (uint64_t byte_position) |
uint64_t | from_buffer_position (BufferPosition buffer_position) |
std::ostream & | operator<< (std::ostream &o, const SortedBuffer &v) |
std::ostream & | operator<< (std::ostream &o, const LogGleaner &v) |
uint16_t | calculate_logger_id (Engine *engine, uint16_t local_ordinal) |
Unique ID of this log mapper. More... | |
uint64_t | align_io_floor (uint64_t offset) |
uint64_t | align_io_ceil (uint64_t offset) |
void | update_key_lengthes (const log::LogHeader *header, storage::StorageType storage_type, uint32_t *shortest_key_length, uint32_t *longest_key_length) |
std::ostream & | operator<< (std::ostream &o, const LogMapper &v) |
std::ostream & | operator<< (std::ostream &o, const LogReducer &v) |
std::ostream & | operator<< (std::ostream &o, const BlockHeaderBase &v) |
uint16_t | extract_shortest_key_length (SortedBuffer *const *inputs, uint16_t inputs_count) |
uint16_t | extract_longest_key_length (SortedBuffer *const *inputs, uint16_t inputs_count) |
template<typename T > | |
int | compare_logs_as (const log::RecordLogType *lhs, const log::RecordLogType *rhs) |
std::ostream & | operator<< (std::ostream &o, const Snapshot &v) |
std::ostream & | operator<< (std::ostream &o, const SnapshotWriter &v) |
Variables | |
const SnapshotId | kNullSnapshotId = 0 |
const uint64_t | kIoAlignment = 0x1000 |
const MergeSort::InputIndex | kInvalidInput = static_cast<MergeSort::InputIndex>(-1U) |
Represents null. More... | |
const float | kWindowMoveThreshold = 0.95 |
Also, when the input consumed more than this fraction of current window, we move the window. More... | |
const char * | kStoragesTagName = "storages" |
Enumerator | |
---|---|
kFlagNoMoreWriters |
A bit-wise flag in ReducerBufferStatus's flags_. If this bit is on, no more mappers can enter the buffer as a new writer. |
Definition at line 55 of file log_reducer_impl.hpp.
uint64_t foedus::snapshot::align_io_ceil | ( | uint64_t | offset | ) |
Definition at line 138 of file log_mapper_impl.cpp.
References align_io_floor().
Referenced by foedus::snapshot::LogMapper::handle_process().
uint64_t foedus::snapshot::align_io_floor | ( | uint64_t | offset | ) |
Definition at line 137 of file log_mapper_impl.cpp.
References kIoAlignment.
Referenced by align_io_ceil(), and foedus::snapshot::LogMapper::handle_process().
uint16_t foedus::snapshot::calculate_logger_id | ( | Engine * | engine, |
uint16_t | local_ordinal | ||
) |
Unique ID of this log mapper.
One log mapper corresponds to one logger, so this ID is also the corresponding logger's ID (log::LoggerId).
Definition at line 54 of file log_mapper_impl.cpp.
References foedus::Engine::get_options(), foedus::Engine::get_soc_id(), foedus::EngineOptions::log_, and foedus::log::LogOptions::loggers_per_node_.
int foedus::snapshot::compare_logs_as | ( | const log::RecordLogType * | lhs, |
const log::RecordLogType * | rhs | ||
) |
Definition at line 548 of file merge_sort.cpp.
uint16_t foedus::snapshot::extract_longest_key_length | ( | SortedBuffer *const * | inputs, |
uint16_t | inputs_count | ||
) |
Definition at line 55 of file merge_sort.cpp.
References foedus::snapshot::SortedBuffer::get_cur_block_longest_key_length().
uint16_t foedus::snapshot::extract_shortest_key_length | ( | SortedBuffer *const * | inputs, |
uint16_t | inputs_count | ||
) |
Definition at line 48 of file merge_sort.cpp.
References foedus::snapshot::SortedBuffer::get_cur_block_shortest_key_length().
|
inline |
Definition at line 78 of file snapshot_id.hpp.
Referenced by foedus::snapshot::LogReducerRef::append_log_chunk(), foedus::snapshot::ReducerBufferStatus::get_tail_bytes(), and foedus::snapshot::LogBuffer::resolve().
|
inline |
Definition at line 516 of file merge_sort.hpp.
References foedus::log::kLogCodeArrayIncrement, and foedus::log::kLogCodeArrayOverwrite.
|
inline |
Definition at line 519 of file merge_sort.hpp.
References foedus::log::kLogCodeHashDelete, foedus::log::kLogCodeHashInsert, foedus::log::kLogCodeHashOverwrite, and foedus::log::kLogCodeHashUpdate.
|
inline |
Definition at line 526 of file merge_sort.hpp.
References foedus::log::kLogCodeMasstreeDelete, foedus::log::kLogCodeMasstreeInsert, foedus::log::kLogCodeMasstreeOverwrite, and foedus::log::kLogCodeMasstreeUpdate.
std::ostream& foedus::snapshot::operator<< | ( | std::ostream & | o, |
const Snapshot & | v | ||
) |
Definition at line 24 of file snapshot.cpp.
References foedus::snapshot::Snapshot::base_epoch_, foedus::snapshot::Snapshot::id_, foedus::snapshot::Snapshot::max_storage_id_, and foedus::snapshot::Snapshot::valid_until_epoch_.
std::ostream& foedus::snapshot::operator<< | ( | std::ostream & | o, |
const SortedBuffer & | v | ||
) |
Definition at line 32 of file log_buffer.cpp.
References foedus::snapshot::SortedBuffer::describe().
std::ostream& foedus::snapshot::operator<< | ( | std::ostream & | o, |
const SnapshotWriter & | v | ||
) |
Definition at line 174 of file snapshot_writer_impl.cpp.
std::ostream& foedus::snapshot::operator<< | ( | std::ostream & | o, |
const LogGleaner & | v | ||
) |
Definition at line 312 of file log_gleaner_impl.cpp.
References foedus::snapshot::LogGleanerControlBlock::completed_count_, foedus::snapshot::LogGleanerControlBlock::completed_mapper_count_, foedus::Attachable< CONTROL_BLOCK >::control_block_, foedus::snapshot::LogGleanerControlBlock::error_count_, and foedus::snapshot::LogGleanerControlBlock::exit_count_.
std::ostream& foedus::snapshot::operator<< | ( | std::ostream & | o, |
const LogMapper & | v | ||
) |
Definition at line 700 of file log_mapper_impl.cpp.
References foedus::snapshot::MapReduceBase::id_, and foedus::snapshot::MapReduceBase::numa_node_.
std::ostream& foedus::snapshot::operator<< | ( | std::ostream & | o, |
const LogReducer & | v | ||
) |
Definition at line 912 of file log_reducer_impl.cpp.
References foedus::snapshot::LogReducerControlBlock::current_buffer_, foedus::snapshot::MapReduceBase::get_id(), foedus::snapshot::MapReduceBase::get_numa_node(), and foedus::snapshot::LogReducerControlBlock::total_storage_count_.
std::ostream& foedus::snapshot::operator<< | ( | std::ostream & | o, |
const BlockHeaderBase & | v | ||
) |
Definition at line 925 of file log_reducer_impl.cpp.
References foedus::snapshot::BlockHeaderBase::block_length_, and foedus::snapshot::BlockHeaderBase::magic_word_.
|
inline |
Definition at line 74 of file snapshot_id.hpp.
References ASSERT_ND.
Referenced by foedus::snapshot::LogReducerRef::append_log_chunk(), and foedus::snapshot::LogBuffer::compact().
|
inline |
Definition at line 540 of file log_mapper_impl.cpp.
References ASSERT_ND, foedus::storage::kArrayStorage, foedus::storage::hash::HashCommonLogType::key_length_, foedus::storage::masstree::MasstreeCommonLogType::key_length_, foedus::storage::kHashStorage, foedus::storage::kMasstreeStorage, and foedus::storage::kSequentialStorage.
const MergeSort::InputIndex foedus::snapshot::kInvalidInput = static_cast<MergeSort::InputIndex>(-1U) |
Represents null.
Definition at line 40 of file merge_sort.cpp.
const uint64_t foedus::snapshot::kIoAlignment = 0x1000 |
Definition at line 136 of file log_mapper_impl.cpp.
Referenced by align_io_floor().
const SnapshotId foedus::snapshot::kNullSnapshotId = 0 |
Definition at line 45 of file snapshot_id.hpp.
Referenced by foedus::snapshot::SnapshotMetadata::clear(), foedus::storage::masstree::from_this_snapshot(), foedus::snapshot::SnapshotManagerPimpl::handle_snapshot_triggered(), foedus::savepoint::SavepointManagerControlBlock::initialize(), foedus::storage::StorageManagerPimpl::initialize_read_latest_snapshot(), foedus::snapshot::SnapshotManagerPimpl::issue_next_snapshot_id(), foedus::savepoint::Savepoint::populate_empty(), foedus::savepoint::SavepointManagerPimpl::savepoint_main(), and foedus::snapshot::LogReducerRef::verify_log_chunk().
const char* foedus::snapshot::kStoragesTagName = "storages" |
Definition at line 32 of file snapshot_metadata.cpp.
const float foedus::snapshot::kWindowMoveThreshold = 0.95 |
Also, when the input consumed more than this fraction of current window, we move the window.
This means we have to memmove 5% everytime, but instead we can avoid many-small batches.
Definition at line 45 of file merge_sort.cpp.