libfoedus-core
FOEDUS Core Library
foedus::storage::masstree::MasstreeComposer Class Referencefinal

Composer for an masstree storage. More...

Detailed Description

Composer for an masstree storage.

This is the most complex composer out of the four storage types.

Differences from volatile pages
There are a few fundamental differences from volatile world.
  • Very high fill factor. Almost 100%.
  • No foster children.
  • All keys are completely sorted by keys in boundary pages, which speeds up cursor.
Constructing pages that match volatile pages well
One of the differences from the simpler array storage is that page boundaries depend on the data. If the snapshot pages have all different page boundaries from volatile pages, the only snapshot pointer we can install after snapshotting is the root pointer. We can only drop the entire volatile pages in that case. In the current version, this is the case. We drop everything and we don't consider where's the page bounary in the volatile world. This is one of major todos. However, it should be just an implementation issue. "Peeking" the volatile world's page boundary is not a difficult thing especially because we don't care 100% accuracy; if the page boundary doesn't match occasionally, that's fine. We just use it to loosely guide the choice of page boundaries in snapshot pages.
Note
This is a private implementation-details of Masstree Storage, thus file name ends with _impl. Do not include this header from a client program. There is no case client program needs to access this internal class.

Definition at line 69 of file masstree_composer_impl.hpp.

#include <masstree_composer_impl.hpp>

Public Member Functions

 MasstreeComposer (Composer *parent)
 MasstreeComposer methods. More...
 
std::string to_string () const
 
ErrorStack compose (const Composer::ComposeArguments &args)
 
ErrorStack construct_root (const Composer::ConstructRootArguments &args)
 
Composer::DropResult drop_volatiles (const Composer::DropVolatilesArguments &args)
 drop_volatiles and related methods More...
 
void drop_root_volatile (const Composer::DropVolatilesArguments &args)
 

Constructor & Destructor Documentation

foedus::storage::masstree::MasstreeComposer::MasstreeComposer ( Composer parent)
explicit

MasstreeComposer methods.

Definition at line 71 of file masstree_composer_impl.cpp.

References ASSERT_ND, and foedus::storage::Storage< CONTROL_BLOCK >::exists().

72  : engine_(parent->get_engine()),
73  storage_id_(parent->get_storage_id()),
74  storage_(engine_, storage_id_) {
75  ASSERT_ND(storage_.exists());
76 }
bool exists() const
Returns whether this storage is already created.
Definition: storage.hpp:169
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72

Here is the call graph for this function:

Member Function Documentation

ErrorStack foedus::storage::masstree::MasstreeComposer::compose ( const Composer::ComposeArguments args)

Definition at line 78 of file masstree_composer_impl.cpp.

References foedus::storage::Composer::ComposeArguments::base_epoch_, CHECK_ERROR, foedus::debugging::StopWatch::elapsed_ms(), foedus::DefaultInitializable::initialize(), foedus::storage::kMasstreeStorage, foedus::storage::masstree::MasstreeComposeContext::kMaxLevels, foedus::kRetOk, foedus::storage::Composer::ComposeArguments::log_streams_, foedus::storage::Composer::ComposeArguments::log_streams_count_, foedus::debugging::StopWatch::stop(), to_string(), foedus::DefaultInitializable::uninitialize(), and foedus::storage::Composer::ComposeArguments::work_memory_.

Referenced by foedus::storage::Composer::compose().

78  {
79  VLOG(0) << to_string() << " composing with " << args.log_streams_count_ << " streams.";
80  debugging::StopWatch stop_watch;
81 
82  snapshot::MergeSort merge_sort(
83  storage_id_,
85  args.base_epoch_,
86  args.log_streams_,
87  args.log_streams_count_,
89  args.work_memory_);
90  CHECK_ERROR(merge_sort.initialize());
91 
92  MasstreeComposeContext context(engine_, &merge_sort, args);
93  CHECK_ERROR(context.execute());
94 
95  CHECK_ERROR(merge_sort.uninitialize()); // no need for scoped release. its destructor is safe.
96 
97  stop_watch.stop();
98  LOG(INFO) << to_string() << " done in " << stop_watch.elapsed_ms() << "ms.";
99  return kRetOk;
100 }
#define CHECK_ERROR(x)
This macro calls x and checks its returned value.
const ErrorStack kRetOk
Normal return value for no-error case.

Here is the call graph for this function:

Here is the caller graph for this function:

ErrorStack foedus::storage::masstree::MasstreeComposer::construct_root ( const Composer::ConstructRootArguments args)

Definition at line 111 of file masstree_composer_impl.cpp.

References ASSERT_ND, CHECK_ERROR, foedus::snapshot::SnapshotWriter::dump_pages(), foedus::debugging::StopWatch::elapsed_ms(), foedus::storage::masstree::MasstreeIntermediatePage::extract_separators_snapshot(), foedus::storage::masstree::from_this_snapshot(), foedus::storage::masstree::MasstreePage::get_btree_level(), foedus::Attachable< CONTROL_BLOCK >::get_control_block(), foedus::storage::Page::get_header(), foedus::storage::masstree::MasstreePage::get_key_count(), foedus::storage::masstree::MasstreeIntermediatePage::get_minipage(), foedus::snapshot::SnapshotWriter::get_next_page_id(), foedus::snapshot::SnapshotWriter::get_page_base(), foedus::storage::PageHeader::get_page_type(), foedus::storage::masstree::MasstreePage::header(), foedus::storage::masstree::MasstreeBorderPage::initialize_snapshot_page(), foedus::storage::masstree::MasstreePage::is_border(), foedus::storage::masstree::MasstreeIntermediatePointerIterator::is_valid(), foedus::storage::masstree::MasstreeIntermediatePage::MiniPage::key_count_, foedus::storage::kMasstreeIntermediatePageType, foedus::storage::kPageSize, foedus::kRetOk, foedus::storage::PageHeader::masstree_in_layer_level_, foedus::storage::Composer::ConstructRootArguments::new_root_page_pointer_, foedus::storage::PageHeader::page_id_, foedus::storage::masstree::MasstreeIntermediatePage::MiniPage::pointers_, foedus::storage::Composer::ConstructRootArguments::root_info_pages_, foedus::storage::Composer::ConstructRootArguments::root_info_pages_count_, foedus::storage::PageHeader::snapshot_, foedus::storage::DualPagePointer::snapshot_pointer_, foedus::storage::Composer::ConstructRootArguments::snapshot_writer_, foedus::debugging::StopWatch::stop(), foedus::storage::PageHeader::storage_id_, to_string(), and WRAP_ERROR_CODE.

Referenced by foedus::storage::Composer::construct_root().

111  {
112  ASSERT_ND(args.root_info_pages_count_ > 0);
113  VLOG(0) << to_string() << " combining " << args.root_info_pages_count_ << " root page info.";
114  debugging::StopWatch stop_watch;
115  SnapshotPagePointer new_root_id = args.snapshot_writer_->get_next_page_id();
116  Page* new_root = args.snapshot_writer_->get_page_base();
117  std::memcpy(new_root, args.root_info_pages_[0], kPageSize); // use the first one as base
118  if (args.root_info_pages_count_ == 1U) {
119  VLOG(0) << to_string() << " only 1 root info page, so we just use it as the new root.";
120  } else {
121  // otherwise, we merge inputs from each reducer. because of how we design partitions,
122  // the inputs are always intermediate pages with same partitioning. we just place new
123  // pointers to children
124  CHECK_ERROR(check_buddies(args));
125 
126  // because of how we partition, B-tree levels might be unbalanced at the root of first layer.
127  // thus, we take the max of btree-level here.
128  MasstreeIntermediatePage* base = reinterpret_cast<MasstreeIntermediatePage*>(new_root);
129  ASSERT_ND(!base->is_border());
130  for (uint16_t buddy = 1; buddy < args.root_info_pages_count_; ++buddy) {
131  const MasstreeIntermediatePage* page
132  = reinterpret_cast<const MasstreeIntermediatePage*>(args.root_info_pages_[buddy]);
133  if (page->get_btree_level() > base->get_btree_level()) {
134  LOG(INFO) << "MasstreeStorage-" << storage_id_ << " has an unbalanced depth in sub-tree";
135  base->header().masstree_in_layer_level_ = page->get_btree_level();
136  }
137  }
138 
139  uint16_t updated_count = 0;
140  for (SlotIndex i = 0; i <= base->get_key_count(); ++i) {
141  MasstreeIntermediatePage::MiniPage& base_mini = base->get_minipage(i);
142  for (SlotIndex j = 0; j <= base_mini.key_count_; ++j) {
143  SnapshotPagePointer base_pointer = base_mini.pointers_[j].snapshot_pointer_;
144  if (from_this_snapshot(base_pointer, args)) {
145  ++updated_count;
146  continue; // base has updated it
147  }
148 
149  for (uint16_t buddy = 1; buddy < args.root_info_pages_count_; ++buddy) {
150  const MasstreeIntermediatePage* page
151  = reinterpret_cast<const MasstreeIntermediatePage*>(args.root_info_pages_[buddy]);
152  const MasstreeIntermediatePage::MiniPage& mini = page->get_minipage(i);
153  SnapshotPagePointer pointer = mini.pointers_[j].snapshot_pointer_;
154  if (from_this_snapshot(pointer, args)) {
155  base_mini.pointers_[j].snapshot_pointer_ = pointer; // this buddy has updated it
156  ++updated_count;
157  break;
158  }
159  }
160  }
161  }
162  VLOG(0) << to_string() << " has " << updated_count << " new pointers from root.";
163  }
164 
165  // In either case, some pointer might be null if that partition had no log record
166  // and if this is an initial snapshot. Create a dummy page for them.
167  MasstreeIntermediatePage* merged = reinterpret_cast<MasstreeIntermediatePage*>(new_root);
168  ASSERT_ND(!merged->is_border());
169  uint16_t dummy_count = 0;
170  for (MasstreeIntermediatePointerIterator it(merged); it.is_valid(); it.next()) {
171  DualPagePointer& pointer = merged->get_minipage(it.index_).pointers_[it.index_mini_];
172  if (pointer.snapshot_pointer_ == 0) {
173  // this should happen only while an initial snapshot for this storage.
174  ASSERT_ND(storage_.get_control_block()->root_page_pointer_.snapshot_pointer_ == 0);
175  KeySlice low, high;
176  merged->extract_separators_snapshot(it.index_, it.index_mini_, &low, &high);
177  uint32_t offset = 1U + dummy_count; // +1 for new_root
178  SnapshotPagePointer page_id = args.snapshot_writer_->get_next_page_id() + offset;
179  MasstreeBorderPage* dummy = reinterpret_cast<MasstreeBorderPage*>(
180  args.snapshot_writer_->get_page_base() + offset);
181  dummy->initialize_snapshot_page(storage_id_, page_id, 0, low, high);
182  pointer.snapshot_pointer_ = page_id;
183  ++dummy_count;
184  }
185  }
186  VLOG(0) << to_string() << " has added " << dummy_count << " dummy pointers in root.";
187 
188  ASSERT_ND(new_root->get_header().snapshot_);
189  ASSERT_ND(new_root->get_header().storage_id_ == storage_id_);
190  ASSERT_ND(new_root->get_header().get_page_type() == kMasstreeIntermediatePageType);
191 
192  new_root->get_header().page_id_ = new_root_id;
193  *args.new_root_page_pointer_ = new_root_id;
194  WRAP_ERROR_CODE(args.snapshot_writer_->dump_pages(0, 1 + dummy_count));
195 
196  // AFTER writing out the root page, install the pointer to new root page
197  storage_.get_control_block()->root_page_pointer_.snapshot_pointer_ = new_root_id;
198  storage_.get_control_block()->meta_.root_snapshot_page_id_ = new_root_id;
199 
200  stop_watch.stop();
201  VLOG(0) << to_string() << " done in " << stop_watch.elapsed_ms() << "ms.";
202  return kRetOk;
203 }
uint16_t SlotIndex
Index of a record in a (border) page.
uint64_t KeySlice
Each key slice is an 8-byte integer.
uint64_t SnapshotPagePointer
Page ID of a snapshot page.
Definition: storage_id.hpp:79
bool from_this_snapshot(SnapshotPagePointer pointer, const Composer::ConstructRootArguments &args)
#define CHECK_ERROR(x)
This macro calls x and checks its returned value.
const ErrorStack kRetOk
Normal return value for no-error case.
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72
#define WRAP_ERROR_CODE(x)
Same as CHECK_ERROR(x) except it receives only an error code, thus more efficient.
CONTROL_BLOCK * get_control_block() const
Definition: attachable.hpp:97
const uint16_t kPageSize
A constant defining the page size (in bytes) of both snapshot pages and volatile pages.
Definition: storage_id.hpp:45

Here is the call graph for this function:

Here is the caller graph for this function:

void foedus::storage::masstree::MasstreeComposer::drop_root_volatile ( const Composer::DropVolatilesArguments args)

Definition at line 2622 of file masstree_composer_impl.cpp.

References foedus::storage::VolatilePagePointer::clear(), foedus::storage::masstree::MasstreePage::get_btree_level(), foedus::Attachable< CONTROL_BLOCK >::get_control_block(), foedus::storage::masstree::MasstreeStorage::get_masstree_metadata(), foedus::storage::Storage< CONTROL_BLOCK >::get_name(), foedus::storage::Metadata::keeps_all_volatile_pages(), and foedus::storage::DualPagePointer::volatile_pointer_.

Referenced by foedus::storage::Composer::drop_root_volatile().

2622  {
2623  if (storage_.get_masstree_metadata()->keeps_all_volatile_pages()) {
2624  LOG(INFO) << "Oh, but keep-all-volatile is on. Storage-" << storage_.get_name()
2625  << " is configured to keep all volatile pages.";
2626  return;
2627  }
2628  DualPagePointer* root_pointer = &storage_.get_control_block()->root_page_pointer_;
2629  MasstreeIntermediatePage* volatile_page = reinterpret_cast<MasstreeIntermediatePage*>(
2630  resolve_volatile(root_pointer->volatile_pointer_));
2631  if (volatile_page == nullptr) {
2632  LOG(INFO) << "Oh, but root volatile page already null";
2633  return;
2634  }
2635 
2636  if (is_to_keep_volatile(0, volatile_page->get_btree_level())) {
2637  LOG(INFO) << "Oh, but " << storage_ << " is configured to keep the root page.";
2638  return;
2639  }
2640 
2641  // yes, we can drop ALL volatile pages!
2642  LOG(INFO) << "Okay, drop em all!!";
2643  drop_all_recurse(args, root_pointer);
2644  root_pointer->volatile_pointer_.clear();
2645 }
bool keeps_all_volatile_pages() const
Definition: metadata.hpp:98
const StorageName & get_name() const
Returns the unique name of this storage.
Definition: storage.hpp:155
CONTROL_BLOCK * get_control_block() const
Definition: attachable.hpp:97
const MasstreeMetadata * get_masstree_metadata() const

Here is the call graph for this function:

Here is the caller graph for this function:

Composer::DropResult foedus::storage::masstree::MasstreeComposer::drop_volatiles ( const Composer::DropVolatilesArguments args)

drop_volatiles and related methods

Definition at line 2558 of file masstree_composer_impl.cpp.

References ASSERT_ND, foedus::storage::Composer::DropResult::combine(), foedus::storage::Composer::DropResult::dropped_all_, foedus::Attachable< CONTROL_BLOCK >::get_control_block(), foedus::storage::masstree::MasstreeStorage::get_masstree_metadata(), foedus::storage::PartitionerMetadata::get_metadata(), foedus::storage::masstree::MasstreeIntermediatePage::get_minipage(), foedus::storage::Storage< CONTROL_BLOCK >::get_name(), foedus::storage::masstree::MasstreeIntermediatePointerIterator::is_valid(), foedus::storage::Metadata::keeps_all_volatile_pages(), foedus::storage::masstree::kMaxIntermediatePointers, foedus::storage::masstree::kSupremumSlice, foedus::storage::PartitionerMetadata::locate_data(), foedus::storage::masstree::MasstreePartitionerData::low_keys_, foedus::storage::Composer::DropVolatilesArguments::my_partition_, foedus::storage::masstree::MasstreePartitionerData::partition_count_, foedus::storage::Composer::DropVolatilesArguments::partitioned_drop_, foedus::storage::masstree::MasstreePartitionerData::partitions_, foedus::storage::masstree::MasstreeIntermediatePage::MiniPage::pointers_, foedus::storage::PartitionerMetadata::valid_, and foedus::storage::DualPagePointer::volatile_pointer_.

Referenced by foedus::storage::Composer::drop_volatiles().

2559  {
2560  Composer::DropResult result(args);
2561  if (storage_.get_masstree_metadata()->keeps_all_volatile_pages()) {
2562  LOG(INFO) << "Keep-all-volatile: Storage-" << storage_.get_name()
2563  << " is configured to keep all volatile pages.";
2564  result.dropped_all_ = false;
2565  return result;
2566  }
2567 
2568  DualPagePointer* root_pointer = &storage_.get_control_block()->root_page_pointer_;
2569  MasstreeIntermediatePage* volatile_page = reinterpret_cast<MasstreeIntermediatePage*>(
2570  resolve_volatile(root_pointer->volatile_pointer_));
2571  if (volatile_page == nullptr) {
2572  LOG(INFO) << "No volatile root page. Probably while restart";
2573  return result;
2574  }
2575 
2576  // Unlike array-storage, each thread decides partititions to follow a bit differently
2577  // because the snapshot pointer might be not installed.
2578  // We check the partitioning assignment.
2579  PartitionerMetadata* metadata = PartitionerMetadata::get_metadata(engine_, storage_id_);
2580  ASSERT_ND(metadata->valid_);
2581  MasstreePartitionerData* data
2582  = reinterpret_cast<MasstreePartitionerData*>(metadata->locate_data(engine_));
2583  ASSERT_ND(data->partition_count_ > 0);
2584  ASSERT_ND(data->partition_count_ < kMaxIntermediatePointers);
2585  uint16_t cur = 0;
2586  for (MasstreeIntermediatePointerIterator it(volatile_page); it.is_valid(); it.next()) {
2587  DualPagePointer* pointer = &volatile_page->get_minipage(it.index_).pointers_[it.index_mini_];
2588  if (!args.partitioned_drop_) {
2589  result.combine(drop_volatiles_recurse(args, pointer));
2590  continue;
2591  }
2592  // the page boundaries as of partitioning might be different from the volatile page's,
2593  // but at least it's more coarse. Each pointer belongs to just one partition.
2594  // same trick as install_pointers.
2595  KeySlice low = it.get_low_key();
2596  KeySlice high = it.get_high_key();
2597  while (true) {
2598  KeySlice partition_high;
2599  if (cur + 1U == data->partition_count_) {
2600  partition_high = kSupremumSlice;
2601  } else {
2602  partition_high = data->low_keys_[cur + 1U];
2603  }
2604  if (high <= partition_high) {
2605  break;
2606  }
2607  ++cur;
2608  }
2609  ASSERT_ND(cur < data->partition_count_);
2610  if (data->partitions_[cur] == args.my_partition_) {
2611  if (data->low_keys_[cur] != low) {
2612  VLOG(0) << "Not exactly matching page boundary."; // but not a big issue.
2613  }
2614  result.combine(drop_volatiles_recurse(args, pointer));
2615  }
2616  }
2617 
2618  // we so far always keep the volatile root of a masstree storage.
2619  return result;
2620 }
bool keeps_all_volatile_pages() const
Definition: metadata.hpp:98
uint64_t KeySlice
Each key slice is an 8-byte integer.
const StorageName & get_name() const
Returns the unique name of this storage.
Definition: storage.hpp:155
const uint32_t kMaxIntermediatePointers
Max number of pointers (if completely filled) stored in an intermediate pages.
const KeySlice kSupremumSlice
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72
CONTROL_BLOCK * get_control_block() const
Definition: attachable.hpp:97
const MasstreeMetadata * get_masstree_metadata() const
static PartitionerMetadata * get_metadata(Engine *engine, StorageId id)
Returns the shared memory for the given storage ID.
Definition: partitioner.cpp:38

Here is the call graph for this function:

Here is the caller graph for this function:

std::string foedus::storage::masstree::MasstreeComposer::to_string ( ) const

Definition at line 282 of file masstree_composer_impl.cpp.

Referenced by compose(), and construct_root().

282  {
283  return std::string("MasstreeComposer-") + std::to_string(storage_id_);
284 }

Here is the caller graph for this function:


The documentation for this class was generated from the following files: