libfoedus-core
FOEDUS Core Library
foedus::storage::hash::HashComposer Class Referencefinal

Composer for a hash storage. More...

Detailed Description

Composer for a hash storage.

Composer for hash is much easier than for B-trees, but somewhat trickier than array. Unlike B-trees, we don't have to worry about page-splits, finding where to insert, etc etc. We re-construct each hash bin always as a whole. Huuuuge simplification. Unlike array, we have a quite granular partitioning, so constructing intermediate pages is trickier. We do 2-path construction to parallelize this part without losing partitioning benefit.

HashRootInfoPage as the results

The format is exactly same as HashIntermediatePage, but typedef-ed for clarity. The only differences are:

  • Snapshot pointer points to HashComposedBinsPage instead of HashDataPage/HashIntermediatePage.
  • Volatile pointer (which of course is never used in snapshot page) instead represents the number of contiguously-written HashComposedBinsPage in the pointed sub-tree. This allows the combiner to read the entire linked-list in one-shot.
Linked list of HashComposedBinsPage
In hash storage, what compose() returns are just a list of bins. There essentially is no structure of them. The only reason we have this "root" page is to allow parallel combination by the gleaner. Each child-pointer from the root page is assigned to one thread on its own node, then written out in that node. So, after following the first child pointer, we just need a linked list, which is HashComposedBinsPage.
2-path composition
Each reducer returns a linked-list of HashComposedBinsPage for every direct-child of the root page. Then, the gleaner combines them in a parallel fashion, resulting in snapshot interemediate pages that will be installed to the root page. The second phase consists of reading the HashComposedBinsPage from each composer (reducer), collecting updated bins on each page, and creating a new interemediate page hierarchically. We have several classes decoupled from the composer class itself as listed below:
1-level hashtable
It works the same way even when the hash table has only one level (root directly points to data pages). Even in this case, HashRootInfoPage still points to HashComposedBinsPage, which contains only one entry per page. Kind of wasteful, but it has negligible overheads compared to the entire mapper/reducer. Unlike array package, we don't do special optimization for this case in order to keep the code simpler, and because the storage could benefit from parallelization even in this case (the root page still contains hundreds of hash bins).
Note
This is a private implementation-details of Hashtable Storage, thus file name ends with _impl. Do not include this header from a client program. There is no case client program needs to access this internal class.

Definition at line 62 of file hash_composer_impl.hpp.

#include <hash_composer_impl.hpp>

Public Member Functions

 HashComposer (Composer *parent)
 HashComposer methods. More...
 
std::string to_string () const
 
ErrorStack compose (const Composer::ComposeArguments &args)
 
ErrorStack construct_root (const Composer::ConstructRootArguments &args)
 
Composer::DropResult drop_volatiles (const Composer::DropVolatilesArguments &args)
 drop_volatiles and related methods More...
 
void drop_root_volatile (const Composer::DropVolatilesArguments &args)
 

Static Public Member Functions

static void launch_construct_root_multi_level (HashComposer *pointer, const Composer::ConstructRootArguments *args, uint16_t numa_node, HashIntermediatePage *root_page, ErrorStack *out_error)
 launched on its own thread. More...
 

Constructor & Destructor Documentation

foedus::storage::hash::HashComposer::HashComposer ( Composer parent)
explicit

HashComposer methods.

Definition at line 89 of file hash_composer_impl.cpp.

References ASSERT_ND, and foedus::storage::Storage< CONTROL_BLOCK >::exists().

90  : engine_(parent->get_engine()),
91  storage_id_(parent->get_storage_id()),
92  storage_(engine_, storage_id_),
93  volatile_resolver_(engine_->get_memory_manager()->get_global_volatile_page_resolver()) {
94  ASSERT_ND(storage_.exists());
95 }
const GlobalVolatilePageResolver & get_global_volatile_page_resolver() const
Returns the page resolver to convert volatile page ID to page pointer.
bool exists() const
Returns whether this storage is already created.
Definition: storage.hpp:169
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72
memory::EngineMemory * get_memory_manager() const
See Memory Manager.
Definition: engine.cpp:50

Here is the call graph for this function:

Member Function Documentation

ErrorStack foedus::storage::hash::HashComposer::compose ( const Composer::ComposeArguments args)

Definition at line 97 of file hash_composer_impl.cpp.

References foedus::storage::Composer::ComposeArguments::base_epoch_, CHECK_ERROR, foedus::debugging::StopWatch::elapsed_ms(), foedus::DefaultInitializable::initialize(), foedus::storage::hash::kHashMaxLevels, foedus::storage::kHashStorage, foedus::kRetOk, foedus::storage::Composer::ComposeArguments::log_streams_, foedus::storage::Composer::ComposeArguments::log_streams_count_, foedus::storage::Composer::ComposeArguments::previous_snapshot_files_, foedus::storage::Composer::ComposeArguments::root_info_page_, foedus::storage::Composer::ComposeArguments::snapshot_writer_, foedus::debugging::StopWatch::stop(), to_string(), foedus::DefaultInitializable::uninitialize(), and foedus::storage::Composer::ComposeArguments::work_memory_.

Referenced by foedus::storage::Composer::compose().

97  {
98  VLOG(0) << to_string() << " composing with " << args.log_streams_count_ << " streams.";
99  debugging::StopWatch stop_watch;
100 
101  snapshot::MergeSort merge_sort(
102  storage_id_,
103  kHashStorage,
104  args.base_epoch_,
105  args.log_streams_,
106  args.log_streams_count_,
108  args.work_memory_);
109  CHECK_ERROR(merge_sort.initialize());
110 
111  HashComposeContext context(
112  engine_,
113  &merge_sort,
114  args.snapshot_writer_,
115  args.previous_snapshot_files_,
116  args.root_info_page_);
117  CHECK_ERROR(context.execute());
118 
119  CHECK_ERROR(merge_sort.uninitialize()); // no need for scoped release. its destructor is safe.
120 
121  stop_watch.stop();
122  LOG(INFO) << to_string() << " done in " << stop_watch.elapsed_ms() << "ms.";
123  return kRetOk;
124 }
const uint8_t kHashMaxLevels
Max level of intermediate pages.
Definition: hash_id.hpp:104
#define CHECK_ERROR(x)
This macro calls x and checks its returned value.
const ErrorStack kRetOk
Normal return value for no-error case.

Here is the call graph for this function:

Here is the caller graph for this function:

ErrorStack foedus::storage::hash::HashComposer::construct_root ( const Composer::ConstructRootArguments args)

Definition at line 126 of file hash_composer_impl.cpp.

References ASSERT_ND, CHECK_ERROR, foedus::snapshot::SnapshotWriter::dump_pages(), foedus::memory::AlignedMemory::get_block(), foedus::Attachable< CONTROL_BLOCK >::get_control_block(), foedus::storage::hash::HashIntermediatePage::get_level(), foedus::storage::hash::HashStorage::get_levels(), foedus::storage::Storage< CONTROL_BLOCK >::get_metadata(), foedus::snapshot::SnapshotWriter::get_next_page_id(), foedus::snapshot::SnapshotWriter::get_page_base(), foedus::Engine::get_soc_count(), foedus::storage::Composer::ConstructRootArguments::gleaner_resource_, foedus::storage::hash::HashIntermediatePage::header(), foedus::storage::hash::HashIntermediatePage::initialize_snapshot_page(), foedus::storage::kPageSize, foedus::kRetOk, launch_construct_root_multi_level(), foedus::storage::Composer::ConstructRootArguments::new_root_page_pointer_, foedus::storage::PageHeader::page_id_, foedus::storage::Composer::ConstructRootArguments::previous_snapshot_files_, foedus::cache::SnapshotFileSet::read_page(), foedus::storage::Metadata::root_snapshot_page_id_, foedus::storage::Composer::ConstructRootArguments::snapshot_writer_, foedus::storage::PageHeader::storage_id_, foedus::snapshot::LogGleanerResource::tmp_root_page_memory_, to_string(), and WRAP_ERROR_CODE.

Referenced by foedus::storage::Composer::construct_root().

126  {
127  // Unlike array package, this method might do lots of I/O.
128  // Each reducer creates only data pages. They don't create intermediate pages
129  // and instead they emit linked-lists of HashComposedBinsPage as well as HashRootInfoPage
130  // pointing to them. We combine these information here, writing out new intermediate pages.
131 
132  // compose() created root_info_pages that contain pointers to fill in the root page,
133  // so we just find non-zero entry and copy it to root page.
134  const uint8_t levels = storage_.get_levels();
135 
136  HashIntermediatePage* root_page = reinterpret_cast<HashIntermediatePage*>(
137  args.gleaner_resource_->tmp_root_page_memory_.get_block());
138  SnapshotPagePointer old_root_page_id = storage_.get_metadata()->root_snapshot_page_id_;
139  if (old_root_page_id != 0) {
140  WRAP_ERROR_CODE(args.previous_snapshot_files_->read_page(old_root_page_id, root_page));
141  ASSERT_ND(root_page->header().storage_id_ == storage_id_);
142  ASSERT_ND(root_page->header().page_id_ == old_root_page_id);
143  ASSERT_ND(root_page->get_level() + 1U == storage_.get_levels());
144  root_page->header().page_id_ = 0;
145  } else {
146  root_page->initialize_snapshot_page(
147  storage_id_,
148  0, // new page ID not known at this point
149  storage_.get_levels() - 1U,
150  0);
151  }
152 
153  if (levels == 1U) {
154  // single-level hash storage's root info page directly points to data pages, so
155  // no need for parallelization. we separate the case below.
156  CHECK_ERROR(construct_root_single_level(args, root_page));
157  } else {
158  // otherwise, we parallelze the heavy lifting part, which involves raeds/writes of pages.
159  LOG(INFO) << to_string() << " construct_root() multi-level parallelized path";
160  std::vector< std::thread > threads;
161 
162  const uint16_t nodes = engine_->get_soc_count();
163  std::unique_ptr< ErrorStack[] > error_stacks(new ErrorStack[nodes]);
164  for (uint16_t numa_node = 0; numa_node < nodes; ++numa_node) {
165  threads.emplace_back(
167  this,
168  &args,
169  numa_node,
170  root_page,
171  error_stacks.get() + numa_node);
172  }
173 
174  LOG(INFO) << to_string() << " construct_root() launched " << nodes << " threads...";
175  for (uint16_t numa_node = 0; numa_node < nodes; ++numa_node) {
176  threads[numa_node].join();
177  if (error_stacks.get()[numa_node].is_error()) {
178  LOG(ERROR) << to_string() << " construct_root() observed an error in the launched"
179  << " thread-" << numa_node << ". will propagate after joining the worker threads. "
180  << "error description: " << error_stacks.get()[numa_node];
181  }
182  CHECK_ERROR(error_stacks.get()[numa_node]);
183  }
184  LOG(INFO) << to_string() << " construct_root() joined";
185  // propagate errors.
186  for (uint16_t numa_node = 0; numa_node < nodes; ++numa_node) {
187  CHECK_ERROR(error_stacks.get()[numa_node]);
188  }
189  }
190 
191  // write out root_page image
192  SnapshotPagePointer new_root_page_id = args.snapshot_writer_->get_next_page_id();
193  root_page->header().page_id_ = new_root_page_id;
194  std::memcpy(args.snapshot_writer_->get_page_base(), root_page, kPageSize);
195  WRAP_ERROR_CODE(args.snapshot_writer_->dump_pages(0, 1));
196  ASSERT_ND(args.snapshot_writer_->get_next_page_id() == new_root_page_id + 1ULL);
197 
198  *args.new_root_page_pointer_ = new_root_page_id;
199  // AFTER writing out the root page, install the pointer to new root page
200  storage_.get_control_block()->root_page_pointer_.snapshot_pointer_ = new_root_page_id;
201  storage_.get_control_block()->meta_.root_snapshot_page_id_ = new_root_page_id;
202  return kRetOk;
203 }
const Metadata * get_metadata() const
Returns the metadata of this storage.
Definition: storage.hpp:162
static void launch_construct_root_multi_level(HashComposer *pointer, const Composer::ConstructRootArguments *args, uint16_t numa_node, HashIntermediatePage *root_page, ErrorStack *out_error)
launched on its own thread.
soc::SocId get_soc_count() const
Shorthand for get_options().thread_.group_count_.
Definition: engine.cpp:74
uint64_t SnapshotPagePointer
Page ID of a snapshot page.
Definition: storage_id.hpp:79
#define CHECK_ERROR(x)
This macro calls x and checks its returned value.
const ErrorStack kRetOk
Normal return value for no-error case.
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72
#define WRAP_ERROR_CODE(x)
Same as CHECK_ERROR(x) except it receives only an error code, thus more efficient.
CONTROL_BLOCK * get_control_block() const
Definition: attachable.hpp:97
const uint16_t kPageSize
A constant defining the page size (in bytes) of both snapshot pages and volatile pages.
Definition: storage_id.hpp:45
SnapshotPagePointer root_snapshot_page_id_
Pointer to a snapshotted page this storage is rooted at.
Definition: metadata.hpp:112

Here is the call graph for this function:

Here is the caller graph for this function:

void foedus::storage::hash::HashComposer::drop_root_volatile ( const Composer::DropVolatilesArguments args)

Definition at line 1223 of file hash_composer_impl.cpp.

References foedus::Attachable< CONTROL_BLOCK >::get_control_block(), foedus::storage::hash::HashStorage::get_hash_metadata(), foedus::storage::hash::HashStorage::get_levels(), foedus::storage::Storage< CONTROL_BLOCK >::get_name(), foedus::storage::Metadata::keeps_all_volatile_pages(), and foedus::storage::DualPagePointer::volatile_pointer_.

Referenced by foedus::storage::Composer::drop_root_volatile().

1223  {
1224  if (storage_.get_hash_metadata()->keeps_all_volatile_pages()) {
1225  LOG(INFO) << "Oh, but keep-all-volatile is on. Storage-" << storage_.get_name()
1226  << " is configured to keep all volatile pages.";
1227  return;
1228  }
1229  if (is_to_keep_volatile(storage_.get_levels() - 1U)) {
1230  LOG(INFO) << "Oh, but Storage-" << storage_.get_name() << " is configured to keep"
1231  << " the root page.";
1232  return;
1233  }
1234  DualPagePointer* root_pointer = &storage_.get_control_block()->root_page_pointer_;
1235  HashIntermediatePage* volatile_page = resolve_intermediate(root_pointer->volatile_pointer_);
1236  if (volatile_page == nullptr) {
1237  LOG(INFO) << "Oh, but root volatile page already null";
1238  return;
1239  }
1240 
1241  LOG(INFO) << "Okay, drop em all!!";
1242  drop_all_recurse(args, root_pointer);
1243 }
bool keeps_all_volatile_pages() const
Definition: metadata.hpp:98
const StorageName & get_name() const
Returns the unique name of this storage.
Definition: storage.hpp:155
const HashMetadata * get_hash_metadata() const
CONTROL_BLOCK * get_control_block() const
Definition: attachable.hpp:97

Here is the call graph for this function:

Here is the caller graph for this function:

Composer::DropResult foedus::storage::hash::HashComposer::drop_volatiles ( const Composer::DropVolatilesArguments args)

drop_volatiles and related methods

Definition at line 1165 of file hash_composer_impl.cpp.

References ASSERT_ND, foedus::storage::Composer::DropResult::dropped_all_, foedus::storage::extract_numa_node_from_snapshot_pointer(), foedus::Attachable< CONTROL_BLOCK >::get_control_block(), foedus::storage::hash::HashStorage::get_hash_metadata(), foedus::storage::hash::HashIntermediatePage::get_level(), foedus::storage::hash::HashStorage::get_levels(), foedus::storage::Storage< CONTROL_BLOCK >::get_name(), foedus::storage::hash::HashIntermediatePage::get_pointer(), foedus::storage::hash::HashStorage::get_root_children(), foedus::storage::VolatilePagePointer::is_null(), foedus::storage::Metadata::keeps_all_volatile_pages(), foedus::storage::Composer::DropVolatilesArguments::my_partition_, foedus::storage::Composer::DropVolatilesArguments::partitioned_drop_, foedus::storage::DualPagePointer::snapshot_pointer_, and foedus::storage::DualPagePointer::volatile_pointer_.

Referenced by foedus::storage::Composer::drop_volatiles().

1165  {
1166  Composer::DropResult result(args);
1167  if (storage_.get_hash_metadata()->keeps_all_volatile_pages()) {
1168  LOG(INFO) << "Keep-all-volatile: Storage-" << storage_.get_name()
1169  << " is configured to keep all volatile pages.";
1170  result.dropped_all_ = false;
1171  return result;
1172  }
1173 
1174  DualPagePointer* root_pointer = &storage_.get_control_block()->root_page_pointer_;
1175  HashIntermediatePage* volatile_page = resolve_intermediate(root_pointer->volatile_pointer_);
1176  if (volatile_page == nullptr) {
1177  LOG(INFO) << "No volatile root page. Probably while restart";
1178  return result;
1179  }
1180 
1181  // We iterate through all existing volatile pages to drop volatile pages of
1182  // level-3 or deeper (if the storage has only 2 levels, keeps all).
1183  // this "level-3 or deeper" is a configuration per storage.
1184  // Even if the volatile page is deeper than that, we keep them if it contains newer modification,
1185  // including descendants (so, probably we will keep higher levels anyways).
1186  uint16_t count = storage_.get_root_children();
1187  uint8_t root_level = volatile_page->get_level();
1188  ASSERT_ND(root_level + 1U == storage_.get_levels());
1189  for (uint16_t i = 0; i < count; ++i) {
1190  DualPagePointer& child_pointer = volatile_page->get_pointer(i);
1191  if (!child_pointer.volatile_pointer_.is_null() && child_pointer.snapshot_pointer_ != 0) {
1192  uint16_t partition = extract_numa_node_from_snapshot_pointer(child_pointer.snapshot_pointer_);
1193  if (!args.partitioned_drop_ || partition == args.my_partition_) {
1194  drop_volatiles_child(args, &child_pointer, root_level, &result);
1195  }
1196  }
1197  }
1198  // root page is kept at this point in this case. we need to check with other threads
1199  return result;
1200 }
bool keeps_all_volatile_pages() const
Definition: metadata.hpp:98
const StorageName & get_name() const
Returns the unique name of this storage.
Definition: storage.hpp:155
uint8_t extract_numa_node_from_snapshot_pointer(SnapshotPagePointer pointer)
Definition: storage_id.hpp:95
const HashMetadata * get_hash_metadata() const
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72
CONTROL_BLOCK * get_control_block() const
Definition: attachable.hpp:97

Here is the call graph for this function:

Here is the caller graph for this function:

static void foedus::storage::hash::HashComposer::launch_construct_root_multi_level ( HashComposer pointer,
const Composer::ConstructRootArguments args,
uint16_t  numa_node,
HashIntermediatePage root_page,
ErrorStack out_error 
)
inlinestatic

launched on its own thread.

Definition at line 76 of file hash_composer_impl.hpp.

Referenced by construct_root().

81  {
82  *out_error = pointer->construct_root_multi_level(*args, numa_node, root_page);
83  }

Here is the caller graph for this function:

std::string foedus::storage::hash::HashComposer::to_string ( ) const

Definition at line 427 of file hash_composer_impl.cpp.

Referenced by compose(), and construct_root().

427  {
428  return std::string("HashComposer-") + std::to_string(storage_id_);
429 }

Here is the caller graph for this function:


The documentation for this class was generated from the following files: