libfoedus-core
FOEDUS Core Library
foedus::storage::Partitioner Class Referencefinal

Partitioning and sorting logic for one storage. More...

Detailed Description

Partitioning and sorting logic for one storage.

When the snapshot module takes snapshot, it instantiates this object for each storage that had some log. Mappers obtain the object and use it to determine which partition to send logs to. Reducers also use it to sort individual logs in the storage by key-then-ordinal order. All methods in this object are in a batched style to avoid overheads per small log entry.

Partitioning Algorithm
So far, create_partitioner() receives only the engine object and ID to design partitioning. This means we can use only information available in the status-quo of the storage in the engine, such as which node owns which volatile/snapshot page. This is so far enough and VERY simple/efficient to design partitioning, but later we might want to explore smarter partitioning that utilizes, say, log entries. (but that will be complex/expensive!)
Sorting Algorithm
Sorting algorithm may use metadata specific to the storage (not just storage type). For example, if we somehow know that every key in the storage is 8 byte, we can do something very efficient. Or, in some case the requirement of sorting itself might depend on the storage.
Shared memory, No virtual methods
Unfortunately, virtual methods are hard to use in multi-process and multi-node environment. All dynamic data of Partitioner is stored in shared memory as explained in PartitionerMetadata. Thus, Partitioner has no virtual methods. It just does switch-case on the storage type and invokes methods in individual partitioners (which aren't derived classes of this).

For more details of how individual storage types implement them, see the individual partitioners.

Definition at line 70 of file partitioner.hpp.

#include <partitioner.hpp>

Inheritance diagram for foedus::storage::Partitioner:
Collaboration diagram for foedus::storage::Partitioner:

Classes

struct  DesignPartitionArguments
 Arguments for design_partition() More...
 
struct  PartitionBatchArguments
 Arguments for partition_batch() More...
 
struct  SortBatchArguments
 Arguments for sort_batch() More...
 

Public Member Functions

 Partitioner (Engine *engine, StorageId id)
 Instantiate an instance for the given storage. More...
 
const PartitionerMetadataget_metadata () const
 Returns tiny metadata of the partitioner in shared memory. More...
 
bool is_valid () const
 whether this object is ready for partitioning. More...
 
StorageId get_storage_id () const
 
StorageType get_storage_type () const
 
bool is_partitionable ()
 returns if this storage is partitionable. More...
 
ErrorStack design_partition (const DesignPartitionArguments &args)
 Determines partitioning scheme for this storage. More...
 
void partition_batch (const PartitionBatchArguments &args)
 Identifies the partition of each log record in a batched fashion. More...
 
void sort_batch (const SortBatchArguments &args)
 Called from log reducer to sort log entries by keys. More...
 
- Public Member Functions inherited from foedus::Attachable< PartitionerMetadata >
 Attachable ()
 
 Attachable (Engine *engine)
 
 Attachable (Engine *engine, PartitionerMetadata *control_block)
 
 Attachable (PartitionerMetadata *control_block)
 
 Attachable (const Attachable &other)
 
virtual ~Attachable ()
 
Attachableoperator= (const Attachable &other)
 
virtual void attach (PartitionerMetadata *control_block)
 Attaches to the given shared memory. More...
 
bool is_attached () const
 Returns whether the object has been already attached to some shared memory. More...
 
PartitionerMetadata * get_control_block () const
 
Engineget_engine () const
 
void set_engine (Engine *engine)
 

Friends

std::ostream & operator<< (std::ostream &o, const Partitioner &v)
 

Additional Inherited Members

- Protected Attributes inherited from foedus::Attachable< PartitionerMetadata >
Engineengine_
 Most attachable object stores an engine pointer (local engine), so we define it here. More...
 
PartitionerMetadata * control_block_
 The shared data on shared memory that has been initialized in some SOC or master engine. More...
 

Constructor & Destructor Documentation

foedus::storage::Partitioner::Partitioner ( Engine engine,
StorageId  id 
)

Instantiate an instance for the given storage.

Parameters
[in]engineEngine
[in]idID of the storage

This method only attaches to a shared memory, so it has no cost. You can instantiate Partitioner anywhere you like.

Definition at line 84 of file partitioner.cpp.

References ASSERT_ND, foedus::storage::StorageManager::get_storage(), foedus::Engine::get_storage_manager(), foedus::storage::kInvalidStorage, foedus::storage::StorageControlBlock::meta_, and foedus::storage::Metadata::type_.

86  id_ = id;
87  type_ = engine->get_storage_manager()->get_storage(id)->meta_.type_;
88  ASSERT_ND(type_ != kInvalidStorage);
89 }
0 indicates invalid type.
Definition: storage_id.hpp:124
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72
static PartitionerMetadata * get_metadata(Engine *engine, StorageId id)
Returns the shared memory for the given storage ID.
Definition: partitioner.cpp:38

Here is the call graph for this function:

Member Function Documentation

ErrorStack foedus::storage::Partitioner::design_partition ( const DesignPartitionArguments args)

Determines partitioning scheme for this storage.

This method puts the resulting data in shared memory. This method should be called only once per snapshot.

Definition at line 106 of file partitioner.cpp.

References foedus::storage::sequential::SequentialPartitioner::design_partition(), foedus::storage::hash::HashPartitioner::design_partition(), foedus::storage::array::ArrayPartitioner::design_partition(), foedus::storage::masstree::MasstreePartitioner::design_partition(), foedus::storage::kArrayStorage, foedus::storage::kHashStorage, foedus::storage::kMasstreeStorage, foedus::kRetOk, and foedus::storage::kSequentialStorage.

106  {
107  switch (type_) {
108  case kArrayStorage: return array::ArrayPartitioner(this).design_partition(args);
109  case kHashStorage: return hash::HashPartitioner(this).design_partition(args);
110  case kSequentialStorage: return sequential::SequentialPartitioner(this).design_partition(args);
111  case kMasstreeStorage: return masstree::MasstreePartitioner(this).design_partition(args);
112  default:
113  LOG(FATAL) << "Unsupported storage type:" << type_;
114  return kRetOk;
115  }
116 }
const ErrorStack kRetOk
Normal return value for no-error case.

Here is the call graph for this function:

const PartitionerMetadata & foedus::storage::Partitioner::get_metadata ( ) const

Returns tiny metadata of the partitioner in shared memory.

Definition at line 91 of file partitioner.cpp.

References foedus::Attachable< PartitionerMetadata >::control_block_.

91 { return *control_block_; }
PartitionerMetadata * control_block_
The shared data on shared memory that has been initialized in some SOC or master engine.
Definition: attachable.hpp:111
StorageId foedus::storage::Partitioner::get_storage_id ( ) const
inline

Definition at line 86 of file partitioner.hpp.

86 { return id_;}
StorageType foedus::storage::Partitioner::get_storage_type ( ) const
inline

Definition at line 87 of file partitioner.hpp.

87 { return type_; }
bool foedus::storage::Partitioner::is_partitionable ( )

returns if this storage is partitionable.

Some storage, such as a B-tree with only a single page (root=leaf), it is impossible to partition. In that case, this returns false to indicate that the caller can just assume all logs should be blindly sent to partition-0. Similarly, if there is only one NUMA node (partition), the caller also skips partitioning, but in that case the caller even skips instantiating partitioners.

Definition at line 94 of file partitioner.cpp.

References foedus::storage::sequential::SequentialPartitioner::is_partitionable(), foedus::storage::hash::HashPartitioner::is_partitionable(), foedus::storage::array::ArrayPartitioner::is_partitionable(), foedus::storage::masstree::MasstreePartitioner::is_partitionable(), foedus::storage::kArrayStorage, foedus::storage::kHashStorage, foedus::storage::kMasstreeStorage, and foedus::storage::kSequentialStorage.

94  {
95  switch (type_) {
96  case kArrayStorage: return array::ArrayPartitioner(this).is_partitionable();
97  case kHashStorage: return hash::HashPartitioner(this).is_partitionable();
98  case kMasstreeStorage: return masstree::MasstreePartitioner(this).is_partitionable();
99  case kSequentialStorage: return sequential::SequentialPartitioner(this).is_partitionable();
100  default:
101  LOG(FATAL) << "Unsupported storage type:" << type_;
102  return false;
103  }
104 }

Here is the call graph for this function:

bool foedus::storage::Partitioner::is_valid ( ) const

whether this object is ready for partitioning.

if only sorting is needed, it doesn't matter

Definition at line 92 of file partitioner.cpp.

References foedus::Attachable< PartitionerMetadata >::control_block_.

92 { return control_block_->valid_; }
PartitionerMetadata * control_block_
The shared data on shared memory that has been initialized in some SOC or master engine.
Definition: attachable.hpp:111
void foedus::storage::Partitioner::partition_batch ( const PartitionBatchArguments args)

Identifies the partition of each log record in a batched fashion.

Precondition
!is_partitionable(): in this case, it's waste of time. check it before calling this.

Each storage type implements this method based on the statistics passed to create_partitioner(). For better performance, logs_count is usually at least thousands. Assume the scale when you optimize the implementation in derived classes.

Definition at line 118 of file partitioner.cpp.

References foedus::storage::kArrayStorage, foedus::storage::kHashStorage, foedus::storage::kMasstreeStorage, foedus::storage::kSequentialStorage, foedus::storage::sequential::SequentialPartitioner::partition_batch(), foedus::storage::hash::HashPartitioner::partition_batch(), foedus::storage::array::ArrayPartitioner::partition_batch(), and foedus::storage::masstree::MasstreePartitioner::partition_batch().

118  {
119  switch (type_) {
120  case kArrayStorage: return array::ArrayPartitioner(this).partition_batch(args);
121  case kHashStorage: return hash::HashPartitioner(this).partition_batch(args);
122  case kMasstreeStorage: return masstree::MasstreePartitioner(this).partition_batch(args);
123  case kSequentialStorage: return sequential::SequentialPartitioner(this).partition_batch(args);
124  default:
125  LOG(FATAL) << "Unsupported storage type:" << type_;
126  }
127 }

Here is the call graph for this function:

void foedus::storage::Partitioner::sort_batch ( const SortBatchArguments args)

Called from log reducer to sort log entries by keys.

All log entries passed to this method are for this storage. Each storage type implements an efficient and batched way of sorting all log entries by key-and-then-ordinal. The implementation can do compaction when it is safe. For example, two ovewrite logs on the same key's same data region can be compacted to one log. In that case, written_count becomes smaller than log_positions_count_.

See also
get_required_sort_buffer_size()

Definition at line 129 of file partitioner.cpp.

References foedus::storage::kArrayStorage, foedus::storage::kHashStorage, foedus::storage::kMasstreeStorage, foedus::storage::kSequentialStorage, foedus::storage::sequential::SequentialPartitioner::sort_batch(), foedus::storage::hash::HashPartitioner::sort_batch(), foedus::storage::array::ArrayPartitioner::sort_batch(), and foedus::storage::masstree::MasstreePartitioner::sort_batch().

129  {
130  switch (type_) {
131  case kArrayStorage: return array::ArrayPartitioner(this).sort_batch(args);
132  case kHashStorage: return hash::HashPartitioner(this).sort_batch(args);
133  case kMasstreeStorage: return masstree::MasstreePartitioner(this).sort_batch(args);
134  case kSequentialStorage: return sequential::SequentialPartitioner(this).sort_batch(args);
135  default:
136  LOG(FATAL) << "Unsupported storage type:" << type_;
137  }
138 }

Here is the call graph for this function:

Friends And Related Function Documentation

std::ostream& operator<< ( std::ostream &  o,
const Partitioner v 
)
friend

Definition at line 140 of file partitioner.cpp.

140  {
141  o << "<Partitioner>"
142  << "<id>" << v.id_ << "</id>"
143  << "<type>" << to_storage_type_name(v.type_) << "</type>"
144  << *v.control_block_;
145  o << "</Partitioner>";
146  return o;
147 }
const char * to_storage_type_name(StorageType type)
Gives a string representation of StorageType.
Definition: storage_id.hpp:139

The documentation for this class was generated from the following files: