libfoedus-core
FOEDUS Core Library
foedus::storage::masstree::ReserveRecords Struct Referencefinal

A system transaction to reserve a physical record(s) in a border page. More...

Detailed Description

A system transaction to reserve a physical record(s) in a border page.

See also
Physical-only, short-living system transactions.

A record insertion happens in two steps:

  • During Execution: Physically inserts a deleted record for the key in the border page.
  • During Commit: Logically flips the delete-bit and installs the payload.

This system transaction does the former.

This does nothing and returns kErrorCodeOk in the following cases:

  • The page turns out to already contain a satisfying physical record for the key.
  • The page turns out to be already moved.
  • The page turns out to need page-split to accomodate the record.

When the 2nd or 3rd case (or 1st case with too-short payload space) happens, the caller will do something else (e.g. split/follow-foster) and retry.

Locks taken in this sysxct (in order of taking):

  • Page-lock of the target page.
  • Record-lock of an existing, matching record. Only when we have to expand the record.

So far this sysxct installs only one physical record at a time. TASK(Hideaki): Probably it helps by batching several records.

Definition at line 57 of file masstree_reserve_impl.hpp.

#include <masstree_reserve_impl.hpp>

Inheritance diagram for foedus::storage::masstree::ReserveRecords:
Collaboration diagram for foedus::storage::masstree::ReserveRecords:

Public Member Functions

 ReserveRecords (thread::Thread *context, MasstreeBorderPage *target, KeySlice slice, KeyLength remainder_length, const void *suffix, PayloadLength payload_count, bool should_aggresively_create_next_layer, SlotIndex hint_check_from)
 
virtual ErrorCode run (xct::SysxctWorkspace *sysxct_workspace) override
 Execute the system transaction. More...
 

Public Attributes

thread::Thread *const context_
 Thread context. More...
 
MasstreeBorderPage *const target_
 The page to install a new physical record. More...
 
const KeySlice slice_
 The slice of the key. More...
 
const void *const suffix_
 Suffix of the key. More...
 
const KeyLength remainder_length_
 Length of the remainder. More...
 
const PayloadLength payload_count_
 Minimal required length of the payload. More...
 
const SlotIndex hint_check_from_
 The in-page location from which this sysxct will look for matching records. More...
 
const bool should_aggresively_create_next_layer_
 When we CAN create a next layer for the new record, whether to make it a next-layer from the beginning, or make it as a usual record first. More...
 
bool out_split_needed_
 [Out] More...
 

Constructor & Destructor Documentation

foedus::storage::masstree::ReserveRecords::ReserveRecords ( thread::Thread context,
MasstreeBorderPage target,
KeySlice  slice,
KeyLength  remainder_length,
const void *  suffix,
PayloadLength  payload_count,
bool  should_aggresively_create_next_layer,
SlotIndex  hint_check_from 
)
inline

Definition at line 91 of file masstree_reserve_impl.hpp.

100  : xct::SysxctFunctor(),
101  context_(context),
102  target_(target),
103  slice_(slice),
104  suffix_(suffix),
105  remainder_length_(remainder_length),
106  payload_count_(payload_count),
107  hint_check_from_(hint_check_from),
108  should_aggresively_create_next_layer_(should_aggresively_create_next_layer),
109  out_split_needed_(false) {
110  }
MasstreeBorderPage *const target_
The page to install a new physical record.
const KeySlice slice_
The slice of the key.
const PayloadLength payload_count_
Minimal required length of the payload.
thread::Thread *const context_
Thread context.
const void *const suffix_
Suffix of the key.
const bool should_aggresively_create_next_layer_
When we CAN create a next layer for the new record, whether to make it a next-layer from the beginnin...
const KeyLength remainder_length_
Length of the remainder.
const SlotIndex hint_check_from_
The in-page location from which this sysxct will look for matching records.

Member Function Documentation

ErrorCode foedus::storage::masstree::ReserveRecords::run ( xct::SysxctWorkspace sysxct_workspace)
overridevirtual

Execute the system transaction.

You should override this method.

Implements foedus::xct::SysxctFunctor.

Definition at line 55 of file masstree_reserve_impl.cpp.

References foedus::storage::masstree::allocate_new_border_page(), foedus::storage::masstree::MasstreeBorderPage::assert_entries(), ASSERT_ND, foedus::storage::masstree::MasstreeBorderPage::can_accomodate(), CHECK_ERROR_CODE, context_, foedus::storage::masstree::MasstreeBorderPage::does_point_to_layer(), foedus::storage::masstree::MasstreeBorderPage::find_key_for_reserve(), foedus::storage::masstree::get_initial_xid(), foedus::storage::masstree::MasstreePage::get_key_count(), foedus::storage::masstree::MasstreePage::get_layer(), foedus::thread::Thread::get_local_volatile_page_resolver(), foedus::storage::masstree::MasstreeBorderPage::get_max_payload_length(), foedus::storage::masstree::MasstreeBorderPage::get_next_layer(), foedus::thread::Thread::get_numa_node(), foedus::storage::masstree::MasstreeBorderPage::get_owner_id(), foedus::thread::Thread::get_thread_memory(), foedus::storage::masstree::MasstreePage::get_volatile_page_id(), foedus::memory::NumaCoreMemory::grab_free_volatile_page(), foedus::storage::masstree::MasstreePage::header(), hint_check_from_, foedus::storage::masstree::MasstreePage::increment_key_count(), foedus::storage::masstree::MasstreeBorderPage::initialize_as_layer_root_physical(), foedus::storage::masstree::MasstreeBorderPage::initialize_volatile_page(), foedus::storage::masstree::MasstreePage::is_locked(), foedus::storage::masstree::MasstreePage::is_moved(), foedus::storage::masstree::MasstreePage::is_retired(), foedus::storage::masstree::kBorderPageMaxSlots, foedus::storage::masstree::MasstreeBorderPage::kConflictingLocalRecord, foedus::kErrorCodeMemoryNoFreePages, foedus::kErrorCodeOk, foedus::storage::masstree::MasstreeBorderPage::kExactMatchLayerPointer, foedus::storage::masstree::MasstreeBorderPage::kExactMatchLocalRecord, foedus::storage::masstree::kInfimumSlice, foedus::storage::masstree::MasstreeBorderPage::kNotFound, foedus::storage::masstree::kSupremumSlice, foedus::assorted::memory_fence_release(), out_split_needed_, foedus::storage::PageHeader::page_id_, payload_count_, remainder_length_, foedus::storage::masstree::MasstreeBorderPage::reserve_initially_next_layer(), foedus::storage::masstree::MasstreeBorderPage::reserve_record_space(), foedus::memory::LocalPageResolver::resolve_offset_newpage(), foedus::storage::VolatilePagePointer::set(), should_aggresively_create_next_layer_, slice_, foedus::storage::PageHeader::snapshot_, foedus::storage::DualPagePointer::snapshot_pointer_, foedus::storage::PageHeader::storage_id_, suffix_, foedus::thread::Thread::sysxct_page_lock(), foedus::thread::Thread::sysxct_record_lock(), target_, foedus::storage::masstree::MasstreeBorderPage::try_expand_record_in_page_physical(), and foedus::storage::DualPagePointer::volatile_pointer_.

55  {
56  out_split_needed_ = false;
59  CHECK_ERROR_CODE(context_->sysxct_page_lock(sysxct_workspace, reinterpret_cast<Page*>(target_)));
61 
62  // After page-lock, key-count and record-keys (not payloads) are fixed. Check them now!
63  if (target_->is_moved()) {
64  DVLOG(0) << "Interesting. this page has been split";
65  return kErrorCodeOk;
66  }
68  const SlotIndex key_count = target_->get_key_count();
69  const VolatilePagePointer page_id(target_->header().page_id_);
70  MasstreeBorderPage::FindKeyForReserveResult match = target_->find_key_for_reserve(
72  key_count,
73  slice_,
74  suffix_,
76 
77  // Let's check whether we need to split this page.
78  // If we can accomodate it without splitting the page, complete it within here.
79  if (match.match_type_ == MasstreeBorderPage::kExactMatchLayerPointer) {
80  // Definitelly done. The caller must folllow the next-layer
81  ASSERT_ND(match.index_ < kBorderPageMaxSlots);
82  return kErrorCodeOk;
83  } else if (match.match_type_ == MasstreeBorderPage::kExactMatchLocalRecord) {
84  // Is it enough spacious?
85  ASSERT_ND(match.index_ < kBorderPageMaxSlots);
86  if (target_->get_max_payload_length(match.index_) >= payload_count_) {
87  return kErrorCodeOk; // Yes! done.
88  }
89 
90  DVLOG(2) << "Need to expand the record.";
91  auto* record = target_->get_owner_id(match.index_);
92  CHECK_ERROR_CODE(context_->sysxct_record_lock(sysxct_workspace, page_id, record));
93  ASSERT_ND(record->is_keylocked());
94 
95  // Now the state of the record is finalized. Let's check it again.
96  ASSERT_ND(!record->is_moved()); // can't be moved as target_ is not moved.
97  if (target_->get_max_payload_length(match.index_) >= payload_count_) {
98  return kErrorCodeOk; // Yes! done.
99  } else if (record->is_next_layer()) {
100  DVLOG(0) << "Interesting. the record now points to next layer";
101  return kErrorCodeOk; // same kExactMatchLayerPointer
102  }
103 
104  bool expanded = target_->try_expand_record_in_page_physical(payload_count_, match.index_);
105  if (expanded) {
107  return kErrorCodeOk;
108  }
109 
110  DVLOG(0) << "Ouch. need to split for allocating a space for record expansion";
111  } else if (match.match_type_ == MasstreeBorderPage::kConflictingLocalRecord) {
112  // We will create a next layer.
113  // In this case, page-lock was actually an overkill because of key-immutability,
114  // However, doing it after page-lock makes the code simpler.
115  // In most cases, the caller finds the conflicting local record before calling this
116  // sysxct. No need to optimize for this rare case.
117  ASSERT_ND(match.index_ < kBorderPageMaxSlots);
118 
119  auto* record = target_->get_owner_id(match.index_);
120  CHECK_ERROR_CODE(context_->sysxct_record_lock(sysxct_workspace, page_id, record));
121  ASSERT_ND(record->is_keylocked());
122 
123  // Now the state of the record is finalized. Let's check it again.
124  ASSERT_ND(!record->is_moved()); // can't be moved as target_ is not moved.
125  if (record->is_next_layer()) {
126  DVLOG(0) << "Interesting. the record now points to next layer";
127  return kErrorCodeOk; // same kExactMatchLayerPointer
128  }
129 
130  // Can we trivially turn this into a next-layer record?
131  if (target_->get_max_payload_length(match.index_) < sizeof(DualPagePointer)) {
132  DVLOG(1) << "We need to expand the record to make it a next-layer."
133  " If this happens too often and is the bottleneck, "
134  " you should have used physical_payload_hint when you initially inserted the record.";
135  bool expanded
136  = target_->try_expand_record_in_page_physical(sizeof(DualPagePointer), match.index_);
137  if (expanded) {
138  ASSERT_ND(target_->get_max_payload_length(match.index_) >= sizeof(DualPagePointer));
139  }
140  }
141 
142  if (target_->get_max_payload_length(match.index_) >= sizeof(DualPagePointer)) {
143  // Turn it into a next-layer
144  memory::NumaCoreMemory* memory = context_->get_thread_memory();
145  memory::PagePoolOffset offset = memory->grab_free_volatile_page();
146  if (offset == 0) {
148  }
149  MasstreeBorderPage* new_layer_root = reinterpret_cast<MasstreeBorderPage*>(
151  VolatilePagePointer new_page_id;
152  new_page_id.set(context_->get_numa_node(), offset);
153  new_layer_root->initialize_as_layer_root_physical(new_page_id, target_, match.index_);
154  return kErrorCodeOk;
155  }
156 
157  DVLOG(0) << "Ouch. need to split for allocating a space for next-layer";
158  } else {
159  ASSERT_ND(match.match_type_ == MasstreeBorderPage::kNotFound);
161  target_->can_accomodate(key_count, sizeof(KeySlice), sizeof(DualPagePointer))) {
162  DVLOG(1) << "Aggressively creating a next-layer.";
163 
164  MasstreeBorderPage* root = allocate_new_border_page(context_);
165  if (root == nullptr) {
167  }
168  DualPagePointer pointer;
169  pointer.snapshot_pointer_ = 0;
170  pointer.volatile_pointer_ = root->get_volatile_page_id();
171 
172  root->initialize_volatile_page(
174  pointer.volatile_pointer_,
175  target_->get_layer() + 1U,
176  kInfimumSlice, // infimum slice
177  kSupremumSlice); // high-fence is supremum
178  ASSERT_ND(!root->is_locked());
179  ASSERT_ND(!root->is_moved());
180  ASSERT_ND(!root->is_retired());
181  ASSERT_ND(root->get_key_count() == 0);
182 
183  xct::XctId initial_id = get_initial_xid();
184  initial_id.set_next_layer();
185  target_->reserve_initially_next_layer(key_count, initial_id, slice_, pointer);
186 
190  ASSERT_ND(target_->get_next_layer(key_count)->volatile_pointer_ == pointer.volatile_pointer_);
191 
195  return kErrorCodeOk;
196  }
199  xct::XctId initial_id = get_initial_xid();
200  initial_id.set_deleted();
202  key_count,
203  initial_id,
204  slice_,
205  suffix_,
208  // we increment key count AFTER installing the key because otherwise the optimistic read
209  // might see the record but find that the key doesn't match. we need a fence to prevent it.
216  return kErrorCodeOk;
217  }
218 
219  DVLOG(1) << "Ouch. need to split for allocating a space for new record";
220  }
221 
222  // If we are here, we need to split the page for some reason.
223  out_split_needed_ = true;
224  return kErrorCodeOk;
225 }
void reserve_record_space(SlotIndex index, xct::XctId initial_owner_id, KeySlice slice, const void *suffix, KeyLength remainder_length, PayloadLength payload_count)
Installs a new physical record that doesn't exist logically (delete bit on).
MasstreeBorderPage *const target_
The page to install a new physical record.
const KeySlice kInfimumSlice
SlotIndex get_key_count() const __attribute__((always_inline))
physical key count (those keys might be deleted) in this page.
MasstreeBorderPage * allocate_new_border_page(thread::Thread *context)
const KeySlice slice_
The slice of the key.
memory::NumaCoreMemory * get_thread_memory() const
Returns the private memory repository of this thread.
Definition: thread.cpp:57
uint16_t SlotIndex
Index of a record in a (border) page.
PayloadLength get_max_payload_length(SlotIndex index) const __attribute__((always_inline))
bool is_locked() const __attribute__((always_inline))
bool is_moved() const __attribute__((always_inline))
ErrorCode sysxct_record_lock(xct::SysxctWorkspace *sysxct_workspace, storage::VolatilePagePointer page_id, xct::RwLockableXctId *lock)
Takes a lock for a sysxct running under this thread.
uint32_t PagePoolOffset
Offset in PagePool that compactly represents the page address (unlike 8 bytes pointer).
Definition: memory_id.hpp:44
StorageId storage_id_
ID of the storage this page belongs to.
Definition: page.hpp:196
storage::Page * resolve_offset_newpage(PagePoolOffset offset) const __attribute__((always_inline))
As the name suggests, this version is for new pages, which don't have sanity checks.
DualPagePointer * get_next_layer(SlotIndex index) __attribute__((always_inline))
const PayloadLength payload_count_
Minimal required length of the payload.
uint64_t KeySlice
Each key slice is an 8-byte integer.
thread::Thread *const context_
Thread context.
bool can_accomodate(SlotIndex new_index, KeyLength remainder_length, PayloadLength payload_count) const __attribute__((always_inline))
bool is_retired() const __attribute__((always_inline))
FindKeyForReserveResult find_key_for_reserve(SlotIndex from_index, SlotIndex to_index, KeySlice slice, const void *suffix, KeyLength remainder) const __attribute__((always_inline))
This is for the case we are looking for either the matching slot or the slot we will modify...
VolatilePagePointer volatile_pointer_
Definition: storage_id.hpp:308
0 means no-error.
Definition: error_code.hpp:87
ErrorCode sysxct_page_lock(xct::SysxctWorkspace *sysxct_workspace, storage::Page *page)
Takes a page lock in the same page for a sysxct running under this thread.
void increment_key_count() __attribute__((always_inline))
const void *const suffix_
Suffix of the key.
const SlotIndex kBorderPageMaxSlots
Maximum number of slots in one MasstreeBorderPage.
uint8_t get_layer() const __attribute__((always_inline))
Layer-0 stores the first 8 byte slice, Layer-1 next 8 byte...
const bool should_aggresively_create_next_layer_
When we CAN create a next layer for the new record, whether to make it a next-layer from the beginnin...
void assert_entries() __attribute__((always_inline))
const memory::LocalPageResolver & get_local_volatile_page_resolver() const
Returns page resolver to convert only local page ID to page pointer.
Definition: thread.cpp:80
0x0301 : "MEMORY : Not enough free volatile pages. Check the config of MemoryOptions" ...
Definition: error_code.hpp:142
bool try_expand_record_in_page_physical(PayloadLength payload_count, SlotIndex record_index)
A physical-only method to expand a record within this page without any logical change.
#define CHECK_ERROR_CODE(x)
This macro calls x and checks its returned error code.
Definition: error_code.hpp:155
ThreadGroupId get_numa_node() const
Definition: thread.hpp:66
const KeyLength remainder_length_
Length of the remainder.
const KeySlice kSupremumSlice
xct::RwLockableXctId * get_owner_id(SlotIndex index) __attribute__((always_inline))
const SlotIndex hint_check_from_
The in-page location from which this sysxct will look for matching records.
bool snapshot_
Whether this page image is of a snapshot page.
Definition: page.hpp:211
#define ASSERT_ND(x)
A warning-free wrapper macro of assert() that has no performance effect in release mode even when 'x'...
Definition: assert_nd.hpp:72
void reserve_initially_next_layer(SlotIndex index, xct::XctId initial_owner_id, KeySlice slice, const DualPagePointer &pointer)
For creating a record that is initially a next-layer.
void memory_fence_release()
Equivalent to std::atomic_thread_fence(std::memory_order_release).
bool does_point_to_layer(SlotIndex index) const __attribute__((always_inline))
uint64_t page_id_
Page ID of this page.
Definition: page.hpp:191

Here is the call graph for this function:

Member Data Documentation

thread::Thread* const foedus::storage::masstree::ReserveRecords::context_

Thread context.

Definition at line 59 of file masstree_reserve_impl.hpp.

Referenced by run().

const SlotIndex foedus::storage::masstree::ReserveRecords::hint_check_from_

The in-page location from which this sysxct will look for matching records.

The caller is sure that no record before this position can match the slice. Thanks to append-only writes in border pages, the caller can guarantee that the records it observed are final.

In most cases, this is same as key_count after locking, thus completely avoiding the re-check.

Definition at line 78 of file masstree_reserve_impl.hpp.

Referenced by run().

bool foedus::storage::masstree::ReserveRecords::out_split_needed_
const PayloadLength foedus::storage::masstree::ReserveRecords::payload_count_

Minimal required length of the payload.

Definition at line 69 of file masstree_reserve_impl.hpp.

Referenced by run().

const KeyLength foedus::storage::masstree::ReserveRecords::remainder_length_

Length of the remainder.

Definition at line 67 of file masstree_reserve_impl.hpp.

Referenced by run().

const bool foedus::storage::masstree::ReserveRecords::should_aggresively_create_next_layer_

When we CAN create a next layer for the new record, whether to make it a next-layer from the beginning, or make it as a usual record first.

See also
MasstreeMetadata::should_aggresively_create_next_layer()

Definition at line 85 of file masstree_reserve_impl.hpp.

Referenced by run().

const KeySlice foedus::storage::masstree::ReserveRecords::slice_

The slice of the key.

Definition at line 63 of file masstree_reserve_impl.hpp.

Referenced by run().

const void* const foedus::storage::masstree::ReserveRecords::suffix_

Suffix of the key.

Definition at line 65 of file masstree_reserve_impl.hpp.

Referenced by run().

MasstreeBorderPage* const foedus::storage::masstree::ReserveRecords::target_

The page to install a new physical record.

Definition at line 61 of file masstree_reserve_impl.hpp.

Referenced by run().


The documentation for this struct was generated from the following files: