A cursor interface to read tuples from a sequential storage. More...

Detailed Description

A cursor interface to read tuples from a sequential storage.

Unlike other storages, the only read-access to sequential storages is, as the name implies, a full sequential scan. This cursor interface is thus optimized for cases where we scan millions of records. This implies that, unlike masstree's cursor, we don't have to worry about infrequent overheads, such as new/delete in initialization.

Example first: Use it as follows.
memory::AlignedMemory buffer(1 << 16, 1 << 12, kNumaAllocOnnode, 0);

SequentialCursor cursor(context, storage, buffer.get_block(), 1 << 16);

SequentialRecordIterator it;

while (cursor.is_valid()) {

CHECK_ERROR(cursor.next_batch(&it));

while (it.is_valid()) {

std::cout << std::string(it.get_cur_record_raw(), it.get_cur_record_length());

...

it.next();

}

}

Safe Epoch and Unsafe Epoch: Safe epochs are epochs before the current grace epoch (current global epoch -1). There will be no more transactions in such epochs that might insert new records. Thus, thanks to the append-only nature of sequential storage, reading records in safe epochs does not need any concurrency control. Unsafe epochs, OTOH, are the currrent grace epoch and later. Some transaction in grace-epoch might be now in apply-phase to insert records, and furthermore some transaction might newly start in current-epoch. This cursor might do expensive synchronization if the user requests to read records from unsafe epochs.

Optimistic vs pessimistic: Reading unsafe epochs should be protected by lock (pessimistic) because 1) this happens rarely, and 2) quite likely that OCC will abort because all accesses are at the tail. So far, the implemention is OCC, just taking page version of tail page and read-set of last record in tail page. Frankly speaking to save coding. We should measure OCC vs lock in this case and most likely implement lock. The lock must be a bit more complicated than usual because insertion threads should not take locks frequently (too expensive then).

Definition at line 91 of file sequential_cursor.hpp.

#include <sequential_cursor.hpp>

Public Types
enum	OrderMode { kNodeFirstMode, kLooseEpochSortMode }
	The order this cursor returns tuples. More...

Public Member Functions
	SequentialCursor (thread::Thread context, const sequential::SequentialStorage &storage, void buffer, uint64_t buffer_size, OrderMode order_mode=kNodeFirstMode, Epoch from_epoch=INVALID_EPOCH, Epoch to_epoch=INVALID_EPOCH, int32_t node_filter=-1)
	Constructs a cursor to read tuples from this storage. More...

	~SequentialCursor ()

thread::Thread *	get_context () const

const sequential::SequentialStorage &	get_storage () const

Epoch	get_from_epoch () const

Epoch	get_to_epoch () const

ErrorCode	next_batch (SequentialRecordIterator *out)
	Returns a batch of records as an iterator. More...

bool	is_valid () const

bool	is_finished_snapshots () const
	Followings are rather implementation details. Used only from testcases. More...

bool	is_finished_safe_volatiles () const

bool	is_finished_unsafe_volatiles () const

Friends
std::ostream &	operator<< (std::ostream &o, const SequentialCursor &v)

Member Enumeration Documentation

enum foedus::storage::sequential::SequentialCursor::OrderMode

The order this cursor returns tuples.

Enumerator

kNodeFirstMode

Returns as many records as possible from node-0's core-0, core-1, do the same from node-1,...

Note that even this mode might return unsafe epoch at last because we delay reading unsafe epochs as much as possible.

kLooseEpochSortMode

Returns records loosely ordered by epochs.

We don't guarantee true ordering even in this case, which is too expensive. TASK(Hideaki) Not implemented yet.

Definition at line 94 of file sequential_cursor.hpp.

                  {
     kNodeFirstMode,
     kLooseEpochSortMode,
   };

Constructor & Destructor Documentation

foedus::storage::sequential::SequentialCursor::SequentialCursor	(	thread::Thread *	context,
		const sequential::SequentialStorage &	storage,
		void *	buffer,
		uint64_t	buffer_size,
		OrderMode	order_mode = `kNodeFirstMode`,
		Epoch	from_epoch = `INVALID_EPOCH`,
		Epoch	to_epoch = `INVALID_EPOCH`,
		int32_t	node_filter = `-1`
	)

Constructs a cursor to read tuples from this storage.

Parameters

[in]	context	Thread context of the transaction
[in]	storage	The sequential storage to read from
[in,out]	buffer	The buffer to read a number of snapshot pages in a batch. This buffer must be aligned for direct-IO.
[in]	buffer_size	Byte size of buffer. Must be at least 4kb.
[in]	order_mode	The order this cursor returns tuples
[in]	from_epoch	Inclusive beginning of epochs to read. If not specified, all epochs.
[in]	to_epoch	Exclusive end of epochs to read. To read records in unsafe epochs, specify a future epoch, larger than the current grace epoch (remember, it's exclusive end). If not specified, all safe epochs (fast, but does not return records being added).
[in]	node_filter	If specified, returns records only in the given node. negative for reading from all nodes. This is especially useful for parallelizing a scan on a large sequential storage.

Default parameter: the system-initial epoch for from_epoch and current-global epoch for to_epoch (thus safe_epoch_only_). Assuming this storage is used for log/archive data, this should be a quite common usecase. order_mode is defaulted to kNodeFirstMode.

Definition at line 50 of file sequential_cursor.cpp.

References ASSERT_ND, foedus::xct::XctManager::get_current_grace_epoch(), foedus::xct::Xct::get_isolation_level(), foedus::Engine::get_xct_manager(), foedus::Epoch::is_valid(), foedus::storage::kPageSize, and foedus::xct::kSnapshot.

   : context_(context),
     xct_(&context->get_current_xct()),
     engine_(context->get_engine()),
     resolver_(engine_->get_memory_manager()->get_global_volatile_page_resolver()),
     storage_(storage),
     from_epoch_(
       from_epoch.is_valid() ? from_epoch : engine_->get_savepoint_manager()->get_earliest_epoch()),
     to_epoch_(
       to_epoch.is_valid() ? to_epoch : engine_->get_xct_manager()->get_current_grace_epoch()),
     latest_snapshot_epoch_(engine_->get_snapshot_manager()->get_snapshot_epoch()),
     from_epoch_volatile_(max_from_epoch_snapshot_epoch(from_epoch_, latest_snapshot_epoch_)),
     node_filter_(node_filter),
     node_count_(engine_->get_soc_count()),
     order_mode_(order_mode),
     buffer_(reinterpret_cast<SequentialRecordBatch*>(buffer)),
     buffer_size_(buffer_size),
     buffer_pages_(buffer_size / kPageSize) {
   ASSERT_ND(buffer_size >= kPageSize);
   current_node_ = 0;
   finished_snapshots_ = false;
   finished_safe_volatiles_ = false;
   finished_unsafe_volatiles_ = false;
   states_.clear();
 
   grace_epoch_ = engine_->get_xct_manager()->get_current_grace_epoch();
   ASSERT_ND(from_epoch_.is_valid());
   ASSERT_ND(to_epoch_.is_valid());
   ASSERT_ND(from_epoch_ <= to_epoch_);
 
   if (xct_->get_isolation_level() == xct::kSnapshot
     || (latest_snapshot_epoch_.is_valid() && to_epoch_ <= latest_snapshot_epoch_)) {
     snapshot_only_ = true;
     safe_epoch_only_ = true;
     finished_safe_volatiles_ = true;
     finished_unsafe_volatiles_ = true;
   } else {
     snapshot_only_ = false;
     if (to_epoch_ <= grace_epoch_) {
       safe_epoch_only_ = true;
       // We do NOT rule out reading unsafe pages yet.
       // Even a safe page might be conservatively deemed as unsafe in our logic,
       // so we must make sure we go on to unsafe-volatile phase too.
       // Real-check happens in next_batch_unsafe_volatiles().
       // finished_unsafe_volatiles_ = true;
     } else {
       // only in this case, we have to take a lock
       safe_epoch_only_ = false;
     }
   }
 
   if (!latest_snapshot_epoch_.is_valid() || latest_snapshot_epoch_ < from_epoch_) {
     finished_snapshots_ = true;
   }
 }

Here is the call graph for this function:

foedus::storage::sequential::SequentialCursor::~SequentialCursor ( )

Definition at line 114 of file sequential_cursor.cpp.

                                     {
   states_.clear();
 }

Member Function Documentation

thread::Thread* foedus::storage::sequential::SequentialCursor::get_context ( ) const

inline

Definition at line 142 of file sequential_cursor.hpp.

142 { return context_;}

Epoch foedus::storage::sequential::SequentialCursor::get_from_epoch ( ) const

inline

Returns: Inclusive beginning of epochs to read.

Definition at line 146 of file sequential_cursor.hpp.

Referenced by foedus::storage::sequential::operator<<().

146 { return from_epoch_; }

Here is the caller graph for this function:

const sequential::SequentialStorage& foedus::storage::sequential::SequentialCursor::get_storage ( ) const

inline

Definition at line 143 of file sequential_cursor.hpp.

Referenced by foedus::storage::sequential::operator<<().

143 { return storage_; }

Here is the caller graph for this function:

Epoch foedus::storage::sequential::SequentialCursor::get_to_epoch ( ) const

inline

Returns: Exclusive end of epochs to read.

Definition at line 148 of file sequential_cursor.hpp.

Referenced by foedus::storage::sequential::operator<<().

148 { return to_epoch_; }

Here is the caller graph for this function:

bool foedus::storage::sequential::SequentialCursor::is_finished_safe_volatiles ( ) const

inline

Definition at line 170 of file sequential_cursor.hpp.

170 { return finished_safe_volatiles_; }

bool foedus::storage::sequential::SequentialCursor::is_finished_snapshots ( ) const

inline

Followings are rather implementation details. Used only from testcases.

Definition at line 169 of file sequential_cursor.hpp.

169 { return finished_snapshots_; }

bool foedus::storage::sequential::SequentialCursor::is_finished_unsafe_volatiles ( ) const

inline

Definition at line 171 of file sequential_cursor.hpp.

171 { return finished_unsafe_volatiles_; }

bool foedus::storage::sequential::SequentialCursor::is_valid ( ) const

inline

Returns: false if there is no chance that this cursor returns any more record. As a very rare case, this might return true though there is no more matching record.

Definition at line 164 of file sequential_cursor.hpp.

Referenced by next_batch().

                              {
     return !(finished_snapshots_ && finished_safe_volatiles_ && finished_unsafe_volatiles_);
   }

Here is the caller graph for this function:

ErrorCode foedus::storage::sequential::SequentialCursor::next_batch ( SequentialRecordIterator * out )

Returns a batch of records as an iterator.

Parameters

[out] out an iterator over returned records.

It might return an empty batch even when this cursor has more records to return. Invoke is_valid() to check it. This method does nothing if is_valid() is already false. Each batch is guaranteed to be from one node, and actually from one page.

Definition at line 159 of file sequential_cursor.cpp.

References ASSERT_ND, CHECK_ERROR_CODE, is_valid(), foedus::kErrorCodeOk, and foedus::storage::sequential::SequentialRecordIterator::reset().

                                                                     {
   out->reset();
   if (states_.empty()) {
     CHECK_ERROR_CODE(init_states());
   }
 
   bool found = false;
   if (!finished_snapshots_) {
     CHECK_ERROR_CODE(next_batch_snapshot(out, &found));
     if (found) {
       return kErrorCodeOk;
     } else {
       DVLOG(1) << "Finished reading snapshot pages:";
       DVLOG(2) << *this;
       ASSERT_ND(finished_snapshots_);
       refresh_grace_epoch();
     }
   }
 
   if (!finished_safe_volatiles_) {
     CHECK_ERROR_CODE(next_batch_safe_volatiles(out, &found));
     if (found) {
       return kErrorCodeOk;
     } else {
       DVLOG(1) << "Finished reading safe volatile pages:";
       DVLOG(2) << *this;
       ASSERT_ND(finished_safe_volatiles_);
       refresh_grace_epoch();
     }
   }
 
   if (!finished_unsafe_volatiles_) {
     CHECK_ERROR_CODE(next_batch_unsafe_volatiles(out, &found));
     if (found) {
       return kErrorCodeOk;
     } else {
       DVLOG(1) << "Finished reading unsafe volatile pages:";
       DVLOG(2) << *this;
       ASSERT_ND(finished_unsafe_volatiles_);
     }
   }
 
   ASSERT_ND(!is_valid());
   return kErrorCodeOk;
 }

Here is the call graph for this function:

Friends And Related Function Documentation

std::ostream& operator<<	(	std::ostream &	o,
		const SequentialCursor &	v
	)

friend

Definition at line 652 of file sequential_cursor.cpp.

                                                                  {
   o << "<SequentialCursor>" << std::endl;
   o << "  " << v.get_storage() << std::endl;
   o << "  <from_epoch>" << v.get_from_epoch() << "</from_epoch>" << std::endl;
   o << "  <to_epoch>" << v.get_to_epoch() << "</to_epoch>" << std::endl;
   o << "  <order_mode>" << v.order_mode_ << "</order_mode>" << std::endl;
   o << "  <node_filter>" << v.node_filter_ << "</node_filter>" << std::endl;
   o << "  <snapshot_only_>" << v.snapshot_only_ << "</snapshot_only_>" << std::endl;
   o << "  <safe_epoch_only_>" << v.safe_epoch_only_ << "</safe_epoch_only_>" << std::endl;
   o << "  <buffer_>" << v.buffer_ << "</buffer_>" << std::endl;
   o << "  <buffer_size>" << v.buffer_size_ << "</buffer_size>" << std::endl;
   o << "  <buffer_pages_>" << v.buffer_pages_ << "</buffer_pages_>" << std::endl;
   o << "  <current_node_>" << v.current_node_ << "</current_node_>" << std::endl;
   o << "  <finished_snapshots_>" << v.finished_snapshots_ << "</finished_snapshots_>" << std::endl;
   o << "  <finished_safe_volatiles_>" << v.finished_safe_volatiles_
     << "</finished_safe_volatiles_>" << std::endl;
   o << "  <finished_unsafe_volatiles_>" << v.finished_unsafe_volatiles_
     << "</finished_unsafe_volatiles_>" << std::endl;
   o << "</SequentialCursor>";
   return o;
 }

The documentation for this class was generated from the following files:

/home/shino/foedus_code/foedus-core/include/foedus/storage/sequential/sequential_cursor.hpp
/home/shino/foedus_code/foedus-core/src/foedus/storage/sequential/sequential_cursor.cpp

Detailed Description

Public Types

Public Member Functions

Friends

Member Enumeration Documentation

Constructor & Destructor Documentation

Member Function Documentation

Friends And Related Function Documentation