libfoedus-core
FOEDUS Core Library
Sequential Storage

Sequential Storage, an append/scan only data structure. More...

Detailed Description

Sequential Storage, an append/scan only data structure.

This data structure is also called Heap in database literature, but probably not a good naming (Heap usually means min/max Heap, which is totally different). So, we call this data structure as Sequential Storage.

Basic Structure and Supported Operations

A sequential storage is VERY simple thus efficient. The only access pattern is append and scan, which makes everything simple. Further, the scan does not provide any guarantee in ordering (similar to Heap in DBMS). It just returns records in the order the underlying storage natively stores them.

Page Hierarchy

There are only two types of pages; root pages (foedus::storage::sequential::SequentialRootPage) and data pages (foedus::storage::sequential::SequentialPage).

Root pages are merely a set of pointers to head pages, which are the beginnings of singly-linked list of data pages.

All contents of root pages are stable. They are never dynamically changed. The volatile part of Sequential Storage is maintained as a set of singly-linked list pointed directly from the storage object, so there is no root page for it.

Data pages contain records that are contiguously placed each other. They form a singly linked list starting from a head page.

For each snapshotting, we add a new set of head pages (one for each node) and pointers to them from the root page. As root pages are singly-linked list, we re-write entire root pages for every sequential storage that had some change for each snapshotting (*). Unless there are huge number of head pages, this should be a negligible overhead.

(*) We only are placing a few head page pointers at the last root page only, but that means we have to install a new version of the last root page, which requires installing a new version of the second-to-last root page just for changing the next pointer, which requires... But, again, most sequential storage's root page should be just one or two pages. A similar overhead happens to all other storage types, too.

Page Layout

Both data pages and root pages form a singly linked list of pages. The next pointer is stored in the header part.

Layout of Data Page (foedus::storage::sequential::SequentialPage)

Fix-Sized HEADER (kHeaderSize bytes)
Record Data part, which grows forward
Unused part
Record Lengthes part, which grows backward

Layout of Root Page (foedus::storage::sequential::SequentialRootPage)

Fix-Sized HEADER (kRootPageHeaderSize bytes)
Pointers to head pages..

In-memory List

As described above, the volatile part of sequential storage is a set of singly-linked list pointed directly from the storage object. The storage object maintains such in-memory list for each epoch. When snapshotting happens, it retires those lists whose epochs are upto the snapshot's "until" epoch.

Because we have to efficiently append to the in-memory list, we maintain head and tail pointers to each list. Updating the tail page and replacing tail pointers involve a few atomic operations, but no expensive locks as done in other storages. For more details, see SequentialVolatileList.

Collaboration diagram for Sequential Storage:

Files

file  fwd.hpp
 Forward declarations of classes in sequential storage package.
 
file  sequential_id.hpp
 Definitions of IDs in this package and a few related constant values.
 
file  sequential_log_types.hpp
 Declares all log types used in this storage type.
 

Classes

struct  foedus::storage::sequential::SequentialComposer::RootInfoPage
 Output of one compose() call, which are then combined in construct_root(). More...
 
class  foedus::storage::sequential::SequentialComposer
 Composer for an sequential storage. More...
 
class  foedus::storage::sequential::SequentialCursor
 A cursor interface to read tuples from a sequential storage. More...
 
struct  foedus::storage::sequential::SequentialRecordBatch
 A chunk of records returned by SequentialCursor. More...
 
class  foedus::storage::sequential::SequentialRecordIterator
 Iterator for one SequentialRecordBatch, or a page. More...
 
struct  foedus::storage::sequential::HeadPagePointer
 Each pointer to a snapshot head page comes with a bit more information to help reading. More...
 
struct  foedus::storage::sequential::SequentialCreateLogType
 Log type of CREATE SEQUENTIAL STORAGE operation. More...
 
struct  foedus::storage::sequential::SequentialTruncateLogType
 Log type of TRUNCATE SEQUENTIAL STORAGE operation. More...
 
struct  foedus::storage::sequential::SequentialMetadata
 Metadata of a sequential storage. More...
 
class  foedus::storage::sequential::SequentialPage
 Represents one data page in Sequential Storage. More...
 
class  foedus::storage::sequential::SequentialRootPage
 Represents one stable root page in Sequential Storage. More...
 
class  foedus::storage::sequential::SequentialPartitioner
 Partitioner for an sequential storage. More...
 
class  foedus::storage::sequential::SequentialStorage
 Represents an append/scan-only store. More...
 
struct  foedus::storage::sequential::SequentialStoragePimpl::PointerPage
 
class  foedus::storage::sequential::SequentialStoragePimpl
 Lock-free list of records stored in the volatile part of sequential storage. More...
 

Variables

const uint16_t foedus::storage::sequential::kMaxSlots = 1 << 15
 We have to represent the record count in 15 bits. More...
 
const uint16_t foedus::storage::sequential::kHeaderSize = 64
 Byte size of header in each data page of sequential storage. More...
 
const uint16_t foedus::storage::sequential::kDataSize = foedus::storage::kPageSize - kHeaderSize
 Byte size of data region in each data page of sequential storage. More...
 
const uint16_t foedus::storage::sequential::kMaxPayload = kDataSize
 Payload must be shorter than this length. More...
 
const uint16_t foedus::storage::sequential::kRootPageHeaderSize = 56
 Byte size of header in each root page of sequential storage. More...
 
const uint16_t foedus::storage::sequential::kRootPageMaxHeadPointers = (foedus::storage::kPageSize - kRootPageHeaderSize) / sizeof(HeadPagePointer)
 Maximum number of head pointers in one root page. More...
 
const uint16_t foedus::storage::sequential::kRootPageDataSize = foedus::storage::kPageSize - kRootPageHeaderSize
 Byte size of data region in each root page of sequential storage. More...
 
const uint16_t foedus::storage::sequential::kPointerPageCount = 1U << 6
 Each poiner page can contain 2^10 pointers (as the node is implicit, PagePoolOffset suffices) and we can have at most 2^16 cores. More...
 

Class Documentation

struct foedus::storage::sequential::SequentialComposer::RootInfoPage

Output of one compose() call, which are then combined in construct_root().

Each compose() returns just one pointer to a head page.

Definition at line 66 of file sequential_composer_impl.hpp.

Collaboration diagram for foedus::storage::sequential::SequentialComposer::RootInfoPage:
Class Members
char filler_[kPageSize-sizeof(PageHeader)-sizeof(HeadPagePointer)]
PageHeader header_
HeadPagePointer pointer_
struct foedus::storage::sequential::HeadPagePointer

Each pointer to a snapshot head page comes with a bit more information to help reading.

Definition at line 66 of file sequential_id.hpp.

Collaboration diagram for foedus::storage::sequential::HeadPagePointer:
Class Members
Epoch from_epoch_ Inclusive beginning of epochs in the pointed pages.
uint64_t page_count_ In a sequential storage, all pages from the head page is guaranteed to be contiguous (that's how SequentialComposer write them out).

This enables us to read a large number of sequential pages in one-go rather than reading one-page at a time.

SnapshotPagePointer page_id_ ID of the page that begins the linked list.
Epoch to_epoch_ Exclusive end of epochs in the pointed pages.
struct foedus::storage::sequential::SequentialStoragePimpl::PointerPage

Definition at line 141 of file sequential_storage_pimpl.hpp.

Class Members
PagePoolOffset pointers_[kPointersPerPage]

Variable Documentation

const uint16_t foedus::storage::sequential::kHeaderSize = 64

Byte size of header in each data page of sequential storage.

Definition at line 43 of file sequential_id.hpp.

const uint16_t foedus::storage::sequential::kMaxPayload = kDataSize
const uint16_t foedus::storage::sequential::kMaxSlots = 1 << 15

We have to represent the record count in 15 bits.

Definition at line 38 of file sequential_id.hpp.

Referenced by foedus::storage::sequential::SequentialPage::append_record_nosync(), and foedus::storage::sequential::SequentialPage::can_insert_record().

const uint16_t foedus::storage::sequential::kPointerPageCount = 1U << 6

Each poiner page can contain 2^10 pointers (as the node is implicit, PagePoolOffset suffices) and we can have at most 2^16 cores.

Thus we have 2^6 pointers here. This means we can waste 64*2=128 volatile pages (=512kb) per one sequential storage.. shouldn't be a big issue.

Definition at line 104 of file sequential_id.hpp.

const uint16_t foedus::storage::sequential::kRootPageDataSize = foedus::storage::kPageSize - kRootPageHeaderSize

Byte size of data region in each root page of sequential storage.

Definition at line 95 of file sequential_id.hpp.

const uint16_t foedus::storage::sequential::kRootPageHeaderSize = 56

Byte size of header in each root page of sequential storage.

Definition at line 60 of file sequential_id.hpp.

const uint16_t foedus::storage::sequential::kRootPageMaxHeadPointers = (foedus::storage::kPageSize - kRootPageHeaderSize) / sizeof(HeadPagePointer)

Maximum number of head pointers in one root page.

Definition at line 89 of file sequential_id.hpp.

Referenced by foedus::storage::sequential::SequentialComposer::construct_root(), and foedus::storage::sequential::SequentialRootPage::set_pointers().