Skip to main content

Contributing to TAOBench

Adding a New Adapter Layer

TAOBench uses a database interface layer to translate requests using TAO’s API to database-specific requests. To add support for a new database, you should implement this interface layer (benchmark::DB) for the database. Take a look at the directories crdb, mysqldb, etc. for examples.

TAO’s API:

  • read(key): Read a record
  • read_txn(keys): Read a group of records atomically
  • write(key,[preconditions]): Write to a record, optionally with a set of preconditions
  • write_txn(keys,[preconditions]): Write to a group of records atomically, optionally with a set of preconditions

1) Create adapter directory

Put new adapters in separate directories. New source files should be added to CMakeLists.txt, and since adapter layers are compiled with the benchmark executable, they should be built conditional on a CMake option.

option(WITH_CRDB OFF)
# ...
if(WITH_CRDB)
include_directories(crdb)
target_sources(benchmark PRIVATE
crdb/crdb_db.h
crdb/crdb_db.cc)
target_link_libraries(benchmark -lpqxx)
target_link_libraries(benchmark -lpq)
endif()

2) Create derived class

Create a class that extends the benchmark::DB class. The file src/db.h contains this class, with descriptions of the methods that must be implemented. Broadly, there are three functionalities to be implemented: (a) connection handling, (b) batch insert + read, and (c) queries/transactions.

For (a), implement the Init() and Cleanup() methods. These manage the state of each client thread, e.g. connection handling. You may want to define some properties in order to pass connection details; properties can be accessed from the props_ variable.

Methods for (b) are needed to support batch inserts during the load phase and batch reads that execute at the beginning of the run phase.

Most of the work will be done in (c), which involves point queries and transactions. A few pointers:

  • The benchmark assumes that an objects and edges table has already been created. The database interface must support queries to both tables (the DB_Operation passed to each method indicates which table should be accessed).
  • There are four types of edges: unique, bidirectional, unique_and_bidirectional, and other. During edge inserts, requests to the database should respect the semantics of these edge types. Uniqueness means that only one id2 can exist for each id1. Bidirectionality determines whether the inverse exists or not.

Here's a cut-down version of the DB class:

Database interface layer
///
/// Database interface layer.
/// per-thread DB instance.
///
class DB {
// ...

/// Initializes any state for accessing this DB.
virtual void Init() { }

/// Clears any state for accessing this DB.
virtual void Cleanup() { }

/// Reads a record from the database.
virtual Status Read(DataTable table, const std::vector<Field> & key,
std::vector<TimestampValue> &buffer) = 0;

/// Updates a record in @param table for @param key with @param value
virtual Status Update(DataTable table, const std::vector<Field> &key,
TimestampValue const & value) = 0;

/// Inserts a record in @param table for @param key with @param value.
virtual Status Insert(DataTable table, const std::vector<Field> &key,
TimestampValue const & value) = 0;

/// Deletes a record from the database.
virtual Status Delete(DataTable table, const std::vector<Field> &key,
TimestampValue const & value) = 0;

/// Execute a single operation (READ, INSERT, UPDATE, DELETE)
virtual Status Execute(const DB_Operation &operation,
std::vector<TimestampValue> &read_buffer, // for reads
bool txn_op = false) = 0;

/// @param operations vector of operations to be completed as one transaction
virtual Status ExecuteTransaction(const std::vector<DB_Operation> &operations,
std::vector<TimestampValue> &read_buffer,
bool read_only) = 0;

/// Insert records in @param table for @param keys with @param values
virtual Status BatchInsert(DataTable table, const std::vector<std::vector<Field>> &keys,
std::vector<TimestampValue> const & values) = 0;

/// This method reads the first @n keys (or all of them, whichever is smaller) from @param table
/// in the OPEN interval (@param floor_key, @param ceiling_key) and writes them to @param key_buffer
/// in sorted order.
virtual Status BatchRead(DataTable table, const std::vector<Field> &floor_key,
const std::vector<Field> &ceiling_key,
int n, std::vector<std::vector<Field>> &key_buffer) = 0;

/// ...
};

3) Register your class

Finally, ensure that the benchmark is aware of your newly created database interface. This can be done by calling DBFactory::RegisterDB and assigning its return value to a global constant in your adapter's source file.

DB *NewExampleDB() { return new ExampleDB; }

const bool registered = DBFactory::RegisterDB("exampledb", NewExampleDB);