UP | HOME

ZIO Tensors

Table of Contents

1 Motivation and requirements

In some cases, we want to easily send tensors between components. We want to do this simply and generally so that disparate endpoints can communicate without knowing details of each other but so that tensors may be faithfully transmitted. The requirements, needs and allowances are:

  • no tensor operations provided
  • faithfully transmit tensor
  • minimize memory copy both internally and at the interface with application code.
  • allow for possible compression and/or sparse representation
  • support multiple tensors per message
  • allow spreading single tensor across message
  • allow association of application metadata
  • allow other message forms to coexist with tensor message forms.

Thus, at a minimum the message representation shall include:

  • conceptual tensor metadata (eg, rank, shape, element type)
  • memory layout metadata (order, compression, sparseness)
  • the packed tensor element data itself
  • indexing of one tensor in a larger set of tensors
  • means to associate application metadata to tensors.

2 Zio TENS message form

The ZIO TENS message format attempts to provide the above requirements. A TENS message may have a ZIO message form header value of TENS however, an application may choose to include the general TENS form in other messages as long as the rest of the specification is followed. A compatible form is FLOW (which requires the form to be FLOW).

As with other forms, the zio::Message header prefix attribute label shall hold structured data as an object encoded as a JSON string. A TENS message may utilize zero or more parts of the zio::Message payload to hold packed arrays of elements, one for each given tensor.

The label object shall have a top-level attribute key TENS which holds all information required to reconstitute the tensors stored in the message except the element data itself.

The value of the TENS object shall have two attributes. Expressed as JSON they are:

{
  "TENS": {
    "tensors": [ ... ],
    "metadata": { ... },
   }
}

2.1 The "tensors" attribute

The "tensors" item holds an array of tensor description objects. For example:

{
  "tensors": [
    {"shape":[6000,800], "word":4, "dtype":"f", "part":1},
    {"shape":[6000,800], "word":4, "dtype":"f", "part":2},
    {"shape":[6000,960], "word":4, "dtype":"f", "part":0},
  ],

Thus, each tensor is described with an object that must follow the schema with attributes given below. In doing so, the description adopt the conceptual model of Boost multi array and other tensor packages. Optional attributes with defaults may be omitted.

Required:

shape
vector of integers of size \(\mathcal{N}_{dim}\), gives the size of each dimension.
word
integer, gives the number of bytes used by each element of the tensor.
dtype
string, indicate the data type of the elements (for numbers, matches the Numpy character: f, i, u, etc).

Optional:

part
integer, the message payload part index containing the packed tensor element array associated with this metadata object. Defaults to using same payload index as TENS.tensors array index.
order
vector of integers of size \(\mathcal{N}_{dim}\), gives the storage order. Defaults to C ordering.
ascend
vector of Boolean of size \(\mathcal{N}_{dim}\), gives true if dimension is ascending. Defaults to all true.
metadata
a single level object holding scalar attributes.

Reserved for possible future use:

packing
string, an optional description of how the tensor elements are packed. If omitted, the default is "dense". Future extensions may use this to indicate a sparse and/or compressed packing.
pointer
integer, optional, indicates a memory location holding the tensor array instead of it being delivered in the payload.

The application may extend these objects but must not utilize the reserved attributes described next. The TENS format may extend the list of reserved attributes in the future.

2.2 The "metadata" attribute

The "metadata" item in the "TENS" attribute may hold any arbitrary JSON object. Any interpretation is application dependent.

3 Application extension

Application may extend a TENS message form in a number of ways:

  • The four byte message form may be other than TENS.
  • the label object may have additional attributes besides TENS.
  • the TENS JSON object may be extended by novel keys other than the reserved keys listed above. Note the TENS.tensors array may not be extended.
  • message payload parts other than those referenced by TENS.tensors[].part attribute may be used for purposes other than TENS.

4 C++ API

The specification above defines a TENS message. The zio/tens.hpp C++ API provides some support for handling TENS messages. See test/test_tens.cpp in the ZIO source for an example of this API being exercised.

5 Use with Boost Multi Array

As said above, the TENS message form is sympathetic with Boost Multi Array (manual and reference). ZIO does not provide tests against Boost in order to minimize dependency but an application may try something like the following to cast a TENS part into a Boost MA:

zio::Message msg;
const zio::message_t& data = zio::tens::at(msg, 0);
auto tens = msg.label_object()["TENS"]["tensors"][0];
auto shape = tens["shape"];
assert(shape.size() == 3);
const float* fdata = (const float*)data.data();
const size_t n1=shape[0].get<size_t>();
const size_t n2=shape[1].get<size_t>();
const size_t n3=shape[2].get<size_t>();
boost::const_multi_array<float, 3> tensor(fdata, boost::extents[n1][n2][n3]);

And, likewise one can fill a message part with something like:

zio::Message msg;
std::vector<size_t> shape={n1,n2,n3};
zio::tens::append(msg, tensor.data(), shape);