Trying strange things out on neural networks.
I’ve been working on implementing this paper, with some adjustments. I implemented it at the python level of tensorflow, but it produced far too many tiny operations and I couldn’t get it to be numerically stable. I figured it would be difficult to finally get it working and since it wasn’t going to scale very well I decided to implement a modification of the loss in a couple ops. As the tensorflow documentation is not very informative, I wrote this up. It’s meant to be a complement to the official documentation, and the actual implementation I wrote here so it will focus on explaining snippets from the github.
REGISTER_OP("AverageEmbeddingMask")
.Attr("T: {float32, float64}")
.Attr("S: {int32, int64}")
.Input("embedding: T")
.Input("rle_mask: S")
.Output("swarm_loss: float32")
.SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c){
});
This registers the op with TensorFlow. The op we’re building has 2 inputs and 1 output. The embedding
input has to be either a float or a double, and the rle_mask
has to be either a int or a long. We also determine the output as a float. The current description of this in the TensorFlow documentation. This is all I’ve needed to do with this so far. If I run into more issues I will update this later. However, the SetShapeFn
is where much of the initial verification steps for the op takes place. It also sets the output shape for ops further down the graph.
You have to define a lambda, that computes the shape using the shape inference library. These are computed using the InferenceContext class object that is passed to the lambda. For example, if I wanted a op whose output whose dimensions are all but the last element of an input op we would use the following:
SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
shape_inference::ShapeHandle loss_shape;
TF_RETURN_IF_ERROR(c->Subshape(c->input(0),0,-1,&loss_shape));
c->set_output(0,loss_shape);
return Status::OK();
}
To walk this though, step by step, the first line is the function handle that the REGISTER_OP calls to determine the output shapes. The rest is the lambda. c
is the inference context object, holding all the information needed to determine the type and shape of the input and output tensors. It holds input and output arrays of ShapeHandles
. The input arrays are populated from this op’s input, but we need to populate the output arrays. To do this we create an empty ShapeHandle
to hold the output of Subshape
.
The InterferenceContext
has several functions attached to it that don’t interact with it’s internal data but do manipulate shapes, such as Subshape. This function takes in the start and end that you want to keep from the first input to the op, c->input(0)
, and outputs it into loss_shape
. It actually returns a Status
(defined in lib/core/status.h). The TF_RETURN_IF_ERROR
is the catch macro for TensorFlow’s internal error handling methods. We finally set the first output to have the same shape as the loss_shape and return an OK status indicating that there’s been no errors.
SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
ShapeHandle input;
TF_RETURN_IF_ERROR(c->WithRank(c->input(0), 4, &input));
ShapeHandle rle_input;
TF_RETURN_IF_ERROR(c->WithRank(c->input(1), 4, &rle_input));
DimensionHandle rle_end_dim;
TF_RETURN_IF_ERROR(c->WithValue(c->Dim(rle_input, 3), 2, &rle_end_dim));
return Status::OK();
}
For the first input, this is meant to hold an image who’s pixels have been each been transformed into an embedding. I want a batch size, and an image with an X and a Y dimension that each holds N dimensions for the embedding space. So, the input has to be a rank 4 tensor. This is accomplished by using the WithRank
function, however this will also return successfully if the rank of the tensor is unknown. As this implementation was targeted at convex cells I could use run length encoding (hence the rle) as the cells’ masks are all Y-simple. So, the second input needs to have shape [BatchSize, MaxMasks, DimX,2]
, and we can verify the last entry with a DimensionHandle
.
Both the shape and dimension are handled through reference objects DimensionHandle
or ShapeHandle
. As far as I can tell this is not meant to cross verify dimensions of tensors. More intricate error messages can be written in the op’s implementation, but I would rather do all my shape verification here so that if I have multiple implementations (GPU and CPU) I don’t have to duplicate code.
template <typename Device, typename T, typename S>
class MaskedAveragesOp : public OpKernel {
public:
explicit MaskedAveragesOp(OpKernelConstruction* context) : OpKernel(context) {}
void Compute(OpKernelContext* context) override {
// Internals go here
}
};
To implement the actual op we need to have extend the base class OpKernel
. As these ops have no internal definitions to worry about with this op, we just need to define the Compute
function. The only thing passed to this is the OpKernelContext
object, which contains a list of inputs that are allocated and ready to use and a list of outputs that we need to allocate in our compute. The data types for the outputs are already filled by the TensorFlow system as defined by the op’s registration, so we only need to supply the shapes here.
const Tensor& input_tensor = context->input(0);
const Tensor& rle_tensor = context->input(1);
TensorShape input_shape = input_tensor.shape();
TensorShape rle_shape = rle_tensor.shape();
TensorShape mean_shape = input_shape;
mean_shape.RemoveDimRange(1,-2);
mean_shape.InsertDim(1,rle_shape.dim_size(1));
A tensor consists of a TensorShape
and a TensorBuffer
. To create a tensor we need to determine a shape and then ask TensorFlow to allocate an appropriately sized buffer for that shape. Do not try to do your own allocation, TensorFlow has it’s own memory management system backed up with jemalloc. This is incompatible with malloc, and if you mix them there will be a segfault. As the numpy arrays returned by TensorFlow are wrappers around the TensorFlow buffer, this even applies when passing an output to an external library. So, when in TensorFlow you have to use the internal memory management, and when passing things from TensorFlow to another C/C++ library treat the buffers gingerly.
Now that we are in an op, the library we use to manipulate their shapes is TensorShape instead of ShapeInference. It has all the pieces we need to build the tensor shape for our output. This op computes the average embedding of each mask, so it needs be of shape [batchSize,MaxMasks,EmbeddingDim]
. The last two lines accomplish this slicing a gluing to make the appropriate shape.
Tensor* output_tensor = NULL;
OP_REQUIRES_OK(context, context->allocate_output(0, loss_shape,
&output_tensor));
The output tensors are already set up in the OpKernelContext
, with the correct type though they are not allocated and have no given shape. The empty first output object already exists in the context
object. So, for the first tensor, 0, we pass a shape, and a reference to a pointer so that we have a means of getting the complete, allocated, tensor.
Tensor mean_tensor_temp;
OP_REQUIRES_OK(context, context->allocate_temp(input_tensor.dtype(),
mean_shape_temp, &mean_tensor_temp));
This is the temporary version of above. The type cannot be inferred from context, so it needs to be passed. There’s an enumeration in types.pb.h
that holds all the types. The most useful ones are DT_FLOAT
and DT_INT32
. Remember the Tensor
object is just a reference, so this function provides the tensor with the type and the shape and then it allocates the appropriate internal buffer.
We need to define an actual implementation of the operation, and we need to debug it. The next post is on Eigen::Tensor
and the odd error messages that it returns when it doesn’t compile.
subscribe via RSS