# Adding a new layer¶

This section describes how to create a new layer incorporated with tiny-dnn. Let's create simple fully-connected layer for example.

Note: This document is old, and doesn't fit to current tiny-dnn. We need to update.

## Declare class¶

Let's define your layer. All of layer operations in tiny-dnn are derived from `layer`

class.

```
// calculate y = Wx + b
class fully_connected : public layer {
public:
//todo
};
```

the `layer`

class prepares input/output data for your calculation. To do this, you must tell `layer`

's constructor what you need.

```
layer::layer(const std::vector<vector_type>& in_type,
const std::vector<vector_type>& out_type)
```

For example, consider calculating fully-connected operation: `y = Wx + b`

. In this calculation, Input (right hand of this eq) is data `x`

, weight `W`

and bias `b`

. Output is, of course `y`

. So it's constructor should pass {data,weight,bias} as input and {data} as output.

```
// calculate y = Wx + b
class fully_connected : public layer {
public:
fully_connected(size_t x_size, size_t y_size)
:layer({vector_type::data,vector_type::weight,vector_type::bias}, // x, W and b
{vector_type::data}),
x_size_(x_size),
y_size_(y_size)
{}
private:
size_t x_size_; // number of input elements
size_t y_size_; // number of output elements
};
```

the `vector_type::data`

is some input data passed by previous layer, or output data consumed by next layer. `vector_type::weight`

and `vector_type::bias`

represents trainable parameters. The only difference between them is default initialization method: `weight`

is initialized by random value, and `bias`

is initialized by zero-vector (this behaviour can be changed by network::weight_init method). If you need another vector to calculate, `vector_type::aux`

can be used.

## Implement virtual method¶

There are 5 methods to implement. In most case 3 methods are written as one-liner and remaining 2 are essential:

- layer_type
- in_shape
- out_shape
- forward_propagation
- back_propagation

### layer_type¶

Returns name of your layer.

```
std::string layer_type() const override {
return "fully-connected";
}
```

### in_shape/out_shape¶

Returns input/output shapes corresponding to inputs/outputs. Shapes is defined by [width, height, depth]. For example fully-connected layer treats input data as 1-dimensional array, so its shape is [N, 1, 1].

```
std::vector<shape3d> in_shape() const override {
// return input shapes
// order of shapes must be equal to argument of layer constructor
return { shape3d(x_size_, 1, 1), // x
shape3d(x_size_, y_size_, 1), // W
shape3d(y_size_, 1, 1) }; // b
}
std::vector<shape3d> out_shape() const override {
return { shape3d(y_size_, 1, 1) }; // y
}
```

### forward_propagation¶

Execute forward calculation in this method.

```
void forward_propagation(size_t worker_index,
const std::vector<vec_t*>& in_data,
std::vector<vec_t*>& out_data) override {
const vec_t& x = *in_data[0]; // it's size is in_shapes()[0] (=[x_size_,1,1])
const vec_t& W = *in_data[1];
const vec_t& b = *in_data[2];
vec_t& y = *out_data[0];
std::fill(y.begin(), y.end(), 0.0);
// y = Wx+b
for (size_t r = 0; r < y_size_; r++) {
for (size_t c = 0; c < x_size_; c++)
y[r] += W[r*x_size_+c]*x[c];
y[r] += b[r];
}
}
```

the `in_data/out_data`

is array of input/output data, which is ordered as you told `layer`

's constructor. The implementation is simple and straightforward, isn't it?

`worker_index`

is task-id. It is always zero if you run tiny-dnn in single thread. If some class member variables are updated while forward/backward pass, these members must be treated carefully to avoid data race. If their variables are task-independent, your class can hold just N variables and access them by worker_index (you can see this example in max_pooling_layer.h).
input/output data managed by `layer`

base class is *task-local*, so `in_data/out_data`

is treated as if it is running on single thread.

### back propagation¶

```
void back_propagation(size_t index,
const std::vector<vec_t*>& in_data,
const std::vector<vec_t*>& out_data,
std::vector<vec_t*>& out_grad,
std::vector<vec_t*>& in_grad) override {
const vec_t& curr_delta = *out_grad[0]; // dE/dy (already calculated in next layer)
const vec_t& x = *in_data[0];
const vec_t& W = *in_data[1];
vec_t& prev_delta = *in_grad[0]; // dE/dx (passed into previous layer)
vec_t& dW = *in_grad[1]; // dE/dW
vec_t& db = *in_grad[2]; // dE/db
// propagate delta to prev-layer
for (size_t c = 0; c < x_size_; c++)
for (size_t r = 0; r < y_size_; r++)
prev_delta[c] += curr_delta[r] * W[r*x_size_+c];
// accumulate weight difference
for (size_t r = 0; r < y_size_; r++)
for (size_t c = 0; c < x_size_; c++)
dW[r*x_size_+c] += curr_delta[r] * x[c];
// accumulate bias difference
for (size_t r = 0; r < y_size_; r++)
db[r] += curr_delta[r];
}
```

the `in_data/out_data`

are just same as forward_propagation, and `in_grad/out_grad`

are its gradient. Order of gradient values are same as `in_data/out_data`

.

Note: Gradient of weight/bias are collected over mini-batch and zero-cleared automatically, so you can't use assignment operator to these elements (layer will forget previous training data in mini-batch!). like this example, use`operator +=`

instead. Gradient of data (`prev_delta`

in the example) may already have meaningful values if two or more layers share this data, so you can't overwrite this value too.

## Verify backward calculation¶

It is always a good idea to check if your backward implementation is correct. `network`

class provides `gradient_check`

method for this purpose.
Let's add following lines to test/test_network.h and execute test.

```
TEST(network, gradient_check_fully_connected) {
network<sequential> net;
net << fully_connected(2, 3)
<< fully_connected(3, 2);
std::vector<tensor_t> in{ tensor_t{ 1, { 0.5, 1.0 } } };
std::vector<std::vector<label_t>> t = { std::vector<label_t>(1, {1}) };
EXPECT_TRUE(net.gradient_check<mse>(in, t, 1e-4, GRAD_CHECK_ALL));
}
```

Congratulations! Now you can use this new class as a tiny-dnn layer.