Neural network operations¶
Penne also provides functions specific to neural networks: sigmoid
, tanh
, hardtanh
, softmax
, logsoftmax
, rectify
(ReLU), crossentropy
(log loss), distance2
(mean-squared loss).
-
penne.
softmax
(arg, axis=-1)[source]¶ Softmax function, \(y_i = \exp x_i / \sum_{i'} \exp x_{i'}\).
Parameters: axis – along which to perform the softmax (default is last).
-
penne.
logsoftmax
(arg, axis=-1)[source]¶ Log-softmax function, \(y_i = \log \left(\exp x_i / \sum_{i'} \exp x_{i'} ight)\).
Use this instead of log(softmax(x)) for better numerical stability.
param axis: along which to perform the softmax (default is last).
-
penne.
crossentropy
(logp, correct)[source]¶ Cross-entropy, a.k.a. log-loss.
Parameters: - logp – vector of log-probabilities
- correct – observed probabilities
-
class
penne.
Dropout
(p=0.5)[source]¶ Factory for dropouts.
Example usage:
d = Dropout(0.5) y = d(x)
The reason for the extra level of indirection is so that all the dropouts can be enabled or disabled together.
Parameters: p (float) – probability of dropout Returns: dropout function Return type: Expression -> Expression
-
class
penne.
Layer
(insize, outsize, f=<class 'penne.expr.tanh'>, gain=None, bias=0.0, model=None)[source]¶ Fully-connected layer.
Parameters: - insize –
Input size or sequence of input sizes.
- If an input size is n > 0, then that input will expect an n-dimensional vector.
- If an input size is n < 0, then that input will expect an integer in [0, n), which you can either think of as a one-hot vector or as an index into a lookup table.
- If an input size is “diag”, then that input will have a diagonal weight matrix.
- outsize – Output size.
- f – Activation function (default tanh).
- bias – Initial bias, or None for no bias.
- insize –