Models
======

Parameters
----------

As a slightly more complex example, suppose we want to define the
following network:

.. math::

   h &= \tanh (V i + a) \\
   o &= \tanh (W h + b) \\
   e &= \|o - c\|^2

where :math:`i` is the input vector and :math:`c` is the correct output
vector. The parameters of the model are matrices :math:`V` and :math:`W`
and vectors :math:`a` and :math:`b`.

Parameters are created like constants, using ``parameter(value)``, where
``value`` is the initial value of the parameter::

    nh = 3
    V = parameter(numpy.random.uniform(-1., 1., (nh, 2)))
    a = parameter(numpy.zeros((nh,)))
    W = parameter(numpy.random.uniform(-1., 1., (1, nh)))
    b = parameter(numpy.zeros((1,)))

The inputs and correct outputs are going to be "constants" whose value
we will change from example to example::

    i = constant(numpy.empty((2,)))
    c = constant(numpy.empty((1,)))

Finally, define the network. This is nearly a straight copy of the
equations above::

    h = tanh(dot(V, i) + a)
    o = tanh(dot(W, h) + b)
    e = distance2(o, c)

Training
--------

To train the network, first create a trainer object (here ``SGD``; see
below for other trainers). Then, feed it expressions using its
``receive`` method, which updates the parameters to try to minimize
each expression. It also returns the value of the expressions.

.. code:: python

    import random
    trainer = SGD(learning_rate=0.1)
    data = [[-1., -1., -1.], 
            [-1.,  1.,  1.],
            [ 1., -1.,  1.],
            [ 1.,  1., -1.]] * 10
    for epoch in xrange(10):
        random.shuffle(data)
        loss = 0.
        for x, y, z in data:
            i.value[...] = [x, y]
            c.value[...] = [z]
            loss += trainer.receive(e)
        print(loss/len(data))


.. parsed-literal::

    1.08034928912
    0.98879616038
    1.00183385115
    0.951137577661
    0.840384066165
    0.314003950596
    0.0539702267511
    0.0295536827621
    0.0192921979733
    0.0140214011032

Loading and Saving
------------------

To save the model, call ``save_model(file)`` where ``file`` is a
file-like object. To load a model, you must build your expressions in
*exactly the same way* that you did up to the point that you saved the
model, then call ``load_model(file)``.

Reference
---------

.. autofunction:: penne.parameter

.. autoclass:: penne.optimize.StochasticGradientDescent
.. autoclass:: penne.optimize.SGD
.. autoclass:: penne.optimize.AdaGrad
.. autoclass:: penne.optimize.Adagrad
.. autoclass:: penne.optimize.AdaDelta
.. autoclass:: penne.optimize.Adadelta
.. autoclass:: penne.optimize.Momentum
.. autoclass:: penne.optimize.NesterovMomentum
.. autoclass:: penne.optimize.RMSprop
.. autoclass:: penne.optimize.Adam

.. autofunction:: penne.load_model
.. autofunction:: penne.save_model