8
8.0

Jun 29, 2018
06/18

by
Qianli Liao; Tomaso Poggio

texts

#
eye 8

#
favorite 0

#
comment 0

We discuss relations between Residual Networks (ResNet), Recurrent Neural Networks (RNNs) and the primate visual cortex. We begin with the observation that a shallow RNN is exactly equivalent to a very deep ResNet with weight sharing among the layers. A direct implementation of such a RNN, although having orders of magnitude fewer parameters, leads to a performance similar to the corresponding ResNet. We propose 1) a generalization of both RNN and ResNet architectures and 2) the conjecture that...

Topics: Neural and Evolutionary Computing, Computing Research Repository, Learning

Source: http://arxiv.org/abs/1604.03640

12
12

Jun 29, 2018
06/18

by
Hrushikesh Mhaskar; Tomaso Poggio

texts

#
eye 12

#
favorite 0

#
comment 0

The paper briefy reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation problems than shallow, one-hidden layer architectures. The paper announces new results for a non-smooth activation function - the ReLU function - used in present-day neural networks, as well as for the Gaussian networks. We propose a new definition of...

Topics: Mathematics, Functional Analysis, Computing Research Repository, Learning

Source: http://arxiv.org/abs/1608.03287

5
5.0

Jun 28, 2018
06/18

by
Maximilian Nickel; Lorenzo Rosasco; Tomaso Poggio

texts

#
eye 5

#
favorite 0

#
comment 0

Learning embeddings of entities and relations is an efficient and versatile method to perform machine learning on relational data such as knowledge graphs. In this work, we propose holographic embeddings (HolE) to learn compositional vector space representations of entire knowledge graphs. The proposed method is related to holographic models of associative memory in that it employs circular correlation to create compositional representations. By using correlation as the compositional operator...

Topics: Learning, Statistics, Machine Learning, Artificial Intelligence, Computing Research Repository

Source: http://arxiv.org/abs/1510.04935

13
13

Jun 27, 2018
06/18

by
Youssef Mroueh; Stephen Voinea; Tomaso Poggio

texts

#
eye 13

#
favorite 0

#
comment 0

We analyze in this paper a random feature map based on a theory of invariance I-theory introduced recently. More specifically, a group invariant signal signature is obtained through cumulative distributions of group transformed random projections. Our analysis bridges invariant feature learning with kernel methods, as we show that this feature map defines an expected Haar integration kernel that is invariant to the specified group action. We show how this non-linear random feature map...

Topics: Statistics, Computer Vision and Pattern Recognition, Computing Research Repository, Learning,...

Source: http://arxiv.org/abs/1506.02544

72
72

Sep 23, 2013
09/13

by
Tomaso Poggio; Stephen Voinea; Lorenzo Rosasco

texts

#
eye 72

#
favorite 0

#
comment 0

In batch learning, stability together with existence and uniqueness of the solution corresponds to well-posedness of Empirical Risk Minimization (ERM) methods; recently, it was proved that CV_loo stability is necessary and sufficient for generalization and consistency of ERM. In this note, we introduce CV_on stability, which plays a similar note in online learning. We show that stochastic gradient descent (SDG) with the usual hypotheses is CVon stable and we then discuss the implications of...

Source: http://arxiv.org/abs/1105.4701v3

44
44

Sep 23, 2013
09/13

by
Silvia Villa; Lorenzo Rosasco; Tomaso Poggio

texts

#
eye 44

#
favorite 0

#
comment 0

We consider the fundamental question of learnability of a hypotheses class in the supervised learning setting and in the general learning setting introduced by Vladimir Vapnik. We survey classic results characterizing learnability in term of suitable notions of complexity, as well as more recent results that establish the connection between learnability and stability of a learning algorithm.

Source: http://arxiv.org/abs/1303.5976v1

6
6.0

Jun 29, 2018
06/18

by
Hrushikesh Mhaskar; Qianli Liao; Tomaso Poggio

texts

#
eye 6

#
favorite 0

#
comment 0

While the universal approximation property holds both for hierarchical and shallow networks, we prove that deep (hierarchical) networks can approximate the class of compositional functions with the same accuracy as shallow networks but with exponentially lower number of training parameters as well as VC-dimension. This theorem settles an old conjecture by Bengio on the role of depth in networks. We then define a general class of scalable, shift-invariant algorithms to show a simple and natural...

Topics: Computing Research Repository, Learning

Source: http://arxiv.org/abs/1603.00988

28
28

Jun 27, 2018
06/18

by
Fabio Anselmi; Lorenzo Rosasco; Tomaso Poggio

texts

#
eye 28

#
favorite 0

#
comment 0

We discuss data representation which can be learned automatically from data, are invariant to transformations, and at the same time selective, in the sense that two points have the same representation only if they are one the transformation of the other. The mathematical results here sharpen some of the key claims of i-theory -- a recent theory of feedforward processing in sensory cortex.

Topics: Learning, Computing Research Repository

Source: http://arxiv.org/abs/1503.05938

6
6.0

Jun 29, 2018
06/18

by
Qianli Liao; Kenji Kawaguchi; Tomaso Poggio

texts

#
eye 6

#
favorite 0

#
comment 0

We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and (2) recurrent learning. Our proposal is simpler and more biologically-plausible. Unlike previous approaches, our technique can be applied out of the box to all learning scenarios (e.g., online learning, batch learning, fully-connected, convolutional, feedforward, recurrent and...

Topics: Neural and Evolutionary Computing, Computing Research Repository, Learning

Source: http://arxiv.org/abs/1610.06160

8
8.0

Jun 30, 2018
06/18

by
Tomaso Poggio; Jim Mutch; Leyla Isik

texts

#
eye 8

#
favorite 0

#
comment 0

We develop a sampling extension of M-theory focused on invariance to scale and translation. Quite surprisingly, the theory predicts an architecture of early vision with increasing receptive field sizes and a high resolution fovea -- in agreement with data about the cortical magnification factor, V1 and the retina. From the slope of the inverse of the magnification factor, M-theory predicts a cortical "fovea" in V1 in the order of $40$ by $40$ basic units at each receptive field size...

Topics: Computing Research Repository, Quantitative Biology, Learning, Neurons and Cognition

Source: http://arxiv.org/abs/1406.1770

7
7.0

Jun 28, 2018
06/18

by
Qianli Liao; Joel Z. Leibo; Tomaso Poggio

texts

#
eye 7

#
favorite 0

#
comment 0

Gradient backpropagation (BP) requires symmetric feedforward and feedback connections -- the same weights must be used for forward and backward passes. This "weight transport problem" (Grossberg 1987) is thought to be one of the main reasons to doubt BP's biologically plausibility. Using 15 different classification datasets, we systematically investigate to what extent BP really depends on weight symmetry. In a study that turned out to be surprisingly similar in spirit to Lillicrap et...

Topics: Learning, Computing Research Repository

Source: http://arxiv.org/abs/1510.05067

3
3.0

Jun 30, 2018
06/18

by
Qianli Liao; Joel Z. Leibo; Tomaso Poggio

texts

#
eye 3

#
favorite 0

#
comment 0

Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e.g., position, scale, viewing angle [1, 2, 3]. Though the learning rules are not known, recent results [4, 5, 6] suggest the operation of an unsupervised temporal-association-based method e.g., Foldiak's trace rule [7]. Such methods exploit the temporal continuity of the visual world by assuming that visual experience over short timescales...

Topics: Computing Research Repository, Computer Vision and Pattern Recognition, Learning

Source: http://arxiv.org/abs/1409.3879

52
52

Sep 18, 2013
09/13

by
Guillermo D. Canas; Tomaso Poggio; Lorenzo Rosasco

texts

#
eye 52

#
favorite 0

#
comment 0

We study the problem of estimating a manifold from random samples. In particular, we consider piecewise constant and piecewise linear estimators induced by k-means and k-flats, and analyze their performance. We extend previous results for k-means in two separate directions. First, we provide new results for k-means reconstruction on manifolds and, secondly, we prove reconstruction bounds for higher-order approximation (k-flats), for which no known results were previously available. While the...

Source: http://arxiv.org/abs/1209.1121v4

14
14

Jun 27, 2018
06/18

by
Carlo Ciliberto; Youssef Mroueh; Tomaso Poggio; Lorenzo Rosasco

texts

#
eye 14

#
favorite 0

#
comment 0

Reducing the amount of human supervision is a key problem in machine learning and a natural approach is that of exploiting the relations (structure) among different tasks. This is the idea at the core of multi-task learning. In this context a fundamental question is how to incorporate the tasks structure in the learning problem.We tackle this question by studying a general computational framework that allows to encode a-priori knowledge of the tasks structure in the form of a convex penalty; in...

Topics: Learning, Computing Research Repository

Source: http://arxiv.org/abs/1504.03101

27
27

Jun 28, 2018
06/18

by
Fabio Anselmi; Lorenzo Rosasco; Cheston Tan; Tomaso Poggio

texts

#
eye 27

#
favorite 0

#
comment 0

In i-theory a typical layer of a hierarchical architecture consists of HW modules pooling the dot products of the inputs to the layer with the transformations of a few templates under a group. Such layers include as special cases the convolutional layers of Deep Convolutional Networks (DCNs) as well as the non-convolutional layers (when the group contains only the identity). Rectifying nonlinearities -- which are used by present-day DCNs -- are one of the several nonlinearities admitted by...

Topics: Computing Research Repository, Learning, Neural and Evolutionary Computing

Source: http://arxiv.org/abs/1508.01084

121
121

Sep 18, 2014
09/14

by
Joel Z. Leibo; Qianli Liao; Fabio Anselmi; Tomaso Poggio

texts

#
eye 121

#
favorite 0

#
comment 0

Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system's optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the...

Topic: Neuroscience

Source: http://biorxiv.org/content/early/2014/04/24/004473

50
50

Sep 18, 2013
09/13

by
Youssef Mroueh; Tomaso Poggio; Lorenzo Rosasco; Jean-Jacques Slotine

texts

#
eye 50

#
favorite 0

#
comment 0

In this paper we discuss a novel framework for multiclass learning, defined by a suitable coding/decoding strategy, namely the simplex coding, that allows to generalize to multiple classes a relaxation approach commonly used in binary classification. In this framework, a relaxation error analysis can be developed avoiding constraints on the considered hypotheses class. Moreover, we show that in this setting it is possible to derive the first provably consistent regularized method with...

Source: http://arxiv.org/abs/1209.1360v2

9
9.0

Jun 30, 2018
06/18

by
Chiyuan Zhang; Georgios Evangelopoulos; Stephen Voinea; Lorenzo Rosasco; Tomaso Poggio

texts

#
eye 9

#
favorite 0

#
comment 0

Representations in the auditory cortex might be based on mechanisms similar to the visual ventral stream; modules for building invariance to transformations and multiple layers for compositionality and selectivity. In this paper we propose the use of such computational modules for extracting invariant and discriminative audio representations. Building on a theory of invariance in hierarchical architectures, we propose a novel, mid-level representation for acoustical signals, using the empirical...

Topics: Machine Learning, Computing Research Repository, Sound, Learning, Statistics

Source: http://arxiv.org/abs/1404.0400

5
5.0

Jun 30, 2018
06/18

by
Georgios Evangelopoulos; Stephen Voinea; Chiyuan Zhang; Lorenzo Rosasco; Tomaso Poggio

texts

#
eye 5

#
favorite 0

#
comment 0

Recognition of speech, and in particular the ability to generalize and learn from small sets of labelled examples like humans do, depends on an appropriate representation of the acoustic input. We formulate the problem of finding robust speech features for supervised learning with small sample complexity as a problem of learning representations of the signal that are maximally invariant to intraclass transformations and deformations. We propose an extension of a theory for unsupervised learning...

Topics: Computing Research Repository, Sound, Learning

Source: http://arxiv.org/abs/1406.3884

8
8.0

Jun 28, 2018
06/18

by
Yan Luo; Xavier Boix; Gemma Roig; Tomaso Poggio; Qi Zhao

texts

#
eye 8

#
favorite 0

#
comment 0

We show that adversarial examples, i.e., the visually imperceptible perturbations that result in Convolutional Neural Networks (CNNs) fail, can be alleviated with a mechanism based on foveations---applying the CNN in different image regions. To see this, first, we report results in ImageNet that lead to a revision of the hypothesis that adversarial perturbations are a consequence of CNNs acting as a linear classifier: CNNs act locally linearly to changes in the image regions with objects...

Topics: Learning, Computing Research Repository, Computer Vision and Pattern Recognition

Source: http://arxiv.org/abs/1511.06292

15
15

Jun 29, 2018
06/18

by
Olivier Morère; Jie Lin; Antoine Veillard; Vijay Chandrasekhar; Tomaso Poggio

texts

#
eye 15

#
favorite 0

#
comment 0

The goal of this work is the computation of very compact binary hashes for image instance retrieval. Our approach has two novel contributions. The first one is Nested Invariance Pooling (NIP), a method inspired from i-theory, a mathematical theory for computing group invariant transformations with feed-forward neural networks. NIP is able to produce compact and well-performing descriptors with visual representations extracted from convolutional neural networks. We specifically incorporate...

Topics: Computer Vision and Pattern Recognition, Information Retrieval, Computing Research Repository

Source: http://arxiv.org/abs/1603.04595

7
7.0

Jun 29, 2018
06/18

by
Tomaso Poggio; Hrushikesh Mhaskar; Lorenzo Rosasco; Brando Miranda; Qianli Liao

texts

#
eye 7

#
favorite 0

#
comment 0

The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage.

Topics: Computing Research Repository, Learning

Source: http://arxiv.org/abs/1611.00740

9
9.0

Aug 11, 2020
08/20

by
Jaz K; ola; Thomas Hofmann; Tomaso Poggio; John Shawe-Taylor

data

#
eye 9

#
favorite 0

#
comment 0

Source: http://academictorrents.com/details/69af795cc5dd6387b7561fd20ff72615d802bb28

6
6.0

Jun 29, 2018
06/18

by
Joel Z. Leibo; Qianli Liao; Winrich Freiwald; Fabio Anselmi; Tomaso Poggio

texts

#
eye 6

#
favorite 0

#
comment 0

The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and relatively robust against identity-preserving transformations like depth-rotations. Current computational models of object recognition, including recent deep learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar...

Topics: Quantitative Biology, Neural and Evolutionary Computing, Computing Research Repository, Neurons and...

Source: http://arxiv.org/abs/1606.01552

44
44

Jun 28, 2018
06/18

by
Charlie Frogner; Chiyuan Zhang; Hossein Mobahi; Mauricio Araya-Polo; Tomaso Poggio

texts

#
eye 44

#
favorite 0

#
comment 0

Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions. In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is...

Topics: Statistics, Computer Vision and Pattern Recognition, Computing Research Repository, Learning,...

Source: http://arxiv.org/abs/1506.05439

6
6.0

Jun 29, 2018
06/18

by
Olivier Morère; Antoine Veillard; Jie Lin; Julie Petta; Vijay Chandrasekhar; Tomaso Poggio

texts

#
eye 6

#
favorite 0

#
comment 0

Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining ground on Fisher Vectors (FVs) as state-of-the-art global descriptors for image instance retrieval. While CNN-based descriptors are generally remarked for good retrieval performance at lower...

Topics: Computer Vision and Pattern Recognition, Information Retrieval, Computing Research Repository

Source: http://arxiv.org/abs/1601.02093

7
7.0

Jun 30, 2018
06/18

by
Vijay Chandrasekhar; Jie Lin; Qianli Liao; Olivier Morère; Antoine Veillard; Lingyu Duan; Tomaso Poggio

texts

#
eye 7

#
favorite 0

#
comment 0

Image instance retrieval is the problem of retrieving images from a database which contain the same object. Convolutional Neural Network (CNN) based descriptors are becoming the dominant approach for generating {\it global image descriptors} for the instance retrieval problem. One major drawback of CNN-based {\it global descriptors} is that uncompressed deep neural network models require hundreds of megabytes of storage making them inconvenient to deploy in mobile applications or in custom...

Topics: Computing Research Repository, Computer Vision and Pattern Recognition

Source: http://arxiv.org/abs/1701.04923