Paper

No-Press Diplomacy: Modeling Multi-Agent Gameplay (Paquette et al., 2019)

Games have been a focus of AI research for decades, from Samuel’s checkers program in the 1950s, to Deep Blue playing Chess in the 1990s, and AlphaGo playing Go in the 2010s. All of those are two-player…

A Large-Scale Corpus for Conversation Disentanglement (Kummerfeld et al., 2019)

This post is about my own paper to appear at ACL later this month. What is interesting about this paper will depend on your research interests, so that’s how I’ve broken down this blog post. A few key points first: Data and code are available on Github. The paper is also available. The general-purpose span labeling and linking annotation tool we used is also appearing at ACL. Check out DSTC 8 Track 2, which is based on this work.

PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution (Chen et al., 2018)

The OntoNotes dataset, which is the focus of almost all coreference resolution research, had several compromises in its development (as is the case for any dataset). Some of these are discussed in…

Evaluating the Utility of Hand-crafted Features in Sequence Labelling (Minghao Wu et al., 2018)

A common argument in favour of neural networks is that they do not require ‘feature engineering’, manually defining functions that produce useful representations of the input data (e.g. a function…

Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples (Vidur Joshi et al., 2018)

Virtually all systems trained using data have trouble when applied to datasets that differ even slightly - even switching from Wall Street…

The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models (Weber et al., 2018)

We know that training a neural network involves optimising over a non-convex space, but using standard evaluation methods we see that our models…

An Analysis of Neural Language Modeling at Multiple Scales (Merity et al., 2018)

Assigning a probability distribution over the next word or character in a sequence (language modeling) is a useful component of many systems…

Provenance for Natural Language Queries (Deutch et al., 2017)

Being able to query a database in natural language could help make data accessible …

Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning (Tsvetkov et al., 2016)

Reordering training sentences for word vectors may impact their usefulness for downstream tasks.

Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations (Wieting et al., 2017)

With enough training data, the best vector representation of a sentence is to concatenate an average over word vectors and an average over character trigram vectors.