Publications

2020

Crowdsourced Detection of Emotionally Manipulative Language
CHI, 2020

PDF BibTeX Abstract

Overview of the seventh Dialog System Technology Challenge: DSTC7
CSL, 2020

PDF BibTeX Abstract

2019

No-Press Diplomacy: Modeling Multi-Agent Gameplay
NeurIPS, 2019

PDF Blog Post Supplementary arXiv BibTeX Abstract

An Evaluation for Intent Classification and Out-of-Scope Prediction
EMNLP (short), 2019

PDF Dataset arXiv BibTeX Abstract

A Large-Scale Corpus for Conversation Disentanglement
ACL, 2019

PDF Blog Post Code Dataset Supplementary arXiv Poster BibTeX Abstract Citations

SLATE: A Super-Lightweight Annotation Tool for Experts
ACL (demo), 2019

PDF Code Poster BibTeX Abstract Citations

Outlier Detection for Improved Data Quality and Diversity in Dialog Systems
NAACL, 2019

PDF Dataset arXiv BibTeX Abstract

Look Who's Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog
Best Student Paper - CICLing, 2019

PDF arXiv BibTeX Abstract

Learning from Personal Longitudinal Dialog Data
IEEE Intelligent Systems, 2019

PDF BibTeX Abstract Citations

2018

Improving Text-to-SQL Evaluation Methodology
ACL, 2018

PDF Code Dataset arXiv Poster BibTeX Abstract Citations

Data Collection for a Production Dialogue System: A Startup Perspective
NAACL (industry), 2018

PDF Video BibTeX Abstract Citations

Effective Crowdsourcing for a New Type of Summarization Task
NAACL (short), 2018

PDF BibTeX Abstract Citations

Factors Influencing the Surprising Instability of Word Embeddings
NAACL, 2018

PDF arXiv BibTeX Abstract Citations

World Knowledge for Abstract Meaning Representation Parsing
LREC, 2018

PDF BibTeX Abstract Citations

2017

Identifying Products in Online Cybercrime Marketplaces: A Dataset for Fine-grained Domain Adaptation
EMNLP, 2017

PDF Code Supplementary arXiv BibTeX Abstract Citations

Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection
ACL (short), 2017

PDF Dataset arXiv PDF Slides Video BibTeX Abstract Citations

Tools for Automated Analysis of Cybercriminal Markets
WWW, 2017

PDF Code BibTeX Abstract Citations

Parsing with Traces: An O($n^4$) Algorithm and a Structural Representation
TACL, 2017

PDF Code arXiv Interview Video BibTeX Abstract Citations

2016

Algorithms for Identifying Syntactic Errors and Parsing with Graph Structured Output
EECS Department, University of California, Berkeley, 2016

PDF BibTeX Abstract

2015

An Empirical Analysis of Optimization for Max-Margin NLP
EMNLP (short), 2015

PDF Code Poster BibTeX Abstract Citations

2013

Error-Driven Analysis of Challenges in Coreference Resolution
EMNLP, 2013

PDF Code Slides PDF Slides BibTeX Abstract Citations

An Empirical Examination of Challenges in Chinese Parsing
ACL (short), 2013

PDF Code Slides PDF Slides BibTeX Abstract Citations

High-velocity Clouds in the Galactic All Sky Survey. I. Catalog
The Astrophysical Journal Supplement Series, 2013

PDF arXiv BibTeX Abstract Citations

2012

Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
EMNLP, 2012

PDF Code Slides PDF Slides BibTeX Abstract Citations

Robust Conversion of CCG Derivations to Phrase Structure Trees
ACL (short), 2012

PDF Code Slides PDF Slides BibTeX Abstract Citations

2011

Mention Detection: Heuristics for the OntoNotes annotations
CoNLL Shared Task, 2011

PDF Poster BibTeX Abstract Citations

2010

Spatiotemporal Hierarchy of Relaxation Events, Dynamical Heterogeneities, and Structural Reorganization in a Supercooled Liquid
Physical Review Letters, 2010

PDF arXiv BibTeX Abstract Citations

Morphological Analysis Can Improve a CCG Parser for English
CoLing, 2010

PDF BibTeX Abstract Citations

Faster Parsing by Supertagger Adaptation
ACL, 2010

PDF Code PDF Slides BibTeX Abstract Citations

2009

Faster parsing and supertagging model estimation
ALTA, 2009

PDF PDF Slides BibTeX Abstract

Adaptive Supertagging for Faster Parsing
The University of Sydney, 2009

PDF PDF Slides Poster BibTeX Abstract

2008

Classification of Verb Particle Constructions with the Google Web1T Corpus
ALTA, 2008

PDF Poster BibTeX Abstract Citations

The densest packing of AB binary hard-sphere homogeneous compounds across all size ratios
The Journal of Physical Chemistry B, 2008

PDF BibTeX Abstract Citations


Non-Archival

2020

NOESIS II: Predicting Responses, Identifying Success, and Managing Complexity in Task-Oriented Dialogue
AAAI Wokshop: Dialogue System Technology Challenges, 2020

PDF BibTeX Abstract

2019

The Eighth Dialog System Technology Challenge
NeurIPS Workshop: Conversational AI: Today’s Practice and Tomorrow’s Potential, 2019

PDF arXiv BibTeX Abstract Citations

Training Data Voids: Novel Attacks Against NLP Content Moderation
CSCW Workshop: Volunteer Work: Mapping the Future of Moderation Research, 2019

PDF BibTeX

DSTC7 Task 1: Noetic End-to-End Response Selection
ACL Workshop: NLP for Conversational AI, 2019

PDF BibTeX

DSTC7 Task 1: Noetic End-to-End Response Selection
AAAI Wokshop: Dialogue System Technology Challenges, 2019

PDF BibTeX Abstract Citations

2018

Dialog System Technology Challenge 7
NeurIPS Workshop: Conversational AI: Today’s Practice and Tomorrow’s Potential, 2018

PDF arXiv BibTeX Abstract Citations

2009

Large-Scale Syntactic Processing: Parsing the Web
Johns Hopkins University, 2009

PDF BibTeX Abstract Citations

Software

Colaboratoy Notebook for Coreference Resolution with SpanBERT

A notebook that (1) sets up the SpanBERT code and model, and (2) runs inference on text you provide.

SLATE: A Super-Lightweight Annotation Tool for Experts

A terminal-based text annotation tool in Python.

Neural POS tagging

Implementations of a POS tagger in DyNet, PyTorch, and Tensorflow, visualised to show the overall picture and make comparisons easy.

Text to SQL Baseline

A simple LSTM-based model that uses templates and slot-filing to map questions to SQL queries.

One-Endpoint Crossing Graph Parser

A range of tools related to one-endpoint crossing graphs - parsing, format conversion, and evaluation.

Coreference Error Analysis

A tool for classifying errors in coreference resolution.

CCG to PST

A tool for converting CCG derivations into PTB-style phrase structure trees.

Parse Error Analysis

A tool for classifying mistakes in the output of parsers.

Data

IRC Disentanglement

Annotation of IRC messages with reply-to structure, which disentangles simultaneous conversations. The largest such annotated resource.

Text to SQL datasets

A collection of datasets containing questions in English paired with SQL queries for a provided database. Our version homogenises the style of the SQL and corrects errors in previous versions of the data.

IE/NER from Cybercriminal Forums

Forum posts with annotations of products.

Crowdsourced Paraphrases

Paraphrases collected while conducting experiments on factors influencing crowd performance.

Spine and Arc version of the Penn Treebank

Code to convert the standard Penn Treebank into a version where each word is assigned a spine of non-terminals, and arcs to indicate attachments from one spine to another.

Adaptive CCG Supertagging Model

A model for the C&C supertagger that gives the same results with smaller beam sizes, enabling faster parsing.

Recent Posts

Papers I’m reading and more (RSS Feed and E-mail List)

More Posts

Keeping up with research is hard. I’ve previously made lists of papers I wanted to read, and then only gotten to a small fraction of them. Simply resolving to read more papers hasn’t worked for me. I’m trying out a new approach. The goals are (1) read less of more papers, and (2) read more papers that are critical to my work. Sometimes just the introduction or abstract is enough for me to get the ideas I need from the paper.

Continue Reading

Am I getting the most our of time at conferences? This post was a way for me to think through that question and come up with strategies.

Continue Reading

Games have been a focus of AI research for decades, from Samuel’s checkers program in the 1950s, to Deep Blue playing Chess in the 1990s, and AlphaGo playing Go in the 2010s. All of those are two-player…

Continue Reading

This post is about my own paper to appear at ACL later this month. What is interesting about this paper will depend on your research interests, so that’s how I’ve broken down this blog post. A few key points first: Data and code are available on Github. The paper is also available. The general-purpose span labeling and linking annotation tool we used is also appearing at ACL. Check out DSTC 8 Track 2, which is based on this work.

Continue Reading

A range of services exist for collecting annotations from paid workers. This post gives an overview of a bunch of them.

Continue Reading

Contact

  • jkummerf@umich.edu
  • 2260 Hayward Street, Ann Arbor, MI 48109, USA