Rami Al-Rfou
Rami Al-Rfou
Home
Experience
Education
Talks
Projects
Publications
Patents
Resume
Contact
Light
Dark
Automatic
Publications
Type
Conference paper
Preprint
Date
2020
2019
2018
2017
2016
2015
2014
2013
Machine Translation Aided Bilingual Data-to-Text Generation and Semantic Parsing
We present a system for bilingual Data-To-Text Generation and Semantic Parsing. We use a text-to-text generator to learn a single model …
Oshin Agarwal
,
Mihir Kale
,
Heming Ge
,
Siamak Shakeri
,
Rami Al-Rfou
PDF
Large Scale Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training
Prior work on Data-To-Text Generation, the task of converting knowledge graph (KG) triples into natural text, focused on …
Oshin Agarwal
,
Heming Ge
,
Siamak Shakeri
,
Rami Al-Rfou
PDF
mT5: A massively multilingual pre-trained text-to-text transformer
The recent “Text-to-Text Transfer Transformer” (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We describe the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. All of the code and model checkpoints used in this work are publicly available.
Linting Xue
,
Noah Constant
,
Adam Roberts
,
Mihir Kale
,
Rami Al-Rfou
,
Aditya Siddhant
,
Aditya Barua
,
Colin Raffel
PDF
Code
Wiki-40B: Multilingual Language Model Dataset
We propose a new multilingual language model benchmark that is composed of 40+ languages spanning several scripts and linguistic …
Mandy Guo
,
Zihang Dai
,
Denny Vrandečić
,
Rami Al-Rfou
PDF
Cite
Code
Dataset
LAReQA: Language-agnostic answer retrieval from a multilingual pool
We present LAReQA, a challenging new benchmark for language-agnostic answer retrieval from a multilingual candidate pool. Unlike …
Uma Roy
,
Noah Constant
,
Rami Al-Rfou
,
Aditya Barua
,
Aaron Phillips
,
Yinfei Yang
PDF
Cite
Dataset
Bridging the Gap for Tokenizer-Free Language Models
Purely character-based language models (LMs) have been lagging in quality on large scale datasets, and current state-of-the-art LMs …
Dokook Choe
,
Rami Al-Rfou
,
Mandy Guo
,
Heeyoung Lee
,
Noah Constant
PDF
Character-Level Language Modeling with Deeper Self-Attention
LSTMs and other RNN variants have shown strong performance on character-level language modeling. These models are typically trained …
Rami Al-Rfou
,
Dokook Choe
,
Noah Constant
,
Mandy Guo
,
Llion Jones
PDF
Cite
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
Can neural networks learn to compare graphs without feature engineering? In this paper, we show that it is possible to learn …
Rami Al-Rfou
,
Dustin Zelle
,
Bryan Perozzi
PDF
Cite
Code
Watch Your Step: Learning Node Embeddings via Graph Attention
Graph embedding methods represent nodes in a continuous vector space, preserving different types of relational information from the …
Sami Abu-El-Haija
,
Bryan Perozzi
,
Rami Al-Rfou
,
Alex Alemi
PDF
Code
Poster
Video
Efficient Natural Language Response Suggestion for Smart Reply
This paper presents a computationally efficient machine-learned method for natural language response suggestion. Feed-forward neural …
Matthew Henderson
,
Rami Al-Rfou
,
Brian Strope
,
Yun-hsuan Sung
,
Laszlo Lukacs
,
Ruiqi Guo
,
Sanjiv Kumar
,
Balint Miklos
,
Ray Kurzweil
PDF
Learning edge representations via low-rank asymmetric projections
We propose a new method for embedding graphs while preserving directed edge information. Learning such continuous-space vector …
Sami Abu-El-Haija
,
Bryan Perozzi
,
Rami Al-Rfou
PDF
Cite
Code
Conversational Contextual Cues: The Case of Personalization and History for Response Ranking
We investigate the task of modeling open-domain, multi-turn, unstructured, multi-participant, conversational dialogue. We specifically …
Rami Al-Rfou
,
Marc Pickett
,
Javier Snaider
,
Yun-hsuan Sung
,
Brian Strope
,
Ray Kurzweil
PDF
Statistically Significant Detection of Linguistic Change
We propose a new computational approach for tracking and detecting statistically significant linguistic shifts in the meaning and usage …
Vivek Kulkarni
,
Rami Al-Rfou
,
Bryan Perozzi
,
Steven Skiena
PDF
Cite
Code
Dataset
DeepWalk: Online Learning of Social Representations
We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode …
Bryan Perozzi
,
Rami Al-Rfou
,
Steven Skiena
PDF
Cite
Code
Polyglot: Distributed Word Representations for Multilingual NLP
Rami Al-Rfou
,
Bryan Perozzi
,
Steven Skiena
PDF
Cite
Cite
×