Statistical Machine Translation Tutorial Reading
The following is a list of papers that I think are worth reading for our
discussion of machine translation. I've tried to give a short blurb about
each of the papers to put them in context. I've included a number of papers
that I marked "OPTIONAL" that I think are interesting, but are either
supplementary or the material is more or less covered in the other papers.
If anyone would like more information on a particular topic or would
like to discuss any of these papers, feel free to e-mail me
dkauchakcs.ucsd.edu
Part 1 (Jan. 19)
A Statistical MT Tutorial Workbook. Kevin Knight. 1999.
Very good introduction to word-based statistical machine translation.
Written in an informal, understandable, tutorial oriented style.
Automating Knowledge Acquisition for Machine Translation.
Kevin Knight. 1997.
(OPTIONAL) Another tutorial oriented paper that steps through
how one can learn from bilingual data. Also introduces a number of
important concepts for MT.
Foundations of Statistical NLP, chapter 13. Manning and Schutze. 1999.
(OPTIONAL) Must be accessed from UCSD. Overview of statistical MT.
Spends a lot of time on sentence and word alignment of bilingual data.
Foundations of Statistical NLP, chapter 6. Manning and Schutze. 1999.
(OPTIONAL) Must be accessed from UCSD. Discusses n-gram language
modeling. Language modeling is crucial for SMT and many other natural
language applications. I won't spend much time discussing language
modeling, but for those that are interested this is a good introduction.
Part 2 (Jan. 26)
Word models:
The Mathematics of Statistical Machine Translation:
Parameter Estimation. P. F. Brown, S. A. Della Pietra,
V. J. Della Pietra and R.L. Mercer. 1993.
(OPTIONAL) All you ever wanted to know about word level
models. Describes IBM models 1-5 and parameter estimation
for these models. It's about 50 pages and contains a lot of
material for the interested reader.
Word model decoding:
Decoding Algorithm in Statistical Machine Translation.
Ye-Yi Wand and Alex Waibel. 1997.
Early paper discussing decoding of IBM model 2. The paper
provides a fairly good introduction to word-level decoding
including multi-stack search (i.e. multiple beams) and rest
cost estimation (heuristic functions).
An Efficient A* Search Algorithm for Statistical Machine Translation.
Franz Josef Och, Nicola Ueffing, Hermann Ney. 2001.
(OPTIONAL) One of many papers on decoding with word-based SMT. They
discuss the basic idea of viewing decoding as state space search and
provide one method for doing this. They describe decoding for Model 3
and suggest a few different heuristics that are admissible, leading to few search errors.
Phrase based statistical MT:
Statistical Phrase-Based Translation.
Philipp Koehn, Franz Jasof Ock and Daniel Marcu. 2003.
Good, short overview of phrased based systems. If you want more
details, see the paper below.
The Alignment Template Approach to Statistical Machine Translation.
Franz Josef Och and Hermann Ney. 2004.
(OPTIONAL) This is a journal paper discussing one phrase based statistical system
including decoding. This is more or less the system used at ISI and
is probably the best current system (though syntax based systems my beat
these in the next few years). Requires acrobat 5 and to be at UCSD.
Part 3 (Feb. 2)
Phrase-based decoding:
See the previous paper.
Syntax based translation:
What's in a Translation Rule? Galley, Hopkins, Knight and Marcu. 2004.
This is the current system being investigated at ISI and the hope is that
these syntax based systems will perform better than phrase based systems.
The paper is a bit tough to read since it's a conference paper.
A Syntax-Based Statistical Translation Model. Yamada and Knight. 2001.
(OPTIONAL) Predecessor model to Galley et al., but similar.
Syntax based decoding:
Foundations of Statistical NLP, chapter 12. Manning and Schutze. 1999.
Must be on campus. This is a chapter on parsing (not actually decoding)
However, since the above rules are very similar to PCFGs, then decoding
is very similar to parsing... just with more complications.
A Decoder for Syntax-Based Statistical MT. Kenji Yamada and Kevin Knight. 2001.
(OPTIONAL) Decoder for the above Yamada and Knight model.
Part 4 (Feb. 9)
Discriminative Training:
Discriminative Training and Maximum Entropy Models for Statistical Machine Translation.
Och and Ney. 2002.
Learning how the best models for combining the different models (traslation
model, language model, etc.) using maximum entropy parameter estimation.
This line of research is still very important and my be interesting to
many of you since it's very machine learningy.
AnotherPaper:Minimum Error Rate Training in Statistical Machine Translation
Och Acl-03
Discriminative Reranking for Machine Translation.
Shen, Sarkar and Och. 2004.(HLT/NAACL'04)
(OPTIONAL) Given a ranked output of possible translations from the
translation system, this paper uses the perceptron algorithm to learn
a reranking of the sentences to improves the top translation.
MT Evaluation:
BLEU: A Method for Automatic Evaluation of Machine Translation.
Papineni, Roukos, Ward and Zhu. 2001.
Foundational method for evaluating MT methods and still used currently.
hfjiang:
好像只有och相关的一些人在尝试,应该算是比较新的方向。另外,在ebmt,rbmt中,discriminative training的方法好像还没有人尝试引入。我们再看这些文章的时候,关键要看一下,如何把一种思想model进现有的框架中。比如,现在我们想尝试用discriminative training 的方法在EBMT上作些工作,那么什么地方是切入点,如何model,如何实验?衡量性能的方法又是什么,这些都是应该考虑的问题。
希望通过阅读借鉴别人的文章来得到写启发。
分享到:
相关推荐
统计机器翻译(英)Statistical Machine Translation
A statistical approach to machine translation
The field of machine translation has recently been energized by the emergence of statistical techniques, which have brought the dream of automatic language translation closer to reality. This class-...
Listwise Ranking Functions for Statistical Machine Translation
Introduction to Statistical Machine Learning 英文版
短语统计机器翻译模型的一种新的调序模型,史晓东,陈毅东,在统计机器翻译中,尤其是基于短语的统计翻译模型中,调序模型起着关键作用。然而,当前词汇化的调序模型存在两个问题:其一,仅
一篇介绍机器翻译解码器的文献,比较早的文献
计算机 brown90 自然语言处理 机器翻译
Statistical machine learning algorithms deal with the problem of selecting an appropriate statistical model from a model space based on a training set {xi}N i=1 ⊂ X or {(xi, yi)}N i=1 ⊂ X × Y. In...
Research on Issues of Translation Selection for Phrase and Structure in Statistical Machine Translation_hezhongjun_phd thesis 2008.pdf Research on domain adaptation in Statistical Machine Translation...
while only assuming the existence of a very limited parallel corpus, thus having a unique starting point to Statistical Machine Translation (SMT). In this book, a detailed presentation of the ...
An Incremental Tuning Method Based on Ultraconservative Update for Statistical Machine Translation
机器学习使得计算机具备了自主学习和模式识别的能力,而数理统计知识与机器学习的有效结合,使其成为一个更加有力的工具,广泛用于基础科学和工程领域中的各类数据分析和挖掘任务。 本书对机器学习的关键知识点...
ANU COMP4670 2018 课件 Cheng Soon Ong & Christian Walder Machine Learning Research Group Data61 | CSIRO and Collage of Engineering and Computer Science The Australian National University
机器学习使得计算机具备了自主学习和模式识别的能力,而数理统计知识与机器学习的有效结合,使其成为一个更加有力的工具,广泛用于基础科学和工程领域中的各类数据分析和挖掘任务。 本书对机器学习的关键知识点...
综合数据挖掘开源平台,性能非常好,功能包括:Classification: Support Vector Machines, Decision Trees, AdaBoost, Gradient Boosting, Random Forest, Logistic Regression, Neural Networks, RBF Networks, ...
PSMT是用序言编写的统计机器翻译程序。
The basic text that this tutorial relies on is Brown et al, “The Mathematics of Statistical Machine Translation”, Computational Linguistics, 1993. On top of this excellent presentation, I can only ...
Statistical Machine Translation IBM Models 1 and 2.pdf Statistical Phrase-Based Translation.pdf word2vector中的数学原理详解.pdf 机器翻译.zip 机器翻译前沿进展.pdf 神经机器翻译mtma16-neural.pdf 神经机器...