commit a0112d59dd7031ac5df7cce68b90a967c2522c4e
parent d1f3039dc6a5e98f62fe886f5c5633043b7f80a6
Author: Dan Callaghan <djc@djc.id.au>
Date: Tue, 17 Jun 2014 23:31:54 +1000
add a README
Diffstat:
1 file changed, 9 insertions(+), 0 deletions(-)
diff --git a/README.md b/README.md
@@ -0,0 +1,9 @@
+Utilities for working with multilingual text in Lucene.
+
+``CyrillicTransliteratingFilter`` injects a Latin transliteration in the
+same position as tokens containing Cyrillic characters. For example,
+this makes it possible to match the text ``Pasternak’s Повесть`` with
+the query ``pasternak's povest``.
+
+``XMLTokenizer`` tokenizes an XML document, using different Analyzers
+for each language in the document identified by the ``lang`` attribute.