Ошибка в текстовом поиске Lucene
Я новичок в текстовом поиске, и я изучаю некоторые примеры, связанные с lucene. Я нашел один из примеров по этой ссылке. http://javatechniques.com/blog/lucene-in-memory-text-search-example/ Я попробовал это сделать в своей IDE затмения. Но это дает некоторые ошибки. Я импортировал также все файлы jar.
Вот код:
открытый класс InMemoryExample {
public static void main(String[] args) { // Construct a RAMDirectory to hold the in-memory representation // of the index. RAMDirectory idx = new RAMDirectory(); try { // Make an writer to create the index IndexWriter writer = new IndexWriter(idx, new StandardAnalyzer(Version.LUCENE_48),
IndexWriter.MaxFieldLength.LIMITED);
// Add some Document objects containing quotes writer.addDocument(createDocument("Theodore Roosevelt", "It behooves every man to remember that the work of the " + "critic, is of altogether secondary importance, and that, " + "in the end, progress is accomplished by the man who does " + "things.")); writer.addDocument(createDocument("Friedrich Hayek", "The case for individual freedom rests largely on the " + "recognition of the inevitable and universal ignorance " + "of all of us concerning a great many of the factors on " + "which the achievements of our ends and welfare depend.")); writer.addDocument(createDocument("Ayn Rand", "There is nothing to take a man's freedom away from " + "him, save other men. To be free, a man must be free " + "of his brothers.")); writer.addDocument(createDocument("Mohandas Gandhi", "Freedom is not worth having if it does not connote " + "freedom to err.")); // Optimize and close the writer to finish building the index writer.optimize(); writer.close(); // Build an IndexSearcher using the in-memory index Searcher searcher = new IndexSearcher(idx); // Run some queries search(searcher, "freedom"); search(searcher, "free"); search(searcher, "progress or achievements"); searcher.close(); } catch (IOException ioe) { // In this example we aren't really doing an I/O, so this // exception should never actually be thrown. ioe.printStackTrace(); } catch (ParseException pe) { pe.printStackTrace(); } } /** * Make a Document object with an un-indexed title field and an * indexed content field. */ private static Document createDocument(String title, String content) { Document doc = new Document(); // Add the title as an unindexed field... doc.add(new Field("title", title, Field.Store.YES, Field.Index.NO)); // ...and the content as an indexed field. Note that indexed // Text fields are constructed using a Reader. Lucene can read // and index very large chunks of text, without storing the // entire content verbatim in the index. In this example we // can just wrap the content string in a StringReader. doc.add(new Field("content", content, Field.Store.YES, Field.Index.ANALYZED)); return doc; } /** * Searches for the given string in the "content" field */ private static void search(Searcher searcher, String queryString) throws ParseException, IOException { // Build a Query object //Query query = QueryParser.parse( QueryParser parser = new QueryParser("content", new StandardAnalyzer(Version.LUCENE_48)); Query query = parser.parse(queryString); int hitsPerPage = 10; // Search for the query TopScoreDocCollector collector = TopScoreDocCollector.create(5 * hitsPerPage, false); searcher.search(query, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; int hitCount = collector.getTotalHits(); System.out.println(hitCount + " total matching documents"); // Examine the Hits object to see if there were any matches if (hitCount == 0) { System.out.println( "No matches were found for \"" + queryString + "\""); } else { System.out.println("Hits for \"" + queryString + "\" were found in quotes by:"); // Iterate over the Documents in the Hits object for (int i = 0; i < hitCount; i++) { // Document doc = hits.doc(i); ScoreDoc scoreDoc = hits[i]; int docId = scoreDoc.doc; float docScore = scoreDoc.score; System.out.println("docId: " + docId + "\t" + "docScore: " + docScore); Document doc = searcher.doc(docId); // Print the value that we stored in the "title" field. Note // that this Field was not indexed, but (unlike the // "contents" field) was stored verbatim and can be // retrieved. System.out.println(" " + (i + 1) + ". " + doc.get("title")); System.out.println("Content: " + doc.get("content")); } } System.out.println(); } }
но он показывает несколько синтаксических ошибок в следующих строках:
Ошибка 1:
IndexWriter writer = подчеркивать MaxFieldLength красным цветом новый IndexWriter(idx, новый StandardAnalyzer(Version.LUCENE_48), IndexWriter.MaxFieldLength.LIMITED);
Ошибка 2: подчеркните optimeze() красным
writer.optimize();
Ошибка 3: подчеркните новый IndexSearcher(idx) красным
Поисковый поисковик = новый IndexSearcher(idx);
Ошибка 4: подчеркивание поиска красным
searcher.search (запрос, сборщик);
Не могли бы вы помочь мне избавиться от этих ошибок? Это будет отличная помощь. Спасибо
Модифицированный код:
открытый класс InMemoryExample {
public static void main (String [] args) throws Exception {// Создаем RAMDirectory для хранения представления в памяти // индекса. RAMDirectory idx = new RAMDirectory ();
// Make an writer to create the index IndexWriterConfig cfg = new IndexWriterConfig(Version.LUCENE_48, new
StandardAnalyzer (Version.LUCENE_48)); IndexWriter writer = новый IndexWriter(idx, cfg);
// Add some Document objects containing quotes writer.addDocument(createDocument("Theodore Roosevelt", "It behooves every man to remember that the work of the " + "critic, is of altogether secondary importance, and that, " + "in the end, progress is accomplished by the man who does " + "things.")); writer.addDocument(createDocument("Friedrich Hayek", "The case for individual freedom rests largely on the " + "recognition of the inevitable and universal ignorance " + "of all of us concerning a great many of the factors on " + "which the achievements of our ends and welfare depend.")); writer.addDocument(createDocument("Ayn Rand", "There is nothing to take a man's freedom away from " + "him, save other men. To be free, a man must be free " + "of his brothers.")); writer.addDocument(createDocument("Mohandas Gandhi", "Freedom is not worth having if it does not connote " + "freedom to err.")); // Optimize and close the writer to finish building the index writer.commit(); writer.close(); // Build an IndexSearcher using the in-memory index IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(idx)); // Run some queries search(searcher, "freedom"); search(searcher, "free"); search(searcher, "progress or achievements"); //searcher.close(); } /** * Make a Document object with an un-indexed title field and an * indexed content field. */ private static Document createDocument(String title, String content) { Document doc = new Document(); // Add the title as an unindexed field... doc.add(new Field("title", title, Field.Store.YES, Field.Index.NO)); // ...and the content as an indexed field. Note that indexed // Text fields are constructed using a Reader. Lucene can read // and index very large chunks of text, without storing the // entire content verbatim in the index. In this example we // can just wrap the content string in a StringReader. doc.add(new Field("content", content, Field.Store.YES, Field.Index.ANALYZED)); return doc; } /** * Searches for the given string in the "content" field */ private static void search(IndexSearcher searcher, String queryString) throws ParseException, IOException { // Build a Query object //Query query = QueryParser.parse( QueryParser parser = new QueryParser("content", new StandardAnalyzer(Version.LUCENE_48)); Query query = parser.parse(queryString); int hitsPerPage = 10; // Search for the query TopScoreDocCollector collector = TopScoreDocCollector.create(5 * hitsPerPage, false); searcher.search(query, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; int hitCount = collector.getTotalHits(); System.out.println(hitCount + " total matching documents"); // Examine the Hits object to see if there were any matches if (hitCount == 0) { System.out.println( "No matches were found for \"" + queryString + "\""); } else { System.out.println("Hits for \"" + queryString + "\" were found in quotes by:"); // Iterate over the Documents in the Hits object for (int i = 0; i < hitCount; i++) { // Document doc = hits.doc(i); ScoreDoc scoreDoc = hits[i]; int docId = scoreDoc.doc; float docScore = scoreDoc.score; System.out.println("docId: " + docId + "\t" + "docScore: " + docScore); Document doc = searcher.doc(docId); // Print the value that we stored in the "title" field. Note // that this Field was not indexed, but (unlike the // "contents" field) was stored verbatim and can be // retrieved. System.out.println(" " + (i + 1) + ". " + doc.get("title")); System.out.println("Content: " + doc.get("content")); } } System.out.println(); } }
и это вывод:
Исключение в потоке "main" java.lang.VerifyError: class org.apache.lucene.analysis.SimpleAnalyzer переопределяет конечный метод tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream; в java.lang.ClassLoader.defineClass1(собственный метод) в java.lang.ClassLoader.defineClass(неизвестный источник) в java.security.SecureClassLoader.defineClass(неизвестный источник) в java.net.URLClassLoader.defineClass(неизвестный источник в).net.URLClassLoader.access$100(неизвестный источник) на java.net.URLClassLoader$1.run(неизвестный источник) на java.net.URLClassLoader$1.run(неизвестный источник) на java.security.AccessController.doPrivileged(собственный метод) на java.net.URLClassLoader.findClass(неизвестный источник) в java.lang.ClassLoader.loadClass(неизвестный источник) в sun.misc.Launcher$AppClassLoader.loadClass(неизвестный источник) в java.lang.ClassLoader.loadClass(неизвестный источник) в beehex.inmemeory.textsearch.InMemoryExample.search(InMemoryExample.java:98) в beehex.inmemeory.textsearch.InMemoryExample.main(InMemoryExample.java:58)
1 ответ
Я не вижу третьего аргумента на IndexWriter
конструктор. Вы должны изменить код, чтобы он соответствовал новому API Lucene следующим образом:
IndexWriterConfig cfg = new IndexWriterConfig(Version.LUCENE_48, new StandardAnalyzer(Version.LUCENE_48));
IndexWriter writer = new IndexWriter(idx, cfg);
Кроме того, вместо того, чтобы ловить исключение здесь, я бы лучше main
метод броска Exception
и пусть программа потерпит неудачу
РЕДАКТИРОВАТЬ:
2) удалить optimize
называть как IndexWriter
у класса больше нет этого метода (я думаю, commit
сделаем трюк здесь).
3) определить IndexSearcher
класс вроде так:
IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(idx));