Lucene in 5 minutes.
来源:互联网 发布:网络环境测试 编辑:程序博客网 时间:2024/06/10 10:49
Lucene in 5 minutes
from LuceneTutorial.com
Now updated for Lucene 4.0!
Lucene makes it easy to add full-text search capability to your application. In fact, its so easy, I'm going to show you how in 5 minutes!
1. Index
For this simple case, we're going to create an in-memory index from some strings.
Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer);
IndexWriter w = new IndexWriter(index, config);
addDoc(w, "Lucene in Action", "193398817");
addDoc(w, "Lucene for Dummies", "55320055Z");
addDoc(w, "Managing Gigabytes", "55063554A");
addDoc(w, "The Art of Computer Science", "9900333X");
w.close();
addDoc() is what actually adds documents to the index:
Document doc = new Document();
doc.add(new TextField("title", title, Field.Store.YES));
doc.add(new StringField("isbn", isbn, Field.Store.YES));
w.addDocument(doc);
}
Note the use of TextField for content we want tokenized, and StringField for id fields and the like, which we don't want tokenized.
2. Query
We read the query from stdin, parse it and build a lucene Query out of it.
Query q = new QueryParser(Version.LUCENE_40, "title", analyzer).parse(querystr);
3. Search
Using the Query we create a Searcher to search the index. Then a TopScoreDocCollector is instantiated to collect the top 10 scoring hits.
IndexReader reader = IndexReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
searcher.search(q, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
4. Display
Now that we have results from our search, we display the results to the user.
for(int i=0;i<hits.length;++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println((i + 1) + ". " + d.get("isbn") + "\t" + d.get("title"));
}
Here's the app in its entirety. Download HelloLucene.java
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopScoreDocCollector;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;
import java.io.IOException;
public class HelloLucene {
public static void main(String[] args) throws IOException, ParseException {
// 0. Specify the analyzer for tokenizing text.
// The same analyzer should be used for indexing and searching
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
// 1. create the index
Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer);
IndexWriter w = new IndexWriter(index, config);
addDoc(w, "Lucene in Action", "193398817");
addDoc(w, "Lucene for Dummies", "55320055Z");
addDoc(w, "Managing Gigabytes", "55063554A");
addDoc(w, "The Art of Computer Science", "9900333X");
w.close();
// 2. query
String querystr = args.length > 0 ? args[0] : "lucene";
// the "title" arg specifies the default field to use
// when no field is explicitly specified in the query.
Query q = new QueryParser(Version.LUCENE_40, "title", analyzer).parse(querystr);
// 3. search
int hitsPerPage = 10;
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage,true);
searcher.search(q, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
// 4. display results
System.out.println("Found " + hits.length + " hits.");
for(int i=0;i<hits.length;++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println((i + 1) + ". " + d.get("isbn") + "\t" + d.get("title"));
}
// reader can only be closed when there
// is no need to access the documents any more.
reader.close();
}
private static void addDoc(IndexWriter w, String title, String isbn) throwsIOException {
Document doc = new Document();
doc.add(new TextField("title", title, Field.Store.YES));
// use a string field for isbn because we don't want it tokenized
doc.add(new StringField("isbn", isbn, Field.Store.YES));
w.addDocument(doc);
}
}
- Lucene in 5 minutes.
- lucene in 5 minutes - lucene tutorial.com
- Scrum in 5 minutes
- Maven in 5 Minutes
- Learn HTML5 in 5 Minutes!
- Maven - Maven in 5 Minutes
- SQLite In 5 Minutes Or Less
- Metaclasses in Five Minutes
- XenServer Monitoring in Minutes
- Vulkan in 30 minutes
- Vulkan in 30 minutes
- Solving PDEs in Minutes
- maven-in-five-minutes
- Maven in 5 Minutes(5分钟学会使用Maven)
- 5 Steps To Scaling MongoDB In 8 Minutes
- Sencha Touch 2.0 MVC in 5 minutes or less
- maven官方入门demo,maven in 5 minutes
- 五分钟快速搞定maven(maven in 5 minutes)
- Android 开关USB存储设备后音乐播放列表被清空
- Xcode开发技巧之code snippets(代码片段)
- 黑马程序员之Java基础加强之交通灯管理系统
- QT皮肤(QSS)编程
- linux设备驱动——总线、设备、驱动
- Lucene in 5 minutes.
- nexus安装
- Poj 3132 Sum of Different Primes
- HTML学习之四CSS盒子
- 尺度不变特征变换SIFT
- 对数组进行排序、求最大值和求元素和的算法都编写为函数模板,采用相关数据进行测试
- Unknown class BIDTaskListController in Interface Builder file
- Android 如何抓取开机Log
- 数据库_注入