-
Notifications
You must be signed in to change notification settings - Fork 36
Full text search
Anastasios Zouzias edited this page Sep 17, 2016
·
1 revision
Few examples of text search with LuceneRDD
Download and install Apache Spark locally.
Setup your SPARK_HOME environment variable to your extracted spark directory, i.e., with Spark 1.5.2 extracted in your home directory, do
HOME_DIR=`echo ~`
export SPARK_HOME=${HOME_DIR}/spark-1.5.2-bin-2.6.0
./spark-shell.sh # Starts spark shell using spark-lucenerdd JAR
Now, LuceneRDD
is available in Spark shell. In spark shell, type
scala> :load scripts/loadWords.scala
to instantiate an LuceneRDD[String]
object containing the words from src/test/resources/words.txt
To perform a exact term query, do
scala> val results = luceneRDD.termQuery("_1", "hello", 10)
scala> results.foreach(println)
SparkScoreDoc(12.393539,129848,0,Numeric fields:Text fields:_1:[hello])
...
To perform a prefix query, do
scala> val results = luceneRDD.prefixQuery("_1", "hel", 10)
scala> results.foreach(println)
SparkScoreDoc(1.0,129618,0,Numeric fields:Text fields:_1:[held])
SparkScoreDoc(1.0,129617,0,Numeric fields:Text fields:_1:[helcotic])
SparkScoreDoc(1.0,129616,0,Numeric fields:Text fields:_1:[helcosis])
...
To perform a fuzzy query, do
scala> val results = luceneRDD.fuzzyQuery("_1", "aba", 1)
scala> results.foreach(println)
SparkScoreDoc(7.155413,175248,0,Numeric fields:Text fields:_1:[yaba])
SparkScoreDoc(7.155413,33820,0,Numeric fields:Text fields:_1:[paba])
...