Skip to content
James McKinney edited this page Jun 20, 2014 · 37 revisions

Table of Contents

In this tutorial, we'll go step-by-step through building the AJAX Solr demo site.

Before we start, we write the HTML to which the JavaScript widgets will attach themselves. In practice, this HTML will often be the non-JavaScript version of your search interface, which you now want to improve with unobtrusive JS. The demo uses jQuery and jQuery UI.

[What we have so far]

Next part of the tutorial: Now, let's talk to Solr!

Running the demo locally

This section is not needed to go through the tutorial.

If you use Chef, see this recipe for deploying Solr.

If you want to run a local instance of the Solr server used in this demo, download a Solr index of the Reuters data:

Replace the data directory of your Solr instance with the data directory from one of the above tarballs. Or, you can index the data yourself.

For Solr 4, you can use the configuration files distributed with AJAX Solr. If you are not using Solr 4, or want to use your own configuration files, add the following to your schema.xml in the conf directory of your Solr instance (for example solr-home/example/solr/collection1/conf/schema.xml):

<field name="places" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="countryCodes" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="topics" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="organisations" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="exchanges" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="companies" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="allText" type="text_general" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<copyField source="title" dest="allText"/>
<copyField source="text" dest="allText"/>
<copyField source="places" dest="allText"/>
<copyField source="topics" dest="allText"/>
<copyField source="companies" dest="allText"/>
<copyField source="exchanges" dest="allText"/>

In Solr > 3.5, replace the date field definition with the following (changes type to pdate):

<field name="date" type="pdate" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />

In Solr > 3.5, add:

<field name="dateline" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />

In Solr 4, add an optional copyField for dateline:

<copyField source="dateline" dest="allText"/>

Building the Solr index from the Reuters data

In this example you'll be using two copies of Solr. One copy, solr-4x, will be a modern Solr instance with the schema.xml changes described above, and another copy, solr-js-r824380, will be an older Solr instance.

These partial instructions were based on the SolrJS wiki but have been modified. The commands below will download the data from the Reuters-21578 Text Categorization Collection and checkout old SolrJS code. The instructions don't yet include adding the Reuters data to the Solr index, because those commands have not been tested. A starting point for that follows the commands below.

svn checkout -r 824380 http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/ solr-js-r824380
cd solr-js-r824380   # remember this as the Top Level Solr Directory
ant dist             # creates jar files used in later step

If you get an error message about get-nni, remove any reference to nni-1.0.0.jar from contrib/clustering/build.xml and try again.

This old Reuters injector code is based on Solr 3x, so in order to allow us to inject data into Solr 4x, we need to make a slight adjustment in the code.

Backup and edit the file solr-js-r824380/client/javascript/example/reuters/importer/java/org/apache/solr/solrjs/ReutersService.java, we'll add one line of Java code, add line 107 shown below:

104     public ReutersService(String solrUrl) {
105         try {
106             this.solrServer = new CommonsHttpSolrServer(solrUrl);
107             ((CommonsHttpSolrServer) this.solrServer).setParser( new org.apache.solr.client.solrj.impl.XMLResponseParser() );  // Remove "javabin" error
108             this.solrServer.ping();
109         } catch (Exception e) {
110             throw new RuntimeException("unable to connect to solr server: " + solrUrl, e );
111         }
112     }

This code change will be recompiled later on when you run the next ant command.

Return back to the top level directory solr-js-r824380 to download and import the data:

# Run from solr-js-r824380 directory
cd client/javascript/example/reuters/testdata
curl -O http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.tar.gz
tar xf reuters21578.tar.gz
cd ../../..

Now you should get back into the directory solr-js-r824380/client/javascript.

At this point you should check whether your Solr 4x instance is running on your local machine or not (localhost), and on the default Solr port of 8983, and if you need to set the collection name. In your running Solr 4x directory, if you're using the Solr 4x default core of collection1 then you don't need to add it to the URL and the old 3x Reuters code should work fine. However, if you're not sure, or would like to specify a different collection/core name, you can edit ant build.xml file to change it. (default collections are currently defined with the defaultCoreName attribute in solr-4x/example/solr/solr.xml but this may change or be removed in future versions)

To change the collection name (or the machine name or TCP/IP port number of your running Solr 4x server), backup and edit the file solr-js-r824380/client/javascript/build.xml to look like this, where my_collection is the name you want to use:

112     <java classname="org.apache.solr.solrjs.ReutersService" fork="true" dir="example/reuters/testdata">
113         <arg value="http://localhost:8983/solr/my_collection" />

We're almost ready to inject data. Make sure your Solr 4x instance is running with the modified schema.xml, and make sure the machine name, port and collection name (if not the default) have been changed in build.xml.

Issue the command:

# Run from solr-js-r824380/client/javascript directory
ant reuters-import

If you get errors, switch to the Solr 4x window and look at the errors there. The most common mistake is not having a field defined in the 4x schema.xml.

If you get an error about "javabin", make sure you've made the change to ReutersService.java discussed above. Setting the server parser to XMLResponseParser allows Solr 3x clients to talk to Solr 4x servers!

The main class that's being run is solr-js-r824380/client/javascript/example/reuters/importer/java/org/apache/solr/solrjs/ReutersService.java, which defines an importer, which is then run by ant reuters-import command. The above instructions have just got the data setup.

(Attribution: The demo site is based in part on the SolrJS demo site.)