Skip to content

Commit

Permalink
Eliminate linux/window tabs for bin/solr post. (#2579)
Browse files Browse the repository at this point in the history
Also use the same style of formatting for the commands.
  • Loading branch information
epugh authored Jul 22, 2024
1 parent b87242e commit a439ebf
Show file tree
Hide file tree
Showing 9 changed files with 51 additions and 132 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -435,7 +435,7 @@ You should get a response that looks like this:

Use `bin/solr post` to index some example documents to the SolrCloud collection created above:

[source,bash]
[source,console]
----
$ bin/solr post --solr-update-url https://localhost:8984/solr/mycollection/update example/exampledocs/*.xml
----
Expand All @@ -445,9 +445,9 @@ $ bin/solr post --solr-update-url https://localhost:8984/solr/mycollection/updat
Use curl to query the SolrCloud collection created above, from a directory containing the PEM formatted certificate and key created above (e.g., `example/etc/`).
If you have not enabled client authentication (system property `-Djetty.ssl.clientAuth=true)`, then you can remove the `-E solr-ssl.pem:secret` option:

[source,bash]
[source,console]
----
curl -E solr-ssl.pem:secret --cacert solr-ssl.pem "https://localhost:8984/solr/mycollection/select?q=*:*"
$ curl -E solr-ssl.pem:secret --cacert solr-ssl.pem "https://localhost:8984/solr/mycollection/select?q=*:*"
----

=== Index a Document using CloudSolrClient
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -222,72 +222,29 @@ Pick one of the formats and index it into the "films" collection (in each exampl
.To Index JSON Format
[tabs#index-json]
======
Linux/Mac::
+
====
[,console]
----
$ bin/solr post -c films example/films/films.json
----
====
Windows::
+
====
[,console]
----
$ bin/solr post -c films example\films\films.json
----
====
======


.To Index XML Format
[tabs#index-xml]
======
Linux/Mac::
+
====
[,console]
----
$ bin/solr post -c films example/films/films.xml
----
====
Windows::
+
====
[,console]
----
$ bin/solr post -c films example\films\films.xml
----
====
======


.To Index CSV Format
[tabs#index-csv]
======
Linux/Mac::
+
====
[,console]
----
$ bin/solr post -c films example/films/films.csv -params "f.genre.split=true&f.directed_by.split=true&f.genre.separator=|&f.directed_by.separator=|"
----
====
Windows::
+
====
[,console]
----
$ bin/solr post -c films example\films\films.csv -params "f.genre.split=true&f.directed_by.split=true&f.genre.separator=|&f.directed_by.separator=|"
----
====
======

Each command includes these main parameters:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,26 +69,10 @@ and for the release date also to be single valued.

Now that we have updated our Schema, we need to index the sample film data, or, if you already have indexed it, then re-index it to take advantage of the new field definitions we added.

[tabs#index-json]
======
Linux/Mac::
+
====
[,console]
----
$ bin/solr post -c films example/films/films.json
----
====
Windows::
+
====
[,console]
----
$ bin/solr post -c films example\films\films.json
----
====
======

=== Let's get Searching!

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -167,18 +167,11 @@ You'll need a command shell to run some of the following examples, rooted in the
The data we will index is in the `example/exampledocs` directory.
The documents are in a mix of document formats (JSON, CSV, etc.), and fortunately we can index them all at once:

.Linux/Mac
[,console]
----
$ bin/solr post -c techproducts example/exampledocs/*
----

.Windows
[,console]
----
$ bin/solr post -c techproducts example\exampledocs\*
----

You should see output similar to the following:

[,console]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,27 +75,10 @@ $ curl http://localhost:8983/solr/films/schema -X POST -H 'Content-type:applicat

We have the vectors embedded in our `films.json` file, so let's index that data, taking advantage of our new schema field we just defined.

[tabs#index-json]
======
Linux/Mac::
+
====
[,console]
----
$ bin/solr post -c films example/films/films.json
----
====
Windows::
+
====
[,console]
----
$ bin/solr post -c films example\films\films.json
----
====
======

=== Let's do some Vector searches
Before making the queries, we define an example target vector, simulating a person that
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -126,9 +126,9 @@ Note this includes the path, so if you upload a different file, always be sure t

You can also use `bin/solr post` to do the same thing:

[source,bash]
[,console]
----
bin/solr post -c gettingstarted example/exampledocs/solr-word.pdf -params "literal.id=doc1"
$ bin/solr post -c gettingstarted example/exampledocs/solr-word.pdf -params "literal.id=doc1"
----

Now you can execute a query and find that document with a request like `\http://localhost:8983/solr/gettingstarted/select?q=pdf`.
Expand All @@ -146,9 +146,9 @@ The dynamic field `ignored_*` is good for this purpose.
For the fields you do want to map, explicitly set them using `fmap.IN=OUT` and/or ensure the field is defined in the schema.
Here's an example:

[source,bash]
[,console]
----
bin/solr post -c gettingstarted example/exampledocs/solr-word.pdf -params "literal.id=doc1&uprefix=ignored_&fmap.last_modified=last_modified_dt"
$ bin/solr post -c gettingstarted example/exampledocs/solr-word.pdf -params "literal.id=doc1&uprefix=ignored_&fmap.last_modified=last_modified_dt"
----

[NOTE]
Expand Down Expand Up @@ -561,18 +561,18 @@ If `literalsOverride=false`, literals will be appended as multi-value to the Tik

The command below captures `<div>` tags separately (`capture=div`), and then maps all the instances of that field to a dynamic field named `foo_t` (`fmap.div=foo_t`).

[source,bash]
[,console]
----
bin/solr post -c gettingstarted example/exampledocs/sample.html -params "literal.id=doc2&captureAttr=true&defaultField=_text_&fmap.div=foo_t&capture=div"
$ bin/solr post -c gettingstarted example/exampledocs/sample.html -params "literal.id=doc2&captureAttr=true&defaultField=_text_&fmap.div=foo_t&capture=div"
----

=== Using Literals to Define Custom Metadata

To add in your own metadata, pass in the literal parameter along with the file:

[source,bash]
[,console]
----
bin/solr post -c gettingstarted -params "literal.id=doc4&captureAttr=true&defaultField=text&capture=div&fmap.div=foo_t&literal.blah_s=Bah" example/exampledocs/sample.html
$ bin/solr post -c gettingstarted -params "literal.id=doc4&captureAttr=true&defaultField=text&capture=div&fmap.div=foo_t&literal.blah_s=Bah" example/exampledocs/sample.html
----

The parameter `literal.blah_s=Bah` will insert a field `blah_s` into every document.
Expand All @@ -582,9 +582,9 @@ Every instance of the text will be "Bah".

The example below passes in an XPath expression to restrict the XHTML returned by Tika:

[source,bash]
[,console]
----
bin/solr post -c gettingstarted -params "literal.id=doc5&captureAttr=true&defaultField=text&capture=div&fmap.div=foo_t&xpath=/xhtml:html/xhtml:body/xhtml:div//node()" example/exampledocs/sample.html
$ bin/solr post -c gettingstarted -params "literal.id=doc5&captureAttr=true&defaultField=text&capture=div&fmap.div=foo_t&xpath=/xhtml:html/xhtml:body/xhtml:div//node()" example/exampledocs/sample.html
----

=== Extracting Data without Indexing
Expand All @@ -601,9 +601,9 @@ curl "http://localhost:8983/solr/gettingstarted/update/extract?&extractOnly=true

The output includes XML generated by Tika (and further escaped by Solr's XML) using a different output format to make it more readable (`-out yes` instructs the tool to echo Solr's output to the console):

[source,bash]
[,console]
----
bin/solr post -c gettingstarted -params "extractOnly=true&wt=ruby&indent=true" -out yes example/exampledocs/sample.html
$ bin/solr post -c gettingstarted -params "extractOnly=true&wt=ruby&indent=true" -out yes example/exampledocs/sample.html
----

=== Using Solr Cell with a POST Request
Expand Down
52 changes: 26 additions & 26 deletions solr/solr-ref-guide/modules/indexing-guide/pages/post-tool.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -110,48 +110,48 @@ This section presents several examples.

Index all JSON files into `gettingstarted`.

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8983/solr/gettingstarted/update *.json
$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update *.json
----

=== Indexing XML

Add all documents with file extension `.xml` to the collection named `gettingstarted`.

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8983/solr/gettingstarted/update *.xml
$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update *.xml
----

Add all documents starting with `article` with file extension `.xml` to the `gettingstarted` collection on Solr running on port `8984`.

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8984/solr/gettingstarted/update article*.xml
$ bin/solr post -url http://localhost:8984/solr/gettingstarted/update article*.xml
----

Send XML arguments to delete a document from `gettingstarted`.

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8983/solr/gettingstarted/update -mode args -type application/xml '<delete><id>42</id></delete>'
$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update -mode args -type application/xml '<delete><id>42</id></delete>'
----

=== Indexing CSV and JSON

Index all CSV and JSON files into `gettingstarted` from current directory:

[source,bash]
[,console]
----
bin/solr post -c gettingstarted -filetypes json,csv .
$ bin/solr post -c gettingstarted -filetypes json,csv .
----

Index a tab-separated file into `gettingstarted`:

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8984/solr/signals/update -params "separator=%09" -type text/csv data.tsv
$ bin/solr post -url http://localhost:8984/solr/signals/update -params "separator=%09" -type text/csv data.tsv
----

The content type (`-type`) parameter is required to treat the file as the proper type, otherwise it will be ignored and a WARNING logged as it does not know what type of content a .tsv file is.
Expand All @@ -161,32 +161,32 @@ The xref:indexing-with-update-handlers.adoc#csv-formatted-index-updates[CSV hand

Index a PDF file into `gettingstarted`.

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8983/solr/gettingstarted/update a.pdf
$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update a.pdf
----

Automatically detect content types in a folder, and recursively scan it for documents for indexing into `gettingstarted`.

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8983/solr/gettingstarted/update afolder/
$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update afolder/
----

Automatically detect content types in a folder, but limit it to PPT and HTML files and index into `gettingstarted`.

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8983/solr/gettingstarted/update -filetypes ppt,html afolder/
$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update -filetypes ppt,html afolder/
----

=== Indexing to a Password Protected Solr (Basic Auth)

Index a PDF as the user "solr" with password "SolrRocks":

[source,bash]
[,console]
----
bin/solr post -u solr:SolrRocks -url http://localhost:8983/solr/gettingstarted/update a.pdf
$ bin/solr post -u solr:SolrRocks -url http://localhost:8983/solr/gettingstarted/update a.pdf
----

=== Crawling a Website to Index Documents
Expand All @@ -195,26 +195,26 @@ Crawl the Apache Solr website going one layer deep and indexing the pages into S

See xref:indexing-with-tika.adoc#trying-out-solr-cell[Trying Out Solr Cell] to learn more about setting up Solr for extracting content from web pages.

[source,bash]
[,console]
----
bin/solr post -mode web -c gettingstarted -recursive 1 -delay 1 https://solr.apache.org/
$ bin/solr post -mode web -c gettingstarted -recursive 1 -delay 1 https://solr.apache.org/
----

=== Standard Input as Source for Indexing

You can use the standard input as your source for data to index.
Notice the `-out` providing raw responses from Solr.

[source,bash]
[,console]
----
echo '{commit: {}}' | bin/solr post -mode stdin -url http://localhost:8983/my_collection/update -out
$ echo '{commit: {}}' | bin/solr post -mode stdin -url http://localhost:8983/my_collection/update -out
----

=== Raw Data as Source for Indexing

Provide the raw document as a string for indexing.

[source,bash]
[,console]
----
bin/solr post -url http://localhost:8983/signals/update -mode args -type text/csv -out $'id,value\n1,0.47'
$ bin/solr post -url http://localhost:8983/signals/update -mode args -type text/csv -out $'id,value\n1,0.47'
----
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,10 @@ However, it's much bulkier than the raw coordinates for such simple data.

Using the `bin/solr post` tool:

[source,text]
bin/solr post -type "application/json" -url "http://localhost:8983/solr/mycollection/update?format=geojson" /path/to/geojson.file
[,console]
----
$ bin/solr post -type "application/json" -url "http://localhost:8983/solr/mycollection/update?format=geojson" /path/to/geojson.file
----

The key parameter to pass in with your request is:

Expand Down
Loading

0 comments on commit a439ebf

Please sign in to comment.