Separate document corpus definition from indices #366

danielmitterdorfer · 2017-11-17T13:15:53Z

Currently the data file definition is tied to an index type. E.g.:

  "indices": [
    {
      "name": "geonames",
      "types": [
        {
          "name": "type",
          "mapping": "mappings.json",
          "documents": "documents.json.bz2",
          "document-count": 11396505,
          "compressed-bytes": 264698741,
          "uncompressed-bytes": 3547614383
        }
      ]
    }
  ]

To allow for more flexibility we should separate them, e.g. by defining them in a separate corpora block. This would also give us more flexibility in allowing other formats (e.g. snapshots, see #341):

"corpora": [
  {
    "name": "geonames",
    "documents": [
      {
        "source-file": "documents.json.bz2",
        "document-count": 11396505,
          "compressed-bytes": 264698741,
          "uncompressed-bytes": 3547614383
      }
    ]
  }
]

We will also add an additional parameter to the bulk index runner to tell it which document corpus it should use (called corpora).

Note: For backwards-compatibility we should not remove the old document-related properties from our standard tracks yet.

The text was updated successfully, but these errors were encountered:

danielmitterdorfer · 2017-12-08T13:58:35Z

Note: Ensure that scenario described in #325 is supported by our implementation.

Relates #366

Relates elastic/rally#366

danielmitterdorfer added :Track Management New operations, changes in the track format, track download changes and the like enhancement Improves the status quo labels Nov 17, 2017

danielmitterdorfer added this to the 0.9.0 milestone Nov 17, 2017

danielmitterdorfer mentioned this issue Dec 8, 2017

Better support for index template(s) + multiple indices #325

Closed

danielmitterdorfer mentioned this issue Jan 2, 2018

Make challenges composable #206

Closed

danielmitterdorfer closed this as completed in ec59dbf Jan 9, 2018

danielmitterdorfer added a commit that referenced this issue Jan 9, 2018

Allow to filter by target-index in bulk operation

52e1d12

Relates #366

danielmitterdorfer added a commit to elastic/rally-tracks that referenced this issue Jan 10, 2018

Add document corpus definitions

84cc5da

Relates elastic/rally#366

danielmitterdorfer added a commit to elastic/rally-tracks that referenced this issue Jan 10, 2018

Add document corpus definitions

955b685

Relates elastic/rally#366

danielmitterdorfer added a commit to elastic/rally-tracks that referenced this issue Jan 10, 2018

Add document corpus definitions

9d64ff2

Relates elastic/rally#366

danielmitterdorfer added a commit to elastic/rally-tracks that referenced this issue Jan 10, 2018

Add document corpus definitions

e821b1b

Relates elastic/rally#366

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate document corpus definition from indices #366

Separate document corpus definition from indices #366

danielmitterdorfer commented Nov 17, 2017 •

edited

Loading

danielmitterdorfer commented Dec 8, 2017

Separate document corpus definition from indices #366

Separate document corpus definition from indices #366

Comments

danielmitterdorfer commented Nov 17, 2017 • edited Loading

danielmitterdorfer commented Dec 8, 2017

danielmitterdorfer commented Nov 17, 2017 •

edited

Loading