Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add analysis components and mapping types to the usage API. #51031

Merged
merged 3 commits into from
Jan 15, 2020

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Jan 15, 2020

Knowing about used analysis components and mapping types would be incredibly
useful in order to know which ones may be deprecated or should get more love.

Some field types also act as a proxy to know about feature usage of some APIs
like the percolator or completion fields types for percolation and the
completion suggester, respectively.

Knowing about used analysis components and mapping types would be incredibly
useful in order to know which ones may be deprecated or should get more love.

Some field types also act as a proxy to know about feature usage of some APIs
like the `percolator` or `completion` fields types for percolation and the
completion suggester, respectively.
@jpountz jpountz added >enhancement :Data Management/Stats Statistics tracking and retrieval APIs labels Jan 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Stats)

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments regarding the extracted components but it looks great !

for (IndexMetaData indexMetaData : state.metaData()) {
MappingMetaData mappingMetaData = indexMetaData.mapping();
if (mappingMetaData != null) {
populateFieldTypesFromObject(mappingMetaData.sourceAsMap(), usedFieldTypes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also extract the analyzers set on the fields ? This could be useful to check the usage of the pre-built analyzers ?

aggregateAnalysisTypes(tokenFilterSettings.values(), usedTokenFilters);

Map<String, Settings> analyzerSettings = indexSettings.getGroups("index.analysis.analyzer");
aggregateAnalysisTypes(analyzerSettings.values(), usedAnalyzers);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also extract the pre-built tokenizer, filters and char_filters from custom analyzers ?

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 great to have this kind of information available!

@jpountz
Copy link
Contributor Author

jpountz commented Jan 15, 2020

@jimczi I added built-in analysis components.

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added built-in analysis components.

LGTM, thanks!

@jpountz jpountz merged commit af59ad7 into elastic:master Jan 15, 2020
@jpountz jpountz deleted the usage_mappings_analysis branch January 15, 2020 17:07
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 17, 2020
We added tracking of index feature usage in elastic#51031 but due to some copy
and paste errors the test fails on some seeds. This fixes those errors.
nik9000 added a commit that referenced this pull request Jan 17, 2020
We added tracking of index feature usage in #51031 but due to some copy
and paste errors the test fails on some seeds. This fixes those errors.
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 17, 2020
We added tracking of index feature usage in elastic#51031 but due to some copy
and paste errors the test fails on some seeds. This fixes those errors.
nik9000 added a commit that referenced this pull request Jan 17, 2020
We added tracking of index feature usage in #51031 but due to some copy
and paste errors the test fails on some seeds. This fixes those errors.
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
…51031)

Knowing about used analysis components and mapping types would be incredibly
useful in order to know which ones may be deprecated or should get more love.

Some field types also act as a proxy to know about feature usage of some APIs
like the `percolator` or `completion` fields types for percolation and the
completion suggester, respectively.
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
We added tracking of index feature usage in elastic#51031 but due to some copy
and paste errors the test fails on some seeds. This fixes those errors.
@codebrain
Copy link
Contributor

codebrain commented Apr 16, 2020

This doesn't appear to be in 7.7.0, running against latest snapshot (build-50109232) I get the following:

  1. Create an index with the following settings:
PUT https://127.0.0.1:9200/text-index
{"settings":{"analysis":{"char_filter":{"c":{"mappings":["a => b"],"type":"mapping"}}}}}
  1. Then calling GET https://127.0.0.1:9200/_xpack/usage gives the following:
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 7569

{
  "flattened" : {
    "available" : true,
    "enabled" : true,
    "field_count" : 2
  },
  "security" : {
    "available" : true,
    "enabled" : true,
    "realms" : {
      "file" : {
        "name" : [
          "file1"
        ],
        "available" : true,
        "cache" : [
          {
            "size" : 1
          }
        ],
        "size" : [
          2
        ],
        "enabled" : true,
        "order" : [
          0
        ]
      },
      "ldap" : {
        "available" : false,
        "enabled" : false
      },
      "native" : {
        "name" : [
          "native1"
        ],
        "available" : true,
        "cache" : [
          {
            "size" : 0
          }
        ],
        "size" : [
          0
        ],
        "enabled" : true,
        "order" : [
          2
        ]
      },
      "saml" : {
        "available" : false,
        "enabled" : false
      },
      "kerberos" : {
        "available" : false,
        "enabled" : false
      },
      "oidc" : {
        "available" : false,
        "enabled" : false
      },
      "active_directory" : {
        "available" : false,
        "enabled" : false
      },
      "pki" : {
        "available" : false,
        "enabled" : false
      }
    },
    "roles" : {
      "native" : {
        "size" : 0,
        "fls" : false,
        "dls" : false
      },
      "dls" : {
        "bit_set_cache" : {
          "count" : 0,
          "memory" : "0b",
          "memory_in_bytes" : 0
        }
      },
      "file" : {
        "size" : 3,
        "fls" : false,
        "dls" : false
      }
    },
    "role_mapping" : {
      "native" : {
        "size" : 0,
        "enabled" : 0
      }
    },
    "ssl" : {
      "http" : {
        "enabled" : true
      },
      "transport" : {
        "enabled" : true
      }
    },
    "token_service" : {
      "enabled" : true
    },
    "api_key_service" : {
      "enabled" : true
    },
    "audit" : {
      "enabled" : false
    },
    "ipfilter" : {
      "http" : false,
      "transport" : false
    },
    "anonymous" : {
      "enabled" : false
    },
    "fips_140" : {
      "enabled" : false
    }
  },
  "monitoring" : {
    "available" : true,
    "enabled" : true,
    "collection_enabled" : false,
    "enabled_exporters" : {
      "local" : 1
    }
  },
  "vectors" : {
    "available" : true,
    "enabled" : true,
    "dense_vector_fields_count" : 0,
    "sparse_vector_fields_count" : 0,
    "dense_vector_dims_avg_count" : 0
  },
  "watcher" : {
    "available" : false,
    "enabled" : true,
    "execution" : {
      "actions" : {
        "_all" : {
          "total" : 0,
          "total_time_in_ms" : 0
        }
      }
    },
    "watch" : {
      "input" : {
        "_all" : {
          "total" : 0,
          "active" : 0
        }
      },
      "trigger" : {
        "_all" : {
          "total" : 0,
          "active" : 0
        }
      }
    },
    "count" : {
      "total" : 0,
      "active" : 0
    }
  },
  "analytics" : {
    "available" : true,
    "enabled" : true,
    "stats" : [
      {
        "boxplot_usage" : 0,
        "cumulative_cardinality_usage" : 0,
        "string_stats_usage" : 0,
        "top_metrics_usage" : 0
      }
    ]
  },
  "ml" : {
    "available" : false,
    "enabled" : true,
    "jobs" : {
      "_all" : {
        "count" : 0,
        "detectors" : {
          "total" : 0.0,
          "min" : 0.0,
          "avg" : 0.0,
          "max" : 0.0
        },
        "created_by" : { },
        "model_size" : {
          "total" : 0.0,
          "min" : 0.0,
          "avg" : 0.0,
          "max" : 0.0
        },
        "forecasts" : {
          "total" : 0,
          "forecasted_jobs" : 0
        }
      }
    },
    "datafeeds" : {
      "_all" : {
        "count" : 0
      }
    },
    "data_frame_analytics_jobs" : {
      "_all" : {
        "count" : 0
      }
    },
    "inference" : {
      "ingest_processors" : {
        "_all" : {
          "num_docs_processed" : {
            "max" : 0,
            "sum" : 0,
            "min" : 0
          },
          "pipelines" : {
            "count" : 0
          },
          "num_failures" : {
            "max" : 0,
            "sum" : 0,
            "min" : 0
          },
          "time_ms" : {
            "max" : 0,
            "sum" : 0,
            "min" : 0
          }
        }
      },
      "trained_models" : {
        "_all" : {
          "count" : 0
        }
      }
    },
    "node_count" : 1
  },
  "ilm" : {
    "policy_count" : 4,
    "policy_stats" : [
      {
        "phases" : {
          "hot" : {
            "min_age" : 0,
            "actions" : [
              "rollover"
            ]
          },
          "delete" : {
            "min_age" : 7776000000,
            "actions" : [
              "delete"
            ]
          }
        },
        "indices_managed" : 0
      },
      {
        "phases" : {
          "delete" : {
            "min_age" : 604800000,
            "actions" : [
              "delete"
            ]
          }
        },
        "indices_managed" : 0
      },
      {
        "phases" : {
          "hot" : {
            "min_age" : 0,
            "actions" : [
              "rollover"
            ]
          }
        },
        "indices_managed" : 0
      },
      {
        "phases" : {
          "hot" : {
            "min_age" : 0,
            "actions" : [
              "rollover"
            ]
          },
          "delete" : {
            "min_age" : 7776000000,
            "actions" : [
              "delete"
            ]
          }
        },
        "indices_managed" : 0
      }
    ]
  },
  "slm" : {
    "available" : true,
    "enabled" : true
  },
  "voting_only" : {
    "available" : true,
    "enabled" : true
  },
  "frozen_indices" : {
    "available" : true,
    "enabled" : true,
    "indices_count" : 0
  },
  "logstash" : {
    "available" : false,
    "enabled" : true
  },
  "ccr" : {
    "available" : false,
    "enabled" : true,
    "follower_indices_count" : 0,
    "auto_follow_patterns_count" : 0
  },
  "graph" : {
    "available" : false,
    "enabled" : true
  },
  "sql" : {
    "available" : true,
    "enabled" : true,
    "features" : {
      "having" : 0,
      "subselect" : 0,
      "limit" : 0,
      "orderby" : 0,
      "where" : 0,
      "groupby" : 0,
      "join" : 0,
      "command" : 0,
      "local" : 0
    },
    "queries" : {
      "cli" : {
        "total" : 0,
        "paging" : 0,
        "failed" : 0
      },
      "rest" : {
        "total" : 0,
        "paging" : 0,
        "failed" : 0
      },
      "canvas" : {
        "total" : 0,
        "paging" : 0,
        "failed" : 0
      },
      "odbc" : {
        "total" : 0,
        "paging" : 0,
        "failed" : 0
      },
      "jdbc" : {
        "total" : 0,
        "paging" : 0,
        "failed" : 0
      },
      "odbc32" : {
        "total" : 0,
        "paging" : 0,
        "failed" : 0
      },
      "odbc64" : {
        "total" : 0,
        "paging" : 0,
        "failed" : 0
      },
      "translate" : {
        "count" : 0
      },
      "_all" : {
        "total" : 0,
        "paging" : 0,
        "failed" : 0
      }
    }
  },
  "enrich" : {
    "available" : true,
    "enabled" : true
  },
  "transform" : {
    "available" : true,
    "enabled" : true
  },
  "spatial" : {
    "available" : true,
    "enabled" : true
  },
  "rollup" : {
    "available" : true,
    "enabled" : true
  },
  "eql" : {
    "available" : true,
    "enabled" : false
  }
}

Notice there is no index property in the response JSON.

@jpountz
Copy link
Contributor Author

jpountz commented Apr 16, 2020

@codebrain Good catch, it's been superseded by #51138. I removed the version labels and will remove from the release notes tomorrow.

jpountz added a commit to jpountz/elasticsearch that referenced this pull request Apr 17, 2020
@jpountz
Copy link
Contributor Author

jpountz commented Apr 17, 2020

Here is the PR #55387.

jpountz added a commit that referenced this pull request Apr 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Stats Statistics tracking and retrieval APIs >enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants