Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Data frame analytics results grid should use index pattern field format if one exists #60892

Closed
Winterflower opened this issue Mar 23, 2020 · 3 comments · Fixed by #61709
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Data Frame Analytics ML data frame analytics features :ml v7.7.0

Comments

@Winterflower
Copy link

Winterflower commented Mar 23, 2020

Kibana version:
STACK_BUILD=2227
STACK_VERSION=7.7.0

Elasticsearch version:
STACK_BUILD=2227
STACK_VERSION=7.7.0

Server OS version:
N/A
Browser version:
Chrome
Browser OS version:
Mac OS X

Steps to reproduce:

  1. Restore the seeds dataset and run the multiclass ML job linked in the Configuration section below
  2. After job is complete (should be fast since this dataset only has 210 datapoints), click on "View" and look at the data in the table.
  3. You will see that most of the entries in the data table are rounded to three decimal points, while some appear to display a lot more decimal points. This increased precision does not match what you would see if you looked at the data in Discover.

Screen Shot 2020-03-23 at 11 15 32 AM

There are two issues here:

  1. Inconsistent rounding among data points in the DF Analytics results table (see screenshot above)
  2. Introducing more floating points than the source data

Source Data in the Discover tab

Screen Shot 2020-03-23 at 11 26 16 AM

Same data point in the ML DF Analytics Results UI

Screen Shot 2020-03-23 at 11 25 28 AM

ML Job Configuration

PUT _ml/data_frame/analytics/seeds
{
  "source": {
    "index": "seeds"
  },
  "dest": {
    "index":"seeds_results"
    
  },
  "model_memory_limit": "2gb",
  "analysis": 
    {
      "classification": {
        "num_top_classes" : 2,
        "dependent_variable": "seed_class",
        "training_percent": 80
      }
    }
}
@Winterflower Winterflower added :ml Feature:Data Frame Analytics ML data frame analytics features labels Mar 23, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@peteharverson peteharverson removed their assignment Mar 23, 2020
@peteharverson peteharverson added v7.7.0 bug Fixes for quality problems that affect the customer experience labels Mar 23, 2020
@alvarezmelissa87
Copy link
Contributor

Hey @Winterflower 😄

This isn't a UI issue as far as I can tell. The UI doesn't do any rounding here in the case of number type when displaying it in the data table. Numbers are displayed as they are.

These are the values stored on the job result index that's being queried.

The original seeds index viewed in discover for this particular record:

image

The same record stored in the classification job's result index:

image

Must be some change happening on the back-end.

@alvarezmelissa87 alvarezmelissa87 self-assigned this Mar 25, 2020
@peteharverson peteharverson changed the title [ML] DF Analytics UI floating point mismatch between results table data and source data and inconsistent rounding [ML] Data frame analytics results grid should use index pattern field format if one exists Mar 26, 2020
@peteharverson peteharverson added enhancement New value added to drive a business result bug Fixes for quality problems that affect the customer experience and removed bug Fixes for quality problems that affect the customer experience enhancement New value added to drive a business result labels Mar 26, 2020
@peteharverson
Copy link
Contributor

If an index pattern exists for the destination index, the values in the results grid should be formatted using the field format from the index pattern. For the case above, this would result in the data grid formatting values to 3 d.p. in common with Discover. If no index pattern exists for the destination index, we should probably check for an index pattern for the source index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Data Frame Analytics ML data frame analytics features :ml v7.7.0
Projects
None yet
4 participants