Support fetching facets from value edge lists #4081

robsws · 2019-09-27T10:30:58Z

It seems that currently, if facets are added on value edge lists (with type [string] for example), it is impossible to retrieve any facets added on edges within the list. Here is a toy example where I have two names recorded for Alice, each with it's own since facet associated:

{
  set {
    _:alice <name> "Alice Smith" (since=1990-01-01) .
    _:alice <name> "Alice Baker" (since=2019-01-01) .
  }
}

The response for this mutation shows a success, so one can assume that the facets were added successfully:

{
  "data": {
    "code": "Success",
    "message": "Done",
    "uids": {
      "alice": "0x1"
    }
  }
}

If you try and query the facet data, only the values of the edges are returned and not their facets:

{
  query(func: has(name)) {
    name @facets
  }
}

Response:

{
  "data": {
    "query": [
      {
        "name": [
          "Alice Smith",
          "Alice Baker"
        ]
      }
    ]
  }
}

Would it be possible to implement the ability to fetch these facets?

Secondly, it would be good to combine this with issue #4034 and allow filtering on these facets as well.

The text was updated successfully, but these errors were encountered:

campoy · 2019-09-27T16:55:10Z

Yes, this is indeed the part that I mentioned would not be necessarily fixed as part of #4034.

The hard part here is basically agreeing on how it should be formatted.

One option would be:

{
  "data": {
    "query": [
      {
        "name": [
          "Alice Smith",
          "Alice Baker"
        ],
        "name|since": [
          "1990-01-01",
          "2019-01-01",
        ]
      }
    ]
  }
}

This approach breaks if some of the edges are missing that facet.

Another option:

{
  "data": {
    "query": [
      {
        "name": [
          {
            "@value": "Alice Smith",
            "since": "1990-01-01"
          },
          {
            "@value": "Alice Baker",
            "since": "2019-01-01"
          }
        ]
      }
    ]
  }
}

This one has the slight issue that when you ask for facets, even if none are returned, strings are reified into objects. Also, we'll need to find a field name for @value that minimizes collisions.

Do you have any opinions on this, @robsws ?

robsws · 2019-09-30T09:35:43Z

For our particular use case, I think either approach would work. In my own opinion, the first approach is probably more in line with the current facet response format and I think you could solve the problem of facets missing by adding nulls to the array where the facets don't exist.

manishrjain · 2019-10-09T13:36:13Z

We could potentially encode the facets in the name itself.

“Name”: “Alice Smith (since = yymmdd, weight = 0.3)”

Might get tricky if the facet value is also a string.

Another approach is we allow data to be returned in RDF format. That would make this trivial.

campoy · 2019-10-16T18:42:25Z

We've decided to fix this issue by unifying the way we support facets in Dgraph both for value and uid predicates.

In the proposed solution, we will handle facets the same way no matter the type of predicate.

Facets are returned in a different field with a name composed of predicate name, followed by pipe character (|), and

Facets on a singular value predicate

This behavior remains unchanged with v1.1.0.
A new field named "predicate_name|facet_name" is added as a sibling of the requested predicate containing the associated value (if any).

Schema:

<name>: string .

Data:

<0x1> <name> "Francesc" (kind="first") .

Query:

{
   q(func: has(name)) {
      name @facets
   }
}

Result:

{
  "data": {
    "q": [
      {
        "name|kind": "first",
        "name": "Francesc"
      }
    ]
  }
}

Facets on lists of value predicates

This behavior was not supported in v1.1.0, follows the same idea as the one above.
A new field named "predicate_name|facet_name" is added as a sibling of the requested predicate containing the associated values.
To avoid losing what facet values correspond to what predicate, we could use null for missing values so each position in the predicate field and the facets are all related.
Instead of doing this, we decided to provide a map from the index of the corresponding predicate value to the corresponding facet value.

Schema:

<nickname>: [string] .

Data:

<0x1> <nickname> "Francesc" (kind="official") .
<0x1> <nickname> "Tete" .
<0x1> <nickname> "Cesc" (kind="friends") .

Query:

{
   q(func: has(nickname)) {
      nickname @facets
   }
}

Result:

{
  "data": {
    "q": [
      {
        "nickname": [
          "Cesc",
          "Francesc",
          "Tete"
        ],
        "nickname|kind": {
          "0": "friends",
          "1": "official"
        }
      }
    ]
  }
}

This solution makes it so parsing nickname is exactly the same no matter whether one is asking for facets or not.

Facets on a singular object predicate

This behavior will change to become more similar to the way we handle value predicates.

Schema:

<name>: string .
<state>: uid .

Data:

<0x1> <name> "San Francisco" .
<0x1> <state> <0x2> (capital=false) .
<0x2> <name> "California" .

Query:

{
   q(func: has(state)) {
      name
      state @facets {
         name
      }
   }
}

The current behavior moves the value of the facet into the object pointed by the predicate.

{
  "data": {
    "q": [
      {
        "name": "San Francisco",
        "state": [
          {
            "name": "California",
            "state|capital": false
          }
        ]
      }
    ]
  }
}

Instead, we propose to keep the facet value at the same level as the predicate as we do for value predicates.

{
  "data": {
    "q": [
      {
        "name": "San Francisco",
        "state": [
          {
            "name": "California"
          },
          "state|capital": false
        ]
      }
    ]
  }
}

Facets on a list of object predicates

This behavior will also change to become more similar to the way we handle value predicates.

Schema:

<name>: string .
<speaks>: [uid] .

Data:

<0x1> <name> "Francesc" .
<0x1> <speaks> <0x2> (fluent=true) .
<0x1> <speaks> <0x3> (fluent=false) .
<0x2> <name> "Spanish" .
<0x3> <name> "Chinese" .

Query:

{
  q(func: uid(0x1)) {
    name
    speaks @facets {
      name
    }
  }
}

The current behavior moves the value of the facet into the object pointed by the predicate.

{
  "data": {
    "q": [
      {
        "name": "Francesc",
        "speaks": [
          {
            "name": "Spanish",
            "speaks|fluent": true
          },
          {
            "name": "Chinese",
            "speaks|fluent": false
          }
        ]
      }
    ]
  }
}

Instead, we propose to keep the facet value at the same level as the predicate as we do for value predicates.

{
  "data": {
    "q": [
      {
        "name": "Francesc",
        "speaks": [
          {
            "name": "Spanish"
          },
          {
            "name": "Chinese"
          }
        ],
        "speaks|fluent": {
           "0": true,
           "1": false
        }
      }
    ]
  }
}

MichelDiz · 2019-10-16T23:08:56Z

The result you’ve shared feels not ideal. The mutation you have a “order” (the kind is linked to the value)

<0x1> <nickname> "Francesc" (kind="official") .
<0x1> <nickname> "Tete" .
<0x1> <nickname> "Cesc" (kind="friends") .

but the response isn’t ordered.
I would expect the response order to be in sync as we intend to be using indexed positioning:

{
  "data": {
    "q": [
      {
        "nickname": [
          "Cesc",
          "Francesc",
          "Tete"
        ],
        "nickname|kind": {
          "0": "friends",
          "1": "official",
          "2": "null"
        }
      }
    ]
  }
}

We need a way to know “who owns the facet”. Because the answer needs to be predictable.

We could also apply index positioning to the values of the list.

{
  "data": {
    "q": [
      {
        "nickname": {
          "0": "Cesc",
          "1": "Francesc",
          "2": "Tete"
       },
        "nickname|kind": {
          "0": "friends",
          "1": "official",
          "2": "null"
        }
      }
    ]
  }
}

Or

{
  "data": {
    "q": [
      {
        "nickname": {
          "0": "Cesc",
          "0|kind": "friends",
          "1": "Francesc",
          "1|kind": "official",
          "2": "Tete"
        },
      }
    ]
  }
}

campoy · 2019-10-17T16:54:31Z

There is no order in the data, but there's an order in the result we send back since it's an array.
I do not care whether the values appear in one order or the other, as long as the keys in the facet map correspond to what we encode.

That is exactly why we use a map, not an array, for the facet results. It allows us to keep it in sync while avoiding having potentially many nulls.

MichelDiz · 2019-10-17T17:21:27Z

{
  "data": {
    "q": [
      {
        "nickname": [
          "Cesc",
          "Francesc",
          "Tete"
        ],
        "nickname": {
          "0": "official",
          "2": "friends"
        }
      }
    ]
  }
}

So, by the response, you've shared. How do we can tell what belongs to what? There's no relation with the name and its facet in that response. I would assume that 0 belongs to "Cesc" (which is wrong, cuz "official" belongs to "Francesc") and 2 belongs to "Tete" (which is also wrong, cuz "friends" belongs to "Cesc").

If facets come unsorted, there's no point in using it. But I can assume now after your response that the response example is slightly wrong. And everything should be alright. Only the example took my attention.

BTW, I think we should use "nickname|kind" as we do in object to avoid duplicate keys. If we have more facets like "since, points, tag" it would be a mess. So"nickname|since", "nickname|points", "nickname|tag"

campoy · 2019-10-19T00:10:17Z

I think the misunderstanding was based on my typos, sorry about that!

The actual response should be:

{
  "data": {
    "q": [
      {
        "nickname": [
          "Cesc",
          "Francesc",
          "Tete"
        ],
        "nickname|kind": {
          "0": "friends",
          "1": "official"
        }
      }
    ]
  }
}

Note that indeed the field name with the facets is nickname|kind and the key in the map points to the corresponding field in the predicates.

animesh2049 · 2019-10-29T17:21:24Z

In case of Facets on lists of value predicates we are putting all the facets for all the values into the same facetList. Later there is no way to find out the mapping between values and facets. Here is a sample test case for the same.
schema

 <nickname>: [string] .

mutatoin

{
    set {
      _:0x1 <nickname> "Francesc" .
      _:0x1 <nickname> "Tete" (kind="official") .
      _:0x1 <nickname> "Cesc" (kind="friends", real=true) .
    }
}

query

{
  me(func: has(nickname)) {
    nickname @facets
  }
}

Right now on master the query doesn't return any facet, but we can look into facetmatrix to verify this.

animesh2049 · 2019-10-29T17:34:03Z

In case of Facets on a list of object predicates , we return the facet even in case of empty node.
Consider the sample query

{
    me(func: uid(1)) {
        friend @facets(orderasc:since) {
	    name
	}
    }
}

sample response according to current behaviour

{
  "data": {
    "me": [
      {
        "friend": [
          {
            "name": "Glenn Rhee",
            "friend|since": "2004-05-02T15:04:05Z"
          },
          {
            "friend|since": "2005-05-02T15:04:05Z" # Facet shows up even if the node is empty
          },
          {
            "name": "Rick Grimes",
            "friend|since": "2006-01-02T15:04:05Z"
          },
          {
            "name": "Andrea",
            "friend|since": "2006-01-02T15:04:05Z"
          },
          {
            "name": "Daryl Dixon",
            "friend|since": "2007-05-02T15:04:05Z"
          }
        ]
      }
    ]
  }
}

but if we try to index facets with values we will have the following response

{
  "data": {
    "me": [
      {
        "friend|since": {
          "1": "2004-05-02T15:04:05Z",
          "2": "2005-05-02T15:04:05Z",
          "3": "2006-01-02T15:04:05Z",
          "4": "2006-01-02T15:04:05Z",
          "5": "2007-05-02T15:04:05Z"
        },
        "friend": [
          {
            "name": "Glenn Rhee"
          },
          {
            "name": "Rick Grimes"
          },
          {
            "name": "Andrea"
          },
          {
            "name": "Daryl Dixon"
          }
        ]
      }
    ]
  }
}

The response above is incorrect because the empty node now has been discarded in values. I think we should eliminate all the empty nodes and their facets in response. A facet without any value anyway doesn't make sense.
cc: @campoy

campoy · 2019-11-12T23:45:22Z

Yes, agreed. Only the facets for the nodes that appear in the response should be encoded.

pepoospina · 2019-11-29T15:39:34Z

Hi @campoy, this actually breaks our whole strategy 😟. Without access to the facets of a list of predicates, for us, it's not possible to reconstruct the persisted data.

Is there a way to get around this issue and somehow get the facet values? I mean, in the meanwhile, until this new feature is released?

campoy · 2019-12-11T00:53:47Z

Wait, so this will help you with your problem. Right, @pepoospina?

I just want to make sure I understand correctly.

pepoospina · 2019-12-11T15:20:54Z

Yes @campoy. The solution in this PR should fix our issue.

Fixes #4081 This PR covers following things: * Support for fetching facets from value edge list. * Response format change in case of facets to have uniform response format across uid/value/list edges. Co-authored-by: Animesh Chandra Pathak <animesh@dgraph.io>

sleto-it · 2020-01-13T17:11:58Z

Hi all,
This ticket was fixed by PR #4267, which has now been merged

The fix will be delivered in v1.2

Please comment if more help is needed, and we will reopen

Many thanks,

campoy added this to the Dgraph v1.2 milestone Sep 27, 2019

campoy added priority/P1 Serious issue that requires eventual attention (can wait a bit) and removed priority/P2 Somehow important but would not block a release. labels Oct 16, 2019

ashish-goswami mentioned this issue Nov 13, 2019

[BREAKING] Support for fetching facets from value edge list #4267

Merged

ashish-goswami closed this as completed in #4267 Jan 13, 2020

emile-tawfik mentioned this issue Jan 15, 2020

JSON mutation, Support facets for value edge lists #4581

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support fetching facets from value edge lists #4081

Support fetching facets from value edge lists #4081

robsws commented Sep 27, 2019

campoy commented Sep 27, 2019 •

edited

Loading

robsws commented Sep 30, 2019

manishrjain commented Oct 9, 2019

campoy commented Oct 16, 2019 •

edited

Loading

MichelDiz commented Oct 16, 2019 •

edited

Loading

campoy commented Oct 17, 2019

MichelDiz commented Oct 17, 2019 •

edited

Loading

campoy commented Oct 19, 2019

animesh2049 commented Oct 29, 2019

animesh2049 commented Oct 29, 2019

campoy commented Nov 12, 2019

pepoospina commented Nov 29, 2019

campoy commented Dec 11, 2019

pepoospina commented Dec 11, 2019 •

edited

Loading

sleto-it commented Jan 13, 2020

Support fetching facets from value edge lists #4081

Support fetching facets from value edge lists #4081

Comments

robsws commented Sep 27, 2019

campoy commented Sep 27, 2019 • edited Loading

robsws commented Sep 30, 2019

manishrjain commented Oct 9, 2019

campoy commented Oct 16, 2019 • edited Loading

Facets on a singular value predicate

Facets on lists of value predicates

Facets on a singular object predicate

Facets on a list of object predicates

MichelDiz commented Oct 16, 2019 • edited Loading

campoy commented Oct 17, 2019

MichelDiz commented Oct 17, 2019 • edited Loading

campoy commented Oct 19, 2019

animesh2049 commented Oct 29, 2019

animesh2049 commented Oct 29, 2019

campoy commented Nov 12, 2019

pepoospina commented Nov 29, 2019

campoy commented Dec 11, 2019

pepoospina commented Dec 11, 2019 • edited Loading

sleto-it commented Jan 13, 2020

campoy commented Sep 27, 2019 •

edited

Loading

campoy commented Oct 16, 2019 •

edited

Loading

MichelDiz commented Oct 16, 2019 •

edited

Loading

MichelDiz commented Oct 17, 2019 •

edited

Loading

pepoospina commented Dec 11, 2019 •

edited

Loading