Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider DB schema #93

Open
4 tasks
nilsnolde opened this issue Mar 19, 2020 · 4 comments
Open
4 tasks

Reconsider DB schema #93

nilsnolde opened this issue Mar 19, 2020 · 4 comments
Assignees
Labels
enhancement | suggestion New feature or request

Comments

@nilsnolde
Copy link
Contributor

Things to think about:

  • The tags are 1:1 with OSM objects, so they can be merged into one table
  • uuid is mostly useless it seems, it's not the PK in most tables
  • categories are extracted and stored for each OSM object, duplicating data millions of times
  • allow (simplified) MultiPolygon in DB. Would need some serious performance testing though for the intersects query stage. Would be nicer than only querying for centroids, which is often meaningless for big areas like parks or even true MultiPolygon objects.
@nilsnolde nilsnolde added the enhancement | suggestion New feature or request label Mar 19, 2020
@nilsnolde nilsnolde self-assigned this Mar 19, 2020
@TimMcCauley
Copy link
Contributor

From the top of my head: the tags are in a separate table by design. It's because the information can be optionally added and its columns can be individually queried. Having everything in one table would remove this functionality.

@nilsnolde
Copy link
Contributor Author

Ah ok, then you'd have to basically duplicate a lot of POIs if they have a lot of additional tags, right? Yeah, that'd be stupid, true. OK thanks for the hint!

@TimMcCauley
Copy link
Contributor

TimMcCauley commented Mar 19, 2020

It is like this:

 public | ops_planet_pois            | table | admin
 public | ops_planet_pois_categories | table | admin
 public | ops_planet_pois_tags       | table | admin

ops_planet_pois

gis=# \d ops_planet_pois
                  Table "public.ops_planet_pois"
  Column  |         Type          | Collation | Nullable | Default
----------+-----------------------+-----------+----------+---------
 uuid     | bytea                 |           | not null |
 osm_id   | bigint                |           | not null |
 osm_type | integer               |           | not null |
 geom     | geography(Point,4326) |           | not null |
Indexes:
    "ops_planet_pois_pkey" PRIMARY KEY, btree (uuid)
    "idx_ops_planet_pois_geom" gist (geom)
    "ix_ops_planet_pois_osm_id" btree (osm_id)
Referenced by:
    TABLE "ops_planet_pois_categories" CONSTRAINT "ops_planet_pois_categories_uuid_fkey" FOREIGN KEY (uuid) REFERENCES ops_planet_pois(uuid)
    TABLE "ops_planet_pois_tags" CONSTRAINT "ops_planet_pois_tags_uuid_fkey" FOREIGN KEY (uuid) REFERENCES ops_planet_pois(uuid)

ops_planet_pois_categories

gis=# \d ops_planet_pois_categories
                             Table "public.ops_planet_pois_categories"
  Column  |  Type   | Collation | Nullable |                        Default
----------+---------+-----------+----------+--------------------------------------------------------
 id       | integer |           | not null | nextval('ops_planet_pois_categories_id_seq'::regclass)
 uuid     | bytea   |           | not null |
 category | integer |           | not null |
Indexes:
    "ops_planet_pois_categories_pkey" PRIMARY KEY, btree (id)
    "ix_ops_planet_pois_categories_category" btree (category)
    "ix_ops_planet_pois_categories_uuid" btree (uuid)
Foreign-key constraints:
    "ops_planet_pois_categories_uuid_fkey" FOREIGN KEY (uuid) REFERENCES ops_planet_pois(uuid)

ops_planet_pois_tags

gis=# \d ops_planet_pois_tags
                            Table "public.ops_planet_pois_tags"
 Column |  Type   | Collation | Nullable |                     Default
--------+---------+-----------+----------+--------------------------------------------------
 id     | integer |           | not null | nextval('ops_planet_pois_tags_id_seq'::regclass)
 uuid   | bytea   |           | not null |
 osm_id | bigint  |           | not null |
 key    | text    |           |          |
 value  | text    |           |          |
Indexes:
    "ops_planet_pois_tags_pkey" PRIMARY KEY, btree (id)
    "ix_ops_planet_pois_tags_key" btree (key)
    "ix_ops_planet_pois_tags_uuid" btree (uuid)
    "ix_ops_planet_pois_tags_value" btree (value)
Foreign-key constraints:
    "ops_planet_pois_tags_uuid_fkey" FOREIGN KEY (uuid) REFERENCES ops_planet_pois(uuid)

Example:

gis=# SELECT * FROM ops_planet_pois LIMIT 5;
                uuid                |  osm_id   | osm_type |                        geom
------------------------------------+-----------+----------+----------------------------------------------------
 \x25f48d87398c4d2fa3e59af27ea307ff | 110247790 |        1 | 0101000020E6100000C8330D2F2DB84EC006AB8F2C05E92840
 \x56b20dd5d3e340d7ba663d12b0655848 | 108493272 |        1 | 0101000020E6100000317D4E8B0BB84EC07052C6095D0F2940
 \xb6e64021629a43528d96874b4301e879 |   9829772 |        1 | 0101000020E61000007433A31F8D9D51C05BE48D8296C83340
 \x92fe63810b3243c7a9a3c4d05c6f0c67 |  26241064 |        1 | 0101000020E6100000C4F59441DA9854C0ED707495EE203740
 \x50f7f61cb96b4644bf97b00723aa0cbc |  26242036 |        1 | 0101000020E610000051488D646A9954C070C328081E033740
(5 rows)

...

gis=# SELECT * FROM ops_planet_pois_tags ORDER BY uuid LIMIT 5;
    id    |                uuid                |   osm_id   |    key     |       value
----------+------------------------------------+------------+------------+-------------------
  9445969 | \x000000d29f844abf9fb85b2d6d75290e | 4766271454 | name       | Nasi Katok Adik
  2371439 | \x0000022702df4af3a763b5aa140ab7ae | 1188977786 | wheelchair | yes
  2371438 | \x0000022702df4af3a763b5aa140ab7ae | 1188977786 | name       | Bistro Augenblick
 12345299 | \x00000267e601418a98fc80bba38c5eeb | 1821133462 | wheelchair | yes
 12345298 | \x00000267e601418a98fc80bba38c5eeb | 1821133462 | name       | Simonet
(5 rows)

...

gis=# SELECT * FROM ops_planet_pois_categories ORDER BY uuid LIMIT 5;
    id    |                uuid                | category
----------+------------------------------------+----------
 30877690 | \x0000001817504b0c87c3de5826105582 |      335
 29785850 | \x0000008e91234c678cb229495fa55f6b |      565
 18003787 | \x000000d29f844abf9fb85b2d6d75290e |      566
 26977347 | \x000001c9f2ce4296842239e7e5cb67cf |      292
  6606057 | \x0000022702df4af3a763b5aa140ab7ae |      570
(5 rows)

The uuid is the FK for the other tables. Maybe there exists a more intuitive way but nothing came to our mind back then at least.

@TimMcCauley
Copy link
Contributor

TimMcCauley commented Mar 19, 2020

categories are extracted and stored for each OSM object, duplicating data millions of times - each OSM object has one category or more than one, there are no duplicates in there.

e.g.

gis=# SELECT count(*) FROM ops_planet_pois_categories;
  count
----------
 38365931
(1 row)

gis=# SELECT count(*) FROM ops_planet_pois;
  count
----------
 38117690
(1 row)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement | suggestion New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants