Skip to content

Commit

Permalink
README
Browse files Browse the repository at this point in the history
  • Loading branch information
joocer committed May 25, 2024
1 parent f2d8c78 commit 211ed5b
Showing 1 changed file with 2 additions and 10 deletions.
12 changes: 2 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,20 @@ Terminology

- **Catalog** - A collection of tables.
- **Data File** - Files that contain the rows of the table.
- **Index** - A structure that improves data retrieval speed.
- **Manifest** - Files that list and describe data files in the table.
- **Metadata** - Information used to manage and describe tables.
- **Schema** - The structure defining the columns of the table.
- **Snapshot** - The state of the table at a specific point in time.
- **Statistics** - Summary information about the data in columns.
- **Table** - A dataset stored in a structured and managed way.


~~~python
table/
|- metadata
# | |- indexes/
# | | +- index-0000-0000.index
| |- manifests/
| | +- manifest-0000-0000.avro
| |- snapshots/
| | +- snapshot-0000-0000.json
# | +- statistics/
# | +- statistics-0000-0000.json
| +- snapshots/
| +- snapshot-0000-0000.json
+- data/
+- year=2000
+- month=01
Expand All @@ -41,8 +35,6 @@ flowchart TD
CATALOG --> SCHEMA(Schema)
SNAPSHOT --> SCHEMA
SNAPSHOT --> MANIFEST(Manifest)
SNAPSHOT --> INDEX(Indexes)
SNAPSHOT --> STATS(Statistics)
MANIFEST --> DATA(Data Files)
~~~

Expand Down

0 comments on commit 211ed5b

Please sign in to comment.