Skip to content
clsn edited this page Feb 4, 2011 · 7 revisions

MySQLFUSE: View a MySQL Database as a Filesystem

This project was my first semi-serious attempt at playing with FUSE for Linux, using the Python FUSE bindings.

Here’s the basic concept involved: You mount a MySQL database (will it work with other SQLs? I don’t know, and don’t count on it) under FUSE, and then when you look into the filesystem, the top-level directories are the names of the tables in the database. Within each table, the directory tree’s layers alternate between key-names and key-values (in the current, latest release. Earlier ones had the names and values adjoined in the same directory-name, separated by a colon, which conceivably we might go back to.)

In order to get that explanation to make sense, let’s concretize it a little. Say you have a database with a table called GreatBigBook:


   CREATE TABLE `GreatBigBook` (
     `volume` int(11) NOT NULL default '0',
     `chapter` int(11) NOT NULL default '0',
     `page` int(11) NOT NULL default '0',
     `data` text,
     PRIMARY KEY  (`volume`,`chapter`,`page`)
   )

It’s a book, indexed on its volume, chapter, and page (all numbers, though in general they need not be). These are the primary keys, in combination: one volume will have more than one page and chapter, one chapter will appear in more than one volume and have more than one page, one page will appear in more than one volume and in more than one chapter, but the combination of (volume, chapter, page), as a tuple, is unique for each row (we’ll ignore for the moment the fact that usually the volume and page suffice for uniqueness. Let’s say the chapters all start page numbers all over again). When you cd into mnt-point/GreatBigBook, you will see three directories: volume, chapter, and page:


  $ cd GreatBigBook
  $ ls
  chapter/        page/        volume/

Then we can cd into the volume directory, say:


  $ cd volume
  $ ls
  1/            2/            3/            4/            5/           6/          ...

All the extant values of the volume key. Change into …/GreatBigBook/volume/1/:


  $ cd 1
  $ ls
  chapter/         page/

These are the elements of the key remaining for you to choose. Go into …/GreatBigBook/volume/1/chapter/ and you’ll see all the chapters that are in volume 1. Then you can cd into, say, chapter 7, and you’d see only page (which I suppose is redundant, but it’s consistent). Inside …/GreatBigBook/volume/1/chapter/7/page/ would be found all the pages in that volume and chapter, still as directories.


  $ cd chapter
  $ ls
  1/            2/             3/             4/            5/             6/           ... # all the chapters in this volume
  $ cd 7
  $ ls
  page/
  $ cd page
  $ ls
  (directories for all the pages in vol.1, chap.7)

Change into …/GreatBigBook/volume/1/chapter/7/page/120/ and you’ll see a file, data (if there were more non-key fields in this table, there would be more files). That file contains the contents of its like-named field in the database, which can be read and written with the usual tools of file-access.


  $ cd 120
  $ pwd
  /home/blah/mountpoint/GreatBigBook/volume/1/chapter/7/page/120
  $ ls
  data
  $ vi data    # and you get to edit the page contents here.

You could also have gone into …/GreatBigBook/chapter/ first and seen all the extant chapters, and in …/GreatBigBook/chapter/7/ you could decide if you wanted to go down into volume or page next, and so forth, until you found the exact same data as before in …/GreatBigBook/chapter/7/page/120/volume/1/data.

The older version had …/GreatBigBook/volume:1/chapter:7/page:120/ and only presented things in the order the keys were found in the index. Is that better? The alternating directories can be confusing, but I think it’s more flexible the new way.

Limitations:

  • You can only insert things into the table by using mkdir. Depending on where in the hierarchy you do this, you will wind up with various dummy entries, since a “directory” only exists when there are rows for it.
  • Similarly, the only way to remove things is with rmdir. In ordinary filesystems, rmdir only works on empty directories, so it’s safe to do things like rmdir *. Don’t do that.
  • At this point, it only works on primary keys, not any other indices you may have. Which means that if this is one of those databases where you threw in a unique id field just to make things unique, you’re going to have a very flat and unhelpful directory structure.
  • It doesn’t seem to handle idleness well; need to get it to refresh its DB connection.
  • I’m kind of amazed it works at all.
Clone this wiki locally