-
Notifications
You must be signed in to change notification settings - Fork 10
Code Documentation
The parser analyzes the file line by line and uses recursion to get the bookmark folder tree. The creator does the opposite of that
There are three modules more the exception module. __init__.py
contain all the classes,
parser.py
contains the parser and creator.py
contains the creator
The classes are in __init__.py
. It has the variable non_parsed
that is a dictionary containing all lines that were ignored by
the parser. The variable is synced with the one with the same name
in the NetscapeBookmarkFile class
Represents the Netscape Bookmark File.
Variables:
html # the bookmark file in a string
bookmarks # BookmarkFolder object containing the bookmark tree
non_parsed # lines of the file that haven't been parsed, synced with the global variable
doc_type #\
http_equiv_meta # \ header info of the bookmark file
content_meta # /
title #/
Without any other module imported, this class doesn't have any methods. When the
parser module is imported, the method parse()
is added to the class and when the
creator module is imported, the method create_file()
is added to the class
Represents an item in the bookmarks. An item can be a folder or an shortcut (or feed or web slice). It's attributes are:
num # the position of the item in the folder it's in
add_date_unix # the creation date of the item in unix time
last_modified_unix # the creation date of the item in unix time
parent # the parent folder of the item. Just the root folder have this equal None
name # name of the item
The BookmarkItem is a data class (introduced in Python 3.7). It doesn't have any method
Represents an folder in the bookmarks. It's subclass of BookmarkItem, also a data class.
Variables:
personal_toolbar # true if the folder is the bookmarks toolbar
items # list that contains all items inside this folder
children # list that contains all subfolders inside this folder
shortcuts # list that contains all shortcuts inside this folder
Methods:
sync_items() # clears and fills the items list with the contents of the children and shortcuts lists
split_items() # splits the items list between the children and shortcuts
sort_items() # sorts the items list
sort_children_and_shortcuts() # sorts the children and shortcuts lists
Represents a shortcut in the bookmarks. It's subclass of BookmarkItem, data class too.
Variables:
href # link to the web page (or anything alike) of the shortcut
last_visit_unix # date when the web paged was last visited, in unix time
private # equals to the PRIVATE attribute
tags # tags of this shortcut, if present
icon_url_fake # true if the ICON_URI attribute start with fake-favicon-uri.
icon_url # the favicon url if icon_url_fake is false and the attribute ICON_URI is present
icon_base64 # the format/encoding of the favicon and the favicon encoded data. Commonly is a png image encoded with base64. The string here can be really big
feed # true if the attribute FEED is present. Legacy support for feeds
web_slice # true if the attribute WEBSLICE is present. Legacy support for web slices
comment # comment of the shortcut if present
shortcut_url # the shortcut keyword associated with the shortcut, if set. Used by firefox (see #8)
Doesn't have any any methods.
Represents a Feed in the bookmarks. It's subclass of BookmarkShortcut, data class too.
Variables:
feed # overrides super and its value is True
feed_url # feed url
Doesn't have any methods. This is for legacy support
Represents an Web Slice in the bookmarks. It's a subclass of BookmarkShortcut, data class too.
Variables:
web_slice # overrides super and its value is True
is_live_preview # same value of the attribute ISLIVEPREVIEW
preview_size # the preview size in the attribute PREVIEWSIZE. It's a string
Doesn't have any methods. This is for legacy support
The module with functions to parse a Netscape Bookmarks File
Get attributes and its values from the tag content (just attributes) and returns them in a dictionary
Ex:
tag = '<A ATTRIBUTE="value">text</A>'
inside = ' ATTRIBUTE="value"'
attribute_finder(inside) -> {'ATTRIBUTE': 'value'}
Handles the !DOCTYPE
tag, verifying if it matches the expected. Prints a
warning if doesn't. Returns the content of !DOCTYPE
Ex:
tag = '<!DOCTYPE NETSCAPE-Bookmark-file-1>'
doc_type_extractor(tag) -> 'NETSCAPE-Bookmark-file-1'
Handles H3
tags, that represents folders. It'll create a BookmarkFolder object
and fill some variables with the H3
tag's info, like name and add date.
Returns the BookmarkFolder created
Ex:
tag = '<DT><H3 ADD_DATE="1530184751">Folder</H3>'
folder_tag_extractor(tag) -> BookmarkFolder(add_date_unix=1530184751, name='Folder')
Handles the A
tags, that represents, commonly, shortcuts and their comments.
It'll create a BookmarkShortcut (or subclasses: BookmarkFeed or BookmarkWebSlices, rare) and
fill some variables with the A
tag's info, like name, href and comment. Returns
the created BookmarkShortcut (or subclass)
Ex:
tag = '<DT><A HREF="http://www.google.com" ADD_DATE="1471007115">Google</A>'
comment = 'Google!'
shortcut_tag_extractor(tag, comment) -> BookmarkShortcut(href='https://www.google.com', add_date_unix=1471007115, name=Google, comment='Google!')
Handles shortcuts and their comments, A
and DD
tag respectively. Verifies if
a_tag contains the opening and the closing of the A
tag, if doesn't, an warning
is printed, extracts the comment of the DD
tag and calls
shortcut_tag_extractor()
, returning the object returned.
The line argument is used for the warning
Ex:
line = 56
a_tag = tag = '<DT><A HREF="http://www.google.com" ADD_DATE="1471007115">Google</A>'
dd_tag = '<DD>Google!'
shortcut_handler(line, a_tag, dd_tag) -> BookmarkShortcut(href='https://www.google.com', add_date_unix=1471007115, name=Google, comment='Google!')
Handles folder and their tree. Verifies if the H3
tag has its opening and closing,
calls folder(h3_tag)
and process the body. The body processing is recursive.
Items inside the folder are transformed in BookmarkShortcut (or subclass), by calling
shortcut_handler, subfolder have their body verified for closing tag <\DL><p>
,
if it isn't found an Exception will be raised, because the bookmarks file doesn't
have the same number of <DL><p>
and </DL><p>
, that wrap a folder body, after,
calls itself recursively. It's responsible for filling the num
, parent
, items
,
entries
and children
in the BookmarkShortcut and BookmarkFolder objects. Returns
a BookmarkFolder object with every possible variable filled, and the folder tree inside
of the items, shortcuts and children lists
Ex:
line = 52
h3_tag = '<DT><H3 ADD_DATE="1530184751">Folder</H3>'
body = ['<DL><p>', '<DT><A HREF="http://www.google.com" ADD_DATE="1471007115">Google</A>', '</DL><p>']
folder_handler(line, h3_tag, body) -> BookmarkFolder(add_date_unix=1530184751, name=Folder, items=[BookmarkShortcut(href='https://www.google.com', add_date_unix=1471007115, name=Google, comment='Google!')], entries=[BookmarkShortcut(href='https://www.google.com', add_date_unix=1471007115, name=Google, comment='Google!')])
Responsible to start the parsing process. Also gets the H1
tag's content and make a fake H3
tag with it, calling folder_handler()
passing
the fake H3
tag and the entire body of the root bookmarks folder. Returns the
NetscapeBookmarksFile parsed. This function is added to the
NetscapeBookmarkFile class at import time
Responsible for adding the parse()
method to the NetscapeBookmarksFile class.
It's executed when the module is imported
Th module with the functions needed to create a Netscape Bookmark File. It's almost like the parser in reverse
Many function do the opposite of their parser counterparts
Verifies if the url has http://
or https://
at the start. If it does, returns
the url with nothing changed, if it doesn't, puts http://
at the start. Function
created because the shortcut's HREF
attribute must start with http://
or https://
Creates a string with the attributes and its value from the dictionary received.
If an attribute doesn't have a value, it'll be printed, but without =
Ex:
attribute_printer({'ATTRIBUTE': 'value'}) -> 'ATTRIBUTE="value"'
attribute_printer({'ATTRIBUTE': ''}) -> 'ATTRIBUTE'
Creates the meta (start) of the file. h1
is the name of the root bookmark folder.
If no argument is passed, prints a default meta
Ex:
out = creator.meta_creator('x', ['y', 'z'], 'B', 'b')
out == '''<!DOCTYPE x>
<!-- This is an automatically generated file.
It will be read and overwritten.
DO NOT EDIT! -->
<META HTTP-EQUIV="y" CONTENT="z">
<TITLE>B</TITLE>
<H1>b</H1>
'''
Creates a shortcut A
tag from the BookmarkShortcut. If a value in
the object is the default, that attribute won't be printed. A list is
returned, with the first element being the A
tag and the second,
if present, the DD
tag
Ex:
arg = Classes.BookmarkShortcut()
arg.href = 'https://duckduckgo.com'
arg.name = 'Duck Duck Go'
out = creator.shortcut_creator(arg)
out == ['<DT><A HREF="https://duckduckgo.com">Duck Duck Go</A>']
arg = Classes.BookmarkShortcut()
arg.href = 'https://duckduckgo.com'
arg.name = 'Duck Duck Go'
arg.add_date_unix = 1515
out = creator.shortcut_creator(arg)
out == ['<DT><A HREF="https://duckduckgo.com" ADD_DATE="1515">Duck Duck Go</A>']
Creates a folder H3
tag from the BookmarkFolder. If a value in
the object is the default, that attribute won't be printed
Ex:
arg = creator.BookmarkFolder()
arg.name = 'Folder'
shortcut = creator.BookmarkShortcut()
shortcut.name = 'Duck Duck Go'
shortcut.href = 'https://duckduckgo.com'
arg.items.append(shortcut)
out = creator.folder_creator(arg)
out == ['<DT><H3>Folder</H3>',
'<DL><p>',
' <DT><A HREF="https://duckduckgo.com">Duck Duck Go</A>'
'</DL><p>']
Responsible to create the html file. It'll create the meta (if print_meta == True) and start the folder creation recursion loop. The file is put in netscape_bookmarks_file.html and returns the lines of the file, without line breaks. This function is added to the NetscapeBookmarkFile class at import time
Responsible to add the create_file()
to the NetscapeBookmarkFile class.
It's ran at import time
Have the only exception that the parser can raise
TagNotPresentException()
, raised when a required
tag isn't found