Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO numbers of functions using DB stuff #2216

Closed
lfdversluis opened this issue May 18, 2016 · 3 comments
Closed

IO numbers of functions using DB stuff #2216

lfdversluis opened this issue May 18, 2016 · 3 comments
Assignees
Milestone

Comments

@lfdversluis
Copy link

Check this for the three database managers and aggregate the numbers.

Including:

  • How many functions use cursors
  • How many of these call e.g. dispersy or GUI items
@lfdversluis
Copy link
Author

Investigated sqlitecachedb and checked the callers that call function that return database items. The results indicate that there are no Dispersy or GUI operations mixed in loops that use the database cursor. I think I got all cases, but if I missed one please let me know. I will investigate the Dispersy Database manager today as well.

SQLiteCacheDB

25 functions in Tribler call some of SQLiteCacheDB functions that return database objects.
4/25 make use of cursor objects being returned = 16%
zero of those four are doing dispersy calls or GUI calls.

The functions that return database objects are listed with their callers below.

get_cursor function

sqlitecachedbhandler.getSearchSuggestion -> uses cursor in function

execute function

tracker_manager.add_tracker -> uses cursor in function
tracker_manager.initialize -> uses cursor in function - Can be refactored to fetchall
tracker_manager.update_tracker_info -> does an execute, but ignores the cursor. The DB manager can have such a function that just returns None.

db_upgrader.UPGRADE_FUNCTIONS (5) that execute -> besides the one mentioned below, the other 5 ignore the cursor, can be handled in the same way as update_tracker_info above.

db_upgrader._upgrade_22_to_23 -> loops over a cursor two times, one can be replaced by fetchall, one cannot probably.

test_sqlitedb.test_create_db -> Does not use the cursor, so can be the same solution as update_tracker_info

executemany function

sqlitecachedbhandler.channelcastdbhandler.on_torrents_from_dispersy -> two times execute many, both ignore the cursor
sqlitecachedbhandler.channelcastdbhandler.update_nr_torrents -> two times executemany, both ignore the cursor.

sqlitecachedbhandler.torrentdbhandler.addExternalTorrentNoDef -> ignores the cursor
idem.addOrGetTorrentIDSReturn -> ignores the cursor
idem.addTorrentTrackerMappingInBatch -> ignores the cursor
idem.addTrackerInfoInBatch -> ignores the cursor
idem.freeSpace -> ignores the cursor
idem.on_search_response -> loops over two cursors, in one of the loops the _indexTorrent function is called which does DB operations.
idem.on_torrent_collect_response -> ignores the cursor
idem.updateTrackerInfo -> ignores the cursor

sqlitecachedbhandler.votecastdbhandler._flush_to_database -> ignores the cursor
idem.on_remove_votes_from_dispersy -> ignores the cursor
idem.on_votes_from_dispersy -> ignores the cursor

db_upgrade._upgrade_22_to_23 -> ignores the cursor

execute_read function

Can be removed. It only calls execute and is only used internally by sqlitecachedb. Does not return a cursor.

@lfdversluis
Copy link
Author

lfdversluis commented May 22, 2016

Multichain database

A subclass of Database in Dispersy, uses a different multichain.db file.

12 callers that get a database object (cursor).
3/12 lines that do a call ignore the cursor (25%).

execute function

add_block -> ignores the cursor.
update_block_with_responder -> ignores the cursor
get_latest_hash -> fetches the result of the cursor, can be refactored by a fetchone.
get_by_hash_requester -> fetches the first result of the cursor. Can be refactored by a fetchone (note: it uses a fetchone in the current code but in this case I mean the fetchone of the new database manager).
get_by_hash -> idem as get_by_hash_requester
get_by_public_key_and_sequence_number -> idem as get_by_hash_requester
get_blocks_since -> Loops over all the results, can be refactored by a fetchall
get_all_hash_requester -> Loops over all the results, can be refactored by returning a fetchall from the new db manager
contains -> grabs the first result, can be refactored by a fetchone
get_latest_sequence_number-> idem as contains
get_total -> idem as contains

executescript

check_database -> ignores the cursor returned.

Question for @pimveldhuisen and @pimotte: How much refactoring would it be if these functions would start become asynchronous and return Deferreds and thus the callers of these functions would have to modified to handle this?

@lfdversluis
Copy link
Author

lfdversluis commented May 22, 2016

Dispersy/database.py

execute function

community.py.Community._dispersy_claim_sync_bloom_filter_modulo -> uses the execute function three times. One can be refactored by a fetchone probably, the other two by fetchall
idem._download_master_member_identity -> can be refactored by a fetchone
idem._initialize_timeline -> loops over cursor, can be replaced by fetchall
idem._select_and_fix -> loops over cursor. Can be replaced by fetchall
idem.dispersy_auto_load -> ignores cursor
idem.dispersy_auto_load -> can be refactored by using fetchone
idem.get_master_members -> weird function that applies a reference to the execute function to a return statement. Quite ugly/complex return statement too.
idem.get_member -> can be replaced by a fetchone
idem.initialize -> several usages of execute, can be refactored with fetchone fetchall and functions that return none (just execute the query) from the db manager.
idem.update_global_time -> ignores the cursor

community.HardKilledCommunity.initialize -> can be refactored by using a fetchone and return that value from the dbmanager

dispersy.Dispersy._check_full_sync_distribution_batch -> also gets a reference to the execute function and applies it in a loop. Will require refactoring by using a DeferredList or something alike.
idem._is_duplicate_sync_message -> two are fetching data from the cursor and 1 line ignores it
idem._store -> 1 ignore case, 2 fetchone, 2 fetchall and a loop that calls execute which also does a fetchone.
idem.check_double_member_and_global_time -> one fetchall and one that gets ignored
idem.get_community -> fetches tuple from cursor, can be refactored with fetchone
idem.get_last_message -> fetches number from tuple, can be refactored using fetchone
idem.get_member -> 3 lines that fetch items 2 ignoring the cursor
idem.get_member_from_database_id -> fetches number from tuple, can be refactored using fetchone
idem.get_message -> fetches data from tuple, can be refactored using fetchone
idem.load_message -> fetches data from tuple, can be refactored using fetchone
idem.load_message_by_packetid -> fetches data from tuple, can be refactored using fetchone
idem.reclassify_community -> 1 line that ignores the cursor and one that fetches data
idem.sanity_check -> 2 loops that will need fetchall, 3 lines that fetch a tuple and 2 lines that ignore the cursor
idem.sanity_check.select -> a line that can be refactored by using fetchall

dispersydatabase.DispersyDatabase.check_community_database -> 2 fetchone cases, one line that ignores the cursor and 4 fetchall that loop over data
idem.check_database -> 1 loop, 2 fetch data from tuple and 1 line that ignores the cursor.

allchannelcommunity.community._get_packet_from_dispersy_id->fetches data from a cursor tuple, can be refactored by a fetchone
idem.check_votecast -> performs the execute in a loop, ignores the cursor

Channelcommunity.community._get_latest_channel_message -> fetch data from a tuple, can be refactored with a fetchone
idem._get_packet_id -> fetches data from a tuple, can be refactored by a fetchone
channelcommunity.channelconversion._get_message -> fetches data from a tuple, can be refactored by a fetchone

Searchcommunity.community.Searchcommunity._get_packet_from_dispersy_id -> fetches data from a tuple, can be refactored by a fetchone

Community.community._claim_master_member_sequence_number -> fetches data from a tuple, can be refactored by a fetchone
idem._get_packets_for_bloomfilters -> interesting case as it returns a generator that gets items from a cursor. Can probably refactored by using a fetchall and yield one item per time.
idem._update_timerange -> loops over elements in a cursor.
idem.check_undo -> 2 lines that fetch data from a tuple on that loops over a cursor
idem.create_undo -> 1 line that fetches a tuple and one that loops over a cursor
idem.fetch_packets -> loops over a cursor
idem.initialize -> ignores the cursor from execute
idem.on_destroy_community -> 2 lines that fetch a tuple, 2 lines that ignore the cursor
idem.on_missing_identity -> 2 loops over a cursor
idem.on_missing_message -> one line that fetches a tuple
idem.on_missing_proof -> one line that fetches a tuple.

community.Trackercommunity.dispersy_cleanup_community -> uses the execute in a loop where a tuple is being fetches with next().

test_torrent_upgrade_63_64 -> two lines that conver a cursor to a list.

note
There are 63 probably usages indicated by my PyCharm IDE. I did not check these.

Executemany function

All 11 callers ignore the cursor being returned.

(dispersy)
community.community.initialize -> two lines that ignore the cursor

(tribler)
idem._update_timerange -> two lines that ignore the cursor
idem.on_undo -> one line that ignores the cursor.

Dispersydatabase.check_community_database -> 6 lines that ignore the cursor
`

executescript function

25 calls to this function, 25/25 ignore the cursor (100%).

Bartercast.StatisticsDatabase.check_database ->ignores the cursor
idem.cleanup -> ignores the cursor

Dispersydatabase. check_database -> 23 cases, all ignore the cursor.
`

@whirm whirm added this to the V6.6 WX3 milestone May 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants