Throughput Test #7

LiXizhi · 2017-06-13T12:11:01Z

No description provided.

liuluheng · 2017-06-28T13:51:13Z

Phase 1

InsertThroughputNoIndex takes about 6 min with 10000 records and 100 concurrent jobs.
a little slow.

To boost the throuput, see Plan for Phase 2

LiXizhi · 2017-06-29T02:01:47Z

if it is that slow, I think we need to find the real spot that slows the system, before moving on to other plan

how did you test the throughput? Is each record a transaction? We can send next query without waiting the previous one to return, what is the throughput of this? This is more like the real situations, all requests are processed independent of each other in parallel.

liuluheng · 2017-06-29T11:17:09Z

Is each record a transaction?

yes. the insertOne method.

the code is as below. It seems that there are much time waiting for the reply between each roundtrip(10 records each）. I'll try retest with letter timeout and more concurrent jobs to see if become better.

function TestInsertThroughputNoIndex()
	NPL.load("(gl)script/ide/System/Database/TableDatabase.lua");
	local TableDatabase = commonlib.gettable("System.Database.TableDatabase");

    -- this will start both db client and db server if not.
	local db = TableDatabase:new():connect("temp/test_raft_database/");
	
	db.insertNoIndex:makeEmpty({});
	db.insertNoIndex:flush({});
		
	NPL.load("(gl)script/ide/Debugger/NPLProfiler.lua");
	local npl_profiler = commonlib.gettable("commonlib.npl_profiler");
	npl_profiler.perf_reset();

	npl_profiler.perf_begin("tableDB_BlockingAPILatency", true)
	local total_times = 10000; -- 10000 non-indexed insert operation
	local max_jobs = 10; -- concurrent jobs count
	NPL.load("(gl)script/ide/System/Concurrent/Parallel.lua");
	local Parallel = commonlib.gettable("System.Concurrent.Parallel");
	local p = Parallel:new():init()
	p:RunManyTimes(function(count)
		db.insertNoIndex:insertOne(nil, {count=count, data=math.random()}, function(err, data)
			if(err) then
				-- echo({err, data});
			end
			p:Next();
		end)
	end, total_times, max_jobs):OnFinished(function(total)
		npl_profiler.perf_end("tableDB_BlockingAPILatency", true)
		log(commonlib.serialize(npl_profiler.perf_get(), true));			
	end);
end

LiXizhi · 2017-06-30T03:24:58Z

OK. round trip is also taking too much time, figure this out

liuluheng · 2017-06-30T13:34:48Z

After I fix the memory leak in RPC and make the concurrent jobs to 100, InsertThroughputNoIndex with 10000 records takes about 6 minutes now.

liuluheng · 2017-07-02T13:52:22Z

Roundtrip

the roundtrip is about 21.96ms(10000 records/219.625s). the main cost is at the redundant log msg and log consistency check caused by the replication. If we remove the term check in the raft, the roundtrip can be down to 7.61ms(10000 records/76.124s).

Reason

As discribed above, redundant log msg and log consistency check have much performance penalty. They may lag down the consistency of the node in the cluster and makes the performance even worse as time flows.

A little about the Raft implementation

log msg are passed from the leader to follower only. It happens at each client request and each heartbeat interval. At each log passing it will try to pass as much log entries as possible from the lastLogIndex memorized for each peer.

Improvement

log consistency check is essential and necessary, but redundant log msg can be alleviate if we use WAL as the raft log. WAL is like a batch of log on the current implementation

Throughput

Besides the above reason, the Logbuffer size can affect the throughput much, if size 1000 the throuput for 10000 records with 10 concurrent jobs can be 6 min, but with size 10000, the time can lower to 1-2 min. This is also caused by the single thread mode implementation. All logic is handled in the main thread, include read the log file, which can be very slow, this will affect the network msg processing.

because of the lag down of the node state in cluster, parrallel send request(RunManyTimes) is slower than sequential request.

@LiXizhi

LiXizhi · 2017-07-03T02:02:29Z

OK.

Roundtrip is the average of sequential queries or the average of each individual query length. So it is usually not affected by how many queries are run in parallel.
You need to move IO to another thread when testing, as IO can be the really slow part.

@liuluheng

liuluheng self-assigned this Jun 28, 2017

LiXizhi added the enhancement label Jun 29, 2017

LiXizhi added this to the Alpha1 milestone Jul 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Throughput Test #7

Throughput Test #7

LiXizhi commented Jun 13, 2017

liuluheng commented Jun 28, 2017 •

edited

Loading

LiXizhi commented Jun 29, 2017 •

edited

Loading

liuluheng commented Jun 29, 2017 •

edited

Loading

LiXizhi commented Jun 30, 2017

liuluheng commented Jun 30, 2017 •

edited

Loading

liuluheng commented Jul 2, 2017 •

edited

Loading

LiXizhi commented Jul 3, 2017

Throughput Test #7

Throughput Test #7

Comments

LiXizhi commented Jun 13, 2017

liuluheng commented Jun 28, 2017 • edited Loading

LiXizhi commented Jun 29, 2017 • edited Loading

liuluheng commented Jun 29, 2017 • edited Loading

LiXizhi commented Jun 30, 2017

liuluheng commented Jun 30, 2017 • edited Loading

liuluheng commented Jul 2, 2017 • edited Loading

Roundtrip

Reason

A little about the Raft implementation

Improvement

Throughput

LiXizhi commented Jul 3, 2017

liuluheng commented Jun 28, 2017 •

edited

Loading

LiXizhi commented Jun 29, 2017 •

edited

Loading

liuluheng commented Jun 29, 2017 •

edited

Loading

liuluheng commented Jun 30, 2017 •

edited

Loading

liuluheng commented Jul 2, 2017 •

edited

Loading