Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throughput Test #7

Open
LiXizhi opened this issue Jun 13, 2017 · 7 comments
Open

Throughput Test #7

LiXizhi opened this issue Jun 13, 2017 · 7 comments
Assignees
Milestone

Comments

@LiXizhi
Copy link
Contributor

LiXizhi commented Jun 13, 2017

No description provided.

@liuluheng
Copy link
Collaborator

liuluheng commented Jun 28, 2017

Phase 1

  • InsertThroughputNoIndex takes about 6 min with 10000 records and 100 concurrent jobs.
  • a little slow.

To boost the throuput, see Plan for Phase 2

@liuluheng liuluheng self-assigned this Jun 28, 2017
@LiXizhi
Copy link
Contributor Author

LiXizhi commented Jun 29, 2017

if it is that slow, I think we need to find the real spot that slows the system, before moving on to other plan

how did you test the throughput? Is each record a transaction? We can send next query without waiting the previous one to return, what is the throughput of this? This is more like the real situations, all requests are processed independent of each other in parallel.

@liuluheng
Copy link
Collaborator

liuluheng commented Jun 29, 2017

Is each record a transaction?

yes. the insertOne method.

the code is as below. It seems that there are much time waiting for the reply between each roundtrip(10 records each). I'll try retest with letter timeout and more concurrent jobs to see if become better.

function TestInsertThroughputNoIndex()
	NPL.load("(gl)script/ide/System/Database/TableDatabase.lua");
	local TableDatabase = commonlib.gettable("System.Database.TableDatabase");

    -- this will start both db client and db server if not.
	local db = TableDatabase:new():connect("temp/test_raft_database/");
	
	db.insertNoIndex:makeEmpty({});
	db.insertNoIndex:flush({});
		
	NPL.load("(gl)script/ide/Debugger/NPLProfiler.lua");
	local npl_profiler = commonlib.gettable("commonlib.npl_profiler");
	npl_profiler.perf_reset();

	npl_profiler.perf_begin("tableDB_BlockingAPILatency", true)
	local total_times = 10000; -- 10000 non-indexed insert operation
	local max_jobs = 10; -- concurrent jobs count
	NPL.load("(gl)script/ide/System/Concurrent/Parallel.lua");
	local Parallel = commonlib.gettable("System.Concurrent.Parallel");
	local p = Parallel:new():init()
	p:RunManyTimes(function(count)
		db.insertNoIndex:insertOne(nil, {count=count, data=math.random()}, function(err, data)
			if(err) then
				-- echo({err, data});
			end
			p:Next();
		end)
	end, total_times, max_jobs):OnFinished(function(total)
		npl_profiler.perf_end("tableDB_BlockingAPILatency", true)
		log(commonlib.serialize(npl_profiler.perf_get(), true));			
	end);
end

@LiXizhi
Copy link
Contributor Author

LiXizhi commented Jun 30, 2017

OK. round trip is also taking too much time, figure this out

@liuluheng
Copy link
Collaborator

liuluheng commented Jun 30, 2017

After I fix the memory leak in RPC and make the concurrent jobs to 100, InsertThroughputNoIndex with 10000 records takes about 6 minutes now.

@liuluheng
Copy link
Collaborator

liuluheng commented Jul 2, 2017

Roundtrip

the roundtrip is about 21.96ms(10000 records/219.625s). the main cost is at the redundant log msg and log consistency check caused by the replication. If we remove the term check in the raft, the roundtrip can be down to 7.61ms(10000 records/76.124s).

Reason

As discribed above, redundant log msg and log consistency check have much performance penalty. They may lag down the consistency of the node in the cluster and makes the performance even worse as time flows.

A little about the Raft implementation

log msg are passed from the leader to follower only. It happens at each client request and each heartbeat interval. At each log passing it will try to pass as much log entries as possible from the lastLogIndex memorized for each peer.

Improvement

log consistency check is essential and necessary, but redundant log msg can be alleviate if we use WAL as the raft log. WAL is like a batch of log on the current implementation

Throughput

Besides the above reason, the Logbuffer size can affect the throughput much, if size 1000 the throuput for 10000 records with 10 concurrent jobs can be 6 min, but with size 10000, the time can lower to 1-2 min. This is also caused by the single thread mode implementation. All logic is handled in the main thread, include read the log file, which can be very slow, this will affect the network msg processing.

because of the lag down of the node state in cluster, parrallel send request(RunManyTimes) is slower than sequential request.

@LiXizhi

@LiXizhi
Copy link
Contributor Author

LiXizhi commented Jul 3, 2017

OK.

  • Roundtrip is the average of sequential queries or the average of each individual query length. So it is usually not affected by how many queries are run in parallel.
  • You need to move IO to another thread when testing, as IO can be the really slow part.

@liuluheng

@LiXizhi LiXizhi added this to the Alpha1 milestone Jul 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants