Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add PID rate controller #30

Merged
merged 1 commit into from
Apr 25, 2018
Merged

Conversation

nklincoln
Copy link
Contributor

I have been using this controller to avoid timeouts during transactions; I suspect it will be useful for others too!

The controller mechanism acts to maintain a fixed level of backlog transactions, using a basic PID control mechanism to modify the driven transaction rate to achieve a fixed backlog. A user must prescribe control gains, as well as a starting TPS to seed the controller. There is an option to view controller output to assist tuning the controller for a specific system.

Contribution includes:

New rate controller with header description
Update of fixed-rate controller to add header description
Docs modification to indicate availability of controller

Signed-off-by: Nick Lincoln [email protected]

Signed-off-by: Nick Lincoln <[email protected]>
@haojun
Copy link
Contributor

haojun commented Apr 19, 2018

Hi @nklincoln , I still don't understand how to set the targetLoad as well as P/I/D. And sorry I don't have time to investigate the code.
I tried to run the 'simple' test with a rate-control as {"type": "pid-rate", "opts": {"targetLoad": 5, "initialTPS": 100, "proportional": 0.2, "integral": 0.0001, "derrivative": 0.1, "showVars": true}} for 5000 txs, and got a result of 42 tps. Why the actual tps is that low?

@nklincoln
Copy link
Contributor Author

Hi @haojun,
That is what I would have expected and I get the same results running locally on my machine. The aim of the control mechanism is to adjust the TPS to maintain a specified (and steady) number of unfinished transactions per client.

When I run the simple test with a fixed rate, the transaction backlog grows to a peak of approx 450. Whilst this means that the maximum possible throughput will be reached, if i was to run the test for 500000 txn, there would be a timeout at some point.

Bringing down the maximum number of backlog transactions prevents the issue of timeouts, but does reduce the overall TPS as there is a deliberate limitation on the placement of transactions to be completed.

@haojun
Copy link
Contributor

haojun commented Apr 20, 2018

OK, I see. But what's the purpose of restricting the number of backlogs?
There are several factors that may affect the number of backlogs:

  1. SUT's processing capability. If the TPS is out of the capability, the number grows continuously. In this case, the TPS should be lowered.
  2. Latency (network latency and processing latency). For example, if submitting txs at 200 tps while the e2e latency for each tx is 1 second, the number of backlogs will grow to 200 even if the actual processing capability on the server side is over 200 ( and the backlogs number should fluctuate around 200). In this case, a high backlogs number(200) looks normal.

@aklenik
Copy link
Contributor

aklenik commented Apr 22, 2018

Maybe maintaining a given e2e latency would be more appropriate here, if it's possible.

@nklincoln
Copy link
Contributor Author

Sorry for the delay in response -

I'm not sure if I understand your point for item 2) above, but for 1) this is exactly why the rate controller should be used and is especially true for long running tests - for instance, how does a user know what TPS to lower it to when hitting a timeout, other than 'less than before'?

As you mention, there are many factors impacting the SUT performance, and the multiple factors will result in a highly non-linear system. Using a systematic approach, this can be overcome:

  • by running the tests with varying load, the output will be a curve of maximum tps for a given load, which should be highly repeatable.
  • through looking a the output graph, the highest TPS throughput can be determined (note that the highest throughput is not at the highest sustainable loading)

There is an additional benefit - since the controller will enable running performance tests for an extended time (which is not guaranteed from a pure TPS limit) it would be useful for possible extension of Caliper into a Performance and reliability tool in the future.

@haojun haojun merged commit 442fea9 into hyperledger-caliper:master Apr 25, 2018
@JulienGuo
Copy link

Hi,@nklincoln
I used pidrate.
According to the console logs, I found many logs like deal pid and compute sleep time, like this:
Current P value: -23.6
Current I value: -0.8427249989642626
Current I value: -2.873340714337511
Current D value: -0.00005253275121500009
New sleep time: 1928.0171704039135
Current D value: -0.000028708512638230602
New sleep time: 3520.160757388461
Current load error: -169
Current P value: -33.800000000000004
Current I value: -2.814124828742096
Current D value: -0.000029013484368045606
New sleep time: 3483.287387965611

and then deal txs, like this:
[Transaction Info] - Submitted: 20000 Succ: 10457 Fail:0 Unfinished:9543
[Transaction Info] - Submitted: 20000 Succ: 10489 Fail:0 Unfinished:9511
[Transaction Info] - Submitted: 20000 Succ: 10510 Fail:0 Unfinished:9490
[Transaction Info] - Submitted: 20000 Succ: 10521 Fail:0 Unfinished:9479
[Transaction Info] - Submitted: 20000 Succ: 10550 Fail:0 Unfinished:9450
[Transaction Info] - Submitted: 20000 Succ: 10597 Fail:0 Unfinished:9403
[Transaction Info] - Submitted: 20000 Succ: 10685 Fail:0 Unfinished:9315
[Transaction Info] - Submitted: 20000 Succ: 11200 Fail:0 Unfinished:8800
[Transaction Info] - Submitted: 20000 Succ: 12066 Fail:0 Unfinished:7934
[Transaction Info] - Submitted: 20000 Succ: 13108 Fail:0 Unfinished:6892
[Transaction Info] - Submitted: 20000 Succ: 13976 Fail:0 Unfinished:6024
[Transaction Info] - Submitted: 20000 Succ: 14487 Fail:0 Unfinished:5513

I am confused,

Isn't it that pidrate control changes sleeptime according to the results of txs?

@nklincoln
Copy link
Contributor Author

Hi, yes - it should. There have been a few upstream changes that broke the Composer plugin - these should have been resolved with the latest PR that has been merged - would recommend that you try again ensuring that #122 is included in your code base

@JulienGuo
Copy link

@nklincoln Hi, in #122 Maybe the problem is resolved in composer, but my problem is occurring in fabric

@JulienGuo
Copy link

@nklincoln I use latest code , And the problem is still happened. I use Fabric network

@nklincoln nklincoln deleted the basic-pid branch July 17, 2018 10:12
faustovanin pushed a commit to faustovanin/caliper that referenced this pull request Oct 27, 2023
fix goLang metadata paths in network files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants