Introduction
Pewpew is an HTTP load test tool designed for ease of use and high performance. Pewpew requires a minimal amount of resources to generate the maximum amount of load.
To get started:
- Learn about Pewpew's design and unique concepts.
- Download Pewpew.
- Create a config file.
- Execute a test.
- View the results.
Pewpew design overview
The primary objective of an HTTP load test is to generate traffic against web services. A user of pewpew describes what HTTP endpoints to target, any data flows and the load patterns for the test within a YAML config file.
Data flow
Some endpoints only require a simple request, relying on no other source of data. In most cases, however, endpoints require particular data as part of the request--think ids or tokens. This is where providers come in. Providers act as a FIFO (first in, first out) queue for sending and receiving data. A provider can be fed with data from a file, a static list, a range of numbers and/or from HTTP responses.
The data within a provider is used to build HTTP requests. Here's a diagram of how data flows:
As an example, here's a visualization of a test which hits a single endpoint on a fictitious pepper service. (Note this diagram does not reflect the structure of the YAML config file, but merely demonstrates the logical flow of data out from and back into a provider):
On the left there is a single provider defined "name" which has some predefined values. In the "endpoint definition" the top box shows a template to build a request. Because the template references the "name" provider, for every request sent a single value will be popped off the "name" provider's queue. After the response is received the "after response action" will put the previously fetched value back onto the queue. In this example the very first request would look like:
GET /pepper/cayenne
host: 127.0.0.1
user-agent: pewpew
After the response is received, the value "cayenne" would be pushed back into the "name" provider.
Peak loads and load patterns
In most load tests it is desirable to not only generate considerable traffic but to do so in a defined pattern. Pewpew accomplishes this through the use of peak loads and load patterns. A peak load is defined for an endpoint and a load pattern can be defined test wide or on individual endpoints.
peak load - how much load an endpoint should generate at "100%". Peak loads are defined in terms of "hits per minute" (HPM) or "hits per second" (HPS).
load pattern - the pattern that generated traffic will follow over time. Load patterns are termed in percentages over some duration.
For example, suppose we were creating a load test against the pepper service. After some research we determine that the peak load should be 50HPM (50 hits per minute) and that the load pattern should increase from 0% (no traffic) to 50% (25HPM) over 20 minutes, remain steady for 40 minutes, then go up to 100% (50HPM) over 10 minutes and stay there for 50 minutes.
Here's what the load pattern would look like charted out over time:
Config file
A config file is a YAML file which defines everything needed for Pewpew to execute a load test. This includes which HTTP endpoints are part of the test, how load should fluctuate over the duration of a test, how data "flows" in a test and more.
Key concepts
Before creating a config file there are a few key concepts which are helpful to understand.
- Everything in an HTTP load test is centered around endpoints, rather than "transactions".
- Whenever some piece of data is needed to build an HTTP request, that data flows through a provider. Similarly, when an HTTP response provides data needed for another request that data goes through a provider.
- The amount of load generated is determined on a per-endpoint-basis termed in "hits per minute" or "hits per second", rather than number of "users".
- Because a config file is used rather than an API with a scripting language, Pewpew includes a minimal, build-in "language" which allows the execution of very simple expressions.
Framing a load test with these concepts enables Pewpew to accomplish one of its goals of allowing a tester to create and maintain load tests with ease.
Sections of a config file
A config file has five main sections, though not all are required:
- config - Allows customization of various test options.
- load_pattern - Specifies how load fluctuates during a test.
- vars - Declare static variables which can be used in expressions.
- providers - Declares providers which will are used to manage the flow of data needed for a test.
- loggers - Declares loggers which, as their name suggests, provide a means of logging data.
- endpoints - Specifies the HTTP endpoints which are part of a test and various parameters to build each request.
Example
Here's a simple example config file:
load_pattern:
- linear:
to: 100%
over: 5m
- linear:
to: 100%
over: 2m
endpoints:
- method: GET
url: http://localhost/foo
peak_load: 42hpm
headers:
Accept: text/plain
- method: GET
url: http://localhost/bar
headers:
Accept-Language: en-us
Accept: application/json
peak_load: 15hps
Har to Yaml Converter
If you are attempting to load test a specific web page or the resources on a web page, you can use the Har to Yaml Converter. First you need to create a Har File from the page load, then use the Converter to generate a Yaml Config file.
config section
config: client: [request_timeout: duration] [headers: headers] [keepalive: duration] general: [auto_buffer_start_size: unsigned integer] [bucket_size: duration] [log_provider_stats: duration] [watch_transition_time: duration]
The config
section provides a means of customizing different parameters for the test. Parameters are divided into two subsections: client
which pertains to customizations for the HTTP client and general
which are other miscellaneous settings for the test.
client
request_timeout
Optional - A duration signifying how long a request will wait for a response before it times out. Defaults to 60 seconds.headers
Optional - Headers which will be sent in every request. A header specified in an endpoint will override a header specified here with the same key.keepalive
Optional - The keepalive duration that will be used on TCP socket connections. This is different from theKeep-Alive
HTTP header. Defaults to 90 seconds.
general
auto_buffer_start_size
Optional - The starting size for provider buffers which areauto
sized. Defaults to 5.bucket_size
Optional - A duration specifying how big each bucket should be for endpoints' aggregated stats. This also affects how often summary stats will be printed to the console. Defaults to 60 seconds.log_provider_stats
Optional - A boolean that enables/disabled logging to the console stats about the providers. Stats include the number of items in the provider, the limit of the provider, how many tasks are waiting to send into the provider and how many endpoints are waiting to receive from the provider. Logs data at thebucket_size
interval. Set tofalse
to turn off and not log provider stats. Defaults totrue
.watch_transition_time
Optional - A duration specifying how long of a transition there should be when going from an oldload_pattern
to a newload_pattern
. This option only has an affect when pewpew is running a load test with the--watch
command-line flag enabled. If this is not specified there will be no transition whenload_pattern
s change.
load_pattern section
load_pattern: - load_pattern_type [parameters]
* If a root level load_pattern
is not specified then each endpoint must specify its own load_pattern
.
The load_pattern
section defines the "shape" that the generated traffic will take over the course of the test. Individual endpoints can choose to specify their own load_pattern
(see the endpoints section).
load_pattern
is an array of load_pattern_types specifying how generated traffic for a segment of the test will scale up, down or remain steady. Currently the only load_pattern_type is linear
.
Example:
load_pattern:
- linear:
to: 100%
over: 5m
- linear:
to: 100%
over: 2m
linear
The linear load_pattern_type allows generated traffic to increase or decrease linearly. There are three parameters which can be specified for each linear segment:
-
from
Optional - A template indicating the starting point to scale from, specified as a percentage. Defaults to0%
if the current segment is the first entry inload_pattern
, or theto
value of the previous segment otherwise. Only variables defined in the vars section can be interopolated.A valid percentage is any unsigned number, integer or decimal, immediately followed by the percent symbol (
%
). Percentages can exceed100%
but cannot be negative. For example15.25%
or150%
. -
to
- A template indicating the end point to scale to, specified as a percentage. Only variables defined in the vars section can be interopolated. -
over
- The duration for how long the current segment should last.
vars section
vars: variable_name: definition
Variables are used where a single pre-defined value is needed in the test a test. The variable definition can be any valid YAML type where any strings will be interpreted as a template. In variable templates only environment variables can be interpolated in expressions.
Examples:
vars:
foo: bar
creates a single variabled named foo
where the value is the string "bar".
More complex values are automatically interpreted as JSON so the following:
vars:
bar:
a: 1
b: 2
c: 3
creates a variable named bar
where the value is equivalent to the JSON {"a": 1, "b": 2, "c": 3}
.
As noted above, environment variables can be interpolated in the string templates. So, the following:
vars:
password: ${PASSWORD}
would create a variable named password
where the value comes from the environment variable PASSWORD
.
providers section
providers: provider_name: provider_type: [parameters]
Providers are the means of providing data to an endpoint, including using data from the response of one endpoint in the request of another. The way providers handle data can be thought of as a FIFO queue--when an endpoint uses data from a provider it "pops" a value from the beginning of the queue and when an endpoint provides data to a provider it is "pushed" to the end of the queue. Every provider has an internal buffer with has a soft limit on how many items can be stored.
A provider_name is any string except for "request", "response", "stats" and "for_each", which are reserved.
Example:
providers:
session:
response:
auto_return: force
username:
file:
path: "usernames.csv"
repeat: true
There are four provider_types: file, response, list and range.
file
The file
provider_type reads data from a file. Every line in the file is read as a value. In the future, the ability to specify the format of the data (csv, json, etc) may be implemented. A file
provider has the following parameters:
-
path
- A template value indicating the path to the file on the file system. Unlike templates used elsewhere, only variables defined in the vars section can be interopolated. When a relative path is specified it is interpreted as relative to the config file. Absolute paths are supported though discouraged as they prevent the config file from being platform agnostic. -
repeat
- Optional A boolean value which whentrue
indicates when the providerfile
provider gets to the end of the file it should start back at the beginning. Defaults tofalse
. -
unique
- Optional A boolean value which whentrue
makes the provider a "unique" provider--meaning each item within the provider will be a unique JSON value without duplicates. Defaults tofalse
. -
auto_return
Optional - This parameter specifies that when this provider is used by a request, after a response is received the value is automatically returned to the provider. Valid options for this parameter areblock
,force
, andif_not_full
. See thesend
parameter under the endpoints.provides subsection for details on the effect of these options. -
buffer
Optional - Specifies the soft limit for a provider's buffer. This can be indicated with an integer greater than zero or the valueauto
. The valueauto
indicates that the soft limit can increase as needed. This happens after a provider is full then later becomes empty. Defaults toauto
. -
format
Optional - Specifies the format for the file. The format can be one ofline
(the default),json
, orcsv
.The
line
format will read the file one line at a time with each line ending in a newline (\n
) or a carriage return and a newline (\r\n
). Every line will attempt to be parsed as JSON, but if it is not valid JSON it will be a string. Note that a JSON object which spans multiple lines in the file, for example, will not parse into a single object.The
json
format will read the file as a stream of JSON values. Every JSON value must be self-delineating (an object, array or string), or must be separated by whitespace or a self-delineating value. For example, the following:{"a":1}{"foo":"bar"}47[1,2,3]"some text"true 56
Would parse into separate JSON values of
{"a": 1}
,{"foo": "bar"}
,47
,[1, 2, 3]
,"some text"
,true
, and56
.The
csv
format will read the file as a CSV file. Every non-header column will attempt to be parsed as JSON, but if it is not valid JSON it will be a string. Thecsv
parameter allows customization over how the file should be parsed. -
csv
Optional - When parsing a file using thecsv
format, this parameter provides extra customization on how the file should be parsed. This parameter is in the format of an object with key/value pairs. If the format is notcsv
this property will be ignored. The following sub-parameters are available:Sub-parameter Description comment Optional
Specifies a single-byte character which will mark a CSV record as a comment (ex.
#
). When not specified, no character is treated as a comment.delimiter Optional
Specifies a single-byte character used to separate columns in a record. Defaults to comma (
,
).double_quote Optional
A boolean that when enabled makes it so two quote characters can be used to escape quotes within a column. Defaults to
true
.escape Optional
Specifies a single-byte character which will be used to escape nested quote characters (ex.
\
). When not specified, escapes are disabled.headers Optional
Can be either a boolean value or a string. When a boolean, it indicates whether the first row in the file should be interpreted as column headers. When a string, the specified string is interpreted as a CSV record which is used for the column headers.
When headers are specified, each record served from the file will use the headers as keys for each column. When no headers are specified (the default), then each record will be returned as an array of values.
For example, with the following CSV file:
id,name 0,Fred 1,Wilma 2,Pebbles
If
headers
wastrue
than the following values would be provided (shown in JSON syntax):{"id": 0, name: "Fred"}
,{"id": 1, name: "Wilma"}
, and{"id": 3, name: "Pebbles"}
.If
headers
wasfalse
than the following values would be provided:[0, "Fred"]
,[1, "Wilma"]
, and[2, "Pebbles"]
.If
headers
wasfoo,bar
than the following values would be provided:{"foo": "id", "bar": "name"}
,{"foo": 0, "bar": "Fred"}
,{"foo": 1, "bar": "Wilma"}
, and{"foo": 3, "bar": "Pebbles"}
.terminator Optional
Specifies a single-byte character used to terminate each record in the CSV file. Defaults to a special value where
\r
,\n
, and\r\n
are all accepted as terminators.When specified, Pewpew becomes self-aware, unfolding a series of events which will ultimately lead to the end of the human race.
quote Optional
Specifies a single-byte character that will be used to quote CSV columns. Defaults to the double-quote character (
"
). -
random
Optional - A boolean indicating that each record in the file should be returned in random order. Defaults tofalse
.When enabled there is no sense of "fairness" in the randomization. Any record in the file could be used more than once before other records are used.
Example, the following:
providers:
- username:
- file:
path: "usernames.csv"
repeat: true
random: true
response
Unlike other provider_types response
does not automatically receive data from a source. Instead a response
provider is available to be a "sink" for data originating from an HTTP response. The response
provider has the following parameters.
auto_return
Optional - This parameter specifies that when this provider is used and an individual endpoint call concludes, the value it got from this provider should be sent back to the provider. Valid options for this parameter areblock
,force
, andif_not_full
. See thesend
parameter under the endpoints.provides subsection for details on the effect of these options.buffer
Optional - Specifies the soft limit for a provider's buffer. This can be indicated with an integer greater than zero or the valueauto
. The valueauto
indicates that if the provider's buffer becomes empty it will automatically increase the buffer size to help prevent the provider from becoming empty again in the future. Defaults toauto
.unique
- Optional A boolean value which whentrue
makes the provider a "unique" provider--meaning each item within the provider will be a unique JSON value without duplicates. Defaults tofalse
.
Example, the following:
providers:
- session:
- response:
buffer: 1000
auto_return: if_not_full
list
The list
provider_type creates a means of specifying an array of static values to be used as a provider.
A list
provider can be specified in two forms, either implicitly or explicitly. The explicit form has the following parameters:
random
Optional - A boolean indicating that entries in the values array should provided in random order. When combined withrepeat
there is no sense of "fairness" in the randomization. Defaults to false.repeat
Optional - A boolean indicating that the array should repeat infitely. Defaults to true.values
- An array of json values.unique
- Optional A boolean value which whentrue
makes the provider a "unique" provider--meaning each item within the provider will be a unique JSON value without duplicates. Defaults tofalse
.
Example, the following:
providers:
foo:
list:
- 123
- 456
- 789
is an example of an implicit list
provider. It creates a list
provider named foo
where the first value provided will be 123
, the second 456
, third 789
then for subsequent values it will start over at the beginning.
Example, the following:
providers:
foo:
list:
values:
- 123
- 456
- 789
random: true
is an example of an explicit list
provider. It creates a list
provider named foo
where the value provided will be randomized between the values listed.
range
The range
provider_type provides an incrementing sequence of numbers in a given range. A range
provider takes three optional parameters.
start
Optional - A whole number in the range of [-9223372036854775808, 9223372036854775807]. This indicates what the starting number should be for the range. Defaults to0
.end
Optional - A whole number in the range of [-9223372036854775808, 9223372036854775807]. This indicates what the end number should be for the range. This number is included in the range. Defaults to9223372036854775807
.step
Optional - A whole number in the range of [1, 65535]. This indicates how big each "step" in the range will be. Defaults to1
.repeat
Optional - A boolean which causes the range to repeat infinitely. Defaults tofalse
.unique
- Optional A boolean value which whentrue
makes the provider a "unique" provider--meaning each item within the provider will be a unique JSON value without duplicates. Defaults tofalse
.
Examples:
providers:
foo:
range: {}
Will use the default settings and foo
will provide the values 0
, 1
, 2
, etc. until it yields the end number (9223372036854775807
).
providers:
foo:
range:
start: -50
end: 100
step: 2
In this case foo
will provide the valuels -50
, -48
, -46
, etc. until it yields 100
.
loggers section
loggers: logger_name: [select: select] [for_each: for_each] [where: expression] to: template | stderr | stdout [pretty: boolean] [limit: integer] [kill: boolean]
Loggers provide a means of logging data to a file, stderr or stdout. Any string can be used for logger_name.
There are two types of loggers: plain loggers which have data logged to them by explicitly referencing them within an endpoints.log
subsection, and global loggers which are evaluated for every HTTP response.
In addition to the special variables "request", "response" and "stats", a logger also has access to a variable "error" which represents an error which happens during the test. It can be helpful to log such errors along with the request or response (if they are available) when diagnosing problems.
Loggers support the following parameters:
select
Optional - When specified, the logger becomes a global logger. See the endpoints.provides subsection for details on how to define a select.for_each
Optional - Used in conjunction withselect
on global loggers. See the endpoints.provides subsection for details on how to define a for_each.where
Optional - Used in conjunction withselect
on global loggers. See the endpoints.provides subsection for details on how to define a where expression.to
- A template specifying where this logger will send its data. Unlike templates which can be used elsewhere, only variables defined in the vars section can be interopolated. Values of "stderr" and "stdout" will log data to the respective process streams and any other string will log to a file with that name. When a file is specified, the file will be created if it does not exist or will be truncated if it already exists. When a relative path is specified it is interpreted as relative to the config file. Absolute paths are supported though discouraged as they prevent the config file from being platform agnostic.pretty
Optional - A boolean that indicates the value logged will have added whitespace for readability. Defaults tofalse
.limit
Optional - An unsigned integer which indicates the logger will only log the first n values sent to it.kill
Optional - A boolen that indicates the test will end when thelimit
is reached, or, if there is no limit, on the first message logged.
Example:
loggers:
httpErrors:
select:
request:
- request["start-line"]
- request.headers
- request.body
response:
- response["start-line"]
- response.headers
- response.body
where: response.status >= 400
limit: 5
to: http_err.log
pretty: true
Creates a global logger named "httpErrors" which will log to the file "http_err.log" the request and response of the first five requests which have an HTTP status of 400 or greater.
endpoints section
endpoints: - [declare: declare_subsection] [headers: headers] [body: body] [load_pattern: load_pattern_subsection] [method: method] [peak_load: peak_load] [tags: tags] url: template [provides: provides_subsection] [on_demand: boolean] [logs: logs_subsection] [max_parallel_requests: unsigned integer] [no_auto_returns: boolean] [request_timeout: duration]
The endpoints
section declares what HTTP endpoints will be called during a test.
-
declare
Optional - See the declare subsection -
headers
Optional - See headers -
body
Optional - See the body subsection -
load_pattern
Optional - See the load_pattern section -
method
Optional - A string representation for a valid HTTP method verb. Defaults toGET
-
peak_load
Optional* - A template representing what the "peak load" for this endpoint should be. The term "peak load" represents how much traffic is generated for this endpoint when the load_pattern reaches100%
. Aload_pattern
can go higher than100%
, so aload_pattern
of200%
, for example, would mean it would go double the definedpeak_load
. Only variables defined in the vars section can be interpolated.* While
peak_load
is marked as optional that is only true if the current endpoint has a provides_subsection, and in that case this endpoint is called only as frequently as needed to keep the buffers of the providers it feeds full.A valid
load_pattern
is a number--integer or decimal--followed by an optional space and the string "hpm" (meaning "hits per minute") or "hps" (meaning "hits per second").Examples:
50hpm
- 50 hits per minute300 hps
- 300 hits per second -
tags
Optional - Key/value string/template pairs.Tags are a series of key/value pairs used to distinguish each endpoint. Tags can be used to include certain endpoints in a
try
run, and also make it possible for a single endpoint to have its results statistics aggregated in multiple groups. Because tag values are templates only tags which can be resolved statically at the beginning of a test can be used with theinclude
flag of atry
run. A reference to a provider can cause a single endpoint to have multiple groups of tags. Each one of these groups will have its own statistics in the results. For example if an endpoint had the following tags:tags: name: Subscribe status: ${response.status}
A new group of aggregated stats will be created for every status code returned by the endpoint.
All endpoints have the following implicitly defined tags:
Name Description method
The HTTP method for the endpoint. url
The endpoint's url with any dynamic pieces being replaced with an asterisk. _id
The index of this endpoint in the list of endpoints, starting with 0. Of the implicitly defined tags only
url
can be overwritten which is helpful in cases such as when an entire url is dynamically generated and it would otherwise show up as*
. -
url
- A template specifying the fully qualified url to the endpoint which will be requested. -
provides
Optional - See the provides subsection -
on_demand
Optional - A boolean which indicates that this endpoint should only be called when another endpoint first needs data that this endpoint provides. If the endpoint has noprovides
it has no affect. -
logs
Optional - See the logs subsection -
max_parallel_requests
Optional - Limits how many requests can be "open" at any point for the endpoint. WARNING: this can cause coordinated omission, invalidating the test statistics. -
no_auto_returns
Optional - A boolean which indicates that anyauto_return
providers referenced within this endpoint will haveauto_return
disabled--meaning values pulled from those providers will not be automatically pushed back to the provider after a response is received. Defaults tofalse
. -
request_timeout
Optional - A duration signifying how long a request will wait for a response before it times out. When not specified, the value from the client config will be used.
Using providers to build a request
Providers can be referenced anywhere templates can be used and also in the declare
subsection.
body subsection
body: template
body: file: template
body: multipart: field_name: [headers: headers] body: template field_name: [headers: headers] body: file: template
A request body can be in one of three formats: a template to send a string as the body, a file which will send the contents of a file as the body, or a multipart body.
To send the contents of a file the body parameter should be an object with a single key of file
and the value being a template. Relative paths resolve relative to the config file used to execute pewpew.
To send a multipart body, the body parameter should be an object with a single key of multipart
and the value being an object of key/value pairs, where each key/value pair represents a piece of the multipart body. The keys represent the field_names used in an HTML form and the values are objects with the following properties:
headers
Optional - Headers that will be included with this piece of the multipart body. For example, it is not uncommon to include acontent-type
header with a piece of a multipart body which includes a file.body
- Either a template which will send a string value or an object with a single key offile
and the value being a template--which will send the contents of a file.
When a multipart body is used for an endpoint each request will have the content-type
header added with the value multipart/form-data
and the necessary boundary. If there is already a content-type
header set for the request it will be overwritten unless it is starts with multipart/
--then the necessary boundary will be appended. If a multipart/...
content-type
is manually set with the request, make sure to not include a boundary
parameter.
For any request which has a content-type
of multipart/form-data
, a Content-Disposition
header will be added to each piece in the multipart body with a value of form-data; name="field_name"
(where field_name is substituted with the piece's field_name). If a Content-Disposition
header is explicitly specified for a piece it will not be overwritten.
File example:
body:
file: a_file.txt
Multipart example:
body:
multipart:
foo:
headers:
Content-Type: image/jpeg
body:
file: foo.jpg
bar:
body: some text
declare subsection
declare: name: expression
A declare_subsection provides the ability to select multiple values from a single provider. Without using a declare_subsection, multiple references to a provider will only select a single value. For example, in:
endpoints:
- method: PUT
url: https://localhost/ship/${shipId}/speed
body: '{"shipId":"${shipId}","kesselRunTime":75}'
both references to the provider shipId
will resolve to the same value, which in many cases is desired.
The declare_subsection is in the format of key/value pairs where the value is an expression. Every key can function as a provider and can be interpolated just as a provider would be.
Example 1
endpoints:
- declare:
shipIds: collect(shipId, 3, 5)
method: DELETE
url: https://localhost/ships
body: '{"shipIds":${shipIds}}'
Calls the endpoint DELETE /ships
where the body is interpolated with an array of ship ids. shipIds
will have a length between three and five.
Example 2
endpoints:
- declare:
destroyedShipId: shipId
method: PUT
url: https://localhost/ship/${shipId}/destroys/${destroyedShipId}
Calls PUT
on an endpoint where shipId
and destroyedShipId
are interpolated to different values.
provides subsection
provides: provider_name: select: select [for_each: for_each] [where: expression] [send: block | force | if_not_full]
The provides_subsection is how data can be sent to a provider from an HTTP response. provider_name is a reference to a provider which must be declared in the root providers section. For every HTTP response that is received, zero or more values can be sent to the provider based upon the conditions specified.
Sending data to a provider is done with a SQL-like syntax. The select
, for_each
and where
sections use expressions to reference providers in addition to the special variables "request", "response" and "stats". "request" provides a means of accessing data that was sent with the request, "response" provides a means of accessing data returned with the response and "stats" give access to measurements about the request (currently only rtt
meaning round-trip time).
The request object has the properties start-line
, method
, url
, headers
, headers_all
and body
which provide access to the respective sections in the HTTP request. Similarly, the response object has the properties start-line
, headers
, headers_all
and body
in addition to status
which indicates the HTTP response status code. See this MDN article on HTTP messages for more details on the structure of HTTP requests and responses.
start-line
is a string and headers
is represented as a JSON object with key/value string pairs. In the event where a request or response has multiple headers with the same name, the headers_all
property can be used which is a JSON object where the header name is the key and the value an array of header values. Currently, body
in the request is always a string and body
in the response is parsed as a JSON value, when possible, otherwise it is a string. status
is a number. method
is a string and url
is an object with the same properties as the web URL object (see this MDN article).
-
select
- Determines the shape of the data sent to the provider.select
is interpreted as a JSON object where any string value is evaluated as an expression. -
for_each
Optional - Evaluatesselect
for each element in an array or arrays. This is specified as an array of expressions. Expressions can evaluate to any JSON data type, but those which evaluate to an array will have each of their elements iterated over andselect
is evaluated for each. When multiple expressions evaluate to an array then the cartesian product of the arrays is produced.The
select
andwhere
parameters can access the elements provided byfor_each
through the valuefor_each
just like accessing a value from a provider. Because afor_each
can iterate over multiple arrays, each element can be accessed by indexing into the array. For examplefor_each[1]
would access the element from the second array (indexes are referenced with zero based counting so0
represents the element in the first array). -
where
Optional - Allows conditionally sending data to a provider based on a predicate. This is an expression which evaluates to a boolean value, indicating whetherselect
should be evaluated for the current data set. -
send
Optional - Specify the behavior that should be used when sending data to a provider. Valid options for this parameter areblock
,force
, andif_not_full
. Defaults toif_not_full
if the endpoint has apeak_load
otherwiseblock
.block
indicates that if the provider's buffer is full, further endpoint calls will be blocked until there's room in the provider's buffer for the value. If an endpoint has multiple provides which areblock
, then the blocking will only wait for at least one of the providers' buffers to have room.force
indicates that the value will be sent to the provider regardless of whether its buffer is "full". This can make a provider's buffer exceed its soft limit.if_not_full
indicates that the value will be sent to the provider only if the provider is not full.
Example 1
With an HTTP response with the following body
{ "session": "abc123" }
and a provides section defined as:
provides:
session:
select: response.body.session
where: response.status < 400
the session
provider would be given the value "abc123"
if the status code was less than 400 otherwise nothing would be sent to the session
provider.
Example 2
With an HTTP response with the following body:
{
"characters": [
{
"type": "Human",
"id": "1000",
"name": "Luke Skywalker",
"friends": ["1002", "1003", "2000", "2001"],
"appearsIn": [4, 5, 6],
"homePlanet": "Tatooine",
},
{
"type": "Human",
"id": "1001",
"name": "Darth Vader",
"friends": ["1004"],
"appearsIn": [4, 5, 6],
"homePlanet": "Tatooine",
},
{
"type": "Droid",
"id": "2001",
"name": "R2-D2",
"friends": ["1000", "1002", "1003"],
"appearsIn": [4, 5, 6],
"primaryFunction": "Astromech",
}
]
}
and our provides section is defined as:
provides:
names:
select:
name: for_each[0].name
for_each:
- response.body.characters
The names
provider would be sent the following values: { "name": "Luke Skywalker" }
, { "name": "Darth Vader" }
, { "name": "R2-D2" }
.
Example 3
It is also possible to access the length of an array by accessing the length
property.
Using the same response data from example 2, with a provides section defined as:
provides:
friendsCount:
select:
id: for_each[0].id
count: for_each[0].friends.length
for_each:
- response.body.characters
The friendsCount
provider would be sent the following values: { "id": 1000, "count": 4 }
, { "id": 1001, "count": 1 }
, { "id": 2001, "count": 3 }
.
logs subsection
logs: logger_name: select: select [for_each: for_each] [where: expression]
The logs_subsection provides a means of sending data to a logger based on the result of an HTTP response. logger_name is a reference to a logger which must be declared in the root loggers section. It is structured in the same way as the provides_subsection except there is no explicit send parameter. When data is sent to a logger it has the same behavior as send: block
, which means logging data can potentially block further requests from happening if a logger were to get "backed up". This is unlikely to be a problem unless a large amount of data was consistently logged. It is also possible to log to the same logger multiple times in a single endpoint by repeating the logger_name with a new select
.
select
- Determines the shape of the data sent into the logger.for_each
Optional - Evaluatesselect
for each element in an array or arrays.where
Optional - Allows conditionally sending data into a logger based on a predicate.
Common types
Duration
A duration is an integer followed by an optional space and a string value indicating the time unit. Days can be specified with "d", "day" or "days", hours with "h", "hr", "hrs", "hour" or "hours", minutes with "m", "min", "mins", "minute" or "minutes", and seconds with "s", "sec", "secs", "second" or "seconds". Durations are templates, but can only be interpolated with variables defined in the vars section.
Examples:
1h
= 1 hour
30 minutes
= 30 minutes
Multiple duration pieces can be chained together to form more complex durations.
Examples:
1h45m30s
= 1 hour, 45 minutes and 30 seconds
4 hrs 15 mins
= 4 hours and 15 minutes
As seen above an optional space can be used to delimit the individual duration pieces.
Headers
Key/value pairs where the key is a string and the value is a template which specify the headers which will be sent with a request. Note that the host
and content-length
headers are added automatically to requests and any headers with the same name will be overwritten.
In an endpoints headers
sub-section, a YAML null
can be specified as the value which will unset any global header with that name. Because HTTP specs allow a header to be specified multiple times in a request, to override a global header it is necessary to specify the header twice in the endpoints headers
sub-section, once with a null
value and once with the new value. Not including the null
value will mean the request will have the header specified twice.
For example:
endpoints:
url: https://localhost/foo/bar
headers:
Authorization: Bearer ${sessionId}
specifies that an "Authorization" header will be sent with the request with a value of "Bearer " followed by a value coming from a provider named "sessionId".
Templates
Templates are special string values which can be interpolated with expressions. Interpolation is done by enclosing the expression in ${ }
. For example: ${foo}-bar
creates a string where a value from a provider named "foo" is interpolated before the string value -bar
. ${join(baz, ".")}
uses the join
helper to create a string value derived from a value coming from the provider "baz".
Expressions
Expressions are like a mini-scripting language embedded within Pewpew. Expressions only deal with very limited data types--the JSON types--strings, numbers, booleans, null values, arrays and objects.
Expressions are most commonly used to access data from a provider (via templates) or to transform data from an HTTP response to be sent into a provider. Expressions allow the traversal of object and array structures, evaluating boolean logic and basic mathematic operators. Helper functions extend the functionality of expressions further.
Operators
Operator | Description |
---|---|
== | Equal. Check that two values are equal to each other and produces a boolean. |
!= | Not equal. Check that two values are not equal to each other and produces a boolean. |
> | Greater than. Check that the left value is greater than the right and produces a boolean. |
< | Less than. Check that the left value is less than the right and produces a boolean. |
>= | Greater than or equal to. Check that the left value is greater than or equal to the right and produces a boolean. |
<= | Less than or equal to. Check that the left value is less than or equal to the right and produces a boolean. |
&& | And. Checks that two values are true and produces a boolean. |
|| | Or. Checks that one of two values is true and produces a boolean. |
+ | Add. Adds two numbers together producing a number. |
- | Subtract. Subtracts two numbers producing a number. |
* | Multiply. Multiplies two numbers producing a number. |
/ | Divide. Divides two numbers producing a number. |
% | Remainder. Provides the remainder after dividing two numbers. |
Helper functions
Function | Description |
---|---|
or
|
When used in a endpoints.declare subsection When used outside a declare subsection, See the endpoints.declare subsection for an example. |
encode(value, encoding)
|
Encode a string with the given encoding. value - any expression. The result of the expression will be coerced to a string if needed and then encoded with the specified encoding.
Example: with the value |
end_pad(value, min_length, pad_string)
|
Pads a string or number to be minimum length. Any added padding will be added to the end of the string. value - An expression whose value will be coerced to a string if needed. Example: with the value |
entries(value)
|
Returns the "entries" which make up value. For an object this will yield the object's key/value pairs. For an array it yields the array's indices and elements. For a string it yields the indices and the characters. For boolean and null types it yields back those same values. Examples With the value
would return
would return
would return
would return |
epoch(unit)
|
Returns time since the unix epoch. unit - A string literal of |
if(check, true_value, false_value) |
Does a boolean check against the first argument, if true the second argument is returned otherwise the third argument is returned. check - An expression which will be coerced to a boolean if needed. Example: |
or
|
Turns an array of values into a string or turns an object into a string. value - any expression. When the expression resolves to an array, the elements of the array are coerced to a string if needed and are then joined together to a single string using the specified separator. When the value resolves to an object and the three argument variant is used then the object will be turned into a string with the specified separators. In any other case value is coerced to a string and returned. Examples With the value With the value
or for an alternative, json-ified view: |
json_path(query)
|
Provides the ability to execute a json path query against an object and returns an array of values. The query must be a string literal. Example: |
match(string, regex)
|
Allows matching a string against a regex. Returns an object with the matches from the regex. Named matches are supported though any unnamed matches will be a number based on their position. Match If the first parameter is not a string it will be coerced into a string. Regex look arounds are not supported. Example: If a response body were the following:
Then the following expression:
Would return:
|
|
Selects the largest number out of a sequence of numbers. Each argument should be an expression which resolves to a number otherwise it will not be considered in determining the min. If no arguments are provided, or if none of the arguments resolve to a number, then |
|
Selects the smallest number out of a sequence of numbers. Each argument should be an expression which resolves to a number otherwise it will not be considered in determining the max. If no arguments are provided, or if none of the arguments resolve to a number, then |
|
Generates a random number between start (inclusive) and end (exclusive). Both start and end must be number literals. If both numbers are integers only integers will be generated within the specified range. If either number is a floating point number then a floating point number will be generated within the specified range. |
|
Creates an array of numeric values in the specified range. start - any expression resolving to a whole number. Represents the starting number for the range (inclusive). end - any expression resolving to a whole number. Represents the end number for the range (exclusive). Examples:
|
or
|
Creates an array of Example: |
start_pad(value, min_length, pad_string)
|
Pads a string or number to be minimum length. Any added padding will be added to the start of the string. value - an expression whose value will be coerced to a string if needed. Example: with the value |
replace(needle, haystack, replacer)
|
Replaces any instance of a string (needle) within a JSON value (haystack) with another string (replacer). This function will recursively check the JSON for any string value of needle and replace it with replacer. This includes checking within a nested object's key and value pairs, within arrays and within strings. needle - an expression whose value will be coerced to a string if needed. Example: with the value |
parseInt(value)
|
Converts a string or other value into an integer ( value - any expression. The result of the expression will be coerced to a string if needed and then converted. |
parseFloat(value)
|
Converts a string or other value into an floating point number ( value - any expression. The result of the expression will be coerced to a string if needed and then converted. |
Command-line options
There are two ways that Pewpew can execute: either a full load test or a try run. For reference here's the output of pewpew --help
:
The HTTP load test tool https://familysearch.github.io/pewpew
Usage: pewpew <COMMANND>
Commands:
run Runs a full load test
try Runs the specified endpoint(s) a single time for testing purposes
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Prints help information
-V, --version Prints version information
As signified in the above help output, there are two subcommands run
and try
.
Here's the output of pewpew run --help
:
Usage: pewpew run [OPTIONS] <CONFIG>
Arguments:
<CONFIG> Load test config file to use
Options:
-f, --output-format <FORMAT> Formatting for stats printed to stdout [default: human]
[possible values: human, json]
-d, --results-directory <DIRECTORY> Directory to store results and logs
-t, --start-at <START_AT> Specify the time the test should start at
-o, --stats-file <STATS_FILE> Specify the filename for the stats file
-s, --stats-file-format <FORMAT> Format for the stats file [default: json] [possible values:
json]
-w, --watch Watch the config file for changes and update the test
accordingly
-h, --help Prints help information
The -f
, --output-format
parameter allows changing the formatting of the stats which are printed to stdout.
The -d
, --results-directory
parameter will store the results file and any output logs in the specified directory. If the directory does not exist it is created.
The -w
, --watch
parameter makes pewpew watch the config file for changes. The watch_transition_time
general config option allows specifying a transition time for switching to the new load_pattern
s and peak_load
s.
While any part of a test can be updated, special care should be made when modifying or removing endpoints. This is because the aggregation of statistics happens based upon the numerical index of where it appears in the config file. If, for example, the first endpoint is no longer needed and it is simply removed from the test, that means what was the second endpoint is now the first and all of the statistics for that endpoint will begin aggregating in with the first endpoint's statistics. An alternative approach to removing the endpoint would be to set the peak_load
on the first endpoint to 0hpm
.
Here's the output of pewpew try --help
:
Usage: pewpew try [OPTIONS] <CONFIG>
Arguments:
<CONFIG> Load test config file to use
Options:
-o, --file <FILE> Send results to the specified file instead of stdout
-f, --format <FORMAT> Specify the format for the try run output [default: human]
[possible values: human, json]
-i, --include <INCLUDE> Filter which endpoints are included in the try run. Filters
work based on an endpoint's tags. Filters are specified in
the format "key=value" where "*" is a wildcard. Any
endpoint matching the filter is included in the test
-l, --loggers Enable loggers defined in the config file
-d, --results-directory <DIRECTORY> Directory to store logs (if enabled with --loggers)
-k, --skip-response-body Skips reponse body from output (try command)
-K, --skip-request-body Skips request body from output (try command)
-h, --help Prints help information
A try run will run one or more endpoints a single time and print out the raw HTTP requests and responses to stdout. By default all endpoints are included in the try run. This is useful for testing out a config file before running a full load test. When the --include
parameter is used, pewpew will automatically include any other endpoints needed to provide data for the explicitly included endpoints.
The -i
, --include
parameter allows the filtering of which endpoints are included in the try run. Filtering works based on an endpoint's tags
(see the tags
parameter in the endpoints section). The INCLUDE
pattern is specified in the format key=value
or key!=value
and an asterisk *
can be used as a wildcard. This parameter can be used multiple times to specify multiple patterns. An endpoint which matches any of the patterns is included in the try run.
The -l
, --loggers
flag specifies that any loggers defined in the config file should be enabled. By default, during a try run, loggers are disabled.
The -d
, --results-directory
parameter will store any log files (if the --loggers
flag is used) in the specified directory. If the directory does not exist it is created.
The -k
, --skip-response-body
parameter ensures that during a Try run, the response bodies aren't displayed. This can be particularly useful for debugging responses when the body is very long and not crucial for the debugging process.
The -K
, --skip-request-body
parameter ensures that during a Try run, the request bodies aren't displayed. This can be particularly useful for debugging requests when the body is very long and not crucial for the debugging process.
In both the run
and try
subcommands a config file is required.
environment variables
While most environment variables are passed on to the vars section of the config file, there are a few that affect the pewpew executable.
RUST_BACKTRACE
Optional - Enable display of the stack backtrace on errors. Providing any parameter (other than falsey/0) will enable this. Examples.RUST_BACKTRACE=1
orRUST_BACKTRACE=full
.RUST_LOG
Optional - A LevelFilter specifying what level for pewpew to log at. Allowed values areOff
,Error
,Warn
,Info
,Debug
, andTrace
. Default isError
. See Enable Logging for more complex options forRUST_LOG
.
Viewing Results
At the end of every test, Pewpew creates a stats-*.json
file with aggregated statistics from the test. To view the stats:
- Go here to open the results viewer.
- Drag the
stats-*.json
file onto the page.
Tuning a machine for a load test
Tuning a Linux machine
To get maximum throughput on Linux consider the following tweaks. NOTE: These tweaks have been tested in Ubuntu 18.04 and may be different in other distributions.
Append the following to /etc/sysctl.conf
:
fs.file-max = 999999
net.ipv4.tcp_rmem = 4096 4096 16777216
net.ipv4.tcp_wmem = 4096 4096 16777216
net.ipv4.ip_local_port_range = 1024 65535
Append the following to /etc/security/limits.conf
:
* - nofile 999999
Tuning a Windows machine
Using the registry editor, navigate to the following path:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\
Add (or edit if it exists) the entry MaxUserPort
as a DWORD
type and set the value as 65534
(decimal).
Alternatively, save the following as port.reg
and run the file:
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
"MaxUserPort"=dword:0000fffe
Found a bug?
Before filing a new issue on GitHub, it is helpful to create a reproducible test case. To do that, please do the following:
- Remove all
endpoints
,providers
,loggers
,vars
,load_pattern
s andconfig
options not needed to reproduce the issue. - When possible, replace
file
providers with a variable (vars
) or a conciselist
provider. If thefile
provider is required to replicate the issue, make the file as small as possible. - Replace all references to environment variables with actual values in
vars
. - Change the remaining
endpoints
to run against the Pewpew test server (see below). - Reduce
peak_load
s andload_pattern
s as much as possible while still reproducing the issue.
Using the Pewpew test server
The Pewpew test server provides a way to reproduce problems without generating load against others' servers. The test server is a simple, locally run HTTP server which is usually run from the same machine that Pewpew runs from.
To run the test server first download the latest test server binaries here, extract the archive and run the executable from the command-line. You should then see a message like:
Listening on port 2073
The port the test server uses can be configured by setting the PORT
environment variable. Here's an example run in bash:
$ PORT=8080 ./test-server
Listening on port 8080
The test server provides a single HTTP endpoint:
-
/
- this endpoint acts as an "echo server" and will return within the response body any data that was sent to it. This endpoint should only ever return a200
or204
status code. It accepts all HTTP methods though onlyGET
,POST
andPUT
can echo data back in the response. For theGET
method to echo data back, specify the echo data in theecho
query parameter. ForPOST
andPUT
simply put the data to be echoed back in the request body. The response will use the sameContent-Type
header from the response when specified, otherwise it will usetext/plain
.There is also an optional
wait
query parameter which defines a delay (specified in milliseconds) for how long the server should wait before responding.