Intermediate Representation (IR) Edit

Overview

OPA can compile policy queries into planned evaluation paths suitable for further compilation or interpretation. This document explains the structure and semantics of the intermediate representation (IR) used to represent these planned evaluation paths. Read this document if you want to write a compiler or interpreter for Rego.

Structure

This section explains the structure of policies compiled into the IR.

Policy

The root object emitted by the compiler is a Policy and contains the following top-level keys:

  • static is an object containing static data used by the compiled plans and functions.
  • plans is an object containing entrypoints to compiled evaluation paths.
  • funcs is an object containing functions supporting the compiled evaluation paths.

Static

The Static object contains static data required by the plans and functions. The static object also contains metadata that does not affect the semantics of the policy. The static object contains the following top-level keys:

  • strings is an array of string constants referenced by compiled statements in the plans and functions.
  • builtin_funcs is an array of function declarations representing built-in functions required by the compiled statements.
  • files is used for debugging purposes only. It is an array of filenames that were used during compilation.

Strings

The Strings array is a collection of string objects referenced by compiled statements in the policy. Strings are referenced by their index in the collection. Each string object contains the following fields:

  • value is the string constant value. The string may be any valid JSON string.

Built-in Functions

The Built-in Functions array is a collection of built-in function declarations. Each declaration represents a function that must be provided by the environment where the policy is eventually executed. Each built-in function contains the following fields:

  • name is the name of the function that must be provided.
  • decl is the type definition of the function.

Files

The Files array is a collection of static strings representing names of source files used during compilation. Filenames are referred to by their index in the files array.

Plans

The Plans object contains a collection of planned evaluation paths representing entrypoints to the policy. When users compile policies they supply the queries to expose as entrypoints. Each plan contains the following fields:

  • name is the entrypoint identifier, typically set to the path of the policy decision (e.g., authz/allow).
  • blocks is a collection of Block objects representing the compiled statements that define the entrypoint.

Functions

The Functions object contains a collection of function definitions that represent functions supporting the plans. Functions can be invoked by name inside of plans and other functions. Each function contains the following fields:

  • name is the function identifier referenced by call statements.
  • path is the function identifier referenced by dynamic call statements.
  • params is an ordered list of local variable identifiers representing function parameters. The parameters can be referenced inside of the blocks that define the function.
  • return is the local variable containing the return value of the function.
  • blocks is collection of Block objects representing the compiled statements that define the function.

Blocks

The Block object contains a sequence of Statements that must be executed in order until a statement terminating block execution is encountered or the end of the block is reached. Each block contains the following fields:

  • stmts is an array of Statement objects.

Statements

The Statement object represents an operation performed by the policy (e.g., function invocation, lookup, iteration, comparison, etc.) The structure is specific to each statement type but every statement contains the following fields:

  • type is a string value that identifies the type of the statement.
  • stmt is an object containing statement-specific fields.
  • file is the index of source filename where this statement originated.
  • row is the row in the source file where this statement originated.
  • col is the column in the source file where this statement originated.

See the Statement Definitions section for an explanation of the supported statement types.

Execution

This section explains the execution model for compiled policies.

Plan Execution

Compiled policies consist of one or more plans. Any plan can be invoked by name. If no name is supplied, the first plan in the policy should be executed. Plans consist of one or more Blocks that are executed in-order. Statements inside the blocks of a plan have implicit access to two local variables representing the input and data documents (0 and 1 respectively.) The final statement in every block inside of a plan is a ResultSetAddStmt statement that adds an object to an implicit result set. The object contains the key-value bindings representing the values of variables in the original query. If no ResultSetAddStmt statements are executed, the implicit result set is empty.

Function Execution

Compiled policies may contain zero or more functions. Any function can be invoked by name via the CallStmt statement or dynamically via the CallDynamicStmt statement. All functions are defined with two or more positional arguments. The first positional argument is a local variable representing the input document. The second positional argument is a local variable representing the data document. Function execution terminates when a ReturnLocalStmt statement is encountered. All functions include a final block that includes a ReturnLocalStmt.

Block Execution

Blocks are sequences of statements that are executed in order. Statements can be executed if all of the input parameters are defined. If any input parameter is undefined then the statement is undefined. The Statement Definitions section below indicates when a statement may be undefined. When a statement is undefined execution breaks to the end of the current block and resumes execution at the statement immediately following the block (which may be the beginning of another block.) When a statement is defined, all output parameters are defined. Execution halts if a statement raises an exception.

Statement Definitions

This section defines the statements that can be contained in plans and functions and explains the input and output parameters that each statement accepts. The set of valid parameter types are:

  • local is a 32-bit integer representing a local variable.
  • int32 is a 32-bit integer.
  • int64 is a 64-bit integer.
  • uint32 is a 32-bit unsigned integer.
  • string is an arbitrary-length unicode string.
  • array[...] represents a sequence of ... values.

In addition, parameters may be of type operand. The operand type represents a tagged union that can refer to a local variable, boolean constant, or string constant index:

{
    "type": "local" | "bool" | "string_index"
    "value": number | boolean | number
}

Local variables refer to values. The value types are any JSON value (i.e., null, true, false, number, string, array, and object) as well as sets (which are unordered value collections.)

ArrayAppendStmt

ParameterInput/OutputTypeDescription
arrayinputlocalThe array to append a value to.
valueinputoperandThe value to append to the array.

AssignIntStmt

ParameterInput/OutputTypeDescription
valueinputint64The integer value to assign to the target.
targetoutputlocalThe local variable to assign the integer to.

AssignVarOnceStmt

ParameterInput/OutputTypeDescription
sourceinputoperandThe value ato assign to the target.
targetoutputlocalThe local variable to assign the operand to.
This statement raises an exception if the target operand is already assigned.

AssignVarStmt

ParameterInput/OutputTypeDescription
sourceinputoperandThe value to assign to the target.
targetoutputlocalThe local variable to assign the operand to.

BlockStmt

ParameterInput/OutputTypeDescription
blocksinputBlocksThe nested blocks to execute.

BreakStmt

ParameterInput/OutputTypeDescription
indexinputuint32The index of the block to jump out of starting with zero representing the current block and incrementing by one for each outer block.

CallDynamicStmt

ParameterInput/OutputTypeDescription
pathinputarray[operand]The path of the function to invoke.
argsinputarray[local]The positional arguments to pass to the function.
resultoutputlocalThe local variable to assign the function return value to.

CallStmt

ParameterInput/OutputTypeDescription
funcinputstringThe name of the function to invoke.
argsinputarray[local]The positional arguments to pass to the function.
resultoutputlocalThe local variable to assign the function return value to.

DotStmt

ParameterInput/OutputTypeDescription
sourceinputoperandThe value to perform a lookup operation on.
keyinputoperandThe key to lookup in the source.
targetoutputlocalThe local variable to assign the result to.

This statement is undefined if the key does not exist in the source value.

EqualStmt

ParameterInput/OutputTypeDescription
ainputoperandThe first value to compare.
binputoperandThe second value to compare.

This statement is undefined if a does not equal b.

IsArrayStmt

ParameterInput/OutputTypeDescription
sourceinputoperandThe value to check.

This statement is undefined if source is not an array.

IsDefinedStmt

ParameterInput/OutputTypeDescription
sourceinputoperandThe value to check.

This statement is undefined if source is undefined.

IsObjectStmt

ParameterInput/OutputTypeDescription
sourceinputoperandThe value to check.

This statement is undefined if source is not an object.

IsUndefinedStmt

ParameterInput/OutputTypeDescription
sourceinputoperandThe value to check.

This statement is undefined if source is not undefined.

LenStmt

ParameterInput/OutputTypeDescription
sourceinputoperandThe value to compute the length for.
targetoutputlocalThe local variable to assign the length to.

MakeArrayStmt

ParameterInput/OutputTypeDescription
capacityinputint32The initial size of the array to pre-allocate.
targetoutputlocalThe local variable to assign the array value to.

MakeNullStmt

ParameterInput/OutputTypeDescription
targetoutputlocalThe local variable to assign the null value to.

MakeNumberIntStmt

ParameterInput/OutputTypeDescription
valueinputint64The integer value to initialize the target with.
targetoutputlocalThe local variable to assign the number to.

MakeNumberRefStmt

ParameterInput/OutputTypeDescription
indexinputint32The index of the string constant to construct the number with.
targetoutputlocalThe local variable to assign the number to.

MakeObjectStmt

ParameterInput/OutputTypeDescription
targetoutputlocalThe local variable to assign the object to.

MakeSetStmt

ParameterInput/OutputTypeDescription
targetoutputlocalThe local variable to assign the set to.

NopStmt

This statement is only used for debugging purposes.

NotEqualStmt

ParameterInput/OutputTypeDescription
ainputoperandThe first value to compare.
binputoperandThe second value to compare.

This statement is undefined if a is equal to b.

NotStmt

ParameterInput/OutputTypeDescription
blockinputBlockThe negated statement to execute.

This statement is undefined if the contained block is not undefined.

ObjectInsertOnceStmt

ParameterInput/OutputTypeDescription
keyinputoperandThe key to insert into the object.
valueinputoperandThe value to insert into the object.
objectinputlocalThe object to insert the key-value pair into.
This statement raises an exception if the object contains an existing key with a different value.

ObjectInsertStmt

ParameterInput/OutputTypeDescription
keyinputoperandThe key to insert into the object.
valueinputoperandThe value to insert into the object.
objectinputlocalThe object to insert the key-value pair into.

ObjectMergeStmt

ParameterInput/OutputTypeDescription
ainputlocalThe object to merge into.
binputlocalThe object to merge from.
targetoutputlocalThe local variable to assign the merged object to.

ResetLocalStmt

ParameterInput/OutputTypeDescription
targetoutputlocalThe local variable to reset.

ResultSetAddStmt

ParameterInput/OutputTypeDescription
valueinputlocalThe value to add to the result set.

ReturnLocalStmt

ParameterInput/OutputTypeDescription
sourceinputlocalThe value to return from the function.

ScanStmt

ParameterInput/OutputTypeDescription
sourceinputlocalThe value to scan.
keyoutputlocalThe local variable to assign keys to before executing the nested block.
valueoutputlocalThe local variable to assign values to before executing the nested block.
blockinputBlockThe nested block to execute repeatedly for each element in the collection.

This statement is undefined if source is a scalar value or empty collection.

SetAddStmt

ParameterInput/OutputTypeDescription
valueinputoperandThe value to insert into the set.
setinputlocalThe set to insert the value into.

WithStmt

ParameterInput/OutputTypeDescription
localinputlocalThe value to mutate in the context of the nested block.
pathinputarray[int32]The path of the nested document to replace with the value represented as an array of string constant indices.
valueinputoperandThe value to upsert.
blockinputBlockThe nested block to execute in the context of the mutation.

Test Suite

The OPA repository contains a test suite that is used internally to validate both the Go interpreter and the Wasm compiler. If you are implementing your own compiler or interpreter we highly recommend integrating the test suite into your own development environment so that your implementation can be verified to conform with OPA’s.

The test suite consists of a set of YAML files that each contain a set of test cases. Each test cases specifies a query, set of modules, data values, and expected outputs or expected error conditions.

To get started with the test suite, see the Hello World example.

The following examples show how the test suite is used internally: