---
title: Policy Language
sidebar_position: 3
---

OPA is purpose built for policy evaluation and uses its declarative language Rego
to reason about structured data like API requests, infrastructure-as-code files,
and configuration data. Rego lets you express desired rules and decisions as code,
and is designed to be easy to read and write while being optimized for fast policy evaluation.

Rego queries are assertions on data that can be used to define policies and make decisions
about whether data violates the expected state of your system. Rego was inspired by
[Datalog](https://en.wikipedia.org/wiki/Datalog) and extends it to support structured
document models such as JSON.

## Why use Rego?

Use Rego for defining policy that is easy to read and write.

Rego focuses on providing powerful support for referencing nested documents and
ensuring that queries are correct and unambiguous.

Rego is declarative so policy authors can focus on what queries should return
rather than how queries should be executed. These queries are simpler and more
concise than the equivalent in an imperative language.

Like other applications which support declarative query languages, OPA is able
to optimize queries to improve performance.

## Learning Rego

While reviewing the examples below, you might find it helpful to follow along
using the online [OPA playground](http://play.openpolicyagent.org). The
playground also allows sharing of examples via URL which can be helpful when
asking questions on the [OPA Slack](https://slack.openpolicyagent.org).
In addition to these official resources, you may also be interested to check
out the
<EcosystemFeatureLink feature="learning-rego">community learning materials and
tools</EcosystemFeatureLink>.

## The Basics

This section introduces the main aspects of Rego.

The simplest rule is a single expression and is defined in terms of a
[scalar value](#scalar-values). This `example` [package](#packages) defines a rule
called `pi` that contains the value of pi:

```rego
package example

pi := 3.14159
```

<RunSnippet command="data.example"/>

Rules can also be defined in terms of [composite values](#composite-values):

```rego
package example

rect := {"width": 2, "height": 4}
```

<RunSnippet id="rect.rego" command="data.example"/>

You can [compare](#equality-comparison-and-unification) two scalar or composite values, and when you do so you are
checking if the two values are the same JSON value.

```rego
package example

result := rect == {"width": 2, "height": 4}
```

<RunSnippet files="#rect.rego" command="data.example.result"/>

You can define a new concept using a rule. For example, `v` below is true if the
equality expression is true.
If we evaluate `v`, the result is `undefined` because the body of the rule never
evaluates to `true`. As a result, the document generated by the rule is not
defined.

```rego
package example

v if "hello" == "world"
```

<RunSnippet command="data.example.v"/>

Expressions that refer to undefined values are also undefined. This includes comparisons such as `!=`.

```rego
package example

v if "hello" == "world"

# also undefined
w if v != true
```

<RunSnippet command="data.example.w"/>

We can define rules in terms of [variables](#variables) as well:

```rego
package example

t if {
    x := 42
    y := 41
    x > y
}
```

<RunSnippet command="data.example.t"/>

When evaluating rule bodies, OPA searches for variable bindings that make all of
the expressions true. There may be multiple sets of bindings that make the rule
body true. The rule body can be understood intuitively as:

```
expression-1 AND expression-2 AND ... AND expression-N
```

The rule itself can be understood intuitively as:

```
rule-name IS value IF body
```

If the **value** is not specified, it defaults to the boolean value of **true**.

Rego [references](#references) help you refer to nested documents.
The rule `prod_exists` asserts that there exists (at least) one document
within `sites` where the `name` attribute equals `"prod"` using the [`some` keyword](#some-keyword).

```rego
package sites

sites := [{"name": "prod"}, {"name": "smoke1"}, {"name": "dev"}]

prod_exists if {
    some site in sites
    site.name == "prod"
}
```

<RunSnippet id="sites.rego" command="data.sites.prod_exists"/>

We can generalize the example above with a rule that defines a set document
instead of a boolean value. Here `site_names` is a set of all the site's name
values.

```rego
package sites

site_names contains name if {
    some site in sites
    name := site.name
}
```

<RunSnippet files="#sites.rego" command="data.sites.site_names"/>

This section introduced the main aspects of Rego. The rest of this document
walks those new to Rego through other important aspects of the language.
Please review the [Policy Reference](./policy-reference) for more detailed
information about the Rego language.

## Scalar Values

Scalar values are the simplest type of term in Rego. Scalar values can be [strings](#strings), numbers, booleans, or null.

Documents can be defined solely in terms of scalar values. This is useful for defining constants that are referenced in multiple places. For example:

```rego
package scalars

greeting   := "Hello"
max_height := 42
pi         := 3.14159
allowed    := true
location   := null
```

<RunSnippet command="data.scalars"/>

## Strings

Rego supports two different types of syntax for declaring strings. The first is likely to be the most familiar: characters surrounded by double quotes.
In such strings, certain characters must be escaped to appear in the string, such as double quotes themselves, backslashes, etc. See the [Policy Reference](./policy-reference/#grammar) for a formal definition.

The other type of string declaration is a raw string declaration. These are made of characters surrounded by backticks (`` ` ``), with the exception
that raw strings may not contain backticks themselves. Raw strings are what they sound like: escape sequences are not interpreted, but instead taken
as the literal text inside the backticks. For example, the raw string `` `hello\there` `` will be the text "hello\there", not "hello" and "here"
separated by a tab. Raw strings are particularly useful when constructing regular expressions for matching, as it eliminates the need to double
escape special characters.

A simple example is a regex to match a valid Rego variable. With a regular string, the regex is `"[a-zA-Z_]\\w*"`, but with raw strings, it becomes `` `[a-zA-Z_]\w*` ``.

## Composite Values

Composite values define collections. In simple cases, composite values can be treated as constants like [scalar values](#scalar-values):

```rego
package composite

cuboid := {"width": 3, "height": 4, "depth": 5}
```

<RunSnippet command="data.composite"/>

Composite values can also be defined in terms of [variables](#variables) or [references](#references). For example:

```rego
package composite_variables

a := 42
b := false
c := null
d := {"a": a, "x": [b, c]}
```

<RunSnippet command="data.composite_variables"/>

By defining composite values in terms of variables and references, rules can define abstractions over raw data and other rules.

### Arrays

Arrays are ordered collections of values. Arrays in Rego are zero-indexed, and may contain any value, including
variable references.

```rego
package arrays

pi := 3.14
arr := [1, "two", pi*2]
last := arr[2]
```

<RunSnippet command="data.arrays.last"/>

Use arrays when order matters or when duplicate values are required.

### Objects

Objects are unordered key-value collections. In Rego, any value type can be
used as an object key. For example, the following assignment maps port **numbers**
to a list of IP addresses (represented as strings).

```rego
package objects

ips_by_port := {
    80: ["10.0.0.1", "10.10.10.1"],
    443: ["10.1.1.1"],
}

result := ips_by_port[80]
```

<RunSnippet id="objects.rego" command="data.objects.result"/>

When Rego values are converted to JSON non-string object keys are marshalled
as strings (because JSON does not support non-string object keys).

```rego
package objects

# when queried, this will be converted to JSON
json := ips_by_port
```

<RunSnippet files="#objects.rego" command="data.objects.json"/>

### Sets

In addition to arrays and objects, Rego supports set values. Sets are unordered
collections of unique values. Just like other composite values, sets can be
defined in terms of scalars, variables, references, and other composite values.
For example:

```rego
package sets

s1 := {1,2,3}
s2 := {3,2,1}

sets_equal := s1 == s2
```

<RunSnippet command="data.sets"/>

:::warning
Set documents are collections of values without keys or order. OPA represents
sets as arrays when serializing to JSON or other formats that do not support a
set data type. The important distinction between sets and arrays or objects is
that sets are unkeyed while arrays and objects are keyed, i.e., you cannot refer
to the index of an element within a set.
:::

Sets share their curly-brace syntax with objects, and an empty object is
defined with `{}`, an empty set has to be constructed with a different syntax:

```rego
package sets

empty := count(set())
not_empty :=  count({1, 2, 3})
empty_object := count({})
not_equal := {} == {e| some e in []}
```

<RunSnippet command="data.sets"/>

:::warning
The [built-in function](#built-in-functions) `count({})` will still return `0` because `{}` is an empty object. However,
since `{}` is not a set, it will not equal `set()` or something that evaluates
to an empty set.
:::

## Variables

Variables are another kind of term in Rego. They appear in both the head and body of rules.

Variables appearing in the head of a rule can be thought of as input and output of the rule. Unlike many programming languages, where a variable is either an input or an output, in Rego a variable is simultaneously an input and an output. If a query supplies a value for a variable, that variable is an input, and if the query does not supply a value for a variable, that variable is an output.

For example:

```rego
package variables

sites := [
    {"name": "prod"},
    {"name": "smoke1"},
    {"name": "dev"}
]

# name is a var in the head and body
q contains name if {
    # site is a var only used in the body
    some site in sites
    name := site.name
}
```

<RunSnippet id="vars.rego" command="data.variables.q"/>

In this case, we evaluate `q` with a variable `x` (which is not bound to a value). As a result, the query returns all of the values for `x` and all of the values for `q[x]`, which are always the same because `q` is a set.

```rego
package variables

result := { x | q[x] }
```

<RunSnippet files="#vars.rego" command="data.variables.result"/>

On the other hand, if we evaluate `q` with an input value for `name` we can determine whether `name` exists in the document defined by `q`:

```rego
package variables

result := q["dev"]
```

<RunSnippet files="#vars.rego" command="data.variables.result"/>

Variables appearing in the head of a rule must also appear in a non-negated equality expression within the same rule. This property ensures that if the rule is evaluated and all of the expressions evaluate to true for some set of variable bindings, the variable in the head of the rule will be defined.

## References

References are used to access nested documents.

<details>

<summary>The examples that follow use some data defined in `data.example.*` here</summary>

```rego
package example

sites := [
    {
        "region": "east",
        "name": "prod",
        "servers": [
            {
                "name": "web-0",
                "hostname": "hydrogen"
            },
            {
                "name": "web-1",
                "hostname": "helium"
            },
            {
                "name": "db-0",
                "hostname": "lithium"
            }
        ]
    },
    {
        "region": "west",
        "name": "smoke",
        "servers": [
            {
                "name": "web-1000",
                "hostname": "beryllium"
            },
            {
                "name": "web-1001",
                "hostname": "boron"
            },
            {
                "name": "db-1000",
                "hostname": "carbon"
            }
        ]
    },
    {
        "region": "west",
        "name": "dev",
        "servers": [
            {
                "name": "web-dev",
                "hostname": "nitrogen"
            },
            {
                "name": "db-dev",
                "hostname": "oxygen"
            }
        ]
    }
]

apps := [
    {
        "name": "web",
        "servers": ["web-0", "web-1", "web-1000", "web-1001", "web-dev"]
    },
    {
        "name": "mysql",
        "servers": ["db-0", "db-1000"]
    },
    {
        "name": "mongodb",
        "servers": ["db-dev"]
    }
]

containers := [
    {
        "image": "redis",
        "ipaddress": "10.0.0.1",
        "name": "big_stallman"
    },
    {
        "image": "nginx",
        "ipaddress": "10.0.0.2",
        "name": "cranky_euclid"
    }
]
```

<RunSnippet id="example_data.rego"/>

</details>

The simplest reference contains no variables. For example, the following reference returns the hostname of the second server in the first site document from our example data:

```rego
package references

import data.example.sites

result := sites[0].servers[1].hostname
```

<RunSnippet files="#example_data.rego" command="data.references.result"/>

References are typically written using the “dot-access” style. The canonical form does away with `.` and closely resembles dictionary lookup in a language such as Python:

```rego
package references

import data.example.sites

result := sites[0]["servers"][1]["hostname"]
```

<RunSnippet files="#example_data.rego" command="data.references.result"/>

Both forms are valid, however, the dot-access style is typically more readable. Note that there are four cases where brackets must be used:

1. String keys containing characters other than `[a-z]`, `[A-Z]`, `[0-9]`, or `_` (underscore).
2. Non-string keys such as numbers, booleans, and null.
3. Variable keys which are described later.
4. Composite keys which are described later.

The prefix of a reference identifies the root document for that reference. In
the example above this is `sites`. The root document may be:

- a local variable inside a rule.
- a rule inside the same package.
- a document stored in OPA.
- a documented temporarily provided to OPA as part of a transaction.
- an array, object or set, e.g. `[1, 2, 3][0]`.
- a function call, e.g. `split("a.b.c", ".")[1]`.
- a [comprehension](#comprehensions).

### Variable Keys

References can include variables as keys. References written this way are used to select a value from every element in a collection.

The following reference will select the hostnames of all the servers in our
example data:

```rego
package references

import data.example.sites

result := {h| h := sites[i].servers[j].hostname}
```

<RunSnippet files="#example_data.rego" command="data.references.result"/>

Conceptually, this is the same as the following imperative code:

```python
def hostnames(sites):
    result = set()

    for site in sites:
        for server in site.servers:
            result.add(server.hostname)

    return result
```

In the reference above, we effectively used variables named `i` and `j` to iterate the collections. If the variables are unused outside the reference, we prefer to replace them with an underscore (`_`) character. The reference above can be rewritten as:

```rego
sites[_].servers[_].hostname
```

The underscore is special because it cannot be referred to by other parts of the rule, e.g., the other side of the expression, another expression, etc. The underscore can be thought of as a special iterator. Each time an underscore is specified, a new iterator is instantiated.

:::info
Under the hood, OPA translates the `_` character to a unique variable name that does not conflict with variables and rules that are in scope.
:::

### Composite Keys

References can include [composite values](#composite-values) as keys if the key is being used to refer into a set. Composite keys may not be used in refs
for base data documents, they are only valid for references into virtual documents.

This is useful for checking for the presence of composite values within a set, or extracting all values within a set matching some pattern.
For example:

```rego
package composite_key

s := {[1, 2], [1, 4], [2, 6]}

result := {
 "exists": {e| e:= s[[1, 2]] },
 "matching": {e| e:= s[[1, _]] }
}
```

<RunSnippet command="data.composite_key.result"/>

### Multiple Expressions

Rules are often written in terms of multiple expressions that contain references to documents. In the following example, the rule defines a set of arrays where each array contains an application name and a hostname of a server where the application is deployed.

```rego
package multiple_exprs

import data.example.apps
import data.example.sites

apps_and_hostnames contains [name, hostname] if {
	some i, j, k
	name := apps[i].name
	server := apps[i].servers[_]
	sites[j].servers[k].name == server
	hostname := sites[j].servers[k].hostname
}
```

<RunSnippet files="#example_data.rego" command="data.multiple_exprs.apps_and_hostnames"/>

Don't worry about understanding everything in this example right now. There are just two important points:

1. Several variables appear more than once in the body. When a variable is used in multiple locations, OPA will only produce documents for the rule with the variable bound to the same value in all expressions.
2. The rule is joining the `apps` and `sites` documents implicitly. In Rego (and other languages based on Datalog), joins are implicit.

### Self-Joins

Using a different key on the same array or object provides the equivalent of self-join in SQL. For example, the following rule defines a document containing apps deployed on the same site as `"mysql"`:

```rego
package multiple_exprs

import data.example.apps
import data.example.sites

same_site contains apps[k].name if {
	some i, j, k
	apps[i].name == "mysql"

	server := apps[i].servers[_]
	server == sites[j].servers[_].name

	other_server := sites[j].servers[_].name
	server != other_server

	other_server == apps[k].servers[_]
}
```

<RunSnippet files="#example_data.rego" command="data.multiple_exprs.same_site"/>

## Comprehensions

Comprehensions provide a concise way of building [composite values](#composite-values) from sub-queries.

Like [rules](#rules), comprehensions consist of a head and a body. The body of a comprehension can be understood in exactly the same way as the body of a rule, that is, one or more expressions that must all be true in order for the overall body to be true. When the body evaluates to true, the head of the comprehension is evaluated to produce an element in the result.

The body of a comprehension is able to refer to variables defined in the outer body. For example:

```rego
package comprehensions

import data.example.apps
import data.example.sites

region := "west"
names := [name | sites[i].region == region; name := sites[i].name]
```

<RunSnippet files="#example_data.rego" command="data.comprehensions.names"/>

In the above query, the second expression contains an [array comprehension](#array-comprehensions) that refers to the `region` variable. The region variable will be bound in the outer body.

> When a comprehension refers to a variable in an outer body, OPA will reorder expressions in the outer body so that variables referred to in the comprehension are bound by the time the comprehension is evaluated.

Comprehensions are similar to the same constructs found in other languages like Python. For example, we could write the above comprehension in Python as follows:

```python
# Python equivalent of Rego comprehension shown above.
names = [site.name for site in sites if site.region == "west"]
```

Comprehensions are often used to group elements by some key. A common use case for comprehensions is to assist in computing aggregate values (e.g., the number of containers running on a host).

### Array Comprehensions

Array comprehensions build array values out of sub-queries. Array comprehensions have the form:

```
[ <term> | <body> ]
```

For example, the following rule defines an object where the keys are application names and the values are hostnames of servers where the application is deployed. The hostnames of servers are represented as an array.

```rego
package comprehensions

import data.example.apps
import data.example.sites

app_to_hostnames[app_name] := hostnames if {
    app := apps[_]
    app_name := app.name
    hostnames := [hostname | name := app.servers[_]
                            s := sites[_].servers[_]
                            s.name == name
                            hostname := s.hostname]
}
```

<RunSnippet files="#example_data.rego" command="data.comprehensions.app_to_hostnames"/>

### Object Comprehensions

Object comprehensions build object values out of sub-queries. Object comprehensions have the form:

```
{ <key>: <term> | <body> }
```

We can use object comprehensions to write the rule from above as a comprehension instead:

```rego
package comprehensions

import data.example.apps
import data.example.sites

app_to_hostnames := {app.name: hostnames |
    app := apps[_]
    hostnames := [hostname |
                    name := app.servers[_]
                    s := sites[_].servers[_]
                    s.name == name
                    hostname := s.hostname]
}
```

<RunSnippet files="#example_data.rego" command="data.comprehensions.app_to_hostnames"/>

Object comprehensions are not allowed to have conflicting entries, similar to rules:

```rego
package comprehensions

conflicting := { "foo": i |
    some i in [1, 2]
}
```

<RunSnippet command="data.comprehensions.conflicting"/>

### Set Comprehensions

Set comprehensions build a set values out of sub-queries. Set comprehensions have
the following form, where terms are selected from the body to be set members:

```
{ <term> | <body> }
```

For example, to construct a set from an array, we can use `e` where `e` is an
element in the array:

```rego
package comprehensions

my_array := [1, 1, 2, 2, 3, 3]
my_set := {e | some e in my_array}
```

<RunSnippet command="data.comprehensions.my_set"/>

## Rules

Rules define the content of [virtual documents](./philosophy#how-does-opa-work) in
OPA. When OPA evaluates a rule, we say OPA _generates_ the content of the
document that is defined by the rule.

The sample code in this section make use of the data defined in [References](#references).

### Generating Sets

The following rule defines a set containing the hostnames of all servers in the
example data:

```rego
package sets

import data.example.sites

hostnames contains name if {
    name := sites[_].servers[_].hostname
}
```

<RunSnippet files="#example_data.rego" command="data.sets.hostnames"/>

When we query for the content of our new `hostnames` rule we see the same data
as we would if we queried using the `sites[_].servers[_].hostname` reference
directly.

This example introduces a few important aspects of Rego.

First, the rule defines a set document where the contents are defined by the
variable `name`. We know this rule defines a set document because the head only
includes a key. All rules have the following form (where key, value, and body
are all optional):

```
<name> <key>? <value>? <body>?
```

:::tip
If the value had been set, this would create an object instead.

For a more formal definition of the rule syntax, see the [Policy Reference](./policy-reference/#grammar) document.
:::

Second, the `sites[_].servers[_].hostname` fragment selects the `hostname`
attribute from all the objects in the `servers` collection. From reading the
fragment in isolation we cannot tell whether the fragment refers to arrays or
objects. We only know that it refers to a collections of values.

Third, the `name := sites[_].servers[_].hostname` expression binds the value of the `hostname` attribute to the variable `name`, which is also declared in the head of the rule.

### Generating Objects

Rules that define objects are very similar to rules that define sets. Note that
object rules have a key and a value in the head of the rule.

```rego
package objects

import data.example.apps
import data.example.sites

apps_by_hostname[hostname] := app if {
    some i
    server := sites[_].servers[_]
    hostname := server.hostname
    apps[i].servers[_] == server.name
    app := apps[i].name
}
```

<RunSnippet files="#example_data.rego" command="data.objects.apps_by_hostname"/>

The rule above defines an object that maps hostnames to app names. The main difference between this rule and one which defines a set is the rule head: in addition to declaring a key, the rule head also declares a value for the document.

### Incremental Definitions

A rule may be defined multiple times with the same name. When a rule is defined
this way, we refer to the rule definition as _incremental_ because each
definition is additive. The document produced by incrementally defined rules is
the union of the documents produced by each individual rule.

An incrementally defined rule can be intuitively understood as `<rule-1> OR <rule-2> OR ... OR <rule-N>`.

For example, we can write a rule that abstracts over our `servers` and
`containers` data as `instances`:

```rego
package incremental

import data.example.sites
import data.example.containers

instances contains instance if {
    server := sites[_].servers[_]
    instance := {"address": server.hostname, "name": server.name}
}

instances contains instance if {
    some container in containers
    instance := {"address": container.ipaddress, "name": container.name}
}
```

<RunSnippet files="#example_data.rego" command="data.incremental.instances"/>

### Complete Definitions

In addition to rules that _partially_ define sets and objects, Rego also
supports so-called _complete_ definitions of any type of document. Rules provide
a complete definition by omitting the key in the head. Complete definitions are
commonly used for constants:

```rego
pi := 3.14159
```

:::info
Rego allows authors to omit the body of rules. If the body is omitted, it defaults to true.
:::

Documents produced by rules with complete definitions can only have one value at
a time. If evaluation produces multiple values for the same document, an error
will be returned.

For example:

```rego showLineNumbers=true
package complete

# Define user "bob" for test input.
user := "bob"

# Define two sets of users: power users and restricted users. Accidentally
# include "bob" in both.
power_users := {"alice", "bob", "fred"}
restricted_users := {"bob", "kim"}

# Power users get 32GB memory.
max_memory := 32 if power_users[user]

# Restricted users get 4GB memory.
max_memory := 4 if restricted_users[user]
```

<RunSnippet id="complete.rego" command="data.complete"/>

OPA returns an error in this case because the rule definitions are in _conflict_.
The value produced by max_memory cannot be 32 and 4 **at the same time**.

The documents produced by rules with complete definitions may still be undefined:

```rego
package undefined

import data.complete.max_memory

result := m if {
    m := max_memory with data.complete.user as "johnson"
}
```

<RunSnippet files="#complete.rego" command="data.undefined.result"/>

In some cases, having an undefined result for a document is not desirable. In
those cases, policies can use the [`default` keyword](#default-keyword) to
provide a fallback value.

### Rule Heads containing References

As a shorthand for defining nested rule structures, it's valid to use references as rule heads.
This module defines _two complete rules_, `data.example.fruit.apple.seeds` and `data.example.fruit.orange.color`:

```rego
package rule_refs

fruit.apple.seeds := 12

fruit.orange.color := "orange"
```

<RunSnippet command="data.rule_refs"/>

#### Variables in Rule Head References

Any term, except the very first, in a rule head's reference can be a variable.
These variables can be assigned within the rule, just as for any other partial
rule, to dynamically construct a nested collection of objects.

```json title="input.json"
{
  "users": [
    {
      "id": "alice",
      "role": "employee",
      "country": "USA"
    },
    {
      "id": "bob",
      "role": "customer",
      "country": "USA"
    },
    {
      "id": "dora",
      "role": "admin",
      "country": "Sweden"
    }
  ],
  "admins": [
    {
      "id": "charlie"
    }
  ]
}
```

<RunSnippet id="input.user-admin.json"/>

```rego
package roles

# A partial object rule that converts a list of users to a mapping by "role" and then "id".
users_by_role[role][id] := user if {
	some user in input.users
	id := user.id
	role := user.role
}

# Partial rule with an explicit "admin" key override
users_by_role.admin[id] := user if {
	some user in input.admins
	id := user.id
}

# Leaf entries can be partial sets
users_by_country[country] contains user.id if {
	some user in input.users
	country := user.country
}
```

<RunSnippet files="#input.user-admin.json" command="data.roles"/>

##### Conflicts

The first variable declared in a rule head's reference divides the reference in
a leading constant portion and a trailing dynamic portion. Other rules are
allowed to overlap with the dynamic portion (dynamic extent) without causing a
compile-time conflict.

```rego showLineNumbers=true
package example

# R1
p[x].r := y if {
	x := "q"
	y := 1
}

# R2
p.q.r := 2
```

<RunSnippet command="data.example"/>

In the above example, rule `R2` overlaps with the dynamic portion of rule `R1`'s
reference (`[x].r`), which is allowed at compile-time, as these rules aren't
guaranteed to produce conflicting output.
However, as `R1` defines `x` as `"q"` and `y` as `1`, a conflict will be
reported at evaluation-time.

Conflicts are detected at compile-time, where possible, between rules even if
they are within the dynamic extent of another rule.

```rego showLineNumbers=true
package example

# R1
p[x].r := y if {
	x := "foo"
	y := 1
}

# R2
p.q.r := 2

# R3
p.q.r.s := 3
```

<RunSnippet command="data.example"/>

Above, `R2` and `R3` are within the dynamic extent of `R1`, but are in conflict
with each other, which is detected at compile-time (note the `rego_type_error`,
rather than `eval_conflict_error` seen above).

Rules are also not allowed to overlap with object values of other rules:

```rego showLineNumbers=true
package example

# R1
p.q.r := {"s": 1}

# R2
p[x].r.t := 2 if {
	x := "q"
}
```

<RunSnippet command="data.example"/>

In the above example, `R1` is within the dynamic extent of `R2` and a conflict
cannot be detected at compile-time. However, at evaluation-time `R2` will
attempt to inject a value under key `t` in an object value defined by `R1`. This
is a conflict, as rules are not allowed to modify or replace values defined by
other rules.
We won't get a conflict if we update the policy to the following:

```rego
package example

# R1
p.q.r.s := 1

# R2
p[x].r.t := 2 if {
	x := "q"
}
```

<RunSnippet command="data.example"/>

As `R1` is now instead defining a value within the dynamic extent of `R2`'s reference, which is allowed:

### Functions

Rego supports user-defined functions that can be called with the same semantics as [built-in functions](#built-in-functions). They have access to both the [the data document](./philosophy/#the-opa-document-model) and [the input document](./philosophy/#the-opa-document-model).

For example, the following function will return the result of trimming the spaces from a string and then splitting it by periods.

```rego
package functions

trim_and_split(s) := x if {
     t := trim(s, " ")
     x := split(t, ".")
}

result := trim_and_split("   foo.bar ")
```

<RunSnippet command="data.functions.result"/>

Functions may have an arbitrary number of inputs, but exactly one output. Function arguments may be any kind of term. For example, suppose we have the following function:

```rego
package functions

foo([x, {"bar": y}]) := z if {
    z := {x: y}
}
```

The following calls would produce the logical mappings given:

| Call                                                  | `x`    | `y`                         |
| ----------------------------------------------------- | ------ | --------------------------- |
| `z := foo(a)`                                         | `a[0]` | `a[1].bar`                  |
| `z := foo(["5", {"bar": "hello"}])`                   | `"5"`  | `"hello"`                   |
| `z := foo(["5", {"bar": [1, 2, 3, ["foo", "bar"]]}])` | `"5"`  | `[1, 2, 3, ["foo", "bar"]]` |

If you need multiple outputs, write your functions so that the output is an array, object or set
containing your results. If the output term is omitted, it is equivalent to having the output term
be the literal `true`. Furthermore, `if` can be used to write shorter definitions. That is, the
function declarations below are equivalent:

```rego
package functions

f(x) if { x == "foo" }
f(x) if x == "foo"

f(x) := true if { x == "foo" }
f(x) := true if x == "foo"
```

The outputs of user functions have some additional limitations, namely that they must resolve to a single value. If you write a function that has multiple possible bindings for an output variable, you will get a conflict error:

```rego showLineNumbers=true
package functions

p(x) := y if {
    y := x[_]
}

result := p([1, 2, 3])
```

<RunSnippet command="data.functions.result"/>

It is possible in Rego to define a function more than once, to achieve a conditional selection of which function to execute:

Functions can be defined incrementally.

```rego
package incremental

q("single", x) := y if {
    y := x
}

q("double", x) := y if {
    y := x*2
}
```

<RunSnippet id="incremental.rego"/>

```rego
package incremental

result := q("single", 2)
```

<RunSnippet files="#incremental.rego" command="data.incremental.result"/>

```rego
package incremental

result := q("double", 2)
```

<RunSnippet files="#incremental.rego" command="data.incremental.result"/>

A given function call will execute all functions that match the signature given. If a call matches multiple functions, they must produce the same output, or else a conflict error will occur:

```rego showLineNumbers=true
package incremental

r(1, x) := y if {
    y := x
}

r(x, 2) := y if {
    y := x*4
}

result := r(1, 2)
```

<RunSnippet command="data.incremental.result"/>

On the other hand, if a call matches no functions, then the result is undefined.

```rego
package imcremental

s(x, 2) := y if {
    y := x * 4
}

result := s(5, 3)
```

<RunSnippet command="data.incremental.result"/>

#### Function overloading

Rego does not support the overloading of functions by the number of
parameters. If two function definitions are given with the same function name
but different numbers of parameters, a compile-time type error is generated.

```rego showLineNumbers=true
package function_overloading_error

r(x) := result if {
    result := 2*x
}

r(x, y) := result if {
    result := 2*x + 3*y
}
```

<RunSnippet command="data.function_overloading_error"/>

In the unusual case that it is critical to use the same name, the function could
be made to take the list of parameters as a single array. However, this approach
is not generally recommended because it sacrifices some helpful compile-time
checking and can be quite error-prone.

```rego
package function_overloading_array

r(params) := result if {
    count(params) == 1
    result := 2*params[0]
}

r(params) := result if {
    count(params) == 2
    result := 2*params[0] + 3*params[1]
}

result := [r([10]), r([10, 1])]
```

<RunSnippet command="data.function_overloading_array.result"/>

## Negation

To generate the content of a [virtual document](./philosophy#how-does-opa-work), OPA attempts to bind variables in the body of the rule such that all expressions in the rule evaluate to True.

This generates the correct result when the expressions represent assertions about what states should exist in the data stored in OPA. In some cases, you want to express that certain states _should not_ exist in the data stored in OPA. In these cases, negation must be used.

For safety, a variable appearing in a negated expression must also appear in another non-negated equality expression in the rule.

> OPA will reorder expressions to ensure that negated expressions are evaluated after other non-negated expressions with the same variables. OPA will reject rules containing negated expressions that do not meet the safety criteria described above.

The simplest use of negation involves only scalar values or variables and is equivalent to complementing the operator:

```rego
package negation

t if {
    greeting := "hello"
    not greeting == "goodbye"
}
```

<RunSnippet command="data.negation.t"/>

Negation is required to check whether some value _does not_ exist in a collection: `not p["foo"]`. That is not the same as complementing the `==` operator in an expression `p[_] == "foo"` which yields `p[_] != "foo"`
which means for any item in `p`, return true if the item is not `"foo"`. See more details [here](/projects/regal/rules/bugs/not-equals-in-loop).

For example, we can write a rule that defines a document containing names of
apps not deployed on the `"prod"` site:

```rego
package negation

import data.example.apps
import data.example.sites

prod_servers contains name if {
    some site in sites
    site.name == "prod"
    some server in site.servers
    name := server.name
}

apps_in_prod contains name if {
    some site in sites
    some app in apps
    name := app.name
    some server in app.servers
    prod_servers[server]
}

# Click evaluate to see the result
apps_not_in_prod contains name if {
    some app in apps
    name := app.name
    not apps_in_prod[name]
}
```

<RunSnippet files="#example_data.rego" command="data.negation.apps_not_in_prod"/>

:::info
Logical OR/AND in Rego is structured differently from other languages you might
be familiar with. See the notes here on [logical OR](../docs/#logical-or) or
here for [logical AND](../docs/#basic-syntax) for more details.
:::

:::tip
Have a look at the other examples for
[`not`](./policy-reference/keywords/not) in the examples section to learn more
about using this keyword.
:::

## Universal Quantification (FOR ALL)

Rego allows for several ways to express universal quantification.

For example, imagine you want to express a policy that says in natural language:

```
There must be no apps named "bitcoin-miner".
```

The most expressive way to state this in Rego is using the [`every` keyword](#every-keyword):

```rego
no_bitcoin_miners_using_every if {
    every app in apps {
        app.name != "bitcoin-miner"
    }
}
```

Variables in Rego are _existentially quantified_ by default: when you write

```rego
array := ["one", "two", "three"]
array[i] == "three"
```

The query will be satisfied **if there is an `i`** such that the query's
expressions are simultaneously satisfied.

Therefore, there are other ways to express the desired policy.

For this policy, you can also define a rule that finds if there exists a bitcoin-mining
app (which is easy using the [`some` keyword](#some-keyword)). And then you use negation to check
that there is NO bitcoin-mining app. Technically, you're using a [negation](#negation) and
an [existential quantifier](#in-keyword), which is logically the same as a universal
quantifier.

For example:

```rego
package negation

import data.example.apps

no_bitcoin_miners_using_negation if not any_bitcoin_miners

any_bitcoin_miners if {
    some app in apps
    app.name == "bitcoin-miner"
}
```

<RunSnippet id="bitcoin.rego"/>

```rego
package negation

result := true if {
    no_bitcoin_miners_using_negation
        with data.example.apps as [{"name": "web"}]
}
```

<RunSnippet files="#bitcoin.rego" command="data.negation.result"/>

```rego
package negation

result := true if {
    no_bitcoin_miners_using_negation
        with data.example.apps as [{"name": "bitcoin-miner"}, {"name": "web"}]
}
```

<RunSnippet files="#bitcoin.rego" command="data.negation.result"/>

:::info
The `undefined` result above is expected because we did not define a default
value for `no_bitcoin_miners_using_negation`. Since the body of the rule fails
to match, there is no value generated.
:::

A common mistake is to try encoding the policy with a rule named `no_bitcoin_miners`
like so:

```rego
no_bitcoin_miners if {
    app := apps[_]
    app.name != "bitcoin-miner" # THIS IS NOT CORRECT.
}
```

It becomes clear that this is incorrect when you use the [`some`](#some-keyword)
keyword, because the rule is true whenever there is SOME app that is not a
bitcoin-miner:

```rego
no_bitcoin_miners if {
    some app in apps
    app.name != "bitcoin-miner" # THIS IS NOT CORRECT.
}
```

The reason the rule is incorrect is that variables in Rego are _existentially
quantified_. This means that rule bodies and queries express FOR ANY and not FOR
ALL. To express FOR ALL in Rego complement the logic in the rule body (e.g.,
`!=` becomes `==`) and then complement the check using negation (e.g.,
`no_bitcoin_miners` becomes `not any_bitcoin_miners`).

Alternatively, we can implement the same kind of logic inside a single rule
using [comprehensions](#comprehensions).

```rego
no_bitcoin_miners_using_comprehension if {
    bitcoin_miners := {app | some app in apps; app.name == "bitcoin-miner"}
    count(bitcoin_miners) == 0
}
```

:::info
Whether you use negation, comprehensions, or `every` to express FOR ALL is up to you.
The [`every` keyword](#every-keyword) should lend itself nicely to a rule formulation that closely
follows how requirements are stated, and thus enhances your policy's readability.

The comprehension version is more concise than the negation variant, and does not
require a helper rule while the negation version is more verbose but a bit simpler
and allows for more complex ORs.
:::

:::tip
Have a look at the other examples for
[`some`](./policy-reference/keywords/some) and
[`every`](./policy-reference/keywords/every) in the examples section.
:::

## Modules

In Rego, policies are defined inside _modules_. Modules consist of:

- Exactly one [package](#packages) declaration.
- Zero or more [import](#imports) statements.
- Zero or more [rule](#rules) definitions.

Modules are typically represented in Unicode text and encoded in UTF-8.

### Comments

Comments begin with the `#` character and continue until the end of the line.

### Packages

Packages group the rules defined in one or more modules into a particular namespace. Because rules are namespaced they can be safely shared across projects.

Modules contributing to the same package do not have to be located in the same directory.

The rules defined in a module are automatically exported. That is, they can be queried under OPA’s [Data API](./rest-api#data-api) provided the appropriate package is given. For example, given the following module:

```rego
package opa.examples

pi := 3.14159
```

The `pi` document can be queried via the Data API:

```http
GET https://example.com/v1/data/opa/examples/pi HTTP/1.1
```

Valid package names are variables or references that only contain string operands. For example, these are all valid package names:

```rego
package foo
package foo.bar
package foo.bar.baz
package foo["bar.baz"].qux
```

These are invalid package names:

```rego
package 1foo        # not a variable
package foo[1].bar  # contains non-string operand
```

For more details see the language [grammar](./policy-reference/#grammar).

### Imports

Import statements declare dependencies that modules have on documents defined outside the package. By importing a
document, the identifiers exported by that document can be referenced within the current module.

All modules contain implicit statements which import the `data` and `input` documents.

Modules use the same syntax to declare dependencies on [base and virtual documents](./philosophy#how-does-opa-work).

For example, the following document can be imported and used as follows:

```rego
package example

servers := [
    {
        "id": "app",
        "protocols": ["https", "ssh"]
    },
    {
        "id": "db",
        "protocols": ["mysql"]
    },
    {
        "id": "ci",
        "protocols": ["http"]
    }
]
```

```rego
package opa.examples

import data.example.servers

http_servers contains server if {
    some server in servers
    "http" in server.protocols
}
```

Similarly, modules can declare dependencies on query arguments by specifying an import path that starts with `input`.

```json title="input.json"
{
  "user": "paul",
  "method": "GET"
}
```

```rego
package examples

import input.user
import input.method

# allow alice to perform any operation.
allow if user == "alice"

# allow bob to perform read-only operations.
allow if {
    user == "bob"
    method == "GET"
}

# allows users assigned a "dev" role to perform read-only operations.
allow if {
    method == "GET"
    input.user in data.roles["dev"]
}

# allows user catherine access on Saturday and Sunday
allow if {
    user == "catherine"
    day := time.weekday(time.now_ns())
    day in ["Saturday", "Sunday"]
}
```

<RunSnippet id="imports.rego"/>

Imports can include an optional `as` keyword to resolve namespacing conflicts:

```rego
package opa.examples

import data.example.servers as my_servers

http_servers contains server if {
    some server in my_servers
    "http" in server.protocols
}
```

## In Keyword

More expressive membership and existential quantification keyword:

```json title="input.json"
{ "roles": ["denylisted-role", "another-role"] }
```

```rego
deny if {
    some x in input.roles # iteration
    x == "denylisted-role"
}

deny if {
    "denylisted-role" in input.roles # membership check
}
```

See [the keywords docs](#membership-and-iteration-in) for details.

## If Keyword

This keyword allows more expressive rule heads:

```json title="input.json"
{
  "token": "secret"
}
```

```rego
deny if input.token != "secret"
```

## Contains Keyword

This keyword allows more expressive rule heads for partial set rules:

```rego
deny contains msg if { msg := "forbidden" }
```

## Some Keyword

The `some` keyword allows queries to explicitly declare local variables. Use the
`some` keyword in rules that contain unification statements or references with
variable operands **if** variables contained in those statements are not
declared using `:=` .

| Statement                        | Example                          | Variables   |
| -------------------------------- | -------------------------------- | ----------- |
| Unification                      | `input.a = [["b", x], [y, "c"]]` | `x` and `y` |
| Reference with variable operands | `data.foo[i].bar[j]`             | `i` and `j` |

For example, the following rule generates tuples of array indices for servers in
the "west" region that contain "db" in their name. The first element in the
tuple is the site index and the second element is the server index.

```rego
package tuples

import data.example.sites

tuples contains [i, j] if {
    some i, j
    sites[i].region == "west"
    server := sites[i].servers[j] # note: 'server' is local because it's declared with :=
    contains(server.name, "db")
}
```

<RunSnippet files="#example_data.rego" command="data.tuples.tuples"/>

If we query for the tuples we get two results.
Since we have declared `i`, `j`, and `server` to be local, we can introduce
rules in the same package without affecting the result above:

```rego
# Define a rule called 'i', has no impact on the tuples rule
i := 1
```

If we had not declared `i` with the `some` keyword, introducing the `i` rule
above would have changed the result of `tuples` because the `i` symbol in the
body would capture the global value. Try removing `some i, j` and see what happens!

The `some` keyword is not required but it's recommended to avoid situations like
the one above where introduction of a rule inside a package could change
behaviour of other rules.

For using the `some` keyword with iteration, see
[the documentation of the `in` operator](#membership-and-iteration-in).

## Every Keyword

```rego
package example

import data.example.sites

names_with_dev if {
    some site in sites
    site.name == "dev"

    every server in site.servers {
        endswith(server.name, "-dev")
    }
}
```

<RunSnippet files="#example_data.rego" command="data.example.names_with_dev"/>

The `every` keyword takes an (optional) key argument, a value argument, a domain, and a
block of further queries, its "body".

The keyword is used to explicitly assert that its body is true for _any element in the domain_.
It will iterate over the domain, bind its variables, and check that the body holds
for those bindings.
If one of the bindings does not yield a successful evaluation of the body, the overall
statement is undefined.

If the domain is empty, the overall statement is true.

Evaluating `every` does **not** introduce new bindings into the rule evaluation.

Used with a key argument, the index, or property name (for objects), comes into the
scope of the body evaluation:

```rego
package example

array_domain if {
    every i, x in [1, 2, 3] { x-i == 1 } # array domain
}

object_domain if {
    every k, v in {"foo": "bar", "fox": "baz" } { # object domain
        startswith(k, "f")
        startswith(v, "b")
    }
}

set_domain if {
    every x in {1, 2, 3} { x != 4 } # set domain
}
```

<RunSnippet command="data.example"/>

Negating `every` is forbidden. If you need to express `not every x in xs { p(x) }`
please use `some x in xs; not p(x)` instead.

## With Keyword

The `with` keyword allows queries to programmatically specify values nested
under the [input document](./philosophy/#the-opa-document-model) or the
[data document](./philosophy/#the-opa-document-model), or [built-in functions](#built-in-functions).

For example, given the simple authorization policy in the [imports](#imports)
section, we can write a query that checks whether a particular request would be
allowed:

```rego
package authz

import data.examples.allow

result := true if {
    allow with input as {"user": "alice", "method": "POST"}
}
```

<RunSnippet files="#imports.rego" command="data.authz.result"/>

```rego
package authz

import data.examples.allow

result := true if {
    allow with input as {"user": "bob", "method": "GET"}
}
```

<RunSnippet files="#imports.rego" command="data.authz.result"/>

```rego
package authz

import data.examples.allow

result := true if {
    not allow with input as {"user": "bob", "method": "DELETE"}
}
```

<RunSnippet files="#imports.rego" command="data.authz.result"/>

It's also possible to use `with` multiple times in the same query. `dev` role
allows `GET`, even for an unknown user in our policy.

```rego
package authz

import data.examples.allow

result := true if {
    allow with input as {"user": "charlie", "method": "GET"}
          with data.roles as {"dev": ["charlie"]}
}
```

<RunSnippet files="#imports.rego" command="data.authz.result"/>

catherine is only allowed access at weekends. The following query uses `with` to
test this functionality:

```rego
package authz

import data.examples.allow

result := true if {
    allow with input as {"user": "catherine", "method": "GET"}
        with data.roles as {"dev": ["bob"]}
        with time.weekday as "Sunday"
}
```

<RunSnippet files="#imports.rego" command="data.authz.result"/>

The `with` keyword acts as a modifier on expressions. A single expression is
allowed to have zero or more `with` modifiers. The `with` keyword has the
following syntax:

```
<expr> with <target-1> as <value-1> [with <target-2> as <value-2> [...]]
```

The `<target>`s must be references to values in the input document (or the input
document itself) or data document, or references to functions (built-in or not).

:::info
When applied to the `data` document, the `<target>` must not attempt to
partially define virtual documents. For example, given a virtual document at
path `data.foo.bar`, the compiler will generate an error if the policy
attempts to replace `data.foo.bar.baz`.
:::

The `with` keyword only affects the attached expression. Subsequent expressions
will see the unmodified value. The exception to this rule is when multiple
`with` keywords are in-scope like below:

```rego
inner := [x, y] if {
    x := input.foo
    y := input.bar
}

middle := [a, b] if {
    a := inner with input.foo as 100
    b := input
}

outer := result if {
    result := middle with input as {"foo": 200, "bar": 300}
}
```

When `<target>` is a reference to a function, like `http.send`, then
its `<value>` can be any of the following:

1. a value: `with http.send as {"body": {"success": true }}`
2. a reference to another function: `with http.send as mock_http_send`
3. a reference to another (possibly custom) built-in function: `with custom_builtin as less_strict_custom_builtin`
4. a reference to a rule that will be used as the _value_.

When the replacement value is a function, its arity needs to match the replaced
function's arity; and the types must be compatible.

Replacement functions can call the function they're replacing **without causing
recursion**.
See the following example:

```rego
package mock

f(x) := count(x)

mock_count(x) := 0 if "x" in x
mock_count(x) := count(x) if not "x" in x

result := v if {
    v := f(["x", 2, 3]) with count as mock_count
}
```

<RunSnippet command="data.mock.result"/>

Each replacement function evaluation will start a new scope: it's valid to use
`with <builtin1> as ...` in the body of the replacement function -- for example:

```rego
package mocks

f(x) := count(x) if {
    rule_using_concat with concat as "foo,bar"
}
```

Note that function replacement via `with` does not affect the evaluation of the
function arguments: if running `f(input.x), and`input.x`is undefined, the replacement of`concat` does not change the result of the evaluation.

## Default Keyword

The `default` keyword allows policies to define a default value for documents
produced by rules with [complete definitions](#complete-definitions). The
default value is used when all the rules sharing the same name are undefined.

For example:

```rego
package example

default allow := false

allow if {
    input.user == "bob"
    input.method == "GET"
}
```

<RunSnippet command="data.example.allow"/>

But if we run this with the following input:

```json
{
  "user": "bob",
  "method": "GET"
}
```

<RunSnippet id="input.bob.json"/>

```rego
package example

default allow := false

allow if {
    input.user == "bob"
    input.method == "GET"
}
```

<RunSnippet files="#input.bob.json" command="data.example.allow"/>

Without the default definition, the `allow` document would simply be undefined for the same input.

When the `default` keyword is used, the rule syntax is restricted to:

```rego
default <name> := <term>
```

The term may be any scalar, composite, or comprehension value but it may not be
a variable or reference. If the value is a composite then it may not contain
variables or references. Comprehensions however may, as the result of a
comprehension is never undefined.

Similar to rules, the `default` keyword can be applied to functions as well. For
example:

```rego
default clamp_positive(_) := 0

clamp_positive(x) := x if {
    x > 0
}
```

When `clamp_positive` is queried, the return value will be either the argument provided to the function or `0`.

The value of a `default` function follows the same conditions as that of a `default` rule. In addition, a `default`
function satisfies the following properties:

- same arity as other functions with the same name
- arguments should only be plain variables ie. no composite values
- argument names should not be repeated

:::info
A `default` function will still fail (as in not evaluate, even to the default value) if any of the arguments provided in
the call are **undefined**. The reason for this is that the arguments are evaluated before the function is even called,
and an undefined argument halts evaluation at that point.
:::

:::tip
Have a look at the other examples for
[`default`](./policy-reference/keywords/default) in the examples section to learn more.
:::

## Else Keyword

The `else` keyword is a basic control flow construct that gives you control
over rule evaluation order.

Rules grouped together with the `else` keyword are evaluated until a match is
found. Once a match is found, rule evaluation does not proceed to rules further
in the chain.

The `else` keyword is useful if you are porting policies into Rego from an
order-sensitive system like IPTables.

```rego
package else_example

authorize := "allow" if {
    input.user == "superuser"           # allow 'superuser' to perform any operation.
} else := "deny" if {
    input.path[0] == "admin"            # disallow 'admin' operations...
    input.source_network == "external"  # from external networks.
} # ... more rules
```

<RunSnippet id="else.rego"/>

In the example below, evaluation stops immediately after the first rule even
though the input matches the second rule as well.

```json
{
  "path": [
    "admin",
    "exec_shell"
  ],
  "source_network": "external",
  "user": "superuser"
}
```

<RunSnippet id="input.superuser.json"/>

```rego
package else_example

superuser_result := authorize
```

<RunSnippet files="#else.rego #input.superuser.json" command="data.else_example.superuser_result"/>

In the next example, the input matches the second rule (but not the first) so
evaluation continues to the second rule before stopping.

```json
{
  "path": [
    "admin",
    "exec_shell"
  ],
  "source_network": "external",
  "user": "alice"
}
```

<RunSnippet id="input.alice.json"/>

```rego
package else_example

alice_result := authorize
```

<RunSnippet files="#else.rego #input.alice.json" command="data.else_example.alice_result"/>

The `else` keyword may be used repeatedly on the same rule and there is no
limit imposed on the number of `else` clauses on a rule. However, it is
recommended that policy authors use the `else` keyword sparingly to avoid
tightly coupled rules.

## Operators

### Membership and iteration: `in`

The membership operator `in` lets you check if an element is part of a collection (array, set, or object). It always evaluates to `true` or `false`:

```rego
package example

result := {
    "array": 3 in [1, 2, 3],
    "set": 3 in {1, 2, 3},
    "object": 3 in {"foo": 1, "bar": 3},
    "object_key": "foo" in {"foo": 1, "bar": 3}, # false, see below
}
```

<RunSnippet command="data.example.result"/>

When providing two arguments on the left-hand side of the `in` operator,
and an object or an array on the right-hand side, the first argument is
taken to be the key (object) or index (array), respectively:

```rego
package example

result.object := "foo", "bar" in {"foo": "bar"} # key, val with object
result.array := 2, "baz" in ["foo", "bar", "baz"] # key, val with array
```

<RunSnippet command="data.example.result"/>

**Note** that in list contexts, like set or array definitions and function
arguments, parentheses are required to use the form with two left-hand side
arguments -- compare:

```rego
package list_in

p := x if {
    x := [ 0, 2 in [2] ]
}
q := x if {
    x := [ (0, 2 in [2]) ]
}
w := x if {
    x := g((0, 2 in [2]))
}
z := x if {
    x := f(0, 2 in [2])
}

f(x, y) := sprintf("two function arguments: %v, %v", [x, y])
g(x) := sprintf("one function argument: %v", [x])
```

<RunSnippet command="data.list_in"/>

Combined with `not`, the operator can be handy when asserting that an element is _not_
member of an array:

```rego
package not_in

deny if not "admin" in input.user.roles

# Click evaluate to see the result
test_deny if {
    deny with input.user.roles as ["operator", "user"]
}
```

<RunSnippet command="data.not_in.test_deny"/>

**Note** that expressions using the `in` operator _always return `true` or `false`_, even
when called in non-collection arguments:

```rego
package boolean_in

q := x if {
    x := 3 in "three"
}
```

<RunSnippet command="data.boolean_in"/>

Using the `some` variant, it can be used to introduce new variables based on a collections' items:

```rego
package some_in

p contains x if {
	some x in ["a", "r", "r", "a", "y"]
}

q contains x if {
	some x in {"s", "e", "t"}
}

r contains x if {
	some x in {"foo": "bar", "baz": "quz"}
}
```

<RunSnippet command="data.some_in"/>

Furthermore, passing a second argument allows you to work with _object keys_ and _array indices_:

```rego
package some_in

p contains x if {
	some x, "r" in ["a", "r", "r", "a", "y"] # key variable, value constant
}

q[x] := y if {
	some x, y in ["a", "r", "r", "a", "y"] # both variables
}

r[y] := x if {
	some x, y in {"foo": "bar", "baz": "quz"}
}
```

<RunSnippet command="data.some_in"/>

Any argument to the `some` variant can be a composite, non-ground value:

```rego
package some_in

p[x] = y if {
    some x, {"foo": y} in [{"foo": 100}, {"bar": 200}]
}

p[x] = y if {
    some {"bar": x}, {"foo": y} in {{"bar": "b"}: {"foo": "f"}}
}
```

<RunSnippet command="data.some_in"/>

:::info Non-ground values
A "non-ground value" is a value that contains variables - like `{"foo": y}`
where `y` is a variable that gets bound during evaluation. This is the opposite
of a "ground value" which contains no variables. For a formal definition, see
[ground term](https://en.wikipedia.org/wiki/Ground_expression#ground_term).
:::

### Assignment (`:=`)

The assignment operator `:=` is used to assign values to variables. Variables assigned inside a rule are locally scoped to that rule and shadow global variables.

```rego
package assignment

x := 100

p if {
    x := 1     # declare local variable 'x' and assign value 1
    x != 100   # true because 'x' refers to local variable
}
```

<RunSnippet command="data.assignment"/>

Assigned variables are not allowed to appear before the assignment in the
query. For example, the following policy will not compile:

```rego showLineNumbers=true
package assignment

p if {
    x != 100
    x := 1     # error because x appears earlier in the query.
}

q if {
    x := 1
    x := 2     # error because x is assigned twice.
}
```

<RunSnippet command="data.assignment"/>

A simple form of destructuring can be used to unpack values from arrays and assign them to variables:

```rego
package assignment

address := ["3 Abbey Road", "NW8 9AY", "London", "England"]

in_london if {
    [_, _, city, country] := address
    city == "London"
    country == "England"
}
```

<RunSnippet command="data.assignment"/>

### Equality: Comparison, and Unification

Rego supports two kinds of equality: comparison (`==`) and unification `=`.
Generally, to test equality, using `==` for the comparison is recommended.
The unification operator `=` can be thought of as a combination of `:=` and
`==`, and is generally suited to some more advanced use cases.

#### Comparison `==`

Comparison checks if two values are equal within a rule. If the left or right hand side contains a variable that has not been assigned a value, the compiler throws an error.

```rego
package comparison

p if {
    x := 100
    x == 100   # true because x refers to the local variable
}

y := 100

q if {
    y == 100   # true because y refers to the global variable
}
```

<RunSnippet command="data.comparison"/>

Values used in comparison must be assigned before the comparison is made. For
example, the following policy will not compile:

```rego showLineNumbers=true
package comparison

p if {
    z == 100 # error because z is not assigned
}
```

<RunSnippet command="data.comparison"/>

#### Unification `=`

Unification (`=`) combines assignment and comparison. Rego will assign variables to values that make the comparison true. Unification lets you ask for values for variables that make an expression true.

```rego
package unification

# Find values for x and y that make the equality true
result := [x, y] if {
    [x, "world"] = ["hello", y]
}
```

<RunSnippet command="data.unification"/>

```rego
package unification

import data.example.sites
import data.example.apps

# find all the servers running apps
result contains sites[i].servers[j].name if {
    sites[i].servers[j].name = apps[k].servers[m]
}
```

<RunSnippet files="#example_data.rego" command="data.unification"/>

As opposed to when assignment (`:=`) is used, the order of expressions in a rule does not affect the document’s content.

```rego
package unification

s if {
    x > y
    y = 41
    x = 42
}
```

<RunSnippet command="data.unification.s"/>

#### Best Practices for Equality and Assignment

Best practice is to use assignment `:=` and comparison `==` unless you know you
need to use unification.
The additional compiler checks help avoid errors when writing policy, and the
additional syntax helps make the intent clearer when reading policy.

| Equality | Compiler Errors              | Use Case        |
| -------- | ---------------------------- | --------------- |
| `:=`     | Var already assigned         | Assign variable |
| `==`     | Var not assigned             | Compare values  |
| `=`      | Values would not be computed | Express query   |

:::tip Further Reading
There are some Regal rules to help authors make the right decisions:

- [`use-assignment-operator`](/projects/regal/rules/style/use-assignment-operator)
- [`prefer-equals-comparison`](/projects/regal/rules/idiomatic/prefer-equals-comparison)

Under the hood `:=` and `==` are syntactic sugar for `=`, local variable creation, and additional compiler checks.
:::

### Comparison Operators

The following comparison operators are supported:

```rego
a  ==  b  #  `a` is equal to `b`.
a  !=  b  #  `a` is not equal to `b`.
a  <   b  #  `a` is less than `b`.
a  <=  b  #  `a` is less than or equal to `b`.
a  >   b  #  `a` is greater than `b`.
a  >=  b  #  `a` is greater than or equal to `b`.
```

None of these operators bind variables contained
in the expression. As a result, if either operand is a variable, the variable
must appear in another expression in the same rule that would cause the
variable to be bound, i.e., an equality expression or the target position of
a built-in function.

## Built-in Functions

In some cases, rules must perform simple arithmetic, aggregation, and so on.
Rego provides a number of built-in functions (or “built-ins”) for performing
these tasks.

Built-ins can be easily recognized by their syntax. All built-ins have the
following form:

```
<name>(<arg-1>, <arg-2>, ..., <arg-n>)
```

Built-ins usually take one or more input values and produce one output
value. Unless stated otherwise, all built-ins accept values or variables as
output arguments.

If a built-in function is invoked with a variable as input, the variable must
be _safe_, i.e., it must be assigned elsewhere in the query.

Built-ins can include "." characters in the name. This allows them to be
namespaced. If you are adding custom built-ins to OPA, consider namespacing
them to avoid naming conflicts, e.g., `org.example.special_func`.

See the [Policy Reference](./policy-reference#built-in-functions) document for
details on each built-in function.

### Errors

By default, built-in function calls that encounter runtime errors evaluate to
undefined (which can usually be treated as `false`) and do not halt policy
evaluation. This ensures that built-in functions can be called with invalid
inputs without causing the entire policy to stop evaluating.

In most cases, policies do not have to implement any kind of error handling
logic. If error handling is required, the built-in function call can be negated
to test for undefined. For example:

```json title="input.json"
{
  "token": "a poorly formatted token"
}
```

<RunSnippet id="input.errors.json"/>

```rego
package errors

allow if {
    io.jwt.verify_hs256(input.token, "secret")
    [_, payload, _] := io.jwt.decode(input.token)
    payload.role == "admin"
}

reason contains "invalid JWT supplied as input" if {
    not io.jwt.decode(input.token)
}
```

<RunSnippet files="#input.errors.json" command="data.errors.reason"/>

If you wish to disable this behaviour and instead have built-in function call
errors treated as exceptions that halt policy evaluation enable "strict built-in
errors" in the caller:

| API                   | Flag                                    |
| --------------------- | --------------------------------------- |
| `POST v1/data` (HTTP) | `strict-builtin-errors` query parameter |
| `GET v1/data` (HTTP)  | `strict-builtin-errors` query parameter |
| `opa eval` (CLI)      | `--strict-builtin-errors`               |
| `opa run` (REPL)      | `> strict-builtin-errors`               |
| `rego` Go module      | `rego.StrictBuiltinErrors(true)` option |
| Wasm                  | Not Available                           |

## Metadata

The package and individual rules in a module can be annotated with a rich set of metadata.

```rego
package metadata

# METADATA
# title: My rule
# description: A rule that determines if x is allowed.
# authors:
# - John Doe <john@example.com>
# entrypoint: true
allow if {
  ...
}
```

Annotations are grouped within a _metadata block_, and must be specified as YAML within a comment block that **must** start with `# METADATA`.
Also, every line in the comment block containing the annotation **must** start at Column 1 in the module/file, or otherwise, they will be ignored.

:::danger
OPA will attempt to parse the YAML document in comments following the
initial `# METADATA` comment. If the YAML document cannot be parsed, OPA will
return an error. If you need to include additional comments between the
comment block and the next statement, include a blank line immediately after
the comment block containing the YAML document. This tells OPA that the
comment block containing the YAML document is finished
:::

### Annotations

| Name              | Type                                                        | Description                                                                                                        |
| ----------------- | ----------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| scope             | string; one of `package`, `rule`, `document`, `subpackages` | The scope for which the metadata applies. Read more [here](#metadata-scope).                                       |
| title             | string                                                      | A human-readable name for the annotation target. Read more [here](#metadata-title).                                |
| description       | string                                                      | A description of the annotation target. Read more [here](#metadata-description).                                   |
| related_resources | list of URLs                                                | A list of URLs pointing to related resources/documentation. Read more [here](#metadata-related_resources).         |
| authors           | list of strings                                             | A list of authors for the annotation target. Read more [here](#metadata-authors).                                  |
| organizations     | list of strings                                             | A list of organizations related to the annotation target. Read more [here](#metadata-organizations).               |
| schemas           | list of object                                              | A list of associations between value paths and schema definitions. Read more [here](#metadata-schemas).            |
| entrypoint        | boolean                                                     | Whether or not the annotation target is to be used as a policy entrypoint. Read more [here](#metadata-entrypoint). |
| custom            | mapping of arbitrary data                                   | A custom mapping of named parameters holding arbitrary data. Read more [here](#metadata-custom).                   |

### Metadata `Scope`

Annotations can be defined at the rule or package level. The `scope` annotation in
a metadata block determines how that metadata block will be applied. If the
`scope` field is omitted, it defaults to the scope for the statement that
immediately follows the annotation. The `scope` values that are currently
supported are:

- `rule` - applies to the individual rule statement (within the same file). Default, when metadata block precedes rule.
- `document` - applies to all of the rules with the same name in the same package (across multiple files)
- `package` - applies to all of the rules in the package (across multiple files). Default, when metadata block precedes package.
- `subpackages` - applies to all of the rules in the package and all subpackages (recursively, across multiple files)

Since the `document` scope annotation applies to all rules with the same name in the same package
and the `package` and `subpackages` scope annotations apply to all packages with a matching path, metadata blocks with
these scopes are applied over all files with applicable package- and rule paths.
As there is no ordering across files in the same package, the `document`, `package`, and `subpackages` scope annotations
can only be specified **once** per path. The `document` scope annotation can be applied to any rule in the set (i.e.,
ordering does not matter.)

An `entrypoint` annotation implies a `scope` of either `package` or `document`. When `entrypoint` is set to `true` on a
rule, the `scope` is automatically set to `document` if not explicitly provided. Setting the `scope` to `rule` will
result in an error, as an entrypoint always applies to the whole document.

#### Example Policy with Metadata

```rego
# METADATA
# scope: document
# description: A set of rules that determines if x is allowed.
package metadata

# METADATA
# title: Allow Ones
allow if {
    x == 1
}

# METADATA
# title: Allow Twos
allow if {
    x == 2
}

# METADATA
# entrypoint: true
# description: |
#   `scope` annotation automatically set to `document`
#   as that is required for entrypoints
message := "welcome!" if allow
```

### Metadata `title`

The `title` annotation is a string value giving a human-readable name to the annotation target.

```rego
# METADATA
# title: Allow Ones
allow if {
  x == 1
}

# METADATA
# title: Allow Twos
allow if {
  x == 2
}
```

### Metadata `description`

The `description` annotation is a string value describing the annotation target, such as its purpose.

```rego
# METADATA
# description: |
#  The 'allow' rule...
#  Is about allowing things.
#  Not denying them.
allow if {
  ...
}
```

### Metadata `related_resources`

The `related_resources` annotation is a list of _related-resource_ entries, where each links to some related external resource; such as RFCs and other reading material.
A _related-resource_ entry can either be an object or a short-form string holding a single URL.

#### Object Related-resource Format

When a _related-resource_ entry is presented as an object, it has two fields:

- `ref`: a URL pointing to the resource (required).
- `description`: a text describing the resource.

#### String Related-resource Format

When a _related-resource_ entry is presented as a string, it needs to be a valid URL.

#### Examples

```rego
# METADATA
# related_resources:
# - ref: https://example.com
# ...
# - ref: https://example.com/foo
#   description: A text describing this resource
allow if {
  ...
}
```

```rego
# METADATA
# related_resources:
# - https://example.com/foo
# ...
# - https://example.com/bar
allow if {
  ...
}
```

### Metadata `authors`

The `authors` annotation is a list of author entries, where each entry denotes an _author_.
An _author_ entry can either be an object or a short-form string.

#### Object Author Format

When an _author_ entry is presented as an object, it has two fields:

- `name`: the name of the author
- `email`: the email of the author

At least one of the above fields are required for a valid `author` entry.

#### String Author Format

When an _author_ entry is presented as a string, it has the format `{ name } [ "<" email ">"]`;
where the name of the author is a sequence of whitespace-separated words.
Optionally, the last word may represent an email, if enclosed with `<>`.

#### Examples

```rego
# METADATA
# authors:
# - name: John Doe
# ...
# - name: Jane Doe
#   email: jane@example.com
allow if {
  ...
}
```

```rego
# METADATA
# authors:
# - John Doe
# ...
# - Jane Doe <jane@example.com>
allow if {
  ...
}
```

### Metadata `organizations`

The `organizations` annotation is a list of string values representing the organizations associated with the annotation target.

#### Example

```rego
# METADATA
# organizations:
# - Acme Corp.
# ...
# - Tyrell Corp.
allow if {
  ...
}
```

### Metadata `schemas`

The `schemas` annotation is a list of key value pairs, associating schemas to data values.
In-depth information on this topic can be found [here](#annotations).

#### Schema Reference Format

Schema files can be referenced by path, where each path starts with the `schema` namespace, and trailing components specify
the path of the schema file (sans file-ending) relative to the root directory specified by the `--schema` flag on applicable commands.
If the `--schema` flag is not present, referenced schemas are ignored during type checking.

```rego
# METADATA
# schemas:
#   - input: schema.input
#   - data.acl: schema["acl-schema"]
allow if {
    access := data.acl["alice"]
    access[_] == input.operation
}
```

#### Inlined Schema Format

Schema definitions can be inlined by specifying the schema structure as a YAML or JSON map.
Inlined schemas are always used to inform type checking for the `eval`, `check`, and `test` commands;
in contrast to [by-reference schema annotations](#schema-reference-format), which require the `--schema` flag to be present in order to be evaluated.

```rego
# METADATA
# schemas:
#   - input.x: {type: number}
allow if {
    input.x == 42
}
```

### Metadata `entrypoint`

The `entrypoint` annotation is a boolean used to mark rules and packages that should be used as entrypoints for a policy.
This value is false by default, and can only be used at `document` or `package` scope. When used on a rule with no
explicit `scope` set, the presence of an `entrypoint` annotation will automatically set the scope to `document`.

The `build` and `eval` CLI commands will automatically pick up annotated entrypoints; you do not have to specify them with
[`--entrypoint`](./cli/#eval).

:::info
Unless the `--prune-unused` flag is used, any rule transitively referring to a
package or rule declared as an entrypoint will also be enumerated as an entrypoint.
:::

### Metadata `custom`

The `custom` annotation is a mapping of user-defined data, mapping string keys to arbitrarily typed values.

#### Example

```rego
# METADATA
# custom:
#  my_int: 42
#  my_string: Some text
#  my_bool: true
#  my_list:
#   - a
#   - b
#  my_map:
#   a: 1
#   b: 2
allow if {
  ...
}
```

### Accessing annotations

Information in metadata blocks can be accessed in a number of ways.

#### From Rego Rules

In the example below, you can see how to access an annotation from within a policy.

```json title="input.json"
{
  "number": 11
}
```

<RunSnippet id="input.metadata.json"/>

The following policy uses the `rego.metadata.rule()` function to access the metadata
from the rule to show in the output message.

```rego
package example

# METADATA
# title: Deny invalid numbers
# description: Numbers may not be higher than 5
# custom:
#  severity: MEDIUM
output := decision if {
	input.number > 5

	annotation := rego.metadata.rule()
	decision := {
		"severity": annotation.custom.severity,
		"message": annotation.description,
	}
}
```

<RunSnippet files="#input.metadata.json" command="data.example.output"/>

If you'd like more examples and information on this, you can see more here under the [Rego](./policy-reference/builtins/rego) policy reference.

#### From the `inspect` command

Annotations can be listed through the `inspect` command by using the `-a` flag:

```shell
opa inspect -a
```

#### From the Go API

The `ast.AnnotationSet` is a collection of all `ast.Annotations` declared in a set of modules.
An `ast.AnnotationSet` can be created from a slice of compiled modules:

```go
var modules []*ast.Module
...
as, err := ast.BuildAnnotationSet(modules)
if err != nil {
    // Handle error.
}
```

or can be retrieved from an `ast.Compiler` instance:

```go
var modules []*ast.Module
...
compiler := ast.NewCompiler()
compiler.Compile(modules)
as := compiler.GetAnnotationSet()
```

The `ast.AnnotationSet` can be flattened into a slice of `ast.AnnotationsRef`, which is a complete, sorted list of all
annotations, grouped by the path and location of their targeted package or -rule.

```go
flattened := as.Flatten()
for _, entry := range flattened {
    fmt.Printf("%v at %v has annotations %v\n",
        entry.Path,
        entry.Location,
        entry.Annotations)
}

// Output:
// data.foo at foo.rego:5 has annotations {"scope":"subpackages","organizations":["Acme Corp."]}
// data.foo.bar at mod:3 has annotations {"scope":"package","description":"A couple of useful rules"}
// data.foo.bar.p at mod:7 has annotations {"scope":"rule","title":"My Rule P"}
//
// For modules:
// # METADATA
// # scope: subpackages
// # organizations:
// # - Acme Corp.
// package foo
// ---
// # METADATA
// # description: A couple of useful rules
// package foo.bar
//
// # METADATA
// # title: My Rule P
// p := 7
```

Given an `ast.Rule`, the `ast.AnnotationSet` can return the chain of annotations declared for that rule, and its path ancestry.
The returned slice is ordered starting with the annotations for the rule, going outward to the farthest node with declared annotations
in the rule's path ancestry.

```go
var rule *ast.Rule
...
chain := ast.Chain(rule)
for _, link := range chain {
    fmt.Printf("link at %v has annotations %v\n",
        link.Path,
        link.Annotations)
}

// Output:
// data.foo.bar.p at mod:7 has annotations {"scope":"rule","title":"My Rule P"}
// data.foo.bar at mod:3 has annotations {"scope":"package","description":"A couple of useful rules"}
// data.foo at foo.rego:5 has annotations {"scope":"subpackages","organizations":["Acme Corp."]}
//
// For modules:
// # METADATA
// # scope: subpackages
// # organizations:
// # - Acme Corp.
// package foo
// ---
// # METADATA
// # description: A couple of useful rules
// package foo.bar
//
// # METADATA
// # title: My Rule P
// p := 7
```

## Schema

### Using schemas to enhance the Rego type checker

You can provide one or more input schema files and/or data schema files to `opa eval` to improve static type checking and get more precise error reports as you develop Rego code.

Schemas can be provided to OPA in two main ways: by supplying external JSON Schema files using the `-s` command-line flag (explained below), or by embedding schema definitions directly within your Rego files using [schema annotations](#schema-annotations) (detailed further down in this document). Both methods help improve static type checking.

The `-s` flag can be used to upload schemas for input and data documents in JSON Schema format. You can either load a single JSON schema file for the input document or directory of schema files.

```
-s, --schema string set schema file path or directory path
```

#### Passing a single file with -s

When a single file is passed, it is a schema file associated with the input document globally. This means that for all rules in all packages, the `input` has a type derived from that schema. There is no constraint on the name of the file, it could be anything.

Example:

```
opa eval data.envoy.authz.allow -i opa-schema-examples/envoy/input.json -d opa-schema-examples/envoy/policy.rego -s opa-schema-examples/envoy/schemas/my-schema.json
```

#### Passing a directory with -s

When a directory path is passed, annotations will be used in the code to indicate what expressions map to what schemas (see below).
Both input schema files and data schema files can be provided in the same directory, with different names. The directory of schemas may have any sub-directories. Notice that when a directory is passed the input document does not have a schema associated with it globally. This must also
be indicated via an annotation.

Example:

```
opa eval data.kubernetes.admission -i opa-schema-examples/kubernetes/input.json -d opa-schema-examples/kubernetes/policy.rego -s opa-schema-examples/kubernetes/schemas
```

Schemas can also be provided for policy and data files loaded via `opa eval --bundle`

Example:

```
opa eval data.kubernetes.admission -i opa-schema-examples/kubernetes/input.json -b opa-schema-examples/bundle.tar.gz -s opa-schema-examples/kubernetes/schemas
```

Samples provided at: [`github.com/aavarghese/opa-schema-examples`](https://github.com/aavarghese/opa-schema-examples/).

### Usage scenario with a single schema file

Consider the following Rego code, which assumes as input a Kubernetes admission review. For resources that are Pods, it checks that the image name
starts with a specific prefix.

```rego title="pod.rego"
package kubernetes.admission

deny contains msg if {
	input.request.kind.kinds == "Pod"
	image := input.request.object.spec.containers[_].image
	not startswith(image, "hooli.com/")
	msg := sprintf("image '%v' comes from untrusted registry", [image])
}
```

Notice that this code has a typo in it: `input.request.kind.kinds` is undefined and should have been `input.request.kind.kind`.

Consider the following input document:

```json title="input.json"
{
  "kind": "AdmissionReview",
  "request": {
    "kind": {
      "kind": "Pod",
      "version": "v1"
    },
    "object": {
      "metadata": {
        "name": "myapp"
      },
      "spec": {
        "containers": [
          {
            "image": "nginx",
            "name": "nginx-frontend"
          },
          {
            "image": "mysql",
            "name": "mysql-backend"
          }
        ]
      }
    }
  }
}
```

Clearly there are 2 image names that are in violation of the policy. However, when we evaluate the erroneous Rego code against this input we obtain:

```shell
$ opa eval data.kubernetes.admission --format pretty -i opa-schema-examples/kubernetes/input.json -d opa-schema-examples/kubernetes/policy.rego
[]
```

The empty value returned is indistinguishable from a situation where the input did not violate the policy. This error is therefore causing the policy not to catch violating inputs appropriately.

If we fix the Rego code and change `input.request.kind.kinds` to `input.request.kind.kind`, then we obtain the expected result:

```json
[
  "image 'nginx' comes from untrusted registry",
  "image 'mysql' comes from untrusted registry"
]
```

With this feature, it is possible to pass a schema to `opa eval`, written in JSON Schema. Consider the admission review schema provided at
[`schemas/input.json`](https://github.com/aavarghese/opa-schema-examples/blob/main/kubernetes/schemas/input.json).

We can pass this schema to the evaluator as follows:

```
% opa eval data.kubernetes.admission --format pretty -i opa-schema-examples/kubernetes/input.json -d opa-schema-examples/kubernetes/policy.rego -s opa-schema-examples/kubernetes/schemas/input.json
```

With the erroneous Rego code, we now obtain the following type error:

```shell
1 error occurred: ../../aavarghese/opa-schema-examples/kubernetes/policy.rego:5: rego_type_error: undefined ref: input.request.kind.kinds
input.request.kind.kinds
                  ^
                  have: "kinds"
                  want (one of): ["kind" "version"]
```

This indicates the error to the Rego developer right away, without having the need to observe the results of runs on actual data, thereby improving productivity.

### Schema annotations

When passing a directory of schemas to `opa eval`, schema annotations become handy to associate a Rego expression with a corresponding schema within a given scope:

```rego
# METADATA
# schemas:
#   - <path-to-value>:<path-to-schema>
#   ...
#   - <path-to-value>:<path-to-schema>
allow if {
  ...
}
```

See the [annotations documentation](./policy-language/#annotations) for general information relating to annotations.

The `schemas` field specifies an array associating schemas to data values. Paths must start with `input` or `data` (i.e., they must be fully-qualified.)

The type checker derives a Rego Object type for the schema and an appropriate entry is added to the type environment before type checking the rule. This entry is removed upon exit from the rule.

Example:

Consider the following Rego code which checks if an operation is allowed by a user, given an ACL data document:

```rego
package policy

import data.acl

default allow := false

# METADATA
# schemas:
#   - input: schema.input
#   - data.acl: schema["acl-schema"]
allow if {
	access := data.acl.alice
	access[_] == input.operation
}

allow if {
	access := data.acl.bob
	access[_] == input.operation
}
```

Consider a directory named `mySchemasDir` with the following structure, provided via `opa eval --schema opa-schema-examples/mySchemasDir`

```shell
$ tree mySchemasDir/
mySchemasDir/
├── input.json
└── acl-schema.json
```

See here for [code samples](https://github.com/aavarghese/opa-schema-examples/tree/main/acl).

In the first `allow` rule above, the input document has the schema `input.json`, and `data.acl` has the schema `acl-schema.json`. Note that we use the relative path inside the `mySchemasDir` directory to identify a schema, omit the `.json` suffix, and use the global variable `schema` to stand for the top-level of the directory.
Schemas in annotations are proper Rego references. So `schema.input` is also valid, but `schema.acl-schema` is not.

If we had the expression `data.acl.foo` in this rule, it would result in a type error because the schema contained in `acl-schema.json` only defines object properties `"alice"` and `"bob"` in the ACL data document.

On the other hand, this annotation does not constrain other paths under `data`. What it says is that we know the type of `data.acl` statically, but not that of other paths. So for example, `data.foo` is not a type error and gets assigned the type `Any`.

Note that the second `allow` rule doesn't have a METADATA comment block attached to it, and hence will not be type checked with any schemas.

On a different note, schema annotations can also be added to policy files part of a bundle package loaded via `opa eval --bundle` along with the `--schema` parameter for type checking a set of `*.rego` policy files.

The _scope_ of the `schema` annotation can be controlled through the [scope](./policy-language/#annotations) annotation

In case of overlap, schema annotations override each other as follows:

- `rule` overrides `document`
- `document` overrides `package`
- `package` overrides `subpackages`

The following sections explain how the different scopes affect `schema` annotation
overriding for type checking.

#### Rule and Document Scopes

In the example above, the second rule does not include an annotation so type
checking of the second rule would not take schemas into account. To enable type
checking on the second (or other rules in the same file) we could specify the
annotation multiple times:

```rego
# METADATA
# scope: rule
# schemas:
#   - input: schema.input
#   - data.acl: schema["acl-schema"]
allow if {
    access := data.acl["alice"]
    access[_] == input.operation
}

# METADATA
# scope: rule
# schemas:
#   - input: schema.input
#   - data.acl: schema["acl-schema"]
allow if {
    access := data.acl["bob"]
    access[_] == input.operation
}
```

This is obviously redundant and error-prone. To avoid this problem, we can
define the annotation once on a rule with scope `document`:

```rego
# METADATA
# scope: document
# schemas:
#   - input: schema.input
#   - data.acl: schema["acl-schema"]
allow if {
    access := data.acl["alice"]
    access[_] == input.operation
}

allow if {
    access := data.acl["bob"]
    access[_] == input.operation
}
```

In this example, the annotation with `document` scope has the same affect as the
two `rule` scoped annotations in the previous example.

#### Package and Subpackage Scopes

Annotations can be defined at the `package` level and then applied to all rules
within the package:

```rego
# METADATA
# scope: package
# schemas:
#   - input: schema.input
#   - data.acl: schema["acl-schema"]
package example

allow if {
    access := data.acl["alice"]
    access[_] == input.operation
}

allow if {
    access := data.acl["bob"]
    access[_] == input.operation
}
```

`package` scoped schema annotations are useful when all rules in the same
package operate on the same input structure. In some cases, when policies are
organized into many sub-packages, it is useful to declare schemas recursively
for them using the `subpackages` scope. For example:

```rego
# METADTA
# scope: subpackages
# schemas:
# - input: schema.input
package kubernetes.admission
```

This snippet would declare the top-level schema for `input` for the
`kubernetes.admission` package as well as all subpackages. If admission control
rules were defined inside packages like `kubernetes.admission.workloads.pods`,
they would be able to pick up that one schema declaration.

### Overriding

JSON Schemas are often incomplete specifications of the format of data. For example, a Kubernetes Admission Review resource has a field `object` which can contain any other Kubernetes resource. A schema for Admission Review has a generic type `object` for that field that has no further specification. To allow more precise type checking in such cases, we support overriding existing schemas.

Consider the following example:

```rego
package kubernetes.admission

# METADATA
# scope: rule
# schemas:
# - input: schema.input
# - input.request.object: schema.kubernetes.pod
deny contains msg if {
	input.request.kind.kind == "Pod"
	image := input.request.object.spec.containers[_].image
	not startswith(image, "hooli.com/")
	msg := sprintf("image '%v' comes from untrusted registry", [image])
}
```

In this example, the `input` is associated with an Admission Review schema, and furthermore `input.request.object` is set to have the schema of a Kubernetes Pod. In effect, the second schema annotation overrides the first one. Overriding is a schema transformation feature and combines existing schemas. In this case, we are combining the Admission Review schema with that of a Pod.

Notice that the order of schema annotations matter for overriding to work correctly.

Given a schema annotation, if a prefix of the path already has a type in the environment, then the annotation has the effect of merging and overriding the existing type with the type derived from the schema. In the example above, the prefix `input` already has a type in the type environment, so the second annotation overrides this existing type. Overriding affects the type of the longest prefix that already has a type. If no such prefix exists, the new path and type are added to the type environment for the scope of the rule.

In general, consider the existing Rego type:

```
object{a: object{b: object{c: C, d: D, e: E}}}
```

If we override this type with the following type (derived from a schema annotation of the form `a.b.e: schema-for-E1`):

```
object{a: object{b: object{e: E1}}}
```

It results in the following type:

```
object{a: object{b: object{c: C, d: D, e: E1}}}
```

Notice that `b` still has its fields `c` and `d`, so overriding has a merging effect as well. Moreover, the type of expression `a.b.e` is now `E1` instead of `E`.

We can also use overriding to add new paths to an existing type, so if we override the initial type with the following:

```
object{a: object{b: object{f: F}}}
```

We obtain the following type:

```
object{a: object{b: object{c: C, d: D, e: E, f: F}}}
```

We use schemas to enhance the type checking capability of OPA, and not to validate the input and data documents against desired schemas. This burden is still on the user and care must be taken when using overriding to ensure that the input and data provided are sensible and validated against the transformed schemas.

### Multiple input schemas

It is sometimes useful to have different input schemas for different rules in the same package. This can be achieved as illustrated by the following example:

```rego
package policy

import data.acl

default allow := false

# METADATA
# scope: rule
# schemas:
#  - input: schema["input"]
#  - data.acl: schema["acl-schema"]
allow if {
	access := data.acl[input.user]
	access[_] == input.operation
}

# METADATA for whocan rule
# scope: rule
# schemas:
#   - input: schema["whocan-input-schema"]
#   - data.acl: schema["acl-schema"]
whocan contains user if {
	access := acl[user]
	access[_] == input.operation
}
```

The directory that is passed to `opa eval` is the following:

```shell
$ tree mySchemasDir/
mySchemasDir/
├── input.json
└── acl-schema.json
└── whocan-input-schema.json
```

In this example, we associate the schema `input.json` with the input document in the rule `allow`, and the schema `whocan-input-schema.json`
with the input document for the rule `whocan`.

### Translating schemas to Rego types and dynamicity

Rego has a gradual type system meaning that types can be partially known statically. For example, an object could have certain fields whose types are known and others that are unknown statically. OPA type checks what it knows statically and leaves the unknown parts to be type checked at runtime. An OPA object type has two parts: the static part with the type information known statically, and a dynamic part, which can be nil (meaning everything is known statically) or non-nil and indicating what is unknown.

When we derive a type from a schema, we try to match what is known and unknown in the schema. For example, an `object` that has no specified fields becomes the Rego type `Object{Any: Any}`. However, currently `additionalProperties` and `additionalItems` are ignored. When a schema is fully specified, we derive a type with its dynamic part set to nil, meaning that we take a strict interpretation in order to get the most out of static type checking. This is the case even if `additionalProperties` is set to `true` in the schema. In the future, we will take this feature into account when deriving Rego types.

When overriding existing types, the dynamicity of the overridden prefix is preserved.

### Supporting JSON Schema composition keywords

JSON Schema provides keywords such as `anyOf` and `allOf` to structure a complex schema. For `anyOf`, at least one of the subschemas must be true, and for `allOf`, all subschemas must be true. The type checker is able to identify such keywords and derive a more robust Rego type through more complex schemas.

#### `anyOf`

Specifically, `anyOf` acts as an Rego Or type where at least one (can be more than one) of the subschemas is true. Consider the following Rego and schema file containing `anyOf`:

```rego title="policy-anyOf.rego"
package kubernetes.admission

# METADATA
# scope: rule
# schemas:
#   - input: schema["input-anyOf"]
deny if {
	input.request.servers.versions == "Pod"
}
```

```json title="input-anyOf.json"
{
  "$schema": "http://json-schema.org/draft-07/schema",
  "type": "object",
  "properties": {
    "kind": { "type": "string" },
    "request": {
      "type": "object",
      "anyOf": [
        {
          "properties": {
            "kind": {
              "type": "object",
              "properties": {
                "kind": { "type": "string" },
                "version": { "type": "string" }
              }
            }
          }
        },
        {
          "properties": {
            "server": {
              "type": "object",
              "properties": {
                "accessNum": { "type": "integer" },
                "version": { "type": "string" }
              }
            }
          }
        }
      ]
    }
  }
}
```

We can see that `request` is an object with two options as indicated by the choices under `anyOf`:

- contains property `kind`, which has properties `kind` and `version`
- contains property `server`, which has properties `accessNum` and `version`

The type checker finds the first error in the Rego code, suggesting that `servers` should be either `kind` or `server`.

```
input.request.servers.versions
              ^
              have: "servers"
              want (one of): ["kind" "server"]
```

Once this is fixed, the second typo is highlighted, prompting the user to choose between `accessNum` and `version`.

```
input.request.server.versions
                     ^
                     have: "versions"
                     want (one of): ["accessNum" "version"]
```

#### `allOf`

Specifically, `allOf` keyword implies that all conditions under `allOf` within a schema must be met by the given data. `allOf` is implemented through merging the types from all of the JSON subSchemas listed under `allOf` before parsing the result to convert it to a Rego type. Merging of the JSON subSchemas essentially combines the passed in subSchemas based on what types they contain. Consider the following Rego and schema file containing `allOf`:

```rego title="policy-allOf.rego"
package kubernetes.admission

# METADATA
# scope: rule
# schemas:
#   - input: schema["input-allof"]
deny if {
	input.request.servers.versions == "Pod"
}
```

```json title="input-allOf.json"
{
  "$schema": "http://json-schema.org/draft-07/schema",
  "type": "object",
  "properties": {
    "kind": { "type": "string" },
    "request": {
      "type": "object",
      "allOf": [
        {
          "properties": {
            "kind": {
              "type": "object",
              "properties": {
                "kind": { "type": "string" },
                "version": { "type": "string" }
              }
            }
          }
        },
        {
          "properties": {
            "server": {
              "type": "object",
              "properties": {
                "accessNum": { "type": "integer" },
                "version": { "type": "string" }
              }
            }
          }
        }
      ]
    }
  }
}
```

We can see that `request` is an object with properties as indicated by the elements listed under `allOf`:

- contains property `kind`, which has properties `kind` and `version`
- contains property `server`, which has properties `accessNum` and `version`

The type checker finds the first error in the Rego code, suggesting that `servers` should be `server`.

```
input.request.servers.versions
              ^
              have: "servers"
              want (one of): ["kind" "server"]
```

Once this is fixed, the second typo is highlighted, informing the user that `versions` should be one of `accessNum` or `version`.

```
input.request.server.versions
                     ^
                     have: "versions"
                     want (one of): ["accessNum" "version"]
```

Because the properties `kind`, `version`, and `accessNum` are all under the `allOf` keyword, the resulting schema that the given data must be validated against will contain the types contained in these properties children (string and integer).

### Remote references in JSON schemas

It is valid for JSON schemas to reference other JSON schemas via URLs, like this:

```json
{
  "description": "Pod is a collection of containers that can run on a host.",
  "type": "object",
  "properties": {
    "metadata": {
      "$ref": "https://kubernetesjsonschema.dev/v1.14.0/_definitions.json#/definitions/io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta",
      "description": "Standard object's metadata. More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata"
    }
  }
}
```

OPA's type checker will fetch these remote references by default.
To control the remote hosts schemas will be fetched from, pass a capabilities
file to your `opa eval` or `opa check` call.

Starting from the capabilities.json of your OPA version (which can be found [in the repository](https://github.com/open-policy-agent/opa/tree/main/capabilities)), add
an `allow_net` key to it: its values are the IP addresses or host names that OPA is
supposed to connect to for retrieving remote schemas.

```json
{
  "builtins": [ ... ],
  "allow_net": [ "kubernetesjsonschema.dev" ]
}
```

#### Note

- To forbid all network access in schema checking, set `allow_net` to `[]`
- Host names are checked against the list as-is, so adding `127.0.0.1` to `allow_net`,
  and referencing a schema from `http://localhost/` will _fail_.
- Metaschemas for different JSON Schema draft versions are not subject to this
  constraint, as they are already provided by OPA's schema checker without requiring
  network access. These are:

  - `http://json-schema.org/draft-04/schema`
  - `http://json-schema.org/draft-06/schema`
  - `http://json-schema.org/draft-07/schema`

### Limitations

Currently this feature admits schemas written in JSON Schema but does not support every feature available in this format.
In particular the following features are not yet supported:

- additional properties for objects
- pattern properties for objects
- additional items for arrays
- contains for arrays
- oneOf, not
- enum
- if/then/else

A note of caution: overriding is a powerful capability that must be used carefully. For example, the user is allowed to write:

```
# METADATA
# scope: rule
# schema:
#  - data: schema["some-schema"]
```

In this case, we are overriding the root of all documents to have some schema. Since all Rego code lives under `data` as virtual documents, this in practice renders all of them inaccessible (resulting in type errors). Similarly, assigning a schema to a package name is not a good idea and can cause problems. Care must also be taken when defining overrides so that the transformation of schemas is sensible and data can be validated against the transformed schema.

### References

For more examples, please see [here](https://github.com/aavarghese/opa-schema-examples).

This contains samples for Envoy, Kubernetes, and Terraform including corresponding JSON Schemas.

See here for the [JSON Schema Reference](https://docs.solo.io/gloo-edge/latest/guides/security/auth/extauth/opa/).

For a tool that generates JSON Schema from JSON samples,
[please see here](https://app.quicktype.io/#l=schema)
([Other Tools](https://json-schema.org/tools?query=&sortBy=name&sortOrder=ascending&groupBy=toolingTypes&licenses=&languages=&drafts=&toolingTypes=data-to-schema&environments=&showObsolete=false&supportsBowtie=false)).

## Strict Mode

The Rego compiler supports `strict mode`, where additional constraints and safety checks are enforced during compilation.
Compiler Strict mode is supported by the `check` command, and can be enabled through the `--strict`/`-S` flag.

```
-S, --strict enable compiler strict mode
```

### Strict Mode Constraints and Checks

| Name                     | Description                                                                                                                              |
| ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------- |
| Unused local assignments | Unused arguments or [assignments](./policy-reference/#assignment-and-equality) local to a rule, function or comprehension are prohibited |
| Unused imports           | Unused [imports](./policy-language/#imports) are prohibited.                                                                             |

## Ecosystem Projects

<EcosystemEmbed feature="learning-rego">
Here are some projects that can help you learn Rego:
</EcosystemEmbed>
