Hapi

Hapi (short for Hierarchical Access Policy Implementation), is a specification language that lets users define access policies. In a distributed system, an access policy defines the actions that each actor can perform on the available resources. Such policies are important because they ensure the preservation of the integrity and the privacy of personal information that users might store into remotely accessed databases. As an example, the specification below indicates that Bob can read all resources, Interns can read emails and credit card numbers, but Alice, who is an intern cannot read emails.:

main = 
  DENY
  EXCEPT
    ALLOW {
      Actor = Bob
      Actions = Reads
      Resources
    }
    ALLOW {
      Actor = Intern
      Actions = Reads
      Resources = EMAIL, CCN
    }
    EXCEPT
      DENY {
        Actor = Alice
        Actions = Reads
        Resources = EMAIL
      }

Hapi is strongly based on Sen et al.'s Legalease. Legalease is a domain specific language, that supports the specification of hierarchies of access policies. Actors might have their rights either increased or decreased as we move down this hierarchy. Syntactically, the position of actors within this hierarchy is given by indentation; which like in Python, has semantic meaning in Hapi. Access hierarchies determine partial orders between actors and groups of resources. Thus, partial ordering emerges as a powerful metaphor to specify the access rights of groups and subgroups of users.

Hapi extends Legalease in several different ways. Mostly, the original specification of Legalease was not sufficiently explicit about the language's grammar and usability. Hapi fills a number of omissions along these directions. In this book, we provide the full grammar specification of Hapi, as well as a brief description of its semantics. Additionally, we provide an assortment of examples that will let users better comprehend how policies can be specified and enforced. Hapi also extends Legalease with a number of practical functionalities, namely, the ability to compose policies in multiple files, and a more concise syntax to specify conjunctions of access rights.

The Project Hapi is sponsored by Cyral Inc and developed at UFMG's Compilers Lab, which exists as part of the Department of Computer Science. This project is fully open source, and the interested users can find all our code base in publicly available repositories.

What is the project about?

Who's working on it?

Hapi is a project from -LAC-UFMG (The Compilers Lab). Currently, the project is developed by the following contributors:

  • Fernando Magno Quintão Pereira - Professor at UFMG;
  • Haniel Barbosa Rodrigues - Professor at UFMG;
  • Celso Junio Simões de Oliveira Santos - Undergraduate research assistant;
  • Flavio Lucio - Undergraduate research assistant
  • Vinicíus Julião - Undergraduate research assistant.

What are our goals?

The main goal of Hapi is to be an easy-to-use policy specification language, that offers users a simple and clear syntax to define the possible interactions between actors, actions and resources. Currently, access policies are programmed in data interchange formats such as JSON, YAML or XML. Such formats are employed in systems of vast importance, such as Amazon's AWS CloudFormation, Microsoft's Azure and Google's Firebase. Yet, it is our understanding that such formats, not having been conceived to this particular domain, could be replaced with more expressive specification languages. Such is the goal of the Hapi Project.

Technologies

Hapi is currently implemented as a combination of several different technologies. Hapi programs are translated to YAML specifications, which can be read and processed by different cloud computing systems. Parsing and translation is implemented in Kotlin. Together with this machinery, we distribute a testing framework, mostly implemented in Python. The policies described in Hapi are amenable to automatic verification. We have a number of such properties verified automatically through CVC4, or through customized algorithms implemented in Python.

Language

This section contains a definition of Hapi. This presentation is organized in three parts: syntactic specification, semantic specification and examples. Hapi is not Turing Complete; hence, it is not meant to describe general computations. Instead, Hapi is a declarative formal language that lets users specify access policies. A program (or an access specification, as programs are called) determines what are the actions that each user can perform on each resource. These three data types that constitute a program: actors, actions and resources, are defined via an special structure called a concept lattice. Access specifications are organized hierarchically: lower level specifications might either increase or reduce the accesses of upper levels.

The Concept Lattice

A Hapi specification manipulates three data types:

  • Actors: entities, like users or groups of users, who can perform actions on resources.
  • Actions: the different ways in which actors can actuate on resources.
  • Resources: the data that can be accessed by different actors via actions.

These types are organized in a data-structure called a concept lattice. A concept lattice is a lattice, that is, an abstract data structure formed by a set, plus a partial order between the elements of the sets. We define a partial order as follows:

  • Given a set S, a partial order is a reflexive, transitive and antisymmetric relation that exists between some pairs of S.

With this in mind, we can now define our concept lattice:

Lattices have another property: every pair of elements has a minimum upper bound, and a maximum lower bound within the lattice. If there are two elements in this set, say, E1 and E2, then the lattice also contains an element E that is greater than E1 and E2. The smallest of these elements is called the least upper bound of E1 and E2. There is also an element E' that is less than E1 and E2. The greatest of these elements is called the greatest lower bound of E1 and E2.

It is possible to visualize lattices as graphs. Below we show three lattices that we use to define data types:

Lattices are fundamental to Hapi because they support the representation of orderings between the values of each attribute that is used in a program. If an attribute A1 is less than another attribute A2, then A1 either expands or constraints the access requirements of A1.

Data Dependency Graph

HAPI policies are checked at each node in the data dependency graph. Each graph node is labeled with the domain-specific attribute name and set of lattice values. Informally, an ALLOW clause permits graph nodes labeled with any subset of the attribute values listed in the clause, and a DENY clause forbids graph nodes labeled with any set that overlaps with the attribute values in the clause. The layering of clauses determines the context within which each clause is checked. For example, the graph below models a scenario where the actor Analyst is reading the resource IP.

[{'Actors': {"Analyst"}, 'Resources': {"IP"}, "Actions": {"Reads}}]

Grammar

The specification bellow is HAPI's full grammar defined in Extended Backus-Naur Form (EBNF):

program: library | executable ;

library
  : importStmt* exportStmt
  | exportStmt
  ;

executable
  : importStmt* letStmt+
  | letStmt+
  ;

letStmt: ID '=' NEWLINE? INDENT? policyExpr DEDENT? ;

exportStmt: 'export' ID 'where' letStmt+;

importStmt: 'import' ID ;

policyExpr: denyExpr | allowExpr ;

denyExpr
  : 'DENY' exceptAllow NEWLINE?
  | 'DENY' attributeExpr exceptAllow? NEWLINE? ;

allowExpr
  : 'ALLOW' exceptDeny NEWLINE?
  | 'ALLOW' attributeExpr exceptDeny? NEWLINE? ;

exceptAllow: NEWLINE 'EXCEPT' NEWLINE INDENT allowExpr+ DEDENT ;

exceptDeny: NEWLINE 'EXCEPT' NEWLINE INDENT denyExpr+ DEDENT ;

attributeExpr
  : '{' attribute+ '}'
  | literalExpr
  ;

literalExpr: ID ('::' ID)? ;

attribute
  : ID '=' value (',' value)*
  | ID
  ;

value: ID;

// <TOKENS>

ID:    [a-zA-Z0-9]+ ;

NEWLINE
 : ( '\r'? '\n' | '\r' | '\f' ) SPACES?
 ;

SKIP_
 : ( SPACES | COMMENT )
 ;

SPACES
 : [ \t]+
 ;

COMMENT
 : '//' ~[\r\n\f]*
 ;

A program's entry point is the program rule which may either define a library or an executable specification. These two keywords distinguish the programs that are executable from programs that just exports policies to be used somewhere else. The former contains a main clause to start the evaluation, whereas the latter contains an export statement at the top level.

Examples

This section provides examples of Hapi specifications that illustrate how the language's syntactic constructs work in practice.

Basic Syntax

The program below is supposed to illustrate a simple policy that denies all accesses but allows for a specific group called Analyst to perform Reads in the Email and IP resources.

main =  
  DENY
  EXCEPT
    ALLOW {
      Actors = Analyst
      Resources = Email, IP
      Actions = Reads
    }

Indentation

The EXCEPT clauses are indentation sensitive. The next example illustrates this property:

1   main =  
2     DENY
3     EXCEPT
4       ALLOW { ... }
5       EXCEPT
6         DENY { ... }
7         EXCEPT
8           ALLOW { ... }
9     EXCEPT
10      ALLOW { ... }

In the policy above, the EXCEPT clause at line 9 belongs to the DENY clause at line 2 instead of to that at line 6. We might as well change it if we want the opposite:

1   main =  
2     DENY
3     EXCEPT
4       ALLOW { ... }
5       EXCEPT
6         DENY { ... }
7         EXCEPT
8           ALLOW { ... }
9         EXCEPT
10          ALLOW { ... }

Exporting and Importing

The next example demonstrates how we can export policies in a file and then incorporte them in another one.

// Main.lgl

import Privacy

main =  
  DENY
  EXCEPT
    ALLOW Privacy::analystActions


// Privacy.lgl

export Privacy where

analystActions =
  ALLOW {
      Actors = Analyst
      Resources
      Actions
    }

Each file defines a namespace. If a file doesn't export a module it is then declared (implicitly) as the Main module. On the other hand, if it is marked as export ... where statement, then it export a module defining a namespace. File names must be the same as module names. Also, they must be in the same folder of the file that imports them. So, in the example above we have two files ("Main.lgl" and "Privacy.lgl") and they must be in the same folder (say "src" for instance).

Unicity and Totality

These are two important properties to our language. They will be better explained in the verification section, but here we intend to give a general intuition on how these properties are guaranteed by the syntax of our language.

  • Totality: For each vector of sets of lattice elements T , and policy clause C, C allows T or C denies T.
  • Unicity: For each vector of sets of lattice elements T , and policy clause C, C allows T and C denies T , are not both true.

The first one is guaranteed by the restriction that every main clause must have a "default" policy clause at the top.

// example 1
main =  
  DENY
  EXCEPT
    ...

// example 2
main =  
  ALLOW
  EXCEPT
    ...

This policy tells us that by default everything is denied (example 1) or allowed (example 2). And then we can specify further rules in the EXCEPT clause.

As for unicity, suppose we have two policies denyAnalyst and allowAnalyst. These policies alone does not really mean anything, we need to combine them in the program's main execution. So we define a main clause to be our entry point.

denyAnalyst = DENY {
                Actors = Analyst
                Resources
                Actions
              }

allowAnalyst = ALLOW {
                Actors = Analyst
                Resources
                Actions
              }

main= ALLOW
      EXCEPT DENY denyAnalyst
      EXCEPT
        DENY denyAnalyst
        EXCEPT ALLOW allowAnalyst

In this example I chose (without loss of generality) it to be a default ALLOW clause. Its semantics says that its conditions must be valid and all of its EXCEPT DENY clauses are valid as well (this will be better explained in the semantics section). There's only one way for the policy EXCEPT DENY denyAnalyst to be valid and that is by denying all Analyst's access. You might now wonder "What about the other one? Isn't it supposed to be valid as well?". In this case, it will be valid because of the semantics of the DENY clause. This is going to be further discussed in the semantics section, but for now, all we need to know is that this structure of nested verification ensures unicity by construction (for a formal proof see verification section).

As you may have noticed, these examples are simple, yet an explanatory way to describe HAPI's grammar. In the next section we're going to dive into the semantics of the language together with the implementation details.

Semantics

We now present the formal semantics of our language:

TC and TG are vector of sets of lattice elements, representing respectively a clause during policy evaluation and the label of a node in the data dependency graph. As we've seen before, the data dependency graph is just a graph to show the relation between the values of lattice elements (things will be clearer in the examples).

The partial order TG ⊑ TC is defined as:

  • \(TG ⊑ TC = ∀ x . TG_x ⊑_x TC_x\)
  • \(TG_x ⊑_x TC_x = ∀ v ∈ TG_x ∃ v' ∈ TC_x . v ≤_x v'\)

Where \(≤_x\) is the partial order associated with the lattice of attribute \(x\).

We also define an intersection operator \(⨅_x\):

  • \(TG_x ⨅_x TC_x = { ∀ v ∈ TG_x v ∧_x v' | v' ∈ TC_x }\)

Where \(∧_x\) is the meet operator for the attribute \(x\).

Last we define the notation:

  • \(⊥ ∈ TG ⨅ TC = ∃ x . ⊥ ∈ TG_x ⨅_x TC_x\)

Basically we want to check if there's an attribute \(x\) such that \(v ∧_x v' = ⊥\) for every \(v ∈ TG_x\) and \(v' ∈ TC_x\).

This whole thing might sound too abstract, so let’s look at some examples to clarify.

Examples

This is a walk-through example over the following policy:

main = 
  DENY
  EXCEPT
    ALLOW {
      Actors = Analyst
      Resources = EMAIL
      Actions = Reads
    }
    EXCEPT
      DENY {
        Actors = Bob
        Resources = EMAIL
        Actions = Reads
      }

It will be represented as a triple (operation, attributes, except_list):

clauses = [
   ("DENY", {'Actors': {"Top"}, 'Resources': {"Top"}, 'Actions': {"Top"}}, [
       ("ALLOW", {'Actors': {"Analyst"}, 'Resources': {"ip"}, "Actions": {"reads"}}, [
           ("DENY", {'Actors': {"Bob"}, 'Resources': {"ip"}, "Actions": {"reads"}}, [])
       ])
   ])
]

And the following data dependency graph with only one node:

[{Actors: {“Bob”}, Actions: {“Reads”}, Resources: {“EMAIL”}}]

That means, we want to answer the following query: can Bob perform Reads operation in the EMAIL resource?

Well, the first clause is an empty DENY (denies everything), so we must look if at least one of its except clauses allow our node.

For the EXCEPT ALLOW clause, Bob ≤ Analyst, EMAIL ≤ EMAIL and Reads ≤ Reads (look at the lattices), so OK! Then it is allowed right? No, we have to check if all of its deny clauses are valid as well.

For the last EXCEPT DENY clause, Bob ∧ Bob = Bob, EMAIL ∧ EMAIL = EMAIL, Reads ∧ Reads = Reads, therefore \(⊥ ∈ TG ⨅ TC\) is false and our policy result denies our node.

In other words, Bob can’t read emails!

Let’s try another node now, what about this one:

[{Actors: {“Alice”}, Actions: {“Reads”}, Resources: {“EMAIL”}}]

That means can “Alice” perform “Reads” operation in the “EMAIL” resource?

For the first DENY it’ll be the same.

In the EXCEPT ALLOW clause, Alice ≤ Analyst, EMAIL ≤ EMAIL and Reads ≤ Reads, OK! So, let’s check our DENY clause.

EXCEPT DENY, Alice ∧ Bob = ⊥ … wait a moment, we already have it! I doesn’t really matter the other attributes because jus by checking this one we know that \(⊥ ∈ TG ⨅ TC\). So our last clause is also valid in our node.

That means that Alice can read emails!

Examples

In this section we aim to present some examples of policies defined in natural language and how they can be translated into HAPI.

Read-only access

"Bob can read all resources, Alice can read and update emails, but only read credit card numbers."

main = 
  DENY
  EXCEPT
    ALLOW {
      Actor = Bob
      Resources
      Actions = Reads
    }
    ALLOW {
      Actor = Alice
      Resources = EMAIL, CCN
      Actions = Reads
    }
    ALLOW {
      Actor = Alice
      Resources = EMAIL
      Actions = Updates
    }

Add a default rule

"Everyone can read emails. Bob can read and delete all data but only updates credit card numbers. Alice can perform all actions in emails"

main = 
  DENY
  EXCEPT
    ALLOW {
      Actor = Bob
      Resources
      Actions = Reads, Deletes
    }
    ALLOW {
      Actor = Bob
      Resources = CCN
      Actions = Updates
    }
    ALLOW {
      Actor
      Resources = EMAIL
      Actions = Reads
    }
    ALLOW {
      Actor = Alice
      Resources = EMAIL
      Actions
    }

Apply the same rules to a group of users

"Bob, Alex and Jeff can read all resources, but Alex can't read emails."

alexCantReadEmails =
  DENY {
        Actors = Alex
        Resources = EMAIL
        Actions = Reads
      }

main = 
  DENY
  EXCEPT
    ALLOW {
      Actor = Bob, Alex, Jeff
      Resources
      Actions = Reads
    }
    EXCEPT
      DENY alexCantReadEmails

Apply rule to a service/group identified by its name

"Operating Cost Predictor service can read all the data."

main = 
  DENY
  EXCEPT
    ALLOW {
      Actor = OperatingCostPredictor
      Resources
      Actions = Reads
    }

Here we're applying rules to a group of values in our lattice. In this case, the service OperatingCostPredictor may be composed of members such as Alice, Bob and Jeff. Then the rules applied to OperatingCostPredictor will also be applied to the three of them.

Combining rules to services/groups with intersections

"Operating Cost Predictor service can read all the data but Intern can't read sensitive ones."

internDontAccessSensitiveData =
  DENY {
        Actors = Intern
        Resources = Sensitive
        Actions
      }

main = 
  DENY
  EXCEPT
    ALLOW {
      Actor = OperatingCostPredictor
      Resources
      Actions = Reads
    }
    EXCEPT
      DENY internDontAccessSensitiveData

In this context, let's say that the group Intern is made of Bob and Jeff. And the OperatingCostPredictor service contains Alice and Jeff. Therefore, Alice will be able to read all data, but Jeff, because he is an intern, won't be able to read sensitive data.

Verification

This section is divided in two subsections. They aim to answer three simple questions about the properties verification integrated into the Hapi tool: What, Why and How.

We present what are the checks made by the Hapi tool, as well as why we decided that such checks should be made in an Access Policy system. A simple example is given motivating the application of the verification.

We also explain how the checks are made in our tool, giving a brief description of them, an use case example and the implementation of the code to perform such verification, providing at the end the result of the execution of the verification, based on in the example given.

The current properties that can be verified so far are:

  • Collision
  • Permissivenes

The configuration files and the tools needed to perform these checks with Hapi, are all described in the Testing Section.

What should be verified?

We want the policies implemented with the Hapi tool to be Consistent and Secure. By Consistency we mean that we want the policies to be correct, running without errors; and by Security we mean that we want the policies to represent the real intentions of the user who implements them.

In order to ensure that the policies implemented in Hapi comply with the above, we verify two properties: collision and permissiveness, defined below:

Definitions:

  • Collision: Each sensitive data must be governed by a single policy, (i.e., no two policies should overlap over sensitive data).

  • Permissivenes: given two similar policies, they are compared to determine the relationship and degree of permissiveness between them, that is, which policy allows a greater number of users to access certain resources.


Consistency

Consistency is one of the most important properties that an Access Policy System must have, and with it we want to ensure that the policies created here are running, individually or together, correctly. For that, the policies need to be in agreement with each other, without Conflict restrictions, or Redundancy.

To be consistent, access policies need to satisfy 3 properties: Totality, Unicity and Collision. As explained in Section 2.2 Grammar , the Hapi language ensures the properties of 'Totality' and 'Unicity' by construction. Therefore, regarding consistency, it remains to verify the property of Collision.

The verification of Collision aims to ensure that two different policies (in separeted files) will deal with different sensitive datas, thus avoiding redundancy or conflicts between different policies. For example, in a case where access to a resource could be guaranteed in one policy but is denied in another:

#Policy_A                               |#Policy_B
----------------------------------------|---------------------------------------
data: [EMAIL, CCN]                      |data: [EMAIL, CCN]
rules:                                  |rules:
  - identities:                         |  - identities:
      users: Alice                      |      users: Alice
      Updates:                          |      Updates:
        data: [CCN, EMAIL]              |        data: [CCN]
----------------------------------------|---------------------------------------
Note that in Policy_A Alice can Update the Resource EMAIL, but in Policy_B, she can't.

This scenario cannot exist. And even if the policies did not conflict, i.e., both policies allowing Alice to perform the Updates action under Resource EMAIL, this would be redundant, which should also not occur in a consistent set of access policies.

Moreover, this is a property that our client, Cyral, determined that should be guaranteed, as can be seen in the following note:

Cyral Note about 'Collision'


Security

Although access policies are developed precisely for the security of a system's data, the user who creates these policies may, at any given time, make a mistake or may lose control of a policy as time passes and the number of associated users to your system increases.

In order to assist the user when it comes to remodeling a policy aiming at a more restrictive system, we have incorporated into the Hapi tool, a Permissiveness check between two Policies.

The verification of Permissiveness seeks to warn the user if an access Policy_A is less-equally Permissive than an access Policy_B:

#Policy_A                               |#Policy_B
----------------------------------------|---------------------------------------
main =                                  |main =
  DENY                                  |  DENY
  EXCEPT                                |  EXCEPT
    ALLOW {                             |    ALLOW {
      Actors = Professor, Assistants    |      Actors
      Resources                         |      Resources
      Actions                           |      Actions
    }                                   |    }
    ALLOW {                             |    EXCEPT
      Actors = Students                 |      DENY {
      Resources = Exam.pdf              |        Actors = Students
      Actions = Reads                   |        Resources = Answers.pdf
    }                                   |      }
----------------------------------------|---------------------------------------
Note that Policy_B prohibits Students from accessing Exam Answers (`Answers.pdf`)
however, they can access any other sensitive data that is in that System. In
addition to the fact that access was granted to all actors in the system
(`Actors <null>`), which can give access to users other than teachers,
assistants and students.

When comparing these Policies, it will be returned that Policy_A is less permissive than Policy_B and, therefore Policy_B does not represent the real intentions of the user who creates the Policies. Policy_A being the least permissive, it is the ideal model for a Restrictive system; that in the item Security, is what is expected.

Helping the user to create a policy system as restrictive as possible is the ultimate goal of creating this verification.

How is it being verified?

All of our checks are integrated into our Tests framework. Currently they are executed manually: the user needs to determine, via command line, the file under which verification will be performed. Automating this process is a work in progress.

We use the CVC4 SMT Solver for the Permissivenes check. The check has the following code snippet in its implementation:

    task_command = ["gradle", "generateCVC4",
    "--args='../../test/fixtures/legalease/configs/cvc4_translator.config.yaml'"]
    subprocess.check_call(task_command, shell=True, cwd='../src/legalease'))*

This code snippet corresponds to the command given to the system so that our translator can be executed over the Legalease policy files, passed as verification parameters. In addition, every CVC4 verification check needs two special imports:

  • PyCVC4 - A python API for the CVC4 SMT solver.
  • CVC4_Statements - A file containing general utilities (definitions of actors, resources, actions ...) to deal with access policies created in Legalease.

How does it happen?

To better explain how verification occurs, we will present examples of use cases for each property being verified and then describe how the code will be executed for this example, according to the implementation of the code.

Note: To run the "Pre-builted execution tests" you need to be on the '../hapi/test' folder

Permissiveness:

Given two policies (A and B), all possible input queries are determined, with the help of the CVC4 SMT solver. Each set of allowed queries is then added to a corresponding list. Both lists are compared according to their items. It follows that: if all items in policy A are contained in policy B, but not all items in policy B are contained in policy A; A is less permissive than B.

To simulate an use-case to this check, imagine the scenario of a fast-growing company:

Every day new employees are hired, new actors are created and added to groups of the company and, consequently, these new actors receive access to various files of the Company, thanks to their association with such groups.

With such rapid growth, regulating and controlling that each added actor has the proper permissions can be a complicated task and therefore failures can occur in the process.

Thinking mainly of a hierarchical class system: New actors can be added to Subgroup C (red circle), which in turn will be associated with a Subgroup B (gray circle), which is associated with group A (black circle).

Exemplo_1

#Alpha_policy
...
ALLOW {
  Actors = Group_A
  Actions = Reads, Updates, Deletes
}

These new actors will inherit all permissions given to group A, which can cause a leak of sensitive data.

Identifying this situation, the user of the HAPI Tool can reformulate the policy that grants access to group A:

#Beta_policy
...
ALLOW {
  Actors = Group_A
  Actions = Reads, Updates, Deletes
}
EXCEPT
  DENY {
    Actors = Group_B
    Actions = Reads, Updates, Deletes
  }

When performing the test using as parameters: 'Alpha Policy (old) -> Beta Policy (new)', it will be reported that the Beta Policy is less permissive than the Alpha Policy:

Example_Permissivenes_Result

That is, Alpha -> Beta = False (Alpha does not implies Beta) but, Beta -> Alpha = True (Beta imply Alpha), which was expected. The user will then choose to use the Beta Policy at the end.

Permissiveness verification is done using two auxiliary functions that were created within the verification file itself. They are:

  • The function Get_list(Policy):
def Get_list(Policy):
 slv.assertFormula(Policy)
 Models_list = []

 while slv.checkSat()._name == 'sat':
   Models_list.append((slv.getValue(Actors), slv.getValue(Action), slv.getValue(Resources)))
   slv.assertFormula(slv.mkTerm(kinds.Or,
                   Actors.eqTerm(slv.getValue(Actors)).notTerm(),
                   Action.eqTerm(slv.getValue(Action)).notTerm(),
                   Resources.eqTerm(slv.getValue(Resources)).notTerm())
             )
 slv.resetAssertions()
 return Models_list

Function that receives an access policy defined with the CVC4 API and obtains all combinations of Actor, Resource, and Action that satisfy the policy, that is, all accepted input queries; returning a list.

  • The function Check_permissiveness(list1, list2):
# check if list2 contains all elements of list1
   # list1 -> list2?
   implication_1 =  all(elem in list2  for elem in list1)

   # check if list1 contains all elements of list2
   # list2 -> list1?
   implication_2 =  all(elem in list1  for elem in list2)

   #Conclusion:
   if (implication_1 == True) and (implication_2 == False):
       return "The First Policy is less permissive than the Second"
   elif (implication_1 == False) and (implication_2 == True):
       return "The Second Policy is less permissive than the First"
   elif (implication_1 == True) and (implication_2 == True):
       return "The two Policies are equally permissive"
   else:
       return "The test doens't aply. There's different accepted models in each Policy"

The one that receives the two lists obtained from the policies to be compared and checks whether list_1 implies list2, that is, whether all elements of list_1 are contained in list_2, and the opposite too (list_2 implies list_1).

In the case of our example, Alpha -> Beta, we will have that the implication_1 will be false so, not all accesses allowed by Alfa are allowed by Beta. And the implication_2 will be true (all accesses allowed by Beta are allowed by Alpha). The end result is that "The Second Policy(Beta) is less permissive than the First(Alpha)".

Pre-built execution test: python main.py -cvc4_permissivenes_test None

Collision:

All the policies, that are already created, needs to be specified to execute this verification. Then, the datas treated in each policy are stored in sets. We then check whether the sets are disjoint or not. If they are not, there will be an intersection between different sets, and respective policies. There is our Collision. Actually theres only one way to verify the Collision, that is manually. Making this verification occurs automatically, where for each new Policy created in Hapi the test is executed, it is a future project.

User X was given the task of defining Resource access: ID, PRODUCTS, EMAIL for Actors Alice and Bob. In user X's system there are hundreds of defined Policies that work perfectly. A Policy A that defines Resources: SSN, EMAIL and CCN already exists in the system, expressed as follows:

data: [SSN, EMAIL, CCN]
rules:
  - identities:
      users: Alice
      Updates:
        data: [CCN, EMAIL]
      Deletes:
        data: [CCN, EMAIL]
  - identities:
      users: Bob
      Updates:
        data: [CCN, EMAIL]
      Deletes:
        data: [EMAIL]

However, not knowing about the existence of such a policy, and that it already governed what happened with Resource EMAIL, user X decides to create a new Policy B to define what happens with all the Resources requested in the task. This new policy being expressed as follows:

data: [ID, PRODUCTS, EMAIL]
rules:
  - identities:
      users: Alice
      Updates:
        data: [PRODUCTS, ID]
      Deletes:
        data: [PRODUCTS, ID]
  - identities:
      users: Bob
      Updates:
        data: [PRODUCTS, EMAIL]
      Deletes:
        data: [PRODUCTS]

As you can see, both Policies A and B define different access restrictions under Resource EMAIL, which can generate errors in the system since the Policies overlap. With the collision verification being performed, the user will receive a warning as it follows:

Collision_result_example

Indicating the conflicting Resource: EMAIL; and also the two Policies in which this Resource is defined: Policies A and B. Then he will need to correct this error if he wants to maintain the Consitency of the system.

The verification of the Collision property occurs with multiple YAML Policy files being passed as a reference in the Verification Call.

Within the plugin, the framework reads every policy and stores all Resources (data) that are treated in each one in different Sets (One set for each Policy):

#...While there is policies...

with open("fixtures/yaml/first-deny.yaml", "r") as ymlfile: #Opening the policy
      cfg = yaml.load(ymlfile)

  Policy_name_1 = set(cfg["data"]) #Getting the defined Resources

Finally, all the Sets containing the Resources that are treated in each policy, are added to a list. In this way, we obtain a list of Sets.

In the second moment, we used two Loops while, to compare all the sets stored in the List. If two sets are 'disjunct', that is, without similar elements, the loop advances to compare the next set:

while iterator < len(list_of_policiess):
    aux = iterator + 1
    while aux < len(list_of_policiess):

        if not list_of_policiess[iterator].isdisjoint(list_of_policiess[aux]):
            overlapped_resources = list_of_policiess[iterator].intersection(list_of_policiess[iterator+1])
            collision_boolean = True
            break
        aux += 1

    if collision_boolean == True:
        break
    iterator += 1

If there are similar elements in two different sets: the loops are closed, the names of the Policies corresponding to each Set are stored in an auxiliary variable, and these two sets are intersected, that will return the similar elements.

After the loop, an auxiliary variable called collision_boolean is used to indicate whether a 'collision' has occurred or not. Being 'True', the test returns the Conflict Policies and Resources governed by both simultaneously. Being 'False', the test passes, telling the user that there are no conflicting policies.

Pre-built execution test: python main.py -collision_verifier None

Updates and improvements are being made so that the use of these features will be easy and intuitive, for users of the Hapi tool.

Testing

Hapi is distributed together with a test framework that eases its development and its usage. This framework lets Hapi users and developers validate if implemented functionalities are behaving as expected. The framework includes a suite of default tests that take a parametrization file to configure the execution environment. New testing patterns can be added to the framework by its users. These new tests can either be implemented from scratch or they can reuse libraries already distributed with Hapi.

Hapi's testing framework is strongly inspired by the LLVM Testing Regression Infrastructure which offers an user interface as plugins. Each plugin has an initial implementation to configure the system attributes and to output test results.

What is supposed to be tested?

Tests are used to validate the tools distributed with Hapi. In other words, most of the tools and services available within the Hapi system are accompanied by a corresponding testing pattern. Hapi's development is strongly based on a test-driven methodology. Currently, the Testing Framework checks the following tools:

  1. Hapi -> YAML translator.
  2. Permissiveness validator
  3. Verification algorithm.

Translation is tested by calling the methods that translate Hapi specs into YAML. The YAML follows the schema requirements for policy files internally adopted at Cyral Inc. Verification of this phase happens through a linter, made available by Cyral Inc.

Permissiveness is verified through CVC4. This formal verifier is used to compare two different policies. In this case, what is really tested are the Python files automatically produced by CVC4 to check some property of policy. To this end, the testing framework relies on the PyCVC4 library to read and process CVC4 specifications.

The last default test distributed with Hapi checks its interpreter. The inputs, in this case, are the policy rules and a query that will be checked against these rules. A query checks the possibility of a given Actor X performing an certain action Y on some resource Z. Testing returns a binary response reporting if the access is allowed or denied.

Install

Hapi provides an auto install script, which checks if the requirements to use the test framework are present. When that is not the case, some applications will be installed automatically. To install the testing framework, once in the hapi/test/ directory type:

foo@bar:~/hapi/test$ sudo sh autoinstall.sh

How does it work?

Since the Testing Framework was written in Python, the use of this structure consists of creating scripts which have a concise pattern based on execution and printing result. Therefore every plugin should implement two mandatory methods that are called by the main function of our system. Moreover, given the need to parametrize the test, it is also possible to catch a file in JSON or YAML format which will be instantiated as a dict or a list in Python code.

Workflow

The test execution is made by:

foo@bar:~/hapi/test$ python3 main.py -<plugin_name> <config_file>

The following image shows how the execution flow of a test works with its plugin. Exemplo_1

Some steps of the flow requires attention:

  1. In this point, it is checked if the plugin will receive a parametrization or not. When the plugin does not demand a configuration, so the <config_file> should be "None". Else the Test superclass will read the JSON/YAML file making the data available.
  2. The run() method is responsible to call the system and execute the tools previously developed by the HAPI team. All response messages are taken when the calls are running.
  3. Once the messages are stored, now the plugin should manipulate the messages and print them.

YAML Lint

This is a plugin previously added in our structure. The main functionality is to check whether an YAML file containing policy rules is according to the Cyral schema. To do that the plugin uses [PAJV][1] validator which, given a mount schema, checks if an JSON/YAML follow the schema specification. This script test requires a configuration file as shown below:

schema: ../datamap/schemas/policy.json
policies:
  - policy-1.yaml
  - policy-2.yaml 

It is good o remember that all file paths in the configuration file are relative to the <config_file>. Using the workflow to explain this script, we have run calling the PAJV for each policy file with the same schema. In each iteration the response is stored in a class attribute and at the end the result method formats and prints the response message in the standard output. [Here][2] is shown how to execute this test.

[1]: https://github.com/json-schema-everywhere/pajv/ [2]: http://cuda.dcc.ufmg.br/hapidoc/testing/examples.html#yaml-lint

Examples

In this point we show execution examples of the kinds of tests that our system has. In general, the "How does it works" section already shows the operation of each plugin, so here we will focus on how to produce a valid test. In short, since the configuration files are correctly written, the test will be executes successfully.

YAML Lint

To exemplify the test operation we will consider the following directory structure:

hapi
└── test
    ├── plugins/
    ├── main.py
    └── fixtures
        ├── configs
        │   └── config-file.yaml
        ├── schemas
        │   └── policy-schema.json
        ├── valid-policy.yaml
        ├── invalid-policy.yaml
        └── not-yaml-policy.yaml

Now, with the config file hapi/test/fixtures/configs/config-file.yaml it is valid to say that all file references in config-file.yaml is a relative path. In other words: How the config-file.yaml is in hapi/test/fixtures/configs/, then the paths specified by this file, must be considering the base directory is the same as the config. Thus, our parametrization file is:

schema: ../schemas/policy-schema.json
policies:
  - ../valid-policy.yaml
  - ../invalid-policy.yaml
  - ../not-yaml-policy.yaml

To execute this plugin, type:

foo@bar:~/hapi/test$ python3 main.py -cyral_yaml fixtures/configs/config-file.yaml

To improve the processing of the test results, the output after the execution is given in YAML format. In this way the output has a good legibility for persons and machines.

How to implement a new test?

Since the Testing Framework requires the existence of a plugin and possibly a parametrization file, here we will show how to create two different kinds of plugins. Both plugins are just an "Hello World" example where the first does not demand configuration file, otherwise the second needs a valid file for parameterization.

Hello World

In this first example, the config file is not needed and the presented code is only a demo to show the requisites and functionalities for implementing a new plugin. We must create a new file named as helloworld.py inside the hapi/test/plugins directory.

from testable import Testable
from testable import Status

class Test(Testable):
  def __init__(self):
    Testable.__init__(self)

  def run(self):
    self._state['result'] = "Hello World"
    self._state['status'] = Status.VALID

  def result(self):
    print(self._state)

To execute helloworld.py, in the hapi/test/ directory type:

foo@bar:~/hapi/test$ python3 main.py -helloworld None

The None word denotes that the plugin does not use a configuration file.

There are three mandatory steps on the new plugin development that not demands an initial file to parametrize the test:

  1. Extend the testable.Testable class with a new class named Test.
  2. Implement the abstract method run().
  3. Implement the abstract method result().

The goal is to implement completely the test responses made in the run() stage and the result() will be used just to manipulate the output.

Hello Folks

In the second plugin, we show how to use a configuration file for structuring a more elaborate test script. Here a YAML file (config.yml) is used, however the Testing Framework also supports JSON format.

Once config.yml is in hapi/test/fixtures/configs and contains an array with some names, the response of our test will be given as Hello <name> for each item listed:

- Anna
- Bernard
- Curry
- Daniel
- Ernest

Our plugin is also inside hapi/test/plugins and this one is named as hellofolks.py as well as the helloworld.py. So we have the following code:

from testable import Testable
from testable import Status
import yaml, sys

class Test(Testable):
  def __init__(self):
    Testable.__init__(self)

  def run(self):
    self._state['result'] = {}
 
    # self._config: config file
    for name in self._config:
        local_result = {
            'message': 'Hello '+name,
            'status': Status.VALID
        }
        self._state['result'][name] = local_result

    self._state['status'] = Status.VALID

  def result(self):
    yaml.dump(self._state['result'], sys.stdout, default_flow_style=False)

Two main point that we should pay attention:

  1. The testable.Testable._config attribute contains the config.yaml data, then as the hellofolks plugin extends this class, it can catch this data accessing self._config.
  2. In result() the formatting is optional by the developer side, so in this example we have decided to print the result with the YAML format.

Lastly the execution is given by:

foo@bar:~/hapi/test$ python3 main.py -hellofolks fixtures/configs/config.yml

Contributors

Celso Junio Simões de Oliveira Santos - celso.simoes@dcc.ufmg.br

Fernando Magno Quintão - fernando@dcc.ufmg.br

Flavio Lucio Correa - flavio.correa@dcc.ufmg.br

João Saffran - <>@dcc.ufmg.br

Haniel Barbosa Rodrigues - hbarbosa@dcc.ufmg.br

Vinicius Julião - viniciusjuliao@dcc.ufmg.br