Authentication and authorization procedures for Grid and Virtual Observatory. Definitions and Guideline
This is a live document!
We define and describe the authentication and authorization (A&A) concepts and implementation mechanisms. We present the state of the art for what regards digital certificates and authorities. We present the Grid point of view on the A&A. We show the key-points to provide an interoperable system. We propose a prototype of A&A architecture for Euro-VO DCA.
A&A mechanisms provides a way for a user to connect and use a resource.
It is easy to confuse the mechanism of authentication with that of authorization. In many host-based systems and client/server systems, the two mechanisms are performed by the same physical hardware and, in some cases, the same software. However, they are two distinct mechanism and maybe performed by separate systems.
Authentication focuses on establishing a person’s identity, based on the reliability of the credential he or she offers. Authentication systems depend on some unique bit of information known (or available) only to the individual being authenticated and the authentication system -- a shared secret. Such information may be a classical password, a digital certificate, or some biometric property. In order to verify the identity of a user, the authenticating system typically challenges the user to provide its unique information (his password, certificate, etc.) -- if the authenticating system can verify that the shared secret was presented correctly, the user is considered authenticated.
Authorization focuses on what actions that identity, at that level of assurance, is permitted to do. For example, a UNIX system my be configured in order to allow some users to write and read only a particular disk area or execute only one application (ex. a bastion host may be configured so that users my only execute ssh).
Authorization systems accounts for the following problems:
- Is user A authorized to access resource X?
- Is user A authorized to perform operation Y?
- Is user A authorized to perform operation Y on resource X?
Authentication and authorization are somewhat tightly-coupled mechanisms. Decisions concerning authorization are and should re-main, the purview of the electronic service process owner.
Authentication with passwords
Conceptually, the simplest way to achieve single sign on (SSO) is for all services in the VO to use the same password file. Each message for which authentication is needed then carries the user's password to the service. This scheme is attractive at first sight but has two major draw-backs.
- The administration of the shared password file is unwelcome when a few sites are connected and infeasible when 100 or more are involved.
- The compromise of one node exposes the credentials of the whole VO.
- Shared password are extremely hard to achieve and manage when different authentication mechanisms are involved.
Therefore, shared password systems are not really suitable for the VO.
Authentication with Digital Certificates
Digital certificates are digital files that certify the identity of an individual or institution seeking access to computer-based information. In enabling such access, they serve the same purpose as a driver's license or library card. The digital certificate links the identifier of an individual or institution to a digital public key.
The public key infrastructure
The combination of standards, protocols, and software that support digital certificates is called a public key infrastructure, or PKI. The software that supports this infrastructure generates sets of public-private key pairs. Public-private key pairs are codes that are related to one another through a complex mathematical algorithm. The algorithms are such that it is impossible to derive the private key from the public one.
The key pairs can reside on one's computer or on hardware devices. Individuals or organizations must ensure the security of their private keys. However, the public keys that correspond to their private keys can be sent across the network or shared.
Message security is the protection of the integrity, and sometimes the privacy, of messages from clients to services, using only credentials carried in the messages themselves.
It does not refer directly to the authentication of systems or between services (for example the Secure Sockets Layer - SSL).
Digital signatures are an example of message security.
Message security protects the integrity of messages by signing them digitally, such that any change to a message by an attacker invalidates the signature. Message security can also protect the privacy of a message by encrypting it. Practical signature and encryption methods typically use PKI cryptography.
To do digital signatures, a user's agent must have access to a public/private key pair. The agent signs with the private key and the receiving service checks the signature with the public key. The latter key can be included in the message; the service does not need to have the key beforehand in order to check a signature.
Certificate as users and services identity warrants
Users and services (host, applications etc.) identify themselveswith a service on the basis of the credential they offer.
To "vouch for" a sender of a message, a party may construct and sign a warrant of identity that ties together the sender's identity and public key. If the receiver of the message trusts the third party and can verify the signature, any message bound to the warrant authenticates the identity stated in the warrant. In this case "bound to the warrant" means that the message is signed with the public key quoted in the warrant.
Two kinds of warrants of identity are commonly used:
- X.509 identity certificates;
- Security Assertion Markup Language (SAML) assertions.
Both kinds could reasonably be called "identity certificates", but that term is commonly used to mean specifically X.509 certificates.
X.509 and SAML are equivalent forms for the limited case of asserting a user's identity. Both forms allow more advanced uses but these are not relevant to the current case. X.509 is an older standard and is used more widely. For example, the Globus Toolkit requires X.509 certificates. The European production grid (EGEE-III) uses also X.509 certificates. SAML is a newer standard. For Example, the Shibboleth system for controlling access to web sites requires SAML.
In this document we focus on the use of X.509 certificates which is the standard for the Grid communities and in particular for the EGEE-III one.
Digital certificates are issued by certificate authorities just as state governments issue driver's licenses. There are several public companies in the business of issuing certifi-cates. Certificate authorities (CA) are responsible for managing the life cycle of certificates, including their revocation.
A CA must check a user's identity before issuing a certificate using some different authentication scheme to the one in which the warrant is used. That is, the user must pre-register with the CA before using the services to which the CAs warrants grant access. The CA issues a long term warrant to users and services typically they are valid for one year.
In Europe a set of national CAs for e-science (research and education) has been created, which are federated via the EUGridPMA
(Policy Management Authority)[cita]. The EUGridPMA
is the international organization to coordinate the trust fabric for e-Science grid authentication in Europe. It collaborates with the regional peers APGridPMA
for the Asia-Pacific and the Americas Grid PMA in the International Grid Trust Federation.
Each institution/university negotiate with its national CA to create a Registration Authority (RA). The RA mediates between the CA and the user that asks for a certificate for him self or for a service he is going to manage. Typically, the person requesting the certificate is required to be physically present at the RA when a certificate is issued. The RA is in charge of verifying his identity.
Actually, the e-science communities (ex. EGEE-III) agree to trust all the CA that are trusted by the EUGridPMA
. Not to trust a CA may prevent a user from accessing some resources/services or some service provider from joining a particular Grid ser-vice.
The warrants policies
The major difference in authentication schemes consists in how the warrants are passed out and communicated to the services that need them. Three schemes are relevant here.
Long-lived warrants held by users
The CA issues long-lived warrants (valid for a year) after carefully authenticating the user with physical credentials such as a passport. Users are given their warrants to keep and use as they wish. The CA is not a part of the runtime SSO system. The CA holds and maintain a certificate revocation list.
Temporary warrants held by user agents
The CA registers users as in the case of long-lived warrants and arranges a SSO password with each user. This password authenticates the user in access to on-line services representing the CA, but not to general services. No long-lived warrants are issued to the user, but instead the CA issues short-lived warrants (valid for perhaps a day) when the user invoked the CA service and authenticates with the SSO password. The CA is a part of the runtime system but is not involved in the authentication of each message; once the user's agent has the warrant for a session the CA need not be consulted again in that session either by the agent or the ser-vices that the agent uses.
Warrants supplied by referee
The CA registers users with a password as in the case of temporary warrants, above. At the start of each session, the user logs in to the CA service with the password, again as in case 2. However, the CA does not supply a warrant to the user's agent. Instead, the agent states the endpoint address of the CA service in each message. When authenticating the message, a service invokes the CA service to get the warrant.
In each scheme, the service requiring authentication has to trust the CA to issue warrants relating only to "proper" users. This means that the CA:
- must never issue a warrant for the same identity to two different us-ers;
- must never issue a warrant for a falsely held identity;
- where warrants are long-lived, must revoke all warrants for which the cryptographic credentials are compromised.
The Shibboleth [cita] system for controlling access to web sites uses the referee scheme. Users log in to a referee service (e.g. using a password). The referee then issues warrants (SAML assertions) that the user is logged in when asked by services needing authentication.
The warrants policies for the EGEE-II and Globus based Grids: proxy certificates
The scheme used for many computational grids, and implemented in the Globus toolkit [cite] is a hybrid of long-lived and temporary warrants. Users (identified by the RA) receive long-lived X.509 certificates from a CA. Using standard SSL it is possible to create a derivative certificate from the original one. This new certificate has a short duration (typically 24 hours). This short term certificate is called "proxy certificates". Proxy certificates are issued by an "end entity" (typically a user or a service), either directly with the "end entity" certificate as issuing certificate, or by extension through an already issued proxy certificate. They are used to extend rights to some other entity (a computer process, typically, or sometimes to the user itself), so it can perform operations in the name of the owner of the user/service certificate (this mechanism is called Delegation).
A warning about proxy certificates
The use of proxy certificates presents also some limitations and possible problems.
(Extracted from OpenSSL Documentation)
None seems to have tested proxy certificates with security in mind. Basically, to this date, it seems that proxy certificates have only been used in a world that's highly aware of them. What would happen if an unsuspecting application is to validate a chain of certificates that contains proxy certificates? It would usually consider the leaf to be the certificate to check for authorization data, and since proxy certificates are controlled by the end entity certificate owner alone, it's would be normal to consider what the end entity certificate owner could do with them. subjectAltName and issuerAltName are forbidden in proxy certificates, and this is enforced in OpenSSL. The subject must be the same as the issuer, with one commonName added on.
Possible threats are, as far as has been imagined so far:
- impersonation through commonName (think server certifi-cates).
- use of additional extensions, possibly non-standard ones used in certain environments, that would grant extra or different authorization rights.
For this reason, OpenSSL
requires that the use of proxy certificates be explicitly allowed. Currently, this can be done using the following methods:
- if the application calls X509_verify_cert() itself, it can do the following prior to that call (ctx is the pointer passed in the call to X509_verify_cert()): X509_STORE_CTX_set_flags(ctx, X509_V_FLAG_ALLOW_PROXY_CERTS);
- in all other cases, proxy certificate validation can be enabled before starting the application by setting the environment variable OPENSSL_ALLOW_PROXY with some non-empty value.
There are thoughts to allow proxy certificates with a line in the default openssl.cnf, but that's still in the future.
Certificate revocation lists
One of the duties of a CA is to maintain a certificate revocation list (CRL) is a list of certificates serial numbers which have been revoked, are no longer valid, and should not be relied on by any system user.
The revocation reasons are well defined [cite rfc3280] :
the CA had improperly issued a certificate; a private-key is thought to have been compromised; failure of the identified entity to adhere to policy requirements; violation of any other policy specified by the CA operator or its cus-tomer.
this reversible status can be used to notice the temporary invalidity of the certificate. The certificate serial number will be removed from CLR after the security warn-ing has been clarified.
The CRL is generated periodically after a clearly defined timeframe and on the other hand immediately after a certificate has been revoked. The time step is defined by the CA and in the Globus like grid environment it is published in the CA setup. If the CLR is expired the “end entity” may not be used (in case of services) or access resources (in case of users). To prevent spoofing or denial-of-service attacks, CRLs are usually signed by the issuing CA and therefore carry a digital signature. To validate a specific CRL prior relying on it, the certificate of its corresponding CA is needed, which usually can be found on a (even public) directory.
CRLs or other certificate validation techniques are a necessary part of any prop-erly operated SSO operations.
Best practices on CRL
To effectively use a X.509 PKI one must have access to current CRLs (i.e. Internet access in the case of a PKI). This requirement of on-line validation negates one of the original major advantages of PKI over symmetric cryptography protocols, namely that the certificate is "self authenticating".
The existence of a CRL implies the need for someone (or some organization) to enforce policy and revoke certificates deemed counter to operational policy. If a certificate is mistakenly revoked significant problems can arise. It is not mandatory that CAs are responsible for determining if and when revocation is appropriate by interpreting the operational policy, however this is the case in the EUGridPMA
The necessity of consulting a CRL, or other certificate status service, prior to accepting a certificate raises a potential denial-of-service attack against the X.509 PKI.
In the EGEE-III the CRLs are downloaded automatically from the web every seven hours. This should avoid an old CRL to be used. When a DNS is in place anyway an updated CRL is already available. This of course do not avoid CRLs insecurity issues.
No comprehensive solution to these problems is known, though there are multi-ple workarounds for various aspects of it, some of which have proven acceptable in practice.
An alternative to using CRLs which is especially useful for software clients is the on-line certificate validation protocol Online Certificate Status Protocol (OCSP) [cita RFC 2560] .
OCSP (Online Certificate Status Protocol) has the primary benefit of requiring less network bandwidth and thus enabling realtime and near realtime status checks for high volume or high value operations. Messages encoded in Abstract Syntax notation are ex-changed over HTTP.
An OCSP server (the responder) is queried by an end entity (A) to verify the status of the certificate exposed by another end entity (B). B contacts A, A contacts the re-sponder sending the fingerprint of B certificate. The responder may return a signed response whit the status of the certificate specified in the request. Possible status are: 'good', 'revoked' or 'unknown'. If the responder cannot process the request, it may return an error code.
OCSP can support more than one level of CA. OCSP requests may be chained between peer responders to query the issuing CA appropriate for the subject certificate, with responders validating each other's responses against the root CA using their own OCSP re-quests.
An OCSP responder may be queried for revocation information by delegated path validation (DPV) servers. OCSP does not, by itself, perform any DPV of supplied certificates.
The use of OCSP in EGEE-III is planned but not implemented [cite].
The authorization mechanisms
In the Grid contest Virtual Organizations (VO) play a central role.
A VO represents all those distributed communities willing to share their resources in order to achieve common goals. A VOs is a set of resources, users and rules governing the sharing of the resources. The sharing must be arranged in a controlled, secure, and flexible way, usually for a limited period of time.
Security requirements within the Grid environment are driven by the need to support scalable, dynamic, distributed VOs.
The lower authorization level on Grid environments is based on the VO concept: people dealing with the same VO may access the same resources. In the real world, this level of authorization is not fully satisfying. A fine-grained authorization is necessary both for the VOs needs and to match the local authorization levels with the Grid ones. The resource administrator should be able to apply the same authorization rules of the local users also to the grid users (i.e. users that access from the Grid environment).
Authorization in practice
We present the most common authorization mechanisms for the Grid Infrastructures.
The authorization mechanisms used by the grid environments provides two “security” levels: between VOs and infra-VO.
In Globus based grids the Grid Security Infrastructure supports authorization in both the server-side and the client-side. Server-side authorization is based on “Gridmap”. A gridmap is a list of 'authorized users' akin to an ACL (Access Control List) that allows to specify what users have access to a service. This authorization method is based on the gridmap file. This file has a list of x.509 certificates distinguished names. The way gridmap authorization works is based on the idea that each certificate is mapped on a local user on the basis of the certificate distinguish name. Authorization profiles are then assign on the local user. For scalability reasons users may be mapped on pool accounts on the basis of their VO membership. In this case it is common to have authorization profile on the basis of the group assignment.
To effectively use a gridmap authorization, a service must have access to current gridmap files (i.e. Internet access to LDAP based gridmap repository). This requirement of on-line validation negates once again (see CRL) one of the original major advantages of PKI over symmetric cryptography protocols, namely that the certificate is a self consistent au-thorization and authentication token.
The existence of a gridmap implies the need for someone (or some organization) to enforce policy and create the proper user mapping. If a certificate is mistakenly mapped to a user significant problems can arise. The VO manager is in charge of maintaining the gridmap.
Problems related to the use of gridmap authorization:
- coarse-grained authorization
- heterogeneous authorization: common rules between similar services may not be automatically applied/assured.
- on line availability problems of the gridmap. This raises a potential denial-of-service attack and other security issues based on the freshness of the gridmapping.
Virtual Organization Membership Service
VOMS is a system for managing authorization data within multi-institutional collaborations. VOMS provides a database of groups and user roles and capabilities and a set of tools for accessing and manipulating the database and using the database contents to generate Grid credentials for users when needed.
The VOMS database contains authorization data that defines specific capabilities and general roles for specific users. A suite of administrative tools allow administrators to assign roles to users and manipulate capability information. The VOMS administration is done via web server. A command-line tool (voms-proxy-init) allows users to generate a local proxy credential based on the contents of the VOMS database (VOMS provides also C++ and Java APIs to initialize a VOMS enabled certificate).
This credential includes the basic authentication information that standard Grid proxy credentials contain, but it also includes role and capability information from the VOMS server.Standard Grid applications can use the credential without using the VOMS data, whereas VOMS-aware applications can use the VOMS data to make authentication decisions regarding user requests.
The VOMS server is contacted only by the user when he initialize the certificate. This operation produces a “pseudo-certificate” that contains the user and the VOMS server credentials and the validity. All the data is signed by the VOMS server certificate. The pseudo-certificate is inserted in the user proxy certificate inserting into it in a non critical ex-tension.
VOMS allows distributed collaborations to centrally manage user roles and capabilities. The VOMS user credentials provide additional role and capability data to application service providers that can then be used to make more fully-informed authorization decisions.
VOMS may be integrated with local (site level) authorization databases to assign users local credentials on the bases of by VO assertions and also on the bases of the local policies.
The VOMS server does not add any security issues at user level since it performs the usual GSI security controls on the user’s certificate before granting rights (it must be signed by a “trusted” CA, be valid and not revoked). On the other hand, even compromising the VOMS server itself would be not enough to grant illegal access to resources since the authorization data must be inserted in a user proxy certificate (i.e. countersigned by the user himself). The only possible large scale vulnerabilities are denial-of-service attacks (e.g. to prevent VO users to get their authorization credentials).
Malicious users that compromise the VOMS server and that have a valid x.509 certificate my impersonate roles and capabilities and VO dealing.
EGEE-II infrastructure uses VOMS as authorization method. A VOMS server that implements the SAML (version 2) capabilities is under development [cita].
Community Authorization Service
Community Authorization Service (CAS) allows a VO to express policy regarding resources distributed across a number of sites. A CAS server issues assertions to the VO users, granting them fine-grained access rights to resources. Servers recognize and enforce the assertions. CAS is designed to be extensible to multiple services and is currently supported by the GridFTP
server. CAS is a Globus Toolkit component.
The CAS essentially has users, actions, objects and policies governing the user's access to the objects for the purpose of performing specific actions. To better serve the requirements of a VO, the server allows grouping of users, actions and objects. This also facilitates specifying policies about them. The CAS server can be thought of as the frontend to a database that maintains state about such community permissions. The effect of each CAS request is either to modify this state or query it.
The CAS architecture is similar to the VOMS one. In fact the policy about what memberships a user has is centralized both in the CAS and in the VOMS server. However in the VOMS architecture, the policy regarding exactly what rights those memberships grant is distributed among the sites. CAS assertions provide the rights directly and do not need interpretation by the resource. A complete centralization of policy can achieve better consistency but reduces the control to the local resources administrator.
Privilegeand Role Management Infrastructure Standards Validation
Privilegeand Role Management Infrastructure Standards Validation (PERMIS) software is an authorization infrastructure that can realize Role Based Access Control (RBAC) authorization infrastructures. It is implemented as a Java-based API which makes decisions on whether or not access to a particular resource is valid. PERMIS uses XML policies which define the rules specifying the access rights and associated actions that can be invoked on resources within the VO. These policies include the definitions of roles (and their hierarchy), Sources of Authority which are trusted to assign these roles, and the resource targets and actions which are governed by this policy. Roles are issued to users in the form of X509 Attribute Certificates (ACs) stored in a dedicated LDAP repository. A decision request (based on the VO policy and role requested) currently contains: the user name, the target name, and the requested action.
PERMIS architecture may be summarized in:
- the user contacts a resource providing identity, attributes and action;
- the resource requests a Credential Validation Server to validate the request;
- the Credential Validation Server makes a decision on the basis of a Credential Provider database and return the decision to the resource;
- the resource enforces the decision.
Shibboleth provides a set of network services that support a federated authorization and authentication model. Designed with universities, corporations, and government agencies in mind, Shibboleth allows organizations to participate in the authentication and authorization of their individual members (e.g., faculty, students, employes) when those members use services provided by external agencies (e.g., commercial or government services).
Shibboleth makes use of local authentication systems at "home institutions" (the organization where an individual user works or goes to school) in cooperation with local Shibboleth services to inform remote services of the validity of requests by local users to use the services.
Shibboleth services on remote web servers intercept user requests and (if the user is not recognized as a known user) work with the user to determine their home institution. They then interact with the home institution's Shibboleth services to obtain a "handle" for the user that contains any identification information that the home institution chooses to make available as well as "attributes" that describe the role(s) that the user has in the institution. This information is used by the remote service to determine whether to give the user access to the service or not.
Originally geared for Web browser-based services, Shibboleth is currently being extended to support services that use other interfaces, such as Web services and WSRF inter-faces.
- Relieves remote service providers from having to manage user lists for every institution that uses their services
- Allows "home institutions" to protect the identities of their users from remote service providers
- Leverages existing authentication systems at home institu-tions
- Flexible, distributed architecture supports a variety of usage scenar-ios
We make an overview of the most common A&A protocols and services.
European Grid projects (see census proposed by WP5 [cita]) mainly adopted an authentication method based on GSI X509 certificate. The authorization is based on VOMS or VOMS-like mechanisms. Activity is on going in the framework of the EGEE project to allow the main global Grid infrastructures to interoperate (EGEE, OSG, NAREGI). The first focus regards the SSO and data exchange.
Having a consistent and common authentication and authorization system between the European VObs and the Grids will enable EuroVO
data centers to:
- reduce the number and types of electronic credentials that EuroVO users and Data Centers need to conduct activities with/between Data Centers
- reduce the user strength to access resources (ex. more than one certificate or registration method to use EuroVO and Grid environments).
- reduce the authentication system development and acquisition costs.
- Make consistent authorization decisions.
- (is there something more.. or less to say?)
In the framework of the IVO, four possible authentication policies has been approved to ensure trustworthy electronic transactions and to fulfill IVO information security requirements:
- No authentication required.
- Digital signature of messages.
- Transport Layer Security (TLS) with client certificates.
- Transport Layer Security (TLS) with passwords.
The choice of an authentication policy should develop from an SSO risk assessment. A guidance on the definition of the risk level is recommended
This document does not directly apply to definition of risk level. However, we remember that while accessing a public data repository implies a minimal risk, a service that allow users to interact with a Grid resource requires a high assurance level.
We recommend to:
- Reduce the number and types of A&A mechanisms to simplify the access to EuroVO and Grid systems
- Use a TSL certificate authentication system using X509 certificates signed by EUGridPMA CAs.
- Create one Registration Authority for each Data Center
- Provide a VOMS based authorization system or provide a VOMS-to-IVO authentication system interface.
- Identify the Risk level for the service exposed (see usage scenarios section for details);
- If one is hacked we are all hacked similar level of security, that comes true if we have distribute resources and interoperating
There are some typical usage scenarios for a VObs service that uses also Grid resources (data and jobs):
- Demo usage: a user that want to test a service before proceeding with the full registration procedure (certificate request and so on). A username+password authentication with mail verification is suggested for authentication. Demo certificate with a short time usage (30-60 minutes) may be automatically generated by the Data Center (application proxy certificate). Authorization on small amount of resources tuned by the VOMS attributes. EuroPMA are now issuing robots certificates. Those certificates are used for internet service and robots that does not require a direct human interaction. They may be successfully used on this purpose.
- Weak SSO profile: a user that sometimes access VObs resources. A user-name+password authentication with mail verification is suggested for authentication.
- Full usage: Strong SSO profile. A user that wants full interaction between VObs resources and Grid ones. A user certificate is required signed by one EUGridPMA CA. To access grid resources from a VObs application a “Delegation Server” is required. The delegation procedure should support VOMS initialization.
- Application certificates: when the use of the Grid is demanded to the application. No user interaction is required. The user just want a software to run no-matter where and how. The use an Application Certificate is suggested. The certificate DN must specify the application type and the Data Center name. Robots certificates are suggested.
- Something else?
Blueprint for a delegation service
- Internet X.509 Public Key Infrastructure (PKI) Proxy Certificate Profile
- Online Certificate Status Protocol
- Internet X.509 Public Key Infrastructure Certificate Policy and Certification Practices Framework
- EGEE-II interoperability issues
- EGEE Global Security Architecture
- Security Assertion Markup Language
- European Policy Management Authority for Grid Authentication
- Globus Toolkit
- gLite grid middleware
- Virtual Organization Membership Service
- VOMS SAML service
- ROLE BASED ACCESS CONTROL
- 12 Aug 2008