gateways for developers
The Role of the Developer: Best Practices
An XSEDE Science Gateway is a web or application portal sponsored by a principle investigator (PI) who has an allocation to use data storage and compute resources provided by XSEDE. The gateway provides access to tools customized to meet the needs of a specific community of researchers and is connected to XSEDE resources. For an overview of the steps a PI must take to start an XSEDE Science Gateway, see the Gateways for PIs section.
A portal developer or development team may be brought into an XSEDE Science Gateway project during the initial planning by the PI—before any decisions have been made for implementation—or may join the project after many requirements have been defined and an allocation has already been obtained. If the developers are involved early in the planning process, they can make valuable contributions to decisions that will affect the community in the future.
An XSEDE Science Gateway developer will want to incorporate recommended best practices described below, as well as meeting specific requirements to fulfill XSEDE standards:
Building the Gateway
Types of Gateways
The first steps in creating a gateway include deciding on the interface type and about the services to which a gateway will connect on the backend. You may choose to build a web-based user interface or a desktop application that is installed directly on the end-users' workstations. On the back end, the gateway may connect to only XSEDE services, or it may serve as a bridge to both XSEDE services and other community grids.
Best Practices for Planning and Design
The practices below apply to the design of any web application, and they are worth mentioning here, so that the ease of use of your gateway is considered along with its scientific objectives.
- Create a precise list of requirements your gateway must meet
- Choose technologies based on resources and time
- Select a development team with user interface (UI) experience
- Plan for the long term, i.e., for the lifetime of the gateway
-
Use formal design principles to avoid confusing presentation:
- Structured layouts
- Focused and uncluttered user interface
- Easy identification of information categories and relationships
- Develop in stages
- Involve end-users in the design process
- Use mockups to perform usability testing
Desirable Gateway Characteristics
Gateway characteristics are the intrinsic qualities of the portal technologies that will lead to a robust and maintainable system.
- Universal, secure access
- Airtight security
- Based on Open Standards (JSR 168/236, OGSA, etc.)
- Modular, reusable design (use portlets)
- Technologies with a rich API/Abstraction Layer
- Platform independence (web, Java, XML, etc.)
- Ease of integration into existing infrastructure
- Use of commodity software
- Extensibility
- Maintainability
- Scalability
Software and Sample Codes
Gateway developers have contributed their recommendations for software that they have found helpful in for developing gateways and connecting them to XSEDE resources.
Connecting to XSEDE
Community Accounts
To address scalability issues, many gateways provide access to XSEDE resources through a community account rather setting up unique XSEDE accounts for each gateway user.
A community account has the following characteristics:
- Only a single community user account (i.e., a XSEDE username/password) is created.
- The Science Gateway uses the single XSEDE community user account to launch jobs on XSEDE.
- The gateway user running under the community account typically has privileges to run only a limited set of applications.
The chief difference between an individual and a community account is that a community account is essentially a single username on XSEDE shared by many (human) users. While this eliminates the need for individual gateway users to request their own XSEDE accounts, it places additional accounting and security burdens on the gateway developers. To distinguish one gateway user from another, the gateway developer has to institute a user registry and gateway authentication mechanism.
The gateway developer may create individual logins to the gateway itself. However, after logging in, the user will be unaware that they are running applications on XSEDE through a shared, community account. Because the gateway maintains control of the XSEDE allocation, it is the gateway PI who is responsible for ensuring that the NSF computational resources are used in a manner consistent with policies and that reasonable attempts and tools have been installed to ensure appropriate usage, including monitoring for all usage of the gateway by the community. Developers will want to develop usage tracking mechanisms that allow them to attribute XSEDE resource usage to individual gateway users in the case of a security incident.
A gateway may wish to distinguish between identification mechanisms and capabilities for different types of users. For example, lightweight identification mechanisms may be appropriate for K-12 users making small demonstration runs. More substantial identification and justification for resources may be required for senior researchers using large fractions of a gateway allocation. XSEDE Resource Provider sites may choose to restrict community accounts in a variety of ways, for example chroot jails or non-shell, role-based accounts which allow the execution of only selected commands or commands located in specific directories.
If a gateway provides services for use by individuals with their own allocations, users may be able to upload their own credentials and make use of gateway tools, but charge the runs on XSEDE resources to their own individual allocation.
Security and accounting requirements are below.
To request a community account, the PI can log on to the XSEDE User Portal and select "Community Accounts." from the My XSEDE tab.
Connecting to HPC Resources
- Job Submission for Science Gateways XSEDWHAT do we do here? We haven't moved this over.
- XSEDE Resource Catalog (XSEDE User Support)
- Computing & Running Jobs Overview (XSEDE User Support)
Data Resources and File Spaces
Data storage on XSEDE is categorized by its purpose and its location.
- Allocated storage space for archiving or publishing data collections in databases, on disk or on tape
- Temporary or long-term storage associated with a compute allocation
- File space for sharing libraries and codes, either through the PIs home directory, a community account home directory, or a community software area that is available by special request
Operations and Maintenance Practices
Once your gateway is operational, good operations and maintenance practices ensure continued, optimum integration with XSEDE resources.
- Implement new technologies as needed to keep gateway up to date
- Monitor filesystem usage
- Monitor job load
- Keep content current and relevant
- Keep security and accounting functionalities current with XSEDE requirements
- Back up your gateway routinely
- Make sure your OS and applications are properly patched
- Put a contingency plan in place for complete server loss or security incident
- Keep logs and contact the XSEDE Help Desk for troubleshooting
All gateway developers will want to build their gateways using best practices for portal and web application development. In addition, accurate accounting practices will provide statistics to help justify requests by the gateway PI subsequent proposals.
Security and Accounting for XSEDE gateways
XSEDE has specific security and accounting requirements and recommendations for connecting to its resources to optimize your gateway for prevention and triage of security incidents or inadvertent misuse. In addition, accurate accounting practices will provide statistics to help justify requests by the gateway PI subsequent proposals.
Security and Accounting Requirements and Recommendations
- Required: Notify the XSEDE Help Desk immediately if you suspect the gateway or its community account may be compromised. help at xsede dot org 1-866-907-2383 XSEDE reserves the right to disable a community account in the event of a security incident.
- Required: Keep Science Gateway contact info up to date on the gateway list in case XSEDE staff should need to contact you
- Required: Institute a user registry
- Devise a credential management strategy
- Required: Include User Attributes in Community Credentials (GridShib SAML Tools)
- Collect Accounting Statistics
- Maintain an audit trail (keep a gateway log)
- Provide the ability to restrict job submissions on a per user basis
- Safeguard and validate Programs, scripts, and input
- Protect passwords locally and over the network
- Use proper precautions for passwordless ssh keys (not recommended, if they are stolen anyone can use them)
- Perform Risk and Vulnerability Assessment
- Backup your gateway routinely
- Develop an an incident response plan for your gateway; review and update it regularly
- Put a contingency plan in place to prepare for a disaster or security event that could cause the total loss or lock down of the server
- Monitor changes to critical system files such as SSH with tripwire or samhain (open source)
- Make sure your OS and applications are properly patched - Run a vulnerability scanner against them such as nessus
- Make use of community accounts rather than individual accounts
What to Do in a Security Incident
Whether a threat is confirmed or suspected, quick action and immediate communication with XSEDE Security Working Group is essential. help at xsede dot org; SDL: correct number?1-866-907-2383
Make Use of Community Accounts
Community Accounts are described at length on the main page of the developers section; they are the most common account strategy for XSEDE Science Gateways. Community Accounts present specific security and accounting challenges for the gateway developer, because many end users share access to XSEDE resources through the shared account. Consequently, the developer will typically want to restrict privileges so that it can run a limited set of applications.
To distinguish one gateway user from another, the gateway developer has to institute a user registry and gateway authentication mechanism. A gateway may wish to distinguish between identification mechanisms and capabilities for different types of users. For example, lightweight identification mechanisms may be appropriate for K-12 users making small demonstration runs. More substantial identification and justification for resources may be required for senior researchers using large fractions of a gateway allocation. XSEDE Resource Provider sites may choose to restrict community accounts in a variety of ways, for example chroot jails or non-shell, role-based accounts which allow the execution of only selected commands or commands located in specific directories. Some of the techniques for managing community accounts are described below.
Institute a User Registry
Science gateways must implement a user registry that contains contact information for all users accessing XSEDE resources through the gateway. Gateways may provide access to their XSEDE allocated resources for demonstration or class accounts. In cases such as these, capabilities would be very limited.
Collect resource usage information for each registered user.
Provide the ability to restrict job submissions on a per user basis. This is optional, but can protect the gateway from the shut down of an entire community account in the event of a security incident. Furthermore, Identification of researchers who have benefited from the services offered by the Gateway will be a fundamental part of future requests requesting XSEDE resources and may also be useful to the Gateway's own funding efforts.
Devise a Credential Strategy
Gateways require X.509 credentials for accessing XSEDE resources securely. Developers need to plan a strategy to fit their credential management scenarios. Users access XSEDE resources via science gateways using either individual credentials (i.e., issued to a single user who is known to the XSEDE Central Database) or community credentials (i.e., issued to the gateway which is responsible for per-user tracking). Individual credentials allow XSEDE resource providers (RPs) to track per-person resource usage using standard account-based techniques, based on the XSEDE allocations process. Community credentials provide a more scalable approach, allowing user registration to be outsourced to the gateway, but XSEDE RPs still require the ability to track per-person resource usage for accounting and security reasons.
Gateways can combine both approaches, allowing registered XSEDE users to access resources with their individual credentials and others to access resources via a community credential. For more about scenarios and strategies, see the Science Gateway Credential Management in the XSEDE Wiki.
Maintain An Audit Trail: Keep a Gateway Log
Keeping an audit trail enables traceback to a user engaged in abusive or suspicious activity or a serious security breach. The audit trail consists of a log of all user login and job activity. The audit trail is now being managed by attribute-based authentication, which sends expanded information, including unique user identifiers, in job submission records. These records will be recorded in the XSEDE Central Database (TGCDB) and available for individual management of security. See Attribute-based Authentication below.
Log Authentication and Authorization Activity
All login activity, including attempts, successful logins, and further authorization should include the following data:
- the requesting IP address
- date stamp (Universal Time Code)
- username
Map User Activity to Jobs Run on XSEDE Resources
With the adoption of attribute-based authentication, it will no longer be necessary to maintain job information; however, it may be helpful in policing your gateway in case of an infraction in security.
GRAM jobs:
- GRAM job handle (which usually looks like https://<execHost>:49xxx//xxxxx/xxxxxxxx) for each job submitted by a gateway user for Globus-based gateways. The mapping between this job handle and the scheduler local jobID on the XSEDE compute resource will be provided via an auditing Web Service
- RSL for each job for GRAM jobs.
Non-GRAM job submissions:
- the remote command(s) executed via SSH, scripts, Web Services, or some other method to initiate a job on a XSEDE compute resource.
- scheduler local jobID to gateway user mapping
For both GRAM and non-GRAM:
- XSEDE resource to which the job was submitted.
- Application(s) launched (especially for canned apps).
- Timestamp for each job launch and termination.
Use Attribute-based Authentication (GridShib/SAML) in Community Credentials
Attribute-based authentication is required for all gateways submitting jobs via a community account. This fulfills the NSF requirement that XSEDE provide the number of unique end users running jobs on the systems on a quarterly basis.
For gateways submitting jobs using gsi-ssh, a script is run after each job submission that attaches attributes to that job. More information on this process is available at:
http://www.xsedeforum.org/mediawiki/index.php?title=Gateway-Submit-Attributes.
Gateways using Globus for job submission will need to install the GridShib SAML Tools package and use the XSEDE Science Gateway SAML Extension to bind a SAML token to a proxy certificate that is signed by a gateway's community credential. This software is available at:
http://www.xsedeforum.org/mediawiki/index.php?title=Science_Gateway_Credential_with_Attributes.
Safeguard and Validate Programs, Scripts, and Input
Developers will want to consider the security of codes running on XSEDE resources. To the extent possible, gateways must implement safeguards (to be determined jointly with the XSEDE gateways and Security Working Groups) to protect against intentional or unintentional abuse by programs, scripts, and input. For example, input entered through a web form cause must not cause buffer overflows when a code is run using the resulting input files. Gateways should not let anonymous users upload executable files.
Perform Risk and Vulnerability Assessment
In March, 2008, the TeraGrid Security Working Group began a Gateway Risk Assessment Survey to help determine the baseline level of security for the gateways, and to document the kinds of questions that need to be asked for ongoing security evaluation. It was designed to provide the information needed by XSEDE for incident prevention and response and to evaluate security threats, to suggest possible mitigations, and to determine whether the unmitigated risk can be accepted as a cost of doing business.
Gateways may want to participate in a risk and vulnerability assessment in collaboration with the XSEDE Security Working Group annually. See the Vulnerability Assessment spreadsheet from the Wiki. SDL: correct name & email?E-mail questions and suggestions to Jim Rome, jar at ornl dot gov and Jeff Rosendale, jeffr at ncsa dot uiuc dot edu
Protect Portal Passwords Locally and Over the Network
Gateways should implement a strong password enforcement mechanism for its users. They must also use SSL or some other encryption mechanism to protect gateway user passwords from being transmitted in the clear.
Collect Accounting (and Other) Statistics
XSEDE must report several important metrics about the gateways it supports:
Gateways must record each gateway user's total SU usage. This will be useful both for final reports for your project, requests for future allocations, and for the Security or Accounting Working Groups should they need to investigate abuse or overuse. Once your gateway implements attribute-based authentication, this will be done automatically by XSEDE.
Science successes may be more challenging for a gateway to collect because of the disparate nature of the user community. Nevertheless, science success due to the use of the gateway are important, both for XSEDE resource requests and likely for gateway funding requests as well. Published papers provide a supporting statistic.
- Each gateway user's total SU Usage
- SDL: Is this a metric? Science success stories (annually)
- Number of end gateway users (quarterly)
SDL: Are we transferring "How to Write a Winning…" doc? Successful tips for gateway PIs for requesting resources through the NSF allocation process are described at How to Write a Winning Gateway Proposal. Justification of the resources requested, papers published because of the use of the resources are key factors in a successful proposal.