The Cloud  Lab Manual
1 Cloud  Lab Status Notes
2 Getting Started
2.1 Next Steps
3 Cloud  Lab Users
3.1 GENI Users
3.2 Register for an Account
3.2.1 Join an existing project
3.2.2 Create a new project
4 Cloud  Lab and Repeatable Research
5 Creating Profiles
5.1 Creating a profile from an existing one
5.1.1 Preparation and precautions
5.1.2 Cloning a Profile
5.1.3 Copying a Profile
5.1.4 Creating the Profile
5.1.5 Updating a profile
5.2 Creating a profile with a GUI
5.3 Repository-Based Profiles
5.3.1 Updating Repository-Based Profiles
5.3.2 Branches and Tags in Repository-Based Profiles
5.4 Creating a profile from scratch
5.5 Sharing Profiles
5.6 Versioned Profiles
6 Basic Concepts
6.1 Profiles
6.1.1 On-demand Profiles
6.1.2 Persistent Profiles
6.2 Experiments
6.2.1 Extending Experiments
6.3 Projects
6.4 Physical Machines
6.5 Virtual Machines and Containers
7 Resource Reservations
7.1 What Reservations Guarantee
7.2 How Reservations May Affect You
7.3 Making a Reservation
7.4 Using a Reservation
8 Describing a profile with python and geni-lib
8.1 A single XEN VM node
8.2 A single physical host
8.3 Two Xen  VM nodes with a link between them
8.4 Two ARM64 servers in a LAN
8.5 A VM with a custom size
8.6 Set a specific IP address on each node
8.7 Specify an operating system and set install and execute scripts
8.8 Profiles with user-specified parameters
8.9 Add temporary local disk space to a node
8.10 Creating a reusable dataset
8.11 Debugging geni-lib profile scripts
9 Virtual Machines and Containers
9.1 Xen VMs
9.1.1 Controlling CPU and Memory
9.1.2 Controlling Disk Space
9.1.3 Setting HVM Mode
9.2 Docker Containers
9.2.1 Basic Examples
9.2.2 Disk Images
9.2.3 External Images
9.2.4 Dockerfiles
9.2.5 Augmented Disk Images
9.2.6 Remote Access
9.2.7 Console
9.2.8 ENTRYPOINT and CMD
9.2.9 Shared Containers
9.2.10 Privileged Containers
9.2.11 Remote Blockstores
9.2.12 Temporary Block Storage
9.2.13 Docker  Container Member Variables
10 Advanced Topics
10.1 Disk Images
10.2 RSpecs
10.3 Public IP Access
10.3.1 Dynamic Public IP Addresses
10.4 Markdown
10.5 Introspection
10.5.1 Client ID
10.5.2 Control MAC
10.5.3 Manifest
10.5.4 Private key
10.5.5 Profile parameters
10.6 User-controlled switches and layer-1 topologies
11 Hardware
11.1 Cloud  Lab Utah
11.2 Cloud  Lab Wisconsin
11.3 Cloud  Lab Clemson
11.4 Apt Cluster
11.5 IG-DDC Cluster
12 Planned Features
12.1 Improved Physical Resource Descriptions
13 Cloud  Lab Open  Stack Tutorial
13.1 Objectives
13.2 Prerequisites
14.4 Logging In
14.4.1 Using a Cloud  Lab Account
14.4.2 Using a GENI Account
13.4 Building Your Own Open  Stack Cloud
13.5 Exploring Your Experiment
13.5.1 Experiment Status
13.5.2 Profile Instructions
13.5.3 Topology View
13.5.4 List View
13.5.5 Manifest View
13.5.6 Graphs View
13.5.7 Actions
13.5.8 Web-based Shell
13.5.9 Serial Console
13.6 Bringing up Instances in Open  Stack
13.7 Administering Open  Stack
13.7.1 Log Into The Control Nodes
13.7.2 Reboot the Compute Node
13.8 Terminating the Experiment
13.9 Taking Next Steps
14 Cloud  Lab Chef Tutorial
14.1 Objectives
14.2 Motivation
14.3 Prerequisites
14.4 Logging In
14.4.1 Using a Cloud  Lab Account
14.4.2 Using a GENI Account
14.5 Launching Chef Experiments
14.6 Exploring Your Experiment
14.6.1 Experiment Status
14.6.2 Profile Instructions
14.6.3 Topology View
14.6.4 List View
14.6.5 Manifest View
14.6.6 Actions
14.7 Brief Introduction to Chef
14.8 Logging in to the Chef Web Console
14.8.1 Web-based Shell
14.8.2 Chef Web Console
14.9 Configuring NFS
14.9.1 Exploring The Structure
14.10 Apache Web Server and Apache  Bench Benchmarking tool
14.10.1 Understanding the Internals
14.11 Final Remarks about Chef on Cloud  Lab
14.12 Terminating Your Experiment
14.13 Future Steps
15 Getting Help
2018-11-29 (16097c8)

The CloudLab Manual

The CloudLab Team

CloudLab is a "meta-cloud"—that is, it is not a cloud itself; rather, it is a facility for building clouds. It provides bare-metal access and control over a substantial set of computing, storage, and networking resources; on top of this platform, users can install standard cloud software stacks, modify them, or create entirely new ones.

The current CloudLab deployment consists of more than 25,000 cores distributed across three sites at the University of Wisconsin, Clemson University, and the University of Utah. CloudLab interoperates with existing testbeds including GENI and Emulab, to take advantage of hardware at dozens of sites around the world.

The control software for CloudLab is open source, and is built on the foundation established for Emulab, GENI, and Apt. Pointers to the details of this control system can be found on CloudLab’s technology page.

Get started!

    1 CloudLab Status Notes

    2 Getting Started

      2.1 Next Steps

    3 CloudLab Users

      3.1 GENI Users

      3.2 Register for an Account

        3.2.1 Join an existing project

        3.2.2 Create a new project

    4 CloudLab and Repeatable Research

    5 Creating Profiles

      5.1 Creating a profile from an existing one

        5.1.1 Preparation and precautions

        5.1.2 Cloning a Profile

        5.1.3 Copying a Profile

        5.1.4 Creating the Profile

        5.1.5 Updating a profile

      5.2 Creating a profile with a GUI

      5.3 Repository-Based Profiles

        5.3.1 Updating Repository-Based Profiles

        5.3.2 Branches and Tags in Repository-Based Profiles

      5.4 Creating a profile from scratch

      5.5 Sharing Profiles

      5.6 Versioned Profiles

    6 Basic Concepts

      6.1 Profiles

        6.1.1 On-demand Profiles

        6.1.2 Persistent Profiles

      6.2 Experiments

        6.2.1 Extending Experiments

      6.3 Projects

      6.4 Physical Machines

      6.5 Virtual Machines and Containers

    7 Resource Reservations

      7.1 What Reservations Guarantee

      7.2 How Reservations May Affect You

      7.3 Making a Reservation

      7.4 Using a Reservation

    8 Describing a profile with python and geni-lib

      8.1 A single XEN VM node

      8.2 A single physical host

      8.3 Two XenVM nodes with a link between them

      8.4 Two ARM64 servers in a LAN

      8.5 A VM with a custom size

      8.6 Set a specific IP address on each node

      8.7 Specify an operating system and set install and execute scripts

      8.8 Profiles with user-specified parameters

      8.9 Add temporary local disk space to a node

      8.10 Creating a reusable dataset

      8.11 Debugging geni-lib profile scripts

    9 Virtual Machines and Containers

      9.1 Xen VMs

        9.1.1 Controlling CPU and Memory

        9.1.2 Controlling Disk Space

        9.1.3 Setting HVM Mode

      9.2 Docker Containers

        9.2.1 Basic Examples

        9.2.2 Disk Images

        9.2.3 External Images

        9.2.4 Dockerfiles

        9.2.5 Augmented Disk Images

        9.2.6 Remote Access

        9.2.7 Console

        9.2.8 ENTRYPOINT and CMD

        9.2.9 Shared Containers

        9.2.10 Privileged Containers

        9.2.11 Remote Blockstores

        9.2.12 Temporary Block Storage

        9.2.13 DockerContainer Member Variables

    10 Advanced Topics

      10.1 Disk Images

      10.2 RSpecs

      10.3 Public IP Access

        10.3.1 Dynamic Public IP Addresses

      10.4 Markdown

      10.5 Introspection

        10.5.1 Client ID

        10.5.2 Control MAC

        10.5.3 Manifest

        10.5.4 Private key

        10.5.5 Profile parameters

      10.6 User-controlled switches and layer-1 topologies

    11 Hardware

      11.1 CloudLab Utah

      11.2 CloudLab Wisconsin

      11.3 CloudLab Clemson

      11.4 Apt Cluster

      11.5 IG-DDC Cluster

    12 Planned Features

      12.1 Improved Physical Resource Descriptions

    13 CloudLab OpenStack Tutorial

      13.1 Objectives

      13.2 Prerequisites

      14.4 Logging In

        14.4.1 Using a CloudLab Account

        14.4.2 Using a GENI Account

      13.4 Building Your Own OpenStack Cloud

      13.5 Exploring Your Experiment

        13.5.1 Experiment Status

        13.5.2 Profile Instructions

        13.5.3 Topology View

        13.5.4 List View

        13.5.5 Manifest View

        13.5.6 Graphs View

        13.5.7 Actions

        13.5.8 Web-based Shell

        13.5.9 Serial Console

      13.6 Bringing up Instances in OpenStack

      13.7 Administering OpenStack

        13.7.1 Log Into The Control Nodes

        13.7.2 Reboot the Compute Node

      13.8 Terminating the Experiment

      13.9 Taking Next Steps

    14 CloudLab Chef Tutorial

      14.1 Objectives

      14.2 Motivation

      14.3 Prerequisites

      14.4 Logging In

        14.4.1 Using a CloudLab Account

        14.4.2 Using a GENI Account

      14.5 Launching Chef Experiments

      14.6 Exploring Your Experiment

        14.6.1 Experiment Status

        14.6.2 Profile Instructions

        14.6.3 Topology View

        14.6.4 List View

        14.6.5 Manifest View

        14.6.6 Actions

      14.7 Brief Introduction to Chef

      14.8 Logging in to the Chef Web Console

        14.8.1 Web-based Shell

        14.8.2 Chef Web Console

      14.9 Configuring NFS

        14.9.1 Exploring The Structure

      14.10 Apache Web Server and ApacheBench Benchmarking tool

        14.10.1 Understanding the Internals

      14.11 Final Remarks about Chef on CloudLab

      14.12 Terminating Your Experiment

      14.13 Future Steps

    15 Getting Help

1 CloudLab Status Notes

CloudLab is open! The hardware that is currently deployed our Phase I deployment and most of Phase II, and we are still constantly adding new features.

2 Getting Started

This chapter will walk you through a simple experiment on CloudLab and introduce you to some of its basic concepts.

Start by pointing your browser at https://www.cloudlab.us/.

  1. Log in
    You’ll need an account to use CloudLab. If you already have an account on Emulab.net, you may use that username and password. Or, if you have an account at the GENI portal, you may use the “GENI User” button to log in using that account. If not, you can apply to start a new project at https://www.cloudlab.us/signup.php and taking the "Start New Project" option. See the chapter about CloudLab users for more details about user accounts.
    screenshots/clab/log-in.png

  2. Start Experiment
    From the top menu, click “Experiments” and then “Start Experiment” to begin.

  3. Experiment Wizard
    Experiments must be configured before they can be instantiated. A short wizard guides you through the process. The first step is to pick a profile for your experiment. A profile describes a set of resources (both hardware and software) that will be used to start your experiment. On the hardware side, the profile will control whether you get virtual machines or physical ones, how many there are, and what the network between them looks like. On the software side, the profile specifies the operating system and installed software.
    Profiles come from two sources. Some of them are provided by CloudLab itself, and provide standard installation of popular operating systems, software stacks, etc. Others are created by other researchers and may contain research software, artifacts and data used to gather published results, etc. Profiles represent a powerful way to enable repeatable research.
    Clicking the “Change Profile” button will let you select the profile that your experiment will be built from.
    screenshots/clab/begin-experiment.png

  4. Select a profile
    On the left side is the profile selector which lists the profiles you can choose. The list contains both globally accessible profiles and profiles accessible to the projects you are part of.
    The large display in this dialog box shows the network topology of the profile, and a short description sits below the topology view.
    The OpenStack profile will give you a small OpenStack installation with one master node and one compute node. It provides a simple example of how complex software stacks can be packaged up within CloudLab. If you’d prefer to start from bare metal, look for one of the profiles that installs a stock operating system on physical machines.
    screenshots/clab/select-profile.png

  5. Choose Parameters
    Some profiles are simple and provide the same topology every time they are instantiated. But others, like the OpenStack profile, are parameterized and allow users to make choices about how they are instantiated. The OpenStack profile allows you to pick the number of compute nodes, the hardware to use, and many more options. The creator of the profile chooses which options to allow and provides information on what those options mean. Just mouse over a blue ’?’ to see a description of an option. For now, stick with the default options and click “Next” to continue.
    screenshots/clab/choose-parameters.png

  6. Pick a cluster
    CloudLab can instantiate profiles on several different backend clusters. The cluster selector is located right above the “Create” button; the the cluster most suited to the profile you’ve chosen will be selected by default.
    screenshots/clab/pick-datacenter.png

  7. Click Create!
    When you click the “Create” button, CloudLab will start preparing your experiment by selecting nodes, installing software, etc. as described in the profile. What’s going on behind the scenes is that on one (or more) of the machines in one of the CloudLab clusters, a disk is being imaged, VMs and/or physical machines booted, accounts created for you, etc. This process usually takes a couple of minutes.
    screenshots/clab/please-wait.png

  8. Use your experiment
    When your experiment is ready to use, the progress bar will be complete, and you’ll be given a lot of new options at the bottom of the screen.
    screenshots/clab/node-list.png
    The “Topology View” shows the network topology of your experiment (which may be as simple as a single node). Clicking on a node in this view brings up a terminal in your browser that gives you a shell on the node. The “List View” lists all nodes in the topology, and in addition to the in-browser shell, gives you the command to ssh login to the node (if you provided a public key). The “Manifest” tab shows you the technical details of the resources allocated to your experiment. Any open terminals you have to the nodes show up as tabs on this page.
    Clicking on the “Profile Instructions” link (if present) will show instructions provided by the profile’s creator regarding its use.
    Your experiment is yours alone, and you have full “root” access (via the sudo command). No one else has access to the nodes in your experiment, and you may do anything at all inside of it, up to and including making radical changes to the operating system itself. We’ll clean it all up when you’re done!
    If you used our default OpenStack profile, the instructions will contain a link to the OpenStack web interface. The instructions will also give you a username and password to use.
    screenshots/clab/openstack-admin.png
    Since you gave CloudLab an ssh public key as part of account creation, you can log in using the ssh client on your laptop or desktop. The contoller node is a good place to start, since you can poke around with the OpenStack admin commands. Go to the "list view" on the experiment page to get a full command line for the ssh command.
    screenshots/clab/openstack-shell.png
    Your experiment will terminate automatically after a few hours. When the experiment terminates, you will lose anything on disk on the nodes, so be sure to copy off anything important early and often. You can use the “Extend” button to submit a request to hold it longer, or the “Terminate” button to end it early.

2.1 Next Steps

3 CloudLab Users

Registering for an account is quick and easy. Registering doesn’t cost anything, it’s simply for accountability. We just ask that if you’re going to use CloudLab for anything other than light use, you tell us a bit more about who you are and what you want to use CloudLab for.

Users in CloudLab are grouped into projects: a project is a (loosely-defined) group of people working together on some common goal, whether that be a research project, a class, etc. CloudLab places a lot of trust on project leaders, including the ability to authorize others to use the CloudLab. We therefore require that project leaders be faculty, senior research staff, or others who are relatively senior positions.

3.1 GENI Users

If you already have a GENI account, you may use it instead of creating a new CloudLab account. On the login page, select the “GENI User” button. You will be taken to a page like the one below to select where you normally log into your GENI account.

screenshots/clab/pick-ma.png

From here, you will be taken to the login page of your GENI federate; for example, the login page for the GENI portal is shown below.

screenshots/clab/geni-portal-login.png

After you log in, you will asked to authorize the CloudLab portal to use this account on your behalf. If your certificate at your GENI aggregate has a passphrase on it, you may be asked to enter that passphrase; if not, (as is the case with the GENI portal) you will simply see an “authorize” button as below:

screenshots/clab/trusted-signer.png

That’s it! When you log in a second time, some of these steps may be skipped, as your browser has them cached.

3.2 Register for an Account

To get an account on CloudLab, you either join an existing project or create a new one. In general, if you are a student, you should join a project led by a faculty member with whom you’re working.

If you already have an account on Emulab.net, you don’t need to sign up for a new account on CloudLab—simply log in with your Emulab username and password.

3.2.1 Join an existing project

screenshots/clab/join-project.png

To join an existing project, simply use the “Sign Up” button found on every CloudLab page. The form will ask you a few basic questions about yourself and the institution you’re affiliated with.

An SSH public key is required; if you’re unfamiliar with creating and using ssh keypairs, we recommend taking a look at the first few steps in GitHub’s guide to generating SSH keys. (Obviously, the steps about how to upload the keypair into GitHub don’t apply to CloudLab.)

CloudLab will send you email to confirm your address—watch for it (it might end up in your spam folder), as your request won’t be processed until you’ve confirmed your address.

You’ll be asked to enter the project ID for the project you are asking to join; you should get this from the leader of the project, likely your advisor or your class instructor. (If they don’t already have a project on CloudLab, you can ask them to create one.) The leader of your project is responsible for approving your account.

3.2.2 Create a new project

screenshots/clab/create-project.png

You should only start a new project if you are a faculty member, senior research staff, or in some other senior position. Students should ask their advisor or course instructor to create a new project.

To start a new project, use the “Sign Up” button found on every CloudLab page. In addition to basic information about yourself, the form will ask you a few questions about how you intend to use CloudLab. The application will be reviewed by our staff, so please provide enough information for us to understand the research or educational value of your project. The review process may take a few days, and you will receive mail informing you of the outcome.

Every person working in your project needs to have their own account. You get to approve these additional users yourself (you will receive email when anyone applies to join.) It is common, for example, for a faculty member to create a project which is primarily used by his or her students, who are the ones who run experiments. We still require that the project leader be the faculty member, as we require that there is someone in a position of authority we can contact if there are questions about the activities of the project.

Note that projects in CloudLab are publicly-listed: a page that allows users to see a list of all projects and search through them does not exist yet, but it will in the future.

4 CloudLab and Repeatable Research

One of CloudLab’s key goals is to enable repeatable researchwe aim to make it easier for researchers to get the same software and hardware environment so that they can repeat or build upon each others’ work.

CloudLab is designed as a scientific instrument. It gives full visibility into every aspect of the facility, and it’s designed to minimize the impact that simultaneous slices have on each other. This means that researchers using CloudLab can fully understand why their systems behave the way they do, and can have confidence that the results that they gather are not artifacts of competition for shared hardware resources. CloudLab profiles can also be published, giving other researchers the exact same environment—hardware and software—on which to repeat experiments and compare results.

CloudLab gives exclusive access to compute resources to one experiment at a time. (That experiment may involve re-exporting those resources to other users, for example, by running cloud services.) Storage resources attached to them (eg. local disk) are also used by a single experiment at a time, and it is possible to run experiments that have exclusive access to switches.

5 Creating Profiles

In CloudLab, a profile captures an entire cloud environment—the software needed to run a particular cloud, plus a description of the hardware (including network topology) that the cloud software stack should run on.

When you create a new profile, you are creating a new RSpec, and, usually, creating one or more disk images that are referenced by that RSpec. When someone uses your profile, they will get their own experiment that boots up the resources (virtual or physical) described by the RSpec. It is common to create the resource specifification for profiles using a GUI or by writing a python script rather than dealing with RSpecs directly.

5.1 Creating a profile from an existing one

The easiest way to create a new profile is by cloning or copying an existing one and customizing it to your needs. The basic steps are:

When you clone an experiment, you are taking an existing experiment, including a snapshot of the disk, and creating a new profile based on it. The new profile will be identical to the profile that experiment was based on in all other respects. Cloning only works on experiments with a single node.

If you copy a profile, you are creating a new profile that is identical in every way to an existing profile. You may or may not have a running experiment using the source profile. And if you do have a running experiment, it does not impact the copy. After copying a profile, you can then modify it for your own use. And if you instantiate the copy, you can then take snapshots of disk images and use them in future version of your copy. Any profile that you have access to may be copied.

5.1.1 Preparation and precautions

To create profiles, you need to be a registered user.

Cloning a profile can take a while, so we recommend that you extend your experiment while creating it, and contact us if you are worried your experiment might expire before you’re done creating your profile. We also strongly recommend testing your profile fully before terminating the experiment you’re creating it from.

When cloning, your home directory is not included in the disk image snapshot! You will need to install your code and data elsewhere in the image. We recommend /local/. Keep in mind that others who use your profile are going to have their own accounts, so make sure that nothing in your image makes assumptions about the username, home directory, etc. of the user running it.

When cloning, be aware the only the contents of disk (not running process, etc.) are stored as part of the profile, and as part of the creation process, your node(s) will be rebooted in order to take consistent snapshots of the disk.

For the time being, cloning only works for single-node profiles; we will add support for multi-node profiles in the future.

When copying a profile, remember that the disk images of a currently running experiment are not saved. If you want to customize the disk images using copy, you must copy the profile first, then instantiate your copy, then take snapshots of the modified disk image in your experiment.

5.1.2 Cloning a Profile
  1. Create an experiment
    Create an experiment using the profile that is most similar to the one you want to build. Usually, this will be one of our facility-provided profiles with a generic installation of Linux.

  2. Set up the node the way you want it
    Log into the node and install your software, datasets, packages, etc. Note the caveat above that it needs to be installed somewhere outside of your home directory, and should not be tied to your user account.

  3. Clone the experiment to create a new profile
    While you are logged in, the experiment page for your active experiments will have a “clone” button. Clicking this button will create a new profile based on your running experiment.
    Specifically, the button creates a copy of the RSpec used to create the experiment, passes it to the form used to create new profiles, and helps you create a disk image from your running experiment.
    screenshots/clab/clone-button.png

  4. Create Profile
    You will be taken to a complete profile form and should fill it out as described below.

5.1.3 Copying a Profile
  1. Choose a profile
    Find the profile you wish to clone using the “Start Experiment” selector. Then you can click “Show Profile” to clone the profile directly or you can instantiate the profile if you wish to create an experiment first. Both profiles themselves and active experiments can be copied.
    screenshots/clab/begin-experiment.png

  2. Copy the profile or experiment
    While logged in, both your experiment page and the show profile page will have a copy button. Clicking this button will create a profile based on that profile or experiment.
    This button only copies the rspec or genilib script. No state in the active experiment is preserved.
    screenshots/clab/copy-button.png

  3. Create Profile
    You will be taken to a complete profile form and should fill it out as described below.

5.1.4 Creating the Profile

After copying or cloning a profile (see above) or selecting the menu option to create a new profile from scratch, you will need to fill out the profile creation form in order to complete the creation process.

  1. Fill out information for the new profile
    After clicking on the “clone” button, you will see a form that allows you to view and edit the basic information associated with your profile.
    screenshots/clab/create-profile-form.png
    Each profile must be associated with a project. If you’re a member of more than one project, you’ll need to select which one you want the profile to belong to.
    Make sure to edit the profile’s Description and Instructions.
    The “Description” is the text that users will see when your profile is listed in CloudLab, when the user is selecting which profile to use. It is also displayed when following a direct link to your profile. It should give the reader a brief description of what they will get if they create an experiment with this profile. If the profile is associated with a paper or other publication, this is a good place to mention that. Markdown markup, including hyperlinks, are allowed in the profile description.
    The “Instructions” text is displayed on the experiment page after the user has created an experiment using the profile. This is a good place to tell them where the code and data can be found, what scripts they might want to run, etc. Again, Markdown is allowed.
    The “Steps” section allows you to create a “tour” of your profile, which is displayed after a user creates an experiment with it. This feature is mostly useful if your profile contains more than one node, and you wish to explain to the user what the purpose of each node is.
    You have the option of making your profile usable to anyone, only registered CloudLab users, or members of your project. Regardless of the setting you chose here, CloudLab will also give you a direct link that you can use to share your profile with others of your choosing.

  2. Click “Create”
    When you click the “Create” button, your node will be rebooted, so that we can take a consistent snapshot of the disk contents. This process can take several minutes or longer, depending on the size of your disk image. You can watch the progress on this page. When the progress bar reaches the “Ready” stage, your new profile is ready! It will now show up in your “My Profiles” list.
    screenshots/clab/image-creating.png

  3. Test your profile
    Before terminating your experiment (or letting it expire), we strongly recommend testing out the new profile. If you elected to make it publicly visible, it will be listed in the profile selection dialog on the front page of https://www.cloudlab.us/. If not, you can instantiate it from the listing in your “My Profiles” page. If the profile will be used by guest users, we recommend testing it as one yourself: log out, and instantiate it using a different username (you will also have to use an alternate email address.

  4. Share your profile
    Now that your profile is working, you can share it with others by sending them direct links, putting links on your webpage or in papers, etc. See “Sharing Profiles” for more details.

5.1.5 Updating a profile

You can update the metadata associated with a profile at any time by going to the “My Profiles” page and clicking on the name of the profile to go to the profile page. On this page, you can edit any of the text fields (Description, Instructions, etc.), change the permissions, etc.

As with cloning a profile, this snapshot feature currently only works with single-node profiles.

If you need to update the contents of the disk image in the profile, simply create a new experiment from the profile. (You will only see this button on experiments created from profiles that you own.) Once your experiment is ready, you will see a “Snapshot” button on the experiment page. Log into your node, get the disk changed the way you want, and click the button.

screenshots/clab/snapshot-button.png

This button kicks off the same image creation process that occurs during cloning a profile. Once it’s finished, any new experiments created from the profile will use the new image.

As with creating a new profile, we recommend testing the profile before letting your experiment expire. If something goes wrong, we do keep one previous image file for each profile; currently, the only way to get access to this backup is to contact us.

5.2 Creating a profile with a GUI

CloudLab embeds the Jacks GUI for simple creation of small profiles. Jacks can be accessed by clicking the “topology” button on the profile creation or editing page. Jacks is designed to be simple, and to ensure that the topologies drawn can be instantiated on the hardware available. Thus, after making certain choices (such as picking an operating system image) you may find that other choices (such as the node type) become limited.

screenshots/clab/jacks-blank.png

Jacks has a “palette” on the left side, giving the set of node types (such as physical or virtual machines) that are available. Dragging a node from this palette onto the larger canvas area on the right adds it to the topology. To create a link between nodes, move the mouse near the first node, and a small black line will appear. Click and drag to the second node to complete the link. To create a LAN (multi-endpoint link), create a link between two nodes, then drag links from other nodes to the small grey box that appears in the middle of the original link.

screenshots/clab/jacks-properties.png

To edit the properties of a node or link, select it by clicking on its icon on the canvas. The panel on the left side will be replace by a property editor that will allow you to to set the disk image for the node, set commands to be run when the node boots, etc. To unselect the current node or link, and return to the palette on the left, simply click a blank area of the canvas.

5.3 Repository-Based Profiles

You can turn any public git repository (including those hosted on GitHub) into a CloudLab profile. Simply place a geni-lib script named profile.py into the top-level directory of your repository. When you create a new profile, you can provide the URL for your repository. The URL needs to be a http:// or https:// URL, and the CloudLab portal needs to be able to clone the repository without authentication.

Note that CloudLab is not a git hosting service; while we do keep a cache of your repository, we don’t guarantee that the profile will continue to work if the original repository becomes unavailable. We also have limits on the size of the repositories that we will clone.

When you intantiate a repository-based profile, the repository will be cloned into the directory /local/repository on all nodes in the experiment. This means that you can keep source code, startup scripts, etc. in your repository and refrence them from profile.py. CloudLab sets the pull URL for all of these clones to be your “upstream” repository, and attempts to set a suitable push URL for (it assumes that the hosting service uses ssh for pushes, and uses the git@<hostname>:user/repo convention). As a result, git pull and git push should be connected to your repository.

There is an example repository on GitHub at https://github.com/emulab/my-profile; if you don’t already have a git repository created, a good way to get started is to for this one and crate a new profile pointing at your fork.

Pushing to your repository is still govered by the authentication and permissions of your git hosting service, so others using your profile will not be able to push to your repository.

5.3.1 Updating Repository-Based Profiles

By default, the CloudLab profile does not automatically update whenever you push to your upstream repository; this means that people instantiating your profile see the repository as it existed at the time CloudLab last pulled from it.

You can manually cause CloudLab to pull from your repository using the “Update” button on the profile management page.

You can also set up CloudLab to automatically pull from your repository whever it is updating. To do so, you will need to set up a “web hook” on the service that hosts your git repository. CloudLab currently supports webhooks for GitHub.com, BitBucket.org, and sites hosted using GitLab (including both GitLab.com and self-hosted GitLab installations.) See the “push URL” in the Repository Info panel on the left side of the profile page for the webhook URL, and use the “information” icon next to it to get specific instructions for setting webhook on each service. Once you have set the webhook up, every time you push to your repository, your hosting service will let CloudLab know that it should automatically initiate a pull. (This will not be instantaneous, but should complete quickly in most cases.)

5.3.2 Branches and Tags in Repository-Based Profiles

By default, repository-based profiles will be instaniated from the master branch. At the bottom of the profile page, you will also find a list of all branches and tags in the repository, and can instantiate the version contained in any of them. Branches can be used for development work that is not yet ready to be come the master (default) version of the profile, and tags can be used to mark specific versions of the profiles that were used for specific papers or course assignments, for example.

5.4 Creating a profile from scratch

CloudLab profiles are described by GENI RSpecs. You can create a profile directly from an RSpec by using the “Create Profile” option from the “Actions” menu. Note that you cannot edit the text fields until you upload an RSpec, as these fields edit (in-browser) fields in the RSpec.

5.5 Sharing Profiles

If you chose to make your profile publicly visible, it will show up in the main “Select Profile” list on https://www.cloudlab.us/. CloudLab also gives you direct links to your profiles so that you can share them with others, post them on your website, publish them in papers, etc. The link can be found on the profile’s detail page, which is linked for your “My Profiles” page. If you chose to make your profile accessible to anyone, the link will take the form https://www.cloudlab.us//p/<project-id>/<profile-id>. If you didn’t make the profile public, the URL will have the form https://www.cloudlab.us//p/<UUID>, where UUID is a 128-bit number so that the URL is not guessable. You can still share this URLs with anyone you want to have access to the profile—for example, to give it to a collaborator to try out your work before publishing.

5.6 Versioned Profiles

Profiles are versioned to capture the evolution of a profile over time. When updating profiles, the result is be a new version that does not (entirely) replace the profile being updated.

When sharing a profile, you are given two links to share. One link will take the user to the most recent version of the profile that exists at the time they click the link. This is the most appropriate option in most cases. There is also a link that takes one to a specific version of the profile. This link is most useful for publication in papers or other cases in which reproducability with the exact same environment is a concnern.

6 Basic Concepts

This chapter covers the basic concepts that you’ll need to understand in order to use CloudLab.

6.1 Profiles

A profile encapsulates everything needed to run an experiment. It consists of two main parts: a description of the resources (hardware, storage, network, etc.) needed to run the experiment, and the software artifacts that run on those resources.

The resource specification is in the RSpec format. The RSpec describes an entire topology: this includes the nodes (hosts) that the software will run on, the storage that they are attached to, and the network that connects them. The nodes may be virtual machines or physical servers. The RSpec can specify the properties of these nodes, such as how much RAM they should have, how many cores, etc., or can directly reference a specific class of hardware available in one of CloudLab’s clusters. The network topology can include point to point links, LANs, etc. and may be either built from Ethernet or Infiniband.

The primary way that software is associated with a profile are through disk images. A disk image (often just called an “image”) is a block-level snapshot of the contents of a real or virtual disk—and it can be loaded onto either. A disk image in CloudLab typically has an installation of a full operating system on it, plus additional packages, research software, data files, etc. that comprise the software environment of the profile. Each node in the RSpec is associated with a disk image, which it boots from; more than one node in a given profile may reference the same disk image, and more than one profile may use the same disk image as well.

Profiles come from two sources: some are provided by CloudLab itself; these tend to be standard installations of popular operating systems and software stacks. Profiles may also be provided by CloudLab’s users, as a way for communities to share research artifacts.

6.1.1 On-demand Profiles

Profiles in CloudLab may be on-demand profiles, which means that they are designed to be instantiated for a relatively short period of time (hours or days). Each person instantiating the profile gets their own experiment, so everyone using the profile is doing so independently on their own set of resources.

6.1.2 Persistent Profiles

CloudLab also supports persistent profiles, which are longer-lived (weeks or months) and are set up to be shared by multiple users. A persistent profile can be thought of as a “testbed within a testbed”—a testbed facility built on top of CloudLab’s hardware and provisioning capabilities. Examples of persistent profiles include:

A persistent profile may offer its own user interface, and its users may not necessarily be aware that they are using CloudLab. For example, a cloud-style profile might directly offer its own API for provisioning virtual machines. Or, an HPC-style persistent profile might run a standard cluster scheduler, which users interact with rather than the CloudLab website.

6.2 Experiments

See the chapter on repeatability for more information on repeatable experimentation in CloudLab.

An experiment is an instantiation of a profile. An experiment uses resources, virtual or physical, on one or more of the clusters that CloudLab has access to. In most cases, the resources used by an experiment are devoted to the individual use of the user who instantiates the experiment. This means that no one else has an account, access to the filesystems, etc. In the case of experiments using solely physical machines, this also means strong performance isolation from all other CloudLab users. (The exceptions to this rule are persistent profiles, which may offer resources to many users.)

Running experiments on CloudLab consume real resources, which are limited. We ask that you be careful about not holding on to experiments when you are not actively using them. If you are are holding on to experiments because getting your working environment set up takes time, consider creating a profile.

The contents of local disk on nodes in an experiment are considered ephemeralthat is, when the experiment ends (either by being explicitly terminated by the user or expiring), the contents of the disk are lost. So, you should copy important data off before the experiment ends. A simple way to do this is through scp or sftp. You may also create a profile, which captures the contents of the disk in a disk image.

All experiments have an expiration time. By default, the expiration time is short (a few hours), but users can use the “Extend” button on the experiment page to request an extension. A request for an extension must be accompanied by a short description that explains the reason for requesting an extension, which will be reviewed by CloudLab staff. You will receive email a few hours before your experiment expires reminding you to copy your data off or request an extension.

6.2.1 Extending Experiments

If you need more time to run an experiment, you may use the “Extend” button on the experiment’s page. You will be presented with a dialog that allows you to select how much longer you need the experiment. Longer time periods require more extensive appoval processes. Short extensions are auto-approved, while longer ones require the intervention of CloudLab staff or, in the case of indefinite extensions, the steering commitee.

screenshots/clab/extend-experiment.png

6.3 Projects

Users are grouped into projects. A project is, roughly speaking, a group of people working together on a common research or educational goal. This may be people in a particular research lab, a distributed set of collaborators, instructors and students in a class, etc.

A project is headed by a project leader. We require that project leaders be faculty, senior research staff, or others in an authoritative position. This is because we trust the project leader to approve other members into the project, ultimately making them responsible for the conduct of the users they approve. If CloudLab staff have questions about a project’s activities, its use of resources, etc., these questions will be directed to the project leader. Some project leaders run a lot of experiments themselves, while some choose to approve accounts for others in the project, who run most of the experiments. Either style works just fine in CloudLab.

Permissions for some operations / objects depend on the project that they belong to. Currently, the only such permission is the ability to make a profile visible onto to the owning project. We expect to introduce more project-specific permissions features in the future.

6.4 Physical Machines

Users of CloudLab may get exclusive, root-level control over physical machines. When allocated this way, no layers of virtualization or indirection get in the way of the way of performance, and users can be sure that no other users have access to the machines at the same time. This is an ideal situation for repeatable research.

Physical machines are re-imaged between users, so you can be sure that your physical machines don’t have any state left around from the previous user. You can find descriptions of the hardware in CloudLab’s clusters in the hardware chapter.

6.5 Virtual Machines and Containers

While CloudLab does have the ability to provision virtual machines (using the Xen hypervisor) and containers (using Docker), we expect that the dominant use of CloudLab is that users will provision physical machines. Users (or the cloud software stacks that they run) may build their own virtual machines on these physical nodes using whatever hypervisor they wish. However, if your experiment could still benefit from use of virtual machines or containers (e.g. to form a scalable pool of clients issuing requests to your cloud software stack), you can find more detail in the advanced topics section.

7 Resource Reservations

CloudLab supports reservations that allow you to request resources ahead of time. This can be useful for tutorials, classes, and to run larger experiments than are typically possible on a first-come, first-served basis.

Reservations in CloudLab are per-cluster and per-type. They are tied to a project: any experiment belonging to that project may used the reserved nodes. Reservations are not tied to specific nodes; this gives CloudLab maximum flexibility to do late-binding of nodes to reservations, which makes them minimally intrusive on other users of the testbed.

7.1 What Reservations Guarantee

Having a reservation guarantees that, at minimum, the specified quantity of nodes of the specified type will be available for use by the project during the specified time window.

Having a reservation does not automatically start an experiment at that time: it ensures that the specified number of nodes are available for use by any experiments that you or your fellow project members start.

More than one experiment may use nodes from the reservation; for example, a tutorial in which 40 students each will run an experiment having a single node may be handled as a single 40-node reservation. You may also start and terminate multiple experiments in series over the course of the reservation: reserved nodes will not be returned to general use until your reservation ends.

A reservation guarantees the minimum number of nodes that will be available; you may use more so long as they are not used by other experiments or reservations.

Experiments run during a reservation do not automatically terminate at the end of the reservation; they simply become subject to the normal resource usage policies, and, for example, may become non-extendable due to other reservations that start after yours.

Important caveats include:

7.2 How Reservations May Affect You

Reservations held by others may affect your experiments in two ways: they may prevent you from creating new experiments or may prevent you from extending existing experiments. This “admission control system” is how we ensure that nodes are available for those that have them reserved.

If there is an ongoing or upcoming reservation by another project, you may encounter an “admission control” failure when trying to create a new experiment. This means that, although there are enough nodes that are not currently allocated to a particular experiment, some or all of those nodes are required in order to fulfill a reservation. Note that the admission control system assumes that your experiment will last for the full default experiment duration when making this calcuation. For example, if the default experiment duration is 24 hours, and a large reservation will start in 10 hours, your experiment may fail to be created due to the admission control system. If the large reservation starts in 30 hours, you will be able to create the experiment, but you may not be able to extend it.

Reservations can also prevent you from extending existing experiments, if that extension would cause too few nodes to be available to satisfy a reservation. A message will appear on the experiment’s status page warning you when this situation will occur in the near future, and the reservation request dialog will limit the length of reservation that you can request. If this happens, be sure to save all of your work, as the administrators cannot grant extensions that would interfere with reservations.

7.3 Making a Reservation

To request a reservation, use the “Reserve Nodes” item from the “Actions” menu.

screenshots/apt/reservation-form.png

After filling out the number of and type of nodes and the time, the check button checks to see if the reservation is possible. If your request is satisfiable, you will get a dialog box that lets you submit the request.

screenshots/apt/reservation-submit.png

If your request is not satisfiable, you will be given a chance to modify the request and “check” again. In this case, the time when there will not be enough nodes is shown, as will the number of nodes by which the request exceeds availalbe resources. To make your reservation fit, try asking for a different type of nodes, a smaller number, or a time further in the future.

Not all reservation requests are automatically accepted. Your request will be shown as “pending” while it is being reviewed by the CloudLab administrators. Requesting the smaller numbers of nodes, or for shorter periods of time, will maximize the chances that youre request is accepted. Be sure to include meaningful text in the “Reason” field, as administrators will use this to determine whethr to grant your reservation.

You may have more than one reservation at a time; if you need resources of more than one type, or on different clusters, you can get this by requesting mutliple reservations.

7.4 Using a Reservation

To use a reservation, simply create experiments as normal. Experiments run during the duration of the reservation (even those begun before its start time) are automatically counted towards the reservation. Experiments run during reservations have expiration times as do normal experiments, so be sure to extend them if necessary.

Since reservations are per-project, if you belong to more than one, make sure to create the experiment under the correct project.

Experiments are not automatically terminated at the conclusion of a reservation (though it may not be possible to extend them due to other reservations). Remember to terminate your experiments when you are done with them, as you would do normally.

8 Describing a profile with python and geni-lib

geni-lib is a tool that allows users to generate RSpec files from Python code. CloudLab offers the ability to use geni-lib scripts as the definition of a profile, rather then the more primitive RSpec format. When you supply a geni-lib script on the Create Profile page, your script is uploaded to the server so that it can be executed in the geni-lib environment. This allows the script to be verified for correctness, and also produces the equivalent RSpec representation that you can view if you so desire.

screenshots/clab/create-geni-lib-empty.png

When you provide a geni-lib script, you will see a slightly different set of buttons on the Create Profile page; next to the “Source” button there is an “XML” button that will pop up the RSpec XML for you to look at. The XML is read-only; if you want to change the profile, you will need to change the python source code that is displayed when you click on the “Source” button. Each time you change the python source code, the script is uploaded to the server and processed. Be sure to save your changes if you are updating an existing profile.

The following examples demonstrate basic geni-lib usage. More information about geni-lib and additional examples, can be found in the geni-lib repository. The current version of geni-lib used by CloudLab can be found in the 0.9-EMULAB branch. Its full documentation is online as part of this manual.

8.1 A single XEN VM node

"""An example of constructing a profile with a single Xen VM. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ # Import the Portal object. import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Add a XenVM (named "node") to the request node = request.XenVM("node") # Write the request in RSpec format portal.context.printRequestRSpec() Open this profile on CloudLab"""An example of constructing a profile with a single Xen VM. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ # Import the Portal object. import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Add a XenVM (named "node") to the request node = request.XenVM("node") # Write the request in RSpec format portal.context.printRequestRSpec()

This example demonstrates the two most important objects: the portal context (accessed through the portal.context object in the geni.portal module), and the request RSpec created by calling makeRequestRSpec() on it. These fundamental objects are central to essentially all CloudLab geni-lib profiles.

Another way to create a Request RSpec object is to call its constructuor, geni.rspec.pg.Request directly. We ask the Context to create it for us so it it is "bound" to the context and does not need to be explicitly passed to other functions on the context

Once the request object has been created, resources may be added to it by calling methods on it like RawPC() or rspec.pg.LAN. In this example, just a single node (created with the XenVM() constructor, asking for a single VM identified by the name "node") is requested.

Most functions called on Request objects are not directly members of that class. Rather, they are loaded as "extensions" by modules such as geni.rspec.emulab.

The final action the geni-lib script performs is to generate the XML representation of the request RSpec, with the printRequestRSpec() call on the last line. This has the effect of communicating the description of all the resources requested by the profile back to CloudLab.

You will also notice that the profile begins with a string literal (to be precise, it is a Python docstring). The initial text will also be used as the profile description; the text following the Instructions: line will be used as the corresponding instructions. This documentation is so important that adding the description to the profile is mandatory. (Using a docstring like this is not the only way to produce the description and instructions, although it is the most convenient.)

This simple example has now demonstrated all the important elements of a geni-lib profile. The portal context and request RSpec objects, the final printRequestRSpec() call, and the docstring description and instructions are “boilerplate” constructions, and you will probably include similar or identical versions of them in every geni-lib profile you create unless you are doing something quite unusual.

8.2 A single physical host

"""An example of constructing a profile with a single raw PC. Instructions: Wait for the profile instance to start, and then log in to the host via the ssh port specified below. """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a raw PC node = request.RawPC("node") # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() Open this profile on CloudLab"""An example of constructing a profile with a single raw PC. Instructions: Wait for the profile instance to start, and then log in to the host via the ssh port specified below. """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a raw PC node = request.RawPC("node") # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

As mentioned above, most of these simple examples consist of boilerplate geni-lib fragments, and indeed the portal context and request RSpec operations are unchanged from the previous script. The big difference, though (other than the updated documentation) is that in this case the RawPC() method is invoked on the Request object instead of XenVM(). As you might expect, the new profile will request a physical host instead of a virtual one. (A side effect of using a real machine is that it automatically comes with a unique public IP address, where the VM used in the earlier example did not. Profiles can request public IP addresses for VMs too, though it does not happen by default.)

8.3 Two XenVM nodes with a link between them

"""An example of constructing a profile with two VMs connected by a LAN. Instructions: Wait for the profile instance to start, and then log in to either VM via the ssh ports specified below. """ import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() # Create two XenVM nodes. node1 = request.XenVM("node1") node2 = request.XenVM("node2") # Create a link between them link1 = request.Link(members = [node1,node2]) portal.context.printRequestRSpec() Open this profile on CloudLab"""An example of constructing a profile with two VMs connected by a LAN. Instructions: Wait for the profile instance to start, and then log in to either VM via the ssh ports specified below. """ import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() # Create two XenVM nodes. node1 = request.XenVM("node1") node2 = request.XenVM("node2") # Create a link between them link1 = request.Link(members = [node1,node2]) portal.context.printRequestRSpec()

This example demonstrates two important geni-lib concepts: first, adding more than a single node to the request (which is a relatively straightforward matter of calling more than one node object constructor, being careful to use a different name each time). It also shows how to add links between nodes. It is possible to construct links and LANs in a more complicated manner (such as explicitly creating Interface objects to control interfaces), but the simplest case is to supply the member nodes at the time the link is created.

8.4 Two ARM64 servers in a LAN

"""An example of constructing a profile with two ARM64 nodes connected by a LAN. Instructions: Wait for the profile instance to start, and then log in to either host via the ssh ports specified below. """ import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() # Create two raw "PC" nodes node1 = request.RawPC("node1") node2 = request.RawPC("node2") # Set each of the two to specifically request "m400" nodes, which in CloudLab, are ARM node1.hardware_type = "m400" node2.hardware_type = "m400" # Create a link between them link1 = request.Link(members = [node1, node2]) portal.context.printRequestRSpec() Open this profile on CloudLab"""An example of constructing a profile with two ARM64 nodes connected by a LAN. Instructions: Wait for the profile instance to start, and then log in to either host via the ssh ports specified below. """ import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() # Create two raw "PC" nodes node1 = request.RawPC("node1") node2 = request.RawPC("node2") # Set each of the two to specifically request "m400" nodes, which in CloudLab, are ARM node1.hardware_type = "m400" node2.hardware_type = "m400" # Create a link between them link1 = request.Link(members = [node1, node2]) portal.context.printRequestRSpec()

We now come to demonstrate requesting particular properties of nodes—until now, all nodes had been either XenVMs or RawPCs and nothing further was said about them. geni-lib allows the user to specify various details about the nodes, and this example makes use of the hardware_type property. The hardware_type can be set to a string describing the type of physical machine onto which the logical node can be mapped: in this case, the string is "m400", which means a ProLiant Moonshot m400 host (an ARM64 server). Obviously, such a profile cannot be instantiated on a cluster without a sufficient quantity of appropriate machines! (This profile was written with the Utah CloudLab cluster in mind.) CloudLab will indicate a list of suitable clusters when the user attempts to instantiate the profile, so he or she is not forced to find one by trial and error.

8.5 A VM with a custom size

"""An example of constructing a profile with a single Xen VM. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Ask for two cores node.cores = 2 # Ask for 2GB of ram node.ram = 2048 # Add an extra 8GB of space on the primary disk. # NOTE: Use fdisk, the extra space is in the 4th DOS partition, # you will need to create a filesystem and mount it. node.disk = 8 # Alternate method; request an ephemeral blockstore mounted at /mydata. # NOTE: Comment out the above line (node.disk) if you do it this way. #bs = node.Blockstore("bs", "/mydata") #bs.size = "8GB" #bs.placement = "nonsysvol" # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() Open this profile on CloudLab"""An example of constructing a profile with a single Xen VM. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Ask for two cores node.cores = 2 # Ask for 2GB of ram node.ram = 2048 # Add an extra 8GB of space on the primary disk. # NOTE: Use fdisk, the extra space is in the 4th DOS partition, # you will need to create a filesystem and mount it. node.disk = 8 # Alternate method; request an ephemeral blockstore mounted at /mydata. # NOTE: Comment out the above line (node.disk) if you do it this way. #bs = node.Blockstore("bs", "/mydata") #bs.size = "8GB" #bs.placement = "nonsysvol" # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

The earlier examples requesting VMs used the default number of cores, quantity of RAM, and disk size. It’s also possible to customize these value, as this example does by setting the cores, ram, and disk properties of the XenVM class (which is a subclass of rspec.pg.Node.)

8.6 Set a specific IP address on each node

"""An example of constructing a profile with node IP addresses specified manually. Instructions: Wait for the profile instance to start, and then log in to either VM via the ssh ports specified below. (Note that even though the EXPERIMENTAL data plane interfaces will use the addresses given in the profile, you will still connect over the control plane interfaces using addresses given by the testbed. The data plane addresses are for intra-experiment communication only.) """ import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node1 = request.XenVM("node1") iface1 = node1.addInterface("if1") # Specify the component id and the IPv4 address iface1.component_id = "eth1" iface1.addAddress(rspec.IPv4Address("192.168.1.1", "255.255.255.0")) node2 = request.XenVM("node2") iface2 = node2.addInterface("if2") # Specify the component id and the IPv4 address iface2.component_id = "eth2" iface2.addAddress(rspec.IPv4Address("192.168.1.2", "255.255.255.0")) link = request.LAN("lan") link.addInterface(iface1) link.addInterface(iface2) portal.context.printRequestRSpec() Open this profile on CloudLab"""An example of constructing a profile with node IP addresses specified manually. Instructions: Wait for the profile instance to start, and then log in to either VM via the ssh ports specified below. (Note that even though the EXPERIMENTAL data plane interfaces will use the addresses given in the profile, you will still connect over the control plane interfaces using addresses given by the testbed. The data plane addresses are for intra-experiment communication only.) """ import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node1 = request.XenVM("node1") iface1 = node1.addInterface("if1") # Specify the component id and the IPv4 address iface1.component_id = "eth1" iface1.addAddress(rspec.IPv4Address("192.168.1.1", "255.255.255.0")) node2 = request.XenVM("node2") iface2 = node2.addInterface("if2") # Specify the component id and the IPv4 address iface2.component_id = "eth2" iface2.addAddress(rspec.IPv4Address("192.168.1.2", "255.255.255.0")) link = request.LAN("lan") link.addInterface(iface1) link.addInterface(iface2) portal.context.printRequestRSpec()

This code sample assigns specific IP addresses to interfaces on the nodes it requests.

Some of the available qualifiers on requested nodes are specified by manipulating attributes within the node (or interface) object directly. The hardware_type in the previous example is one such case, as is the component_id here. (Note that the component_id in this example is applied to an interface, although it is also possible to specify component_ids on nodes, too, to request a particular physical host.)

Other modifications to requests require dedicated methods. For instance, see the addAddress() calls made on each of the two interfaces above. In each case, an IPv4Address object is obtained from the appropriate constructor (the parameters are the address and the netmask, respectively), and then added to the corresponding interface.

8.7 Specify an operating system and set install and execute scripts

"""An example of constructing a profile with install and execute services. Instructions: Wait for the profile instance to start, then click on the node in the topology and choose the `shell` menu item. The install and execute services are handled automatically during profile instantiation, with no manual intervention required. """ # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Add a raw PC to the request. node = request.RawPC("node") # Install and execute scripts on the node. THIS TAR FILE DOES NOT ACTUALLY EXIST! node.addService(rspec.Install(url="http://example.org/sample.tar.gz", path="/local")) node.addService(rspec.Execute(shell="bash", command="/local/example.sh")) portal.context.printRequestRSpec()Open this profile on CloudLab"""An example of constructing a profile with install and execute services. Instructions: Wait for the profile instance to start, then click on the node in the topology and choose the `shell` menu item. The install and execute services are handled automatically during profile instantiation, with no manual intervention required. """ # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Add a raw PC to the request. node = request.RawPC("node") # Install and execute scripts on the node. THIS TAR FILE DOES NOT ACTUALLY EXIST! node.addService(rspec.Install(url="http://example.org/sample.tar.gz", path="/local")) node.addService(rspec.Execute(shell="bash", command="/local/example.sh")) portal.context.printRequestRSpec()

This example demonstrates how to request services for a node, where CloudLab will automate some task as part of the profile instance setup procedure. In this case, two services are described (an install and an execute). This is a very common pair of services to request together: the Install object describes a service which retrieves a tarball from the location given in the url parameter, and installs it into the local filesystem as specified by path. (The installation occurs during node setup, upon the first boot after the disk image has been loaded.) The second service, described by the Execute object, invokes a shell process to run the given command. In this example (as is common), the command refers directly to a file saved by the immediately preceding Install service. This behaviour works, because CloudLab guarantees that all Install services complete before any Execute services are started. The command executes every time the node boots, so you can use it start daemons, etc. that are necessary for your experiment.

8.8 Profiles with user-specified parameters

"""An example of using parameters to construct a profile with a variable number of nodes. Instructions: Wait for the profile instance to start, and then log in to one or more of the VMs via the ssh port(s) specified below. """ # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as rspec # Describe the parameter(s) this profile script can accept. portal.context.defineParameter( "n", "Number of VMs", portal.ParameterType.INTEGER, 1 ) # Retrieve the values the user specifies during instantiation. params = portal.context.bindParameters() # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Check parameter validity. if params.n < 1 or params.n > 8: portal.context.reportError( portal.ParameterError( "You must choose at least 1 and no more than 8 VMs." ) ) for i in range( params.n ): # Create a XenVM and add it to the RSpec. node = request.XenVM( "node" + str( i ) ) # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() Open this profile on CloudLab"""An example of using parameters to construct a profile with a variable number of nodes. Instructions: Wait for the profile instance to start, and then log in to one or more of the VMs via the ssh port(s) specified below. """ # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as rspec # Describe the parameter(s) this profile script can accept. portal.context.defineParameter( "n", "Number of VMs", portal.ParameterType.INTEGER, 1 ) # Retrieve the values the user specifies during instantiation. params = portal.context.bindParameters() # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Check parameter validity. if params.n < 1 or params.n > 8: portal.context.reportError( portal.ParameterError( "You must choose at least 1 and no more than 8 VMs." ) ) for i in range( params.n ): # Create a XenVM and add it to the RSpec. node = request.XenVM( "node" + str( i ) ) # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

Until now, all of the geni-lib scripts have described profiles which could also have been generated with the Jacks GUI, or even by writing a raw XML RSpec directly. However, geni-lib profiles offer an important feature unavailable by the other methods: the ability to describe not a static request, but a request “template” which is dynamically constructed based on a user’s choices at the time the profile is instantiated. The mechanism for constructing such profiles relies on profile parameters; the geni-lib script describes the set of parameters it will accept, and then retrieves the corresponding values at instantiation time and is free to respond by constructing arbitrarily different resource requests based on that input.

The profile above accepts exactly one parameter—the number of VMs it will instantiate. You can see that the parameter is described via the portal portal.context object, using the defineParameter() call shown for the first time in this example. defineParameter() must be invoked once per profile parameter, and requires the parameter symbol, parameter description, type, and default value respectively. The parameter symbol ("n" in this example) must be unique within the profile, and is used to retrieve the parameter’s value during script execution. The description ("Number of VMs", in this case) will be shown to prompt the user to supply a corresponding value when the the profile is instantiated. The type is used partly to constrain the parameters to valid values, and partly to assist the instantiating user by suggesting appropriate choices. The list of valid types is:

portal.ParameterType.INTEGER

   

Simple integer

portal.ParameterType.STRING

   

Arbitrary (uninterpreted) string

portal.ParameterType.BOOLEAN

   

True or False

portal.ParameterType.IMAGE

   

URN to a disk image

portal.ParameterType.AGGREGATE

   

URN of a GENI Aggregate Manager

portal.ParameterType.NODETYPE

   

String specifying a type of node

portal.ParameterType.BANDWIDTH

   

Floating-point number specifying bandwidth in kbps

portal.ParameterType.LATENCY

   

Floating-point number specifying delay in ms

portal.ParameterType.SIZE

   

Integer used for memory or disk size (e.g., MB, GB, etc.)

The last field is the default value of the parameter, and is required: not only must the field itself contain a valid value, but the set of all parameters must be valid when each of them assumes the default value. (This is partly so that the portal can construct a default topology for the profile without any manual intervention, and partly so that unprivileged users, who may lack permission to supply their own values, might still be able to instantiate the profile.)

After all parameters have been defined, the profile script may retrieve the runtime values with the bindParameters() method. This will return a Python class instance with one attribute for each parameter (with the name supplied during the appropriate defineParameter() call). In the example, the instance was assigned to params, and therefore the only parameter (which was called "n") is accessible as params.n.

Of course, it may be possible for the user to specify nonsensical values for a parameter, or perhaps give a set of parameters whose combination is invalid. A profile should detect error cases like these, and respond by constructing a portal.ParameterError object, which can be passed to the portal context’s reportError() method to abort generation of the RSpec.

8.9 Add temporary local disk space to a node

"""This profile demonstrates how to add some extra *local* disk space on your node. In general nodes have much more disk space then what you see with `df` when you log in. That extra space is in unallocated partitions or additional disk drives. An *ephemeral blockstore* is how you ask for some of that space to be allocated and mounted as a **temporary** filesystem (temporary means it will be lost when you terminate your experiment). Instructions: Log into your node, your **temporary** file system in mounted at `/mydata`. """ # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as rspec # Import the emulab extensions library. import geni.rspec.emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Allocate a node and ask for a 30GB file system mounted at /mydata node = request.RawPC("node") bs = node.Blockstore("bs", "/mydata") bs.size = "30GB" # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() Open this profile on CloudLab"""This profile demonstrates how to add some extra *local* disk space on your node. In general nodes have much more disk space then what you see with `df` when you log in. That extra space is in unallocated partitions or additional disk drives. An *ephemeral blockstore* is how you ask for some of that space to be allocated and mounted as a **temporary** filesystem (temporary means it will be lost when you terminate your experiment). Instructions: Log into your node, your **temporary** file system in mounted at `/mydata`. """ # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as rspec # Import the emulab extensions library. import geni.rspec.emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Allocate a node and ask for a 30GB file system mounted at /mydata node = request.RawPC("node") bs = node.Blockstore("bs", "/mydata") bs.size = "30GB" # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

This example demonstrates how to request extra temporary diskspace on a node. The extra disk space is allocated from unused disk partitions, and is mounted at a directory of your choosing. In the example code above, we are asking for a 30GB file system mounted at "/mydata". Anything you store in this file system is temporary, and will be lost when your experiment is terminated.

The total size of the file systems you can ask for on a node, is obviously limited to the amount of unused disk space available. The system does its best to find nodes with enough space to fulfill the request, but in general you are limited to temporary file systems in the 10s of, or a few hundred GB.

8.10 Creating a reusable dataset

"""This profile demonstrates how to use a remote dataset on your node, either a long term dataset or a short term dataset, created via the Portal. Instructions: Log into your node, your dataset file system in mounted at `/mydata`. """ # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as rspec # Import the emulab extensions library. import geni.rspec.emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Add a node to the request. node = request.RawPC("node") # We need a link to talk to the remote file system, so make an interface. iface = node.addInterface() # The remote file system is represented by special node. fsnode = request.RemoteBlockstore("fsnode", "/mydata") # This URN is displayed in the web interfaace for your dataset. fsnode.dataset = "urn:publicid:IDN+emulab.net:portalprofiles+ltdataset+DemoDataset" # # The "rwclone" attribute allows you to map a writable copy of the # indicated SAN-based dataset. In this way, multiple nodes can map # the same dataset simultaneously. In many situations, this is more # useful than a "readonly" mapping. For example, a dataset # containing a Linux source tree could be mapped into multiple # nodes, each of which could do its own independent, # non-conflicting configure and build in their respective copies. # Currently, rwclones are "ephemeral" in that any changes made are # lost when the experiment mapping the clone is terminated. # #fsnode.rwclone = True # # The "readonly" attribute, like the rwclone attribute, allows you to # map a dataset onto multiple nodes simultaneously. But with readonly, # those mappings will only allow read access (duh!) and any filesystem # (/mydata in this example) will thus be mounted read-only. Currently, # readonly mappings are implemented as clones that are exported # allowing just read access, so there are minimal efficiency reasons to # use a readonly mapping rather than a clone. The main reason to use a # readonly mapping is to avoid a situation in which you forget that # changes to a clone dataset are ephemeral, and then lose some # important changes when you terminate the experiment. # #fsnode.readonly = True # Now we add the link between the node and the special node fslink = request.Link("fslink") fslink.addInterface(iface) fslink.addInterface(fsnode.interface) # Special attributes for this link that we must use. fslink.best_effort = True fslink.vlan_tagging = True # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()Open this profile on CloudLab"""This profile demonstrates how to use a remote dataset on your node, either a long term dataset or a short term dataset, created via the Portal. Instructions: Log into your node, your dataset file system in mounted at `/mydata`. """ # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as rspec # Import the emulab extensions library. import geni.rspec.emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Add a node to the request. node = request.RawPC("node") # We need a link to talk to the remote file system, so make an interface. iface = node.addInterface() # The remote file system is represented by special node. fsnode = request.RemoteBlockstore("fsnode", "/mydata") # This URN is displayed in the web interfaace for your dataset. fsnode.dataset = "urn:publicid:IDN+emulab.net:portalprofiles+ltdataset+DemoDataset" # # The "rwclone" attribute allows you to map a writable copy of the # indicated SAN-based dataset. In this way, multiple nodes can map # the same dataset simultaneously. In many situations, this is more # useful than a "readonly" mapping. For example, a dataset # containing a Linux source tree could be mapped into multiple # nodes, each of which could do its own independent, # non-conflicting configure and build in their respective copies. # Currently, rwclones are "ephemeral" in that any changes made are # lost when the experiment mapping the clone is terminated. # #fsnode.rwclone = True # # The "readonly" attribute, like the rwclone attribute, allows you to # map a dataset onto multiple nodes simultaneously. But with readonly, # those mappings will only allow read access (duh!) and any filesystem # (/mydata in this example) will thus be mounted read-only. Currently, # readonly mappings are implemented as clones that are exported # allowing just read access, so there are minimal efficiency reasons to # use a readonly mapping rather than a clone. The main reason to use a # readonly mapping is to avoid a situation in which you forget that # changes to a clone dataset are ephemeral, and then lose some # important changes when you terminate the experiment. # #fsnode.readonly = True # Now we add the link between the node and the special node fslink = request.Link("fslink") fslink.addInterface(iface) fslink.addInterface(fsnode.interface) # Special attributes for this link that we must use. fslink.best_effort = True fslink.vlan_tagging = True # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

In this example, we demonstrate how to create and use a dataset. A dataset is simply a snapshot of a temporary file system (see the previous example) that has been saved to permanent storage, and reloaded on a node (or nodes) in a different experiment. This type of dataset must be explicitly saved (more on this below) in order to make changes permanent (and available later). In the example code above, the temporary file system will be loaded with the dataset specified by the URN.

But before you can use a dataset, you first have to create one using the following steps:

  1. Create an experiment
    Create an experiment using the local diskspace example above.

  2. Add your data
    Populate the file system mounted at /mydata with the data you wish to use in other experiments.

  3. Fill out the Create Dataset form
    Click on the "Create Dataset" option in the Actions menu. This will bring up the form to create a new dataset. Choose a name for your dataset and optionally the project the dataset should be associated with. Be sure to select Image Backed for the type. Then choose the the experiment, which node in the experiment, and which blockstore on the node.
    screenshots/clab/create-imdataset.png

  4. Click “Create
    When you click the “Create” button, the file system on your node will be unmounted so that we can take a consistent snapshot of the contents. This process can take several minutes or longer, depending on the size of the file system. You can watch the progress on this page. When the progress bar reaches the “Ready” stage, your new dataset is ready! It will now show up in your “List Datasets” list, and can be used in new experiments.
    screenshots/clab/snapshot-dataset.png

  5. Use your dataset
    To use your new dataset, you will need to reference it in your geni lib script (see the example code above). The name of your dataset is a URN, and can be found on the information page for the dataset. From the Actions menu, click on "List Datasets", find the name of your dataset in the list, and click on it.
    screenshots/clab/show-dataset.png

  6. Update your dataset
    If you need to make changes to your dataset, simply start an experiment that uses your dataset. Make the changes you need to the file system mounted at /mydata, and then use the "Modify" button as shown in the previous step.

8.11 Debugging geni-lib profile scripts

It is not necessary to instantiate the profile via the portal web interface to test it. Properly written profile scripts should work perfectly well independent of the normal portal—the same geni-lib objects will behave sensibly when invoked from the command line. As long as geni-lib is installed, then invoking the Python interpreter on the profile script should simply write the corresponding RSpec to standard output. (Parameters, if any, will assume their default values.) For instance, if the script in the previous example is saved as geni-lib-parameters.py, then the command:

python geni-lib-parameters.py

will produce an RSpec containing three nodes (the default value for n). It is also possible to override the defaults on the command line by giving the parameter name as an option, followed by the desired value:

python geni-lib-parameters.py –n 4

The option help will list the available parameters and their descriptions.

9 Virtual Machines and Containers

A CloudLab virtual node is a virtual machine or container running on top of a regular operating system. CloudLab virtual nodes are based on the Xen hypervisor or on Docker containers. Both types of virtualization allow groups of processes to be isolated from each other while running on the same physical machine. CloudLab virtual nodes provide isolation of the filesystem, process, network, and account namespaces. Thus, each virtual node has its own private filesystem, process hierarchy, network interfaces and IP addresses, and set of users and groups. This level of virtualization allows unmodified applications to run as though they were on a real machine. Virtual network interfaces support an arbitrary number of virtual network links. These links may be individually shaped according to user-specified link parameters, and may be multiplexed over physical links or used to connect to virtual nodes within a single physical node.

There are a few specific differences between virtual and physical nodes. First, CloudLab physical nodes have a routable, public IPv4 address allowing direct remote access (unless the CloudLab installation has been configured to use unroutable control network IP addresses, which is very rare). However, virtual nodes are assigned control network IP addresses on a private network (typically the 172.16/12 subnet) and are remotely accessible over ssh via DNAT (destination network-address translation) to the physical host’s public control network IP address, to a high-numbered port. Depending on local configuration, it may be possible to request routable IP addresses for specific virtual nodes to enable direct remote access. Note that virtual nodes are always able to access the public Internet via SNAT (source network-address translation; nearly identical to masquerading).

Second, virtual nodes and their virtual network interfaces are connected by virtual links built atop physical links and physical interfaces. The virtualization of a physical device/link decreases the fidelity of the network emulation. Moreover, several virtual links may share the same physical links via multiplexing. Individual links are isolated at layer 2, but they are not isolated in terms of performance. If you request a specific bandwidth for a given set of links, our resource mapper will ensure that if multiple virtual links are mapped to a single physical link, the sum of the bandwidths of the virtual links will not exceed the capacity of the physical link (unless you also specify that this constraint can be ignored by setting the best_effort link parameter to True). For example, no more than ten 1Gbps virtual links can be mapped to a 10Gbps physical link.

Finally, when you allocate virtual nodes, you can specify the amount of CPU and RAM (and, for Xen VMs, virtual disk space) each node will be allocated. CloudLab’s resource assigner will not oversubscribe these quantities.

9.1 Xen VMs

These examples show the basics of allocating Xen VMs: a single Xen VM node, two Xen VMs in a LAN, a Xen VM with custom disk size. In the sections below, we discuss advanced Xen VM allocation features.

9.1.1 Controlling CPU and Memory

You can control the number of cores and the amount of memory allocated to each VM by setting the cores and ram instance variables of a XenVM object, as shown in the following example:

"""An example of constructing a profile with a single Xen VM. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Request a specific number of VCPUs. node.cores = 4 # Request a specific amount of memory (in GB). node.ram = 4096 # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() """An example of constructing a profile with a single Xen VM. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Request a specific number of VCPUs. node.cores = 4 # Request a specific amount of memory (in GB). node.ram = 4096 # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

9.1.2 Controlling Disk Space

Each Xen VM is given enough disk space to hold the requested image. Most CloudLab images are built with a 16 GB root partition, typically with about 25% of the disk space used by the operating system. If the remaining space is not enough for your needs, you can request additional disk space by setting a XEN_EXTRAFS node attribute, as shown in the following example.

"""An example of constructing a profile with a single Xen VM with extra fs space. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Import Emulab-specific extensions so we can set node attributes. import geni.rspec.emulab as emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Set the XEN_EXTRAFS to request 8GB of extra space in the 4th partition. node.Attribute('XEN_EXTRAFS','8') # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() """An example of constructing a profile with a single Xen VM with extra fs space. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Import Emulab-specific extensions so we can set node attributes. import geni.rspec.emulab as emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Set the XEN_EXTRAFS to request 8GB of extra space in the 4th partition. node.Attribute('XEN_EXTRAFS','8') # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

This attribute’s unit is in GB. As with CloudLab physical nodes, the extra disk space will appear in the fourth partition of your VM’s disk. You can turn this extra space into a usable file system by logging into your VM and doing:

mynode> sudo mkdir /dirname
mynode> sudo /usr/local/etc/emulab/mkextrafs.pl /dirname

where dirname is the directory you want your newly-formatted file system to be mounted.

"""An example of constructing a profile with a single Xen VM. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Request a specific number of VCPUs. node.cores = 4 # Request a specific amount of memory (in GB). node.ram = 4096 # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() """An example of constructing a profile with a single Xen VM. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Request a specific number of VCPUs. node.cores = 4 # Request a specific amount of memory (in GB). node.ram = 4096 # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

9.1.3 Setting HVM Mode

By default, all Xen VMs are paravirtualized. If you need hardware virtualization instead, you must set a XEN_FORCE_HVM node attribute, as shown in this example:

"""An example of constructing a profile with a single Xen VM in HVM mode. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Import Emulab-specific extensions so we can set node attributes. import geni.rspec.emulab as emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Set the XEN_FORCE_HVM custom node attribute to 1 to enable HVM mode: node.Attribute('XEN_FORCE_HVM','1') # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() """An example of constructing a profile with a single Xen VM in HVM mode. Instructions: Wait for the profile instance to start, and then log in to the VM via the ssh port specified below. (Note that in this case, you will need to access the VM through a high port on the physical host, since we have not requested a public IP address for the VM itself.) """ import geni.portal as portal import geni.rspec.pg as rspec # Import Emulab-specific extensions so we can set node attributes. import geni.rspec.emulab as emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a XenVM node = request.XenVM("node") # Set the XEN_FORCE_HVM custom node attribute to 1 to enable HVM mode: node.Attribute('XEN_FORCE_HVM','1') # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

You can set this attribute only for dedicated-mode VMs. Shared VMs are available only in paravirtualized mode.

9.2 Docker Containers

CloudLab supports experiments that use Docker containers as virtual nodes. In this section, we first describe how to build simple profiles that create Docker containers, and then demonstrate more advanced features. The CloudLab-Docker container integration has been designed to enable easy image onboarding, and to allow users to continue to work naturally with the standard Docker API or CLI. However, because CloudLab is itself an orchestration engine, it does not support any of the Docker orchestration tools or platforms, such as Docker Swarm.

You can request a CloudLab Docker container in a geni-lib script like this:

import geni.portal as portal
import geni.rspec.pg as rspec
request = portal.context.makeRequestRSpec()
node = request.DockerContainer("node")

You can use the returned node object (a DockerContainer instance) similarly to other kinds of node objects, like RawPC or XenVM. However, Docker nodes have several custom member variables you can set to control their behavior and Docker-specific features. We demonstrate the usage of these member variables in the following subsections and summarize them at the end of this section.

9.2.1 Basic Examples

"""An example of constructing a profile with a single Docker container. Instructions: Wait for the profile instance to start, and then log in to the container via the ssh port specified below. By default, your container will run a standard Ubuntu image with the Emulab software preinstalled. """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a Docker container. node = request.DockerContainer("node") # Request a container hosted on a shared container host; you will not # have access to the underlying physical host, and your container will # not be privileged. Note that if there are no shared hosts available, # your experiment will be assigned a physical machine to host your container. node.exclusive = True # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() """An example of constructing a profile with a single Docker container. Instructions: Wait for the profile instance to start, and then log in to the container via the ssh port specified below. By default, your container will run a standard Ubuntu image with the Emulab software preinstalled. """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a Docker container. node = request.DockerContainer("node") # Request a container hosted on a shared container host; you will not # have access to the underlying physical host, and your container will # not be privileged. Note that if there are no shared hosts available, # your experiment will be assigned a physical machine to host your container. node.exclusive = True # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

It is easy to extend this profile slightly to allocate 10 containers in a LAN, and to switch them to dedicated mode. Note that in this case, the exclusive member variable is not specified, and it defaults to False):

"""An example of constructing a profile with ten Docker containers in a LAN. Instructions: Wait for the profile instance to start, and then log in to the container via the ssh port specified below. By default, your container will run a standard Ubuntu image with the Emulab software preinstalled. """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a LAN to put containers into. lan = request.LAN("lan") # Create ten Docker containers. for i in range(0,10): node = request.DockerContainer("node-%d" % (i)) # Create an interface. iface = node.addInterface("if1") # Add the interface to the LAN. lan.addInterface(iface) # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() """An example of constructing a profile with ten Docker containers in a LAN. Instructions: Wait for the profile instance to start, and then log in to the container via the ssh port specified below. By default, your container will run a standard Ubuntu image with the Emulab software preinstalled. """ import geni.portal as portal import geni.rspec.pg as rspec # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a LAN to put containers into. lan = request.LAN("lan") # Create ten Docker containers. for i in range(0,10): node = request.DockerContainer("node-%d" % (i)) # Create an interface. iface = node.addInterface("if1") # Add the interface to the LAN. lan.addInterface(iface) # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

Here is a more complex profile that creates 20 containers, binds 10 of them to a physical host machine of a particular type, and binds the other 10 to a second machine of the same type:

"""An example of constructing a profile with 20 Docker containers in a LAN, divided across two container hosts. Instructions: Wait for the profile instance to start, and then log in to the container via the ssh port specified below. By default, your container will run a standard Ubuntu image with the Emulab software preinstalled. """ import geni.portal as portal import geni.rspec.pg as rspec # Import the Emulab specific extensions. import geni.rspec.emulab as emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a LAN to put containers into. lan = request.LAN("lan") # Create two container hosts, each with ten Docker containers. for j in range(0,2): # Create a container host. host = request.RawPC("host-%d" % (j)) # Select a specific hardware type for the container host. host.hardware_type = "d430" for i in range(0,10): # Create a container. node = request.DockerContainer("node-%d-%d" % (j,i)) # Create an interface. iface = node.addInterface("if1") # Add the interface to the LAN. lan.addInterface(iface) # Set this container to be instantiated on the host created in # the outer loop. node.InstantiateOn(host.client_id) # Print the RSpec to the enclosing page. portal.context.printRequestRSpec() """An example of constructing a profile with 20 Docker containers in a LAN, divided across two container hosts. Instructions: Wait for the profile instance to start, and then log in to the container via the ssh port specified below. By default, your container will run a standard Ubuntu image with the Emulab software preinstalled. """ import geni.portal as portal import geni.rspec.pg as rspec # Import the Emulab specific extensions. import geni.rspec.emulab as emulab # Create a Request object to start building the RSpec. request = portal.context.makeRequestRSpec() # Create a LAN to put containers into. lan = request.LAN("lan") # Create two container hosts, each with ten Docker containers. for j in range(0,2): # Create a container host. host = request.RawPC("host-%d" % (j)) # Select a specific hardware type for the container host. host.hardware_type = "d430" for i in range(0,10): # Create a container. node = request.DockerContainer("node-%d-%d" % (j,i)) # Create an interface. iface = node.addInterface("if1") # Add the interface to the LAN. lan.addInterface(iface) # Set this container to be instantiated on the host created in # the outer loop. node.InstantiateOn(host.client_id) # Print the RSpec to the enclosing page. portal.context.printRequestRSpec()

9.2.2 Disk Images

Docker containers use a different disk image format than CloudLab physical machines or Xen virtual machines, which means that you cannot use the same images on both a container and a raw PC. However, CloudLab supports native Docker images in several modes and workflows. CloudLab hosts a private Docker registry, and the standard CloudLab image-deployment and -capture mechanisms support capturing container disk images into it. CloudLab also supports the use of externally hosted, unmodified Docker images and Dockerfiles for image onboarding and dynamic image creation. Finally, since some CloudLab features require in-container support (e.g., user accounts, SSH pubkeys, syslog, scripted program execution), we also provide an optional automated process, called augmentation, through which an external image can be customized with the CloudLab software and dependencies.

CloudLab supports both augmented and unmodified Docker images, but some features require augmentation (e.g., that the CloudLab client-side software is installed and running in the container). Unmodified images support these CloudLab features: network links, link shaping, remote access, remote storage (e.g. remote block stores), and image capture. Unmodified images do not support user accounts, SSH pubkeys, or scripted program execution.

CloudLab’s disk image naming and versioning scheme is slightly different from Docker’s content-addressable model. A CloudLab disk image is identified by a project and name tuple (typically encoded as a URN), or by a UUID, as well as a version number that starts at 0. Each time you capture a new version of an image, the image’s version number is incremented by one. CloudLab does not support the use of arbitrary alphanumeric tags to identify image versions, as Docker does.

Thus, when you capture an CloudLab disk image of a CloudLab Docker container, and give it a name, the local CloudLab registry will contain an image (repository) of that name, in the project (and group—nearly always the project name) your experiment was created within—and thus the full image name is <project-name>/<group-name>/<image-name>. The tags within that repository correspond to the integer version numbers of the CloudLab disk image. For example, if you have created an CloudLab image named docker-my-research in project myproject, and you have created three versions (0, 1, 2) and want to pull the latest version (2) to your computer, you could run this command:

docker pull ops.emulab.net:5080/myproject/myproject/docker-my-research:2

You will be prompted for username and password; use your CloudLab credentials.

The following code fragment creates a Docker container that uses a standard CloudLab Docker disk image, docker-ubuntu16-std. This image is based on the ubuntu:16.04 Docker image, with the Emulab client software installed (meaning it is augmented) along with dependencies and other utilities.

"""An example of a Docker container running a standard, augmented system image.""" import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") node.disk_image = "urn:publicid:IDN+emulab.net+image+emulab-ops//docker-ubuntu16-std" portal.context.printRequestRSpec() """An example of a Docker container running a standard, augmented system image.""" import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") node.disk_image = "urn:publicid:IDN+emulab.net+image+emulab-ops//docker-ubuntu16-std" portal.context.printRequestRSpec()

9.2.3 External Images

CloudLab supports the use of publicly accessible Docker images in other registries. It does not currently support username/password access to images. By default, if you simply specify a repository and tag, as in the example below, CloudLab assumes the image is in the standard Docker registry; but you can instead specify a complete URL pointing to a different registry.

"""An example of a Docker container running an external, unmodified image.""" import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") node.docker_extimage = "ubuntu:16.04" portal.context.printRequestRSpec() """An example of a Docker container running an external, unmodified image.""" import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") node.docker_extimage = "ubuntu:16.04" portal.context.printRequestRSpec()

By default, CloudLab assumes that an external, non-augmented image does not run its own sshd to support remote login. Instead, it facilitates remote access to a container by running an alternate sshd on the container host and executing a shell (by default /bin/sh) in the container associated with a specific port (the port in the ssh URL shown on your experiment’s page). See the section on remote access below for more detail.

9.2.4 Dockerfiles

You can also create images dynamically (at experiment runtime) by specifying a Dockerfile for each container. Note that if multiple containers hosted on the same physical machine reference the same Dockerfile, the dynamically built image will only be built once. Here is a simple example of a Dockerfile that builds httpd from source:

"""An example of a Docker container running an external, unmodified image.""" import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") node.docker_dockerfile = "https://github.com/docker-library/httpd/raw/38842a5d4cdd44ff4888e8540c0da99009790d01/2.4/Dockerfile" portal.context.printRequestRSpec() """An example of a Docker container running an external, unmodified image.""" import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") node.docker_dockerfile = "https://github.com/docker-library/httpd/raw/38842a5d4cdd44ff4888e8540c0da99009790d01/2.4/Dockerfile" portal.context.printRequestRSpec()

You should not assume you have access to the image build environment (you only do if your container(s) are running in dedicated mode)—you should test your Dockerfile on your local machine first to ensure it works. Moreover, your Dockerfile will have an empty context directory, so any file resources it requires must be downloaded during the image build, from instructions in the Dockerfile.

9.2.5 Augmented Disk Images

The primary use case that Docker supports involves a large collection of containers, potentially spanning many machines, in which each container runs a single application (one or more processes launched by a single parent) that provides a specific service. In this model, container images may be tailored to support only their single service, unencumbered by extra software. Many Docker images do not include an init daemon to launch and monitor processes, or even a basic set of user-space tools and libraries.

CloudLab-based experimentation requires more than the ability to run a single service per node. Within an experiment, it is beneficial for each node to run basic OS services (e.g., syslogd and sshd) to support activities such as debugging, interactive exploration, logging, and more. It is beneficial for each node to run a suite of CloudLab-specific services to configure the node, conduct in-node monitoring, and automate experiment deployment and control (e.g., launch programs or dynamically emulate link failures). This environment requires a full-featured init daemon and a common, basic set of user-space software and libraries.

To enable experiments that use existing Docker images and seamlessly support the CloudLab environment, CloudLab supports automatic augmentation of those images. When requested, CloudLab pulls an existing image, builds both runit (a simple, minimal init daemon) and the CloudLab client toolchain against it in temporary containers, and creates a new image with the just-built runit and CloudLab binaries, scripts, and dependencies. Augmented images are necessary to fully support some CloudLab features (specifically, the creation of user accounts, installation of SSH pubkeys, and execution of startup scripts). Other CloudLab features (such as experiment network links and block storage) are configured outside the container at startup and do not require augmentation.

During augmentation, CloudLab modifies Docker images to run an init system rather than a single application. CloudLab builds and packages (in temporary containers with a build toolchain) and installs (in the final augmented image) runit and sets it as the augmented image’s ENTRYPOINT. CloudLab creates runit services to emulate existing ENTRYPOINT and/or CMD instructions from the original image. The CloudLab client-side software is also built and installed.

9.2.6 Remote Access

CloudLab provides remote access via ssh to each node in an experiment. If your container is running an sshd inside, then ssh access is straightfoward: CloudLab allocates a high-numbered port on the container host machine, and redirects (DNATs) that port to the private, unrouted control network IP address assigned to the container. (CloudLab containers typically receive private addresses that are only routed on the CloudLab control network to increase experiment scalability.) We refer to this style of ssh remote login as direct.

However, most unmodified, unaugmented Docker images do not run sshd. To trivially allow remote access to containers running such images, CloudLab runs a proxy sshd in the host context on the high-numbered ports assigned to each container, and turns successfully authenticated incoming connections into invocations of docker exec <container> /bin/sh. We refer to this style of ssh remote login as exec.

You can force the ssh style you prefer on a per-container basis—and if using the exec style, you can also choose a specific shell to execute inside the container. For instance:

"""An example of a Docker container running an external, unmodified image, and customizing its remote access.""" import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") node.docker_extimage = "ubuntu:16.04" node.docker_ssh_style = "exec" node.docker_exec_shell = "/bin/bash" portal.context.printRequestRSpec() """An example of a Docker container running an external, unmodified image, and customizing its remote access.""" import geni.portal as portal import geni.rspec.pg as rspec request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") node.docker_extimage = "ubuntu:16.04" node.docker_ssh_style = "exec" node.docker_exec_shell = "/bin/bash" portal.context.printRequestRSpec()

9.2.7 Console

Although Docker containers do not have a serial console, as do most CloudLab physical machines, they can provide a pty (pseudoterminal) for interaction with the main process running in the container. CloudLab attaches to these streams via the Docker daemon and proxies them into the CloudLab console mechanism. Thus, if you click the "Console" link for a container on your experiment status page, you will see output from the container and be able to send it input. (Each time you close and reopen the console for a container, your console will see the entire console log kept by the Docker daemon, so there may be a flood of initial output if your container has been running for a long time and producing output.)

9.2.8 ENTRYPOINT and CMD

If you run an augmented Docker image in a container, recall that the augmentation process overrides the ENTRYPOINT and/or CMD instruction(s) the image contained, if any. Changing the ENTRYPOINT means that the augmented image will not run the command that was specified for the original image. Moreover, runit must be run as root, but the image creator may have specified a different USER for processes that execute in the container. To fix these problems, CloudLab emulates the original (or runtime-specific) ENTRYPOINT as an runit service and handles several related Dockerfile settings as well: CMD, WORKDIR, ENV, and USER. The emulation preserves the semantics of these settings (see Docker’s reference on interaction between ENTRYPOINT and CMD), with the exception that the user-specified ENTRYPOINT or CMD is not executed as PID 1. Only the ENTRYPOINT and CMD processes run as USER; processes started from outside the container via docker exec run as root.

You can customize the ENTRYPOINT and CMD for each container you create by setting the docker_entrypoint and the docker_cmd member variables of a geni-lib DockerContainer object instance.

The runit service that performs this emulation is named dockerentrypoint. If this service exits, it is not automatically restarted, unlike most runit services. You can start it again by running sv start dockerentrypoint within a container. You can also browse the runit documentation for more examples of interacting with runit.

This emulation process is complicated, so ifyou suspect problems, there are logfiles written in each container that may help. The stdout and stderr output from the ENTRYPOINT and CMD emulation is logged to /var/log/entrypoint.log, and additional debugging information is logged in /var/log/entrypoint-debug.log.

9.2.9 Shared Containers

In CloudLab, Docker containers can be created in dedicated or shared mode. In dedicated mode, containers run on physical nodes that are reserved to a particular experiment, and you have root-level access to the underlying physical machine. In shared mode, containers run on physical machines that host containers from potentially many experiments, and users do not have access to the underlying physical machine.

9.2.10 Privileged Containers

Docker allows containers to be privileged or unprivileged: a privileged container has administrative access to the underlying host. CloudLab allows you to spawn privileged containers, but only on dedicated container hosts. Moreover, you should only do this when absolutely necessary, because a hacked privileged container can effectively take over the physical host, and access and control other containers. To make a container privileged, set the docker_privileged member variable of a geni-lib DockerContainer object instance.

9.2.11 Remote Blockstores

You can mount CloudLab remote blockstores in Docker containers — if they were formatted with a Linux-mountable filesystem. Here is an example:

"""An example of a Docker container that mounts a remote blockstore.""" import geni.portal as portal import geni.rspec.pg as rspec import geni.rspec.igext as ig request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") # Create an interface to connect to the link from the container to the # blockstore host. myintf = # Create the blockstore host. bsnode = ig.RemoteBlockstore("bsnode","/mnt/blockstore") # Map your remote blockstore to the blockstore host bsnode.dataset = \ "urn:publicid:IDN+emulab.net:emulab-ops+ltdataset+johnsond-bs-foo" bsnode.readonly = False # Connect the blockstore host to the container. bslink = pg.Link("bslink") bslink.addInterface(node.addInterface("ifbs0")) bslink.addInterface(bsnode.interface) portal.context.printRequestRSpec() """An example of a Docker container that mounts a remote blockstore.""" import geni.portal as portal import geni.rspec.pg as rspec import geni.rspec.igext as ig request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") # Create an interface to connect to the link from the container to the # blockstore host. myintf = # Create the blockstore host. bsnode = ig.RemoteBlockstore("bsnode","/mnt/blockstore") # Map your remote blockstore to the blockstore host bsnode.dataset = \ "urn:publicid:IDN+emulab.net:emulab-ops+ltdataset+johnsond-bs-foo" bsnode.readonly = False # Connect the blockstore host to the container. bslink = pg.Link("bslink") bslink.addInterface(node.addInterface("ifbs0")) bslink.addInterface(bsnode.interface) portal.context.printRequestRSpec()

CloudLab does not support mapping raw devices (or in the case of CloudLab remote blockstores, virtual ISCSI-backed devices) into containers.

9.2.12 Temporary Block Storage

You can mount CloudLab temporary blockstores in Docker containers. Here is an example that places an 8 GB filesystem in a container at /mnt/tmp:

"""An example of a Docker container that mounts a remote blockstore.""" import geni.portal as portal import geni.rspec.pg as rspec import geni.rspec.igext as ig request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") bs = node.Blockstore("temp-bs","/mnt/tmp") bs.size = "8GB" bs.placement "any" portal.context.printRequestRSpec() """An example of a Docker container that mounts a remote blockstore.""" import geni.portal as portal import geni.rspec.pg as rspec import geni.rspec.igext as ig request = portal.context.makeRequestRSpec() node = request.DockerContainer("node") bs = node.Blockstore("temp-bs","/mnt/tmp") bs.size = "8GB" bs.placement "any" portal.context.printRequestRSpec()

9.2.13 DockerContainer Member Variables

The following table summarizes the DockerContainer class member variables. You can find more detail either in the sections above, or by browsing the source code documentation.

cores

   

The amount (integer) of virtual CPUs this container should receive; approximated using Docker's CpuShares and CpuPeriod options.

ram

   

The amount (integer) of memory this container should receive.

disk_image

   

The disk image this node should run. See the section on Docker disk images below.

docker_ptype

   

The physical node type on which to instantiate the container. Types are cluster-specific; see the hardware chapter.

docker_extimage

   

An external Docker image (repo:tag) to load on the container; see the section on external Docker images.

docker_dockerfile

   

A URL that points to a Dockerfile from which an image for this node will be created; see the section on Dockerfiles.

docker_tbaugmentation

   

The requested testbed augmentation level: must be one of full, buildenv, core, basic, none. See the section on Docker image augmentation.

docker_tbaugmentation_update

   

If the image has already been augmented, should we update it True or not False.

docker_ssh_style

   

Specify what happens when you ssh to your node; must be either direct or exec. See the section on remote access.

docker_exec_shell

   

The shell to run if the value of docker_ssh_style is direct; ignored otherwise.

docker_entrypoint

   

Change/set the Docker ENTRYPOINT option for this container; see the section on Docker ENTRYPOINT and CMD handling.

docker_cmd

   

Change/set the Docker CMD option for this container; see the section on Docker ENTRYPOINT and CMD handling.

docker_env

   

Add or override environment variables to a container. The value to this attribute should be either a newline-separated list of variable assignments, or one or more variable assignments on a single line. If the former, we do not support escaped newlines, unlike the Docker ENV instruction.

docker_privileged

   

Set this member variable True to make the container privileged. See the section on Docker privileged containers.

10 Advanced Topics

10.1 Disk Images

Most disk images in CloudLab are stored and distributed in the Frisbee disk image format. They are stored at block level, meaning that, in theory, any filesystem can be used. In practice, Frisbee’s filesystem-aware compression is used, which causes the image snapshotting and installation processes to parse the filesystem and skip free blocks; this provides large performance benefits and space savings. Frisbee has support for filesystems that are variants of the BSD UFS/FFS, Linux ext, and Windows NTFS formats. The disk images created by Frisbee are bit-for-bit identical with the original image, with the caveat that free blocks are skipped and may contain leftover data from previous users.

Disk images in CloudLab are typically created by starting with one of CloudLab’s supplied images, customizing the contents, and taking a snapshot of the resulting disk. The snapshotting process reboots the node, as it boots into an MFS to ensure a quiescent disk. If you wish to bring in an image from outside of CloudLab or create a new one from scratch, please contact us for help; if this is a common request, we may add features to make it easier.

CloudLab has default disk image for each node type; after a node is freed by one experimenter, it is re-loaded with the default image before being released back into the free pool. As a result, profiles that use the default disk images typically instantiate faster than those that use custom images, as no disk loading occurs.

Frisbee loads disk images using a custom multicast protocol, so loading large numbers of nodes typically does not slow down the instantiation process much.

Images may be referred to in requests in three ways: by URN, by an unqualified name, and by URL. URNs refer to a specific image that may be hosted on any of the CloudLab-affilated clusters. An unqualified name refers to the version of the image hosted on the cluster on which the experiment is instantiated. If you have large images that CloudLab cannot store due to space constraints, you may host them yourself on a webserver and put the URL for the image into the profile. CloudLab will fetch your image on demand, and cache it for some period of time for efficient distribution.

Images in CloudLab are versioned, and CloudLab records the provenance of images. That is, when you create an image by snapshotting a disk that was previously loaded with another image, CloudLab records which image was used as a base for the new image, and stores only the blocks that differ between the two. Image URLs and URNs can contain version numbers, in which case they refer to that specific version of the image, or they may omit the version number, in which case they refer to the latest version of the image at the time an experiment is instantiated.

10.2 RSpecs

The resources (nodes, links, etc.) that define a profile are expressed in the RSpec format from the GENI project. In general, RSpec should be thought of as a sort of “assembly language”—something it’s best not to edit yourself, but as manipulate with other tools or create as a “compiled” target from a higher-level language.

10.3 Public IP Access

CloudLab treats publicly-routable IP addresses as an allocatable resource.

By default, all physical hosts are given a public IP address. This IP address is determined by the host, rather than the experiment. There are two DNS names that point to this public address: a static one that is the node’s permanent hostname (such as apt042.apt.emulab.net), and a dynamic one that is assigned based on the experiment; this one may look like <vname>.<exp>.<proj>.apt.emulab.net, where <vname> is the name assigned in the request RSpec, <eid> is the name assigned to the experiment, and proj is the project that the experiment belongs to. This name is predictable regardless of the physical nodes assigned.

By default, virtual machines are not given public IP addresses; basic remote access is provided through an ssh server running on a non-standard port, using the physical host’s IP address. This port can be discovered through the manifest of an instantiated experiment, or on the “list view” of the experiment page. If a public IP address is required for a virtual machine (for example, to host a webserver on it), a public address can be requested on a per-VM basis. If using the Jacks GUI, each VM has a checkbox to request a public address. If using python scripts and geni-lib, setting the routable_control_ip property of a node accomplishes the same effect. Different clusters will have different numbers of public addresses available for allocation in this manner.

10.3.1 Dynamic Public IP Addresses

In some cases, users would like to create their own virtual machines, and would like to give them public IP addresses. We also allow profiles to request a pool of dynamic addresses; VMs brought up by the user can then run DHCP to be assigned one of these addresses.

Profiles using python scripts and geni-lib can request dynamic IP address pools by constructing an AddressPool object (defined in the geni.rspec.igext module), as in the following example:

# Request a pool of 3 dynamic IP addresses pool = AddressPool( "poolname", 3 ) rspec.addResource( pool ) # Request a pool of 3 dynamic IP addresses pool = AddressPool( "poolname", 3 ) rspec.addResource( pool )

The addresses assigned to the pool are found in the experiment manifest.

10.4 Markdown

CloudLab supports Markdown in the major text fields in RSpecs. Markdown is a simple formatting syntax with a straightforward translation to basic HTML elements such as headers, lists, and pre-formatted text. You may find this useful in the description and instructions attached to your profile.

While editing a profile, you can preview the Markdown rendering of the Instructions or Description field by double-clicking within the text box.

screenshots/clab/markdown-preview.png

You will probably find the Markdown manual to be useful.

10.5 Introspection

CloudLab implements the GENI APIs, and in particular the geni-get command. geni-get is a generic means for nodes to query their own configuration and metadata, and is pre-installed on all facility-provided disk images. (If you are supplying your own disk image built from scratch, you can add the geni-get client from its repository.)

While geni-get supports many options, there are five commands most useful in the CloudLab context.

10.5.1 Client ID

Invoking geni-get client_id will print a single line to standard output showing the identifier specified in the profile corresponding to the node on which it is run. This is a particularly useful feature in execute services, where a script might want to vary its behaviour between different nodes.

10.5.2 Control MAC

The command geni-get control_mac will print the MAC address of the control interface (as a string of 12 hexadecimal digits with no punctuation). In some circumstances this can be a useful means to determine which interface is attached to the control network, as OSes are not necessarily consistent in assigning identifiers to network interfaces.

10.5.3 Manifest

To retrieve the manifest RSpec for the instance, you can use the command geni-get manifest. It will print the manifest to standard output, including any annotations added during instantiation. For instance, this is an appropriate technique to use to query the allocation of a dynamic public IP address pool.

10.5.4 Private key

As a convenience, CloudLab will automatically generate an RSA private key unique to each profile instance. geni-get key will retrieve the private half of the keypair, which makes it a useful command for profiles bootstraping an authenticated channel. For instance:

#!/bin/sh # Create the user SSH directory, just in case. mkdir $HOME/.ssh && chmod 700 $HOME/.ssh # Retrieve the server-generated RSA private key. geni-get key > $HOME/.ssh/id_rsa chmod 600 $HOME/.ssh/id_rsa # Derive the corresponding public key portion. ssh-keygen -y -f $HOME/.ssh/id_rsa > $HOME/.ssh/id_rsa.pub # If you want to permit login authenticated by the auto-generated key, # then append the public half to the authorized_keys file: grep -q -f $HOME/.ssh/id_rsa.pub $HOME/.ssh/authorized_keys || cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys #!/bin/sh # Create the user SSH directory, just in case. mkdir $HOME/.ssh && chmod 700 $HOME/.ssh # Retrieve the server-generated RSA private key. geni-get key > $HOME/.ssh/id_rsa chmod 600 $HOME/.ssh/id_rsa # Derive the corresponding public key portion. ssh-keygen -y -f $HOME/.ssh/id_rsa > $HOME/.ssh/id_rsa.pub # If you want to permit login authenticated by the auto-generated key, # then append the public half to the authorized_keys file: grep -q -f $HOME/.ssh/id_rsa.pub $HOME/.ssh/authorized_keys || cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Please note that the private key will be accessible to any user who can invoke geni-get from within the profile instance. Therefore, it is NOT suitable for an authentication mechanism for privilege within a multi-user instance!

10.5.5 Profile parameters

When executing within the context of a profile instantiated with user-specified parameters, geni-get allows the retrieval of any of those parameters. The proper syntax is geni-get "param name", where name is the parameter name as specified in the geni-lib script defineParameter call. For example, geni-get "param n" would retrieve the number of nodes in an instance of the profile shown in the geni-lib parameter section.

10.6 User-controlled switches and layer-1 topologies

Some experiments require exclusive access to Ethernet switches and/or the ability for users to reconfigure those switches. One example of a good use case for this feature is to enable or tune QoS features that cannot be enabled on CloudLab’s shared infrastructure switches.

User-allocated switches are treated similarly to the way CloudLab treats servers: the switches appear as nodes in your topology, and you ’wire’ them to PCs and each other using point-to-point layer-1 links. When one of these switches is allocated to an experiment, that experiment is the exclusive user, just as it is for a raw PC, and the user has ssh access with full administrative control. This means that users are free to enable and disable features, tweak parameters, reconfigure as will, etc. Users are be given the switches in a ’clean’ state (we do little configuration on them), and can reload and reboot them like you would do with a server.

The list of available switches is found in our hardware chapter, and the following example shows how to request a simple topology using geni-lib.

"""This profile allocates two bare metal nodes and connects them together via a Dell switch with layer1 links. Instructions: Click on any node in the topology and choose the `shell` menu item. When your shell window appears, use `ping` to test the link. You will be able to ping the other node through the switch fabric. We have installed a minimal configuration on your switches that enables the ports that are in use, and turns on spanning-tree (RSTP) in case you inadvertently created a loop with your topology. All unused ports are disabled. The ports are in Vlan 1, which effectively gives a single broadcast domain. If you want anything fancier, you will need to open up a shell window to your switches and configure them yourself. If your topology has more then a single switch, and you have links between your switches, we will enable those ports too, but we do not put them into switchport mode or bond them into a single channel, you will need to do that yourself. If you make any changes to the switch configuration, be sure to write those changes to memory. We will wipe the switches clean and restore a default configuration when your experiment ends.""" # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as pg # Import the Emulab specific extensions. import geni.rspec.emulab as emulab # Create a portal context. pc = portal.Context() # Create a Request object to start building the RSpec. request = pc.makeRequestRSpec() # Do not run snmpit #request.skipVlans() # Add a raw PC to the request and give it an interface. node1 = request.RawPC("node1") iface1 = node1.addInterface() # Specify the IPv4 address iface1.addAddress(pg.IPv4Address("192.168.1.1", "255.255.255.0")) # Add Switch to the request and give it a couple of interfaces mysw = request.Switch("mysw"); mysw.hardware_type = "dell-s4048" swiface1 = mysw.addInterface() swiface2 = mysw.addInterface() # Add another raw PC to the request and give it an interface. node2 = request.RawPC("node2") iface2 = node2.addInterface() # Specify the IPv4 address iface2.addAddress(pg.IPv4Address("192.168.1.2", "255.255.255.0")) # Add L1 link from node1 to mysw link1 = request.L1Link("link1") link1.addInterface(iface1) link1.addInterface(swiface1) # Add L1 link from node2 to mysw link2 = request.L1Link("link2") link2.addInterface(iface2) link2.addInterface(swiface2) # Print the RSpec to the enclosing page. pc.printRequestRSpec(request) Open this profile on CloudLab"""This profile allocates two bare metal nodes and connects them together via a Dell switch with layer1 links. Instructions: Click on any node in the topology and choose the `shell` menu item. When your shell window appears, use `ping` to test the link. You will be able to ping the other node through the switch fabric. We have installed a minimal configuration on your switches that enables the ports that are in use, and turns on spanning-tree (RSTP) in case you inadvertently created a loop with your topology. All unused ports are disabled. The ports are in Vlan 1, which effectively gives a single broadcast domain. If you want anything fancier, you will need to open up a shell window to your switches and configure them yourself. If your topology has more then a single switch, and you have links between your switches, we will enable those ports too, but we do not put them into switchport mode or bond them into a single channel, you will need to do that yourself. If you make any changes to the switch configuration, be sure to write those changes to memory. We will wipe the switches clean and restore a default configuration when your experiment ends.""" # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as pg # Import the Emulab specific extensions. import geni.rspec.emulab as emulab # Create a portal context. pc = portal.Context() # Create a Request object to start building the RSpec. request = pc.makeRequestRSpec() # Do not run snmpit #request.skipVlans() # Add a raw PC to the request and give it an interface. node1 = request.RawPC("node1") iface1 = node1.addInterface() # Specify the IPv4 address iface1.addAddress(pg.IPv4Address("192.168.1.1", "255.255.255.0")) # Add Switch to the request and give it a couple of interfaces mysw = request.Switch("mysw"); mysw.hardware_type = "dell-s4048" swiface1 = mysw.addInterface() swiface2 = mysw.addInterface() # Add another raw PC to the request and give it an interface. node2 = request.RawPC("node2") iface2 = node2.addInterface() # Specify the IPv4 address iface2.addAddress(pg.IPv4Address("192.168.1.2", "255.255.255.0")) # Add L1 link from node1 to mysw link1 = request.L1Link("link1") link1.addInterface(iface1) link1.addInterface(swiface1) # Add L1 link from node2 to mysw link2 = request.L1Link("link2") link2.addInterface(iface2) link2.addInterface(swiface2) # Print the RSpec to the enclosing page. pc.printRequestRSpec(request)

This feature is implemented using a set of layer-1 switches between some servers and Ethernet switches. These switches act as “patch panels,” allowing us to “wire” servers to switches with no intervening Ethernet packet processing and minimal impact on latency. This feature can also be used to “wire” servers directly to one another, and to create links between switches, as seen in the following two examples.

"""This profile allocates two bare metal nodes and connects them directly together via a layer1 link. Instructions: Click on any node in the topology and choose the `shell` menu item. When your shell window appears, use `ping` to test the link.""" # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as pg # Import the Emulab specific extensions. import geni.rspec.emulab as emulab # Create a portal context. pc = portal.Context() # Create a Request object to start building the RSpec. request = pc.makeRequestRSpec() # Do not run snmpit #request.skipVlans() # Add a raw PC to the request and give it an interface. node1 = request.RawPC("node1") iface1 = node1.addInterface() # Specify the IPv4 address iface1.addAddress(pg.IPv4Address("192.168.1.1", "255.255.255.0")) # Add another raw PC to the request and give it an interface. node2 = request.RawPC("node2") iface2 = node2.addInterface() # Specify the IPv4 address iface2.addAddress(pg.IPv4Address("192.168.1.2", "255.255.255.0")) # Add L1 link from node1 to node2 link1 = request.L1Link("link1") link1.addInterface(iface1) link1.addInterface(iface2) # Print the RSpec to the enclosing page. pc.printRequestRSpec(request) Open this profile on CloudLab"""This profile allocates two bare metal nodes and connects them directly together via a layer1 link. Instructions: Click on any node in the topology and choose the `shell` menu item. When your shell window appears, use `ping` to test the link.""" # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as pg # Import the Emulab specific extensions. import geni.rspec.emulab as emulab # Create a portal context. pc = portal.Context() # Create a Request object to start building the RSpec. request = pc.makeRequestRSpec() # Do not run snmpit #request.skipVlans() # Add a raw PC to the request and give it an interface. node1 = request.RawPC("node1") iface1 = node1.addInterface() # Specify the IPv4 address iface1.addAddress(pg.IPv4Address("192.168.1.1", "255.255.255.0")) # Add another raw PC to the request and give it an interface. node2 = request.RawPC("node2") iface2 = node2.addInterface() # Specify the IPv4 address iface2.addAddress(pg.IPv4Address("192.168.1.2", "255.255.255.0")) # Add L1 link from node1 to node2 link1 = request.L1Link("link1") link1.addInterface(iface1) link1.addInterface(iface2) # Print the RSpec to the enclosing page. pc.printRequestRSpec(request)

"""This profile allocates two bare metal nodes and connects them together via two Dell switches with layer1 links. Instructions: Click on any node in the topology and choose the `shell` menu item. When your shell window appears, use `ping` to test the link. You will be able to ping the other node through the switch fabric. We have installed a minimal configuration on your switches that enables the ports that are in use, and turns on spanning-tree (RSTP) in case you inadvertently created a loop with your topology. All unused ports are disabled. The ports are in Vlan 1, which effectively gives a single broadcast domain. If you want anything fancier, you will need to open up a shell window to your switches and configure them yourself. If your topology has more then a single switch, and you have links between your switches, we will enable those ports too, but we do not put them into switchport mode or bond them into a single channel, you will need to do that yourself. If you make any changes to the switch configuration, be sure to write those changes to memory. We will wipe the switches clean and restore a default configuration when your experiment ends.""" # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as pg # Import the Emulab specific extensions. import geni.rspec.emulab as emulab # Create a portal context. pc = portal.Context() # Create a Request object to start building the RSpec. request = pc.makeRequestRSpec() # Do not run snmpit #request.skipVlans() # Add a raw PC to the request and give it an interface. node1 = request.RawPC("node1") iface1 = node1.addInterface() # Specify the IPv4 address iface1.addAddress(pg.IPv4Address("192.168.1.1", "255.255.255.0")) # Add first switch to the request and give it a couple of interfaces mysw1 = request.Switch("mysw1"); mysw1.hardware_type = "dell-s4048" sw1iface1 = mysw1.addInterface() sw1iface2 = mysw1.addInterface() # Add second switch to the request and give it a couple of interfaces mysw2 = request.Switch("mysw2"); mysw2.hardware_type = "dell-s4048" sw2iface1 = mysw2.addInterface() sw2iface2 = mysw2.addInterface() # Add another raw PC to the request and give it an interface. node2 = request.RawPC("node2") iface2 = node2.addInterface() # Specify the IPv4 address iface2.addAddress(pg.IPv4Address("192.168.1.2", "255.255.255.0")) # Add L1 link from node1 to mysw1 link1 = request.L1Link("link1") link1.addInterface(iface1) link1.addInterface(sw1iface1) # Add L1 link from mysw1 to mysw2 trunk = request.L1Link("trunk") trunk.addInterface(sw1iface2) trunk.addInterface(sw2iface2) # Add L1 link from node2 to mysw2 link2 = request.L1Link("link2") link2.addInterface(iface2) link2.addInterface(sw2iface1) # Print the RSpec to the enclosing page. pc.printRequestRSpec(request) Open this profile on CloudLab"""This profile allocates two bare metal nodes and connects them together via two Dell switches with layer1 links. Instructions: Click on any node in the topology and choose the `shell` menu item. When your shell window appears, use `ping` to test the link. You will be able to ping the other node through the switch fabric. We have installed a minimal configuration on your switches that enables the ports that are in use, and turns on spanning-tree (RSTP) in case you inadvertently created a loop with your topology. All unused ports are disabled. The ports are in Vlan 1, which effectively gives a single broadcast domain. If you want anything fancier, you will need to open up a shell window to your switches and configure them yourself. If your topology has more then a single switch, and you have links between your switches, we will enable those ports too, but we do not put them into switchport mode or bond them into a single channel, you will need to do that yourself. If you make any changes to the switch configuration, be sure to write those changes to memory. We will wipe the switches clean and restore a default configuration when your experiment ends.""" # Import the Portal object. import geni.portal as portal # Import the ProtoGENI library. import geni.rspec.pg as pg # Import the Emulab specific extensions. import geni.rspec.emulab as emulab # Create a portal context. pc = portal.Context() # Create a Request object to start building the RSpec. request = pc.makeRequestRSpec() # Do not run snmpit #request.skipVlans() # Add a raw PC to the request and give it an interface. node1 = request.RawPC("node1") iface1 = node1.addInterface() # Specify the IPv4 address iface1.addAddress(pg.IPv4Address("192.168.1.1", "255.255.255.0")) # Add first switch to the request and give it a couple of interfaces mysw1 = request.Switch("mysw1"); mysw1.hardware_type = "dell-s4048" sw1iface1 = mysw1.addInterface() sw1iface2 = mysw1.addInterface() # Add second switch to the request and give it a couple of interfaces mysw2 = request.Switch("mysw2"); mysw2.hardware_type = "dell-s4048" sw2iface1 = mysw2.addInterface() sw2iface2 = mysw2.addInterface() # Add another raw PC to the request and give it an interface. node2 = request.RawPC("node2") iface2 = node2.addInterface() # Specify the IPv4 address iface2.addAddress(pg.IPv4Address("192.168.1.2", "255.255.255.0")) # Add L1 link from node1 to mysw1 link1 = request.L1Link("link1") link1.addInterface(iface1) link1.addInterface(sw1iface1) # Add L1 link from mysw1 to mysw2 trunk = request.L1Link("trunk") trunk.addInterface(sw1iface2) trunk.addInterface(sw2iface2) # Add L1 link from node2 to mysw2 link2 = request.L1Link("link2") link2.addInterface(iface2) link2.addInterface(sw2iface1) # Print the RSpec to the enclosing page. pc.printRequestRSpec(request)

11 Hardware

CloudLab can allocate experiments on any one of several federated clusters.

CloudLab has the ability to dispatch experiments to several clusters: three that belong to CloudLab itself, plus several more that belong to federated projects.

Additional hardware expansions are planned, and descriptions of them can be found at https://www.cloudlab.us/hardware.php

11.1 CloudLab Utah

The CloudLab cluster at the University of Utah is being built in partnership with HP. It consists of 200 Intel Xeon E5 servers, 270 Xeon-D servers, and 315 64-bit ARM servers and 270 Intel Xeon-D severs, for a total of 6,680 cores. The cluster is housed in the University of Utah’s Downtown Data Center in Salt Lake City.

m400

   

315 nodes (64-bit ARM)

CPU

   

Eight 64-bit ARMv8 (Atlas/A57) cores at 2.4 GHz (APM X-GENE)

RAM

   

64GB ECC Memory (8x 8 GB DDR3-1600 SO-DIMMs)

Disk

   

120 GB of flash (SATA3 / M.2, Micron M500)

NIC

   

Dual-port Mellanox ConnectX-3 10 GB NIC (PCIe v3.0, 8 lanes

m510

   

270 nodes (Intel Xeon-D)

CPU

   

Eight-core Intel Xeon D-1548 at 2.0 GHz

RAM

   

64GB ECC Memory (4x 16 GB DDR4-2133 SO-DIMMs)

Disk

   

256 GB NVMe flash storage

NIC

   

Dual-port Mellanox ConnectX-3 10 GB NIC (PCIe v3.0, 8 lanes

There are 45 nodes in a chassis, and this cluster consists of thirteen chassis. Each chassis has two 45XGc switches; each node is connected to both switches, and each chassis switch has four 40Gbps uplinks, for a total of 320Gbps of uplink capacity from each chassis. One switch is used for control traffic, connecting to the Internet, etc. The other is used to build experiment topologies, and should be used for most experimental purposes.

All chassis are interconnected through a large HP FlexFabric 12910 switch which has full bisection bandwidth internally.

We have plans to enable some users to allocate entire chassis; when allocated in this mode, it will be possible to have complete administrator control over the switches in addition to the nodes.

In phase two we added 50 Apollo R2200 chassis each with four HPE ProLiant XL170r server modules. Each server has 10 cores for a total of 2000 cores.

xl170

   

200 nodes (Intel Broadwell, 10 core, 1 disk)

CPU

   

Ten-core Intel E5-2640v4 at 2.4 GHz

RAM

   

64GB ECC Memory (4x 16 GB DDR4-2400 DIMMs)

Disk

   

Intel DC S3520 480 GB 6G SATA SSD

NIC

   

Two Dual-port Mellanox ConnectX-4 25 GB NIC (PCIe v3.0, 8 lanes

Each server is connected via a 10Gbps control link (Dell switches) and a 25Gbps experimental link to Mellanox 2410 switches in groups of 40 servers. Each of the five groups’ experimental switches are connected to a Mellanox 2700 spine switch at 5x100Gbps. That switch in turn interconnects with the rest of the Utah CloudLab cluster via 6x40Gbps uplinks to the HP FlexFabric 12910 switch.

A unique feature of the phase two nodes is the addition of eight ONIE bootable "user allocatable" switches that can run a variety of Open Network OSes: six Dell S4048-ONs and two Mellanox MSN2410-BB2Fs. These switches and all 200 nodes are connected to two NetScout 3903 layer-1 switches, allowing flexible combinations of nodes and switches in an experiment.

11.2 CloudLab Wisconsin

The CloudLab cluster at the University of Wisconsin is built in partnership with Cisco, Seagate, and HP. The cluster, which is in Madison, Wisconsin, has 270 servers with a total of 5,000 cores connected in a CLOS topology with full bisection bandwidth. It has 1,070 TB of storage, including SSDs on every node.

More technical details can be found at https://www.cloudlab.us/hardware.php#wisconsin

c220g1

   

90 nodes (Haswell, 16 core, 3 disks)

CPU

   

Two Intel E5-2630 v3 8-core CPUs at 2.40 GHz (Haswell w/ EM64T)

RAM

   

128GB ECC Memory (8x 16 GB DDR4 1866 MHz dual rank RDIMMs)

Disk

   

Two 1.2 TB 10K RPM 6G SAS SFF HDDs

Disk

   

One Intel DC S3500 480 GB 6G SATA SSDs

NIC

   

Dual-port Intel X520-DA2 10Gb NIC (PCIe v3.0, 8 lanes)

NIC

   

Onboard Intel i350 1Gb

c240g1

   

10 nodes (Haswell, 16 core, 14 disks)

CPU

   

Two Intel E5-2630 v3 8-core CPUs at 2.40 GHz (Haswell w/ EM64T)

RAM

   

128GB ECC Memory (8x 16 GB DDR4 1866 MHz dual rank RDIMMs)

Disk

   

Two Intel DC S3500 480 GB 6G SATA SSDs

Disk

   

Twelve 3 TB 3.5" HDDs donated by Seagate

NIC

   

Dual-port Intel X520-DA2 10Gb NIC (PCIe v3.0, 8 lanes)

NIC

   

Onboard Intel i350 1Gb

c220g2

   

163 nodes (Haswell, 20 core, 3 disks)

CPU

   

Two Intel E5-2660 v3 10-core CPUs at 2.60 GHz (Haswell EP)

RAM

   

160GB ECC Memory (10x 16 GB DDR4 2133 MHz dual rank RDIMMs)

Disk

   

One Intel DC S3500 480 GB 6G SATA SSDs

Disk

   

Two 1.2 TB 10K RPM 6G SAS SFF HDDs

NIC

   

Dual-port Intel X520 10Gb NIC (PCIe v3.0, 8 lanes

NIC

   

Onboard Intel i350 1Gb

c240g2

   

4 nodes (Haswell, 20 core, 8 disks)

CPU

   

Two Intel E5-2660 v3 10-core CPUs at 2.60 GHz (Haswell EP)

RAM

   

160GB ECC Memory (10x 16 GB DDR4 2133 MHz dual rank RDIMMs)

Disk

   

Two Intel DC S3500 480 GB 6G SATA SSDs

Disk

   

Two 1TB HDDs

Disk

   

Four 3TB HDDs

NIC

   

Dual-port Intel X520 10Gb NIC (PCIe v3.0, 8 lanes

NIC

   

Onboard Intel i350 1Gb

All nodes are connected to two networks:

The experiment network at Wisconsin is transitioning to using HP switches in order to provide OpenFlow 1.3 support.

Phase II added 260 new nodes, 36 with one or more GPUs:

c220g5

   

224 nodes (Intel Skylake, 20 core, 2 disks)

CPU

   

Two Intel Xeon Silver 4114 10-core CPUs at 2.20 GHz

RAM

   

192GB ECC DDR4-2666 Memory

Disk

   

One 1 TB 7200 RPM 6G SAS HDs

Disk

   

One Intel DC S3500 480 GB 6G SATA SSD

NIC

   

Dual-port Intel X520-DA2 10Gb NIC (PCIe v3.0, 8 lanes)

NIC

   

Onboard Intel i350 1Gb

c240g5

   

32 nodes (Intel Skylake, 20 core, 2 disks, GPU)

CPU

   

Two Intel Xeon Silver 4114 10-core CPUs at 2.20 GHz

RAM

   

192GB ECC DDR4-2666 Memory

Disk

   

One 1 TB 7200 RPM 6G SAS HDs

Disk

   

One Intel DC S3500 480 GB 6G SATA SSD

GPU

   

One NVIDIA 12GB PCI P100 GPU

NIC

   

Dual-port Intel X520-DA2 10Gb NIC (PCIe v3.0, 8 lanes)

NIC

   

Onboard Intel i350 1Gb

c4130

   

4 nodes (Intel Broadwell, 16 core, 2 disks, 4 GPUs)

CPU

   

Two Intel Xeon E5-2667 8-core CPUs at 3.20 GHz

RAM

   

128GB ECC Memory

Disk

   

Two 960 GB 6G SATA SSD

GPU

   

Four NVIDIA 16GB Tesla V100 SMX2 GPUs

11.3 CloudLab Clemson

The CloudLab cluster at Clemson University has been built partnership with Dell. The cluster so far has 260 servers with a total of 6,736 cores, 1,272TB of disk space, and 73TB of RAM. All nodes have 10GB Ethernet and most have QDR Infiniband. It is located in Clemson, South Carolina.

More technical details can be found at https://www.cloudlab.us/hardware.php#clemson

c8220

   

96 nodes (Ivy Bridge, 20 core)

CPU

   

Two Intel E5-2660 v2 10-core CPUs at 2.20 GHz (Ivy Bridge)

RAM

   

256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs

Disk

   

Two 1 TB 7.2K RPM 3G SATA HDDs

NIC

   

Dual-port Intel 10Gbe NIC (PCIe v3.0, 8 lanes

NIC

   

Qlogic QLE 7340 40 Gb/s Infiniband HCA (PCIe v3.0, 8 lanes)

c8220x

   

4 nodes (Ivy Bridge, 20 core, 20 disks)

CPU

   

Two Intel E5-2660 v2 10-core CPUs at 2.20 GHz (Ivy Bridge)

RAM

   

256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs

Disk

   

Eight 1 TB 7.2K RPM 3G SATA HDDs

Disk

   

Twelve 4 TB 7.2K RPM 3G SATA HDDs

NIC

   

Dual-port Intel 10Gbe NIC (PCIe v3.0, 8 lanes

NIC

   

Qlogic QLE 7340 40 Gb/s Infiniband HCA (PCIe v3.0, 8 lanes)

c6320

   

84 nodes (Haswell, 28 core)

CPU

   

Two Intel E5-2683 v3 14-core CPUs at 2.00 GHz (Haswell)

RAM

   

256GB ECC Memory

Disk

   

Two 1 TB 7.2K RPM 3G SATA HDDs

NIC

   

Dual-port Intel 10Gbe NIC (X520)

NIC

   

Qlogic QLE 7340 40 Gb/s Infiniband HCA (PCIe v3.0, 8 lanes)

c4130

   

2 nodes (Haswell, 28 core, two GPUs)

CPU

   

Two Intel E5-2680 v3 12-core processors at 2.50 GHz (Haswell)

RAM

   

256GB ECC Memory

Disk

   

Two 1 TB 7.2K RPM 3G SATA HDDs

GPU

   

Two Tesla K40m GPUs

NIC

   

Dual-port Intel 1Gbe NIC (i350)

NIC

   

Dual-port Intel 10Gbe NIC (X710)

NIC

   

Qlogic QLE 7340 40 Gb/s Infiniband HCA (PCIe v3.0, 8 lanes)

There are also two, storage intensive (270TB each!) nodes that should only be used if you need a huge amount of volatile storage. These nodes have only 10GB Ethernet.

dss7500

   

2 nodes (Haswell, 12 core, 270TB disk)

CPU

   

Two Intel E5-2620 v3 6-core CPUs at 2.40 GHz (Haswell)

RAM

   

128GB ECC Memory

Disk

   

Two 120 GB 6Gbps SATA SSDs

Disk

   

45 6 TB 7.2K RPM 6Gbps SATA HDDs

NIC

   

Dual-port Intel 10Gbe NIC (X520)

There are three networks at the Clemson site:

Phase two added 18 Dell C6420 chassis each with four dual-socket Skylake-based servers. Each of the 72 servers has 32 cores for a total of 2304 cores.

c6420

   

72 nodes (Intel Skylake, 32 core, 2 disk)

CPU

   

Two Sixteen-core Intel Xeon Gold 6142 CPUs at 2.6 GHz

RAM

   

384GB ECC DDR4-2666 Memory

Disk

   

Two Seagate 1TB 7200 RPM 6G SATA HDs

NIC

   

Dual-port Intel X710 10Gbe NIC

Each server is connected via a 1Gbps control link (Dell D3048 switches) and a 10Gbps experimental link (Dell S5048 switches).

These Phase II machines do not include Infiniband.

11.4 Apt Cluster

The main Apt cluster is housed in the University of Utah’s Downtown Data Center in Salt Lake City, Utah. It contains two classes of nodes:

r320

   

128 nodes (Sandy Bridge, 8 cores)

CPU

   

1x Xeon E5-2450 processor (8 cores, 2.1Ghz)

RAM

   

16GB Memory (4 x 2GB RDIMMs, 1.6Ghz)

Disks

   

4 x 500GB 7.2K SATA Drives (RAID5)

NIC

   

1GbE Dual port embedded NIC (Broadcom)

NIC

   

1 x Mellanox MX354A Dual port FDR CX3 adapter w/1 x QSA adapter

c6220

   

64 nodes (Ivy Bridge, 16 cores)

CPU

   

2 x Xeon E5-2650v2 processors (8 cores each, 2.6Ghz)

RAM

   

64GB Memory (8 x 8GB DDR-3 RDIMMs, 1.86Ghz)

Disks

   

2 x 1TB SATA 3.5” 7.2K rpm hard drives

NIC

   

4 x 1GbE embedded Ethernet Ports (Broadcom)

NIC

   

1 x Intel X520 PCIe Dual port 10Gb Ethernet NIC

NIC

   

1 x Mellanox FDR CX3 Single port mezz card

All nodes are connected to three networks with one interface each:

11.5 IG-DDC Cluster

This cluster is not owned by CloudLab, but is federated and available to CloudLab users.

This small cluster is an InstaGENI Rack housed in the University of Utah’s Downtown Data Center. It has nodes of only a single type:

dl360

   

33 nodes (Sandy Bridge, 16 cores)

CPU

   

2x Xeon E5-2450 processors (8 cores each, 2.1Ghz)

RAM

   

48GB Memory (6 x 8GB RDIMMs, 1.6Ghz)

Disk

   

1 x 1TB 7.2K SATA Drive

NIC

   

1GbE 4-port embedded NIC

It has two network fabrics:

12 Planned Features

This chapter describes features that are planned for CloudLab or under development: please contact us if you have any feedback or suggestions!

12.1 Improved Physical Resource Descriptions

As part of the process of reserving resources on CloudLab, a type of RSpec called a manifest is created. The manifest gives a detailed description of the hardware allocated, including the specifics of network topology. Currently, CloudLab does not directly export this information to the user. In the interest of improving transparency and repeatable research, CloudLab will develop interfaces to expose, explore, and export this information.

13 CloudLab OpenStack Tutorial

This tutorial will walk you through the process of creating a small cloud on CloudLab using OpenStack. Your copy of OpenStack will run on bare-metal machines that are dedicated for your use for the duration of your experiment. You will have complete administrative access to these machines, meaning that you have full ability to customize and/or configure your installation of OpenStack.

13.1 Objectives

In the process of taking this tutorial, you will learn to:

13.2 Prerequisites

This tutorial assumes that:

14.4 Logging In

The first step is to log in to CloudLab. CloudLab is available to all researchers and educators who work in cloud computing. If you have an account at one of its federated facilities, like Emulab or GENI, then you already have an account at CloudLab.

14.4.1 Using a CloudLab Account

If you have signed up for an account at the CloudLab website, simply open https://www.cloudlab.us/ in your browser, click the “Log In” button, enter your username and password, and skip to the next part of the tutorial.

screenshots/clab/tutorial/front-page.png

14.4.2 Using a GENI Account

If you have an account at the GENI portal, you will use the single-sign-on features of the CloudLab and GENI portal.

  1. Open the web interface
    To log in to CloudLab using a GENI account, start by visiting https://www.cloudlab.us/ in your browser and using the “Log In” button in the upper right.
    screenshots/clab/tutorial/front-page.png

  2. Click the "GENI User" button
    On the login page that appears, you will use the “GENI User” button. This will start the GENI authorization tool, which lets you connect your GENI account with CloudLab.
    screenshots/clab/tutorial/login-page.png

  3. Select the GENI portal
    You will be asked which facility your account is with. Select the GENI icon, which will take you to the GENI portal. There are several other facilities that are federated with CloudLab, which can be found in the drop-down list.
    screenshots/clab/tutorial/authorization-tool.png

  4. Log into the GENI portal
    You will be taken to the GENI portal, and will need to select the institution that is your identity provider. Usually, this is the university you are affiliated with. If your university is not in the list, you may need to log in using the “GENI Project Office” identity provider. If you have ever logged in to the GENI portal before, your identity provider should be pre-selected; if you are not familiar with this login page, there is a good chance that you don’t have a GENI account and need to apply for one.
    screenshots/clab/tutorial/geni-login.png

  5. Log into your institution
    This will bring you to your institution’s standard login page, which you likely use to access many on-campus services. (Yours will look different from this one.)
    screenshots/clab/tutorial/utah-login.png

  6. Authorize CloudLab to use your GENI account

    What’s happening in this step is that your browser uses your GENI user certificate (which it obtained from the GENI portal) to cryptographically sign a “speaks-for” credential that authorizes the CloudLab portal to make GENI API calls on your behalf.

    Click the “Authorize” button: this will create a signed statement authorizing the CloudLab portal to speak on your behalf. This authorization is time-limited (see “Show Advanced” for the details), and all actions the CloudLab portal takes on your behalf are clearly traceable.

    If you’d like, you can click “Remember This Decision” to skip this step in the future (you will still be asked to log in to the GENI portal).
    screenshots/clab/tutorial/authorize.png

  7. Set up an SSH keypair in the GENI portal
    Though not strictly required, many parts of this tutorial will work better if you have an ssh public key associated with your GENI Portal account. To upload one, visit portal.geni.net and use the “SSH Keys” item on the menu that drops down under your name.
    If you are not familiar with ssh keys, you may skip this step. You may find GitHub’s ssh key page helpful in learning how to generate and use them.

    When you create a new experiment, CloudLab grabs the set of ssh keys you have set up at the time; any ssh keys that you add later are not added to experiments that are already running.

    screenshots/clab/tutorial/portal-keypair.png

13.4 Building Your Own OpenStack Cloud

Once you have logged in to CloudLab, you will “instantiate” a “profile” to create an experiment. (An experiment in CloudLab is similar to a “slice” in GENI.) Profiles are CloudLab’s way of packaging up configurations and experiments so that they can be shared with others. Each experiment is separate: the experiment that you create for this tutorial will be an instance of a profile provided by the facility, but running on resources that are dedicated to you, which you have complete control over. This profile uses local disk space on the nodes, so anything you store there will be lost when the experiment terminates.

The OpenStack cloud we will build in this tutorial is very small, but CloudLab has additional hardware that can be used for larger-scale experiments.

For this tutorial, we will use a basic profile that brings up a small OpenStack cloud. The CloudLab staff have built this profile by capturing disk images of a partially-completed OpenStack installation and scripting the remainder of the install (customizing it for the specific machines that will get allocated, the user that created it, etc.) See this manual’s section on profiles for more information about how they work.

  1. Start Experiment
    screenshots/clab/tutorial/start-experiment-menu.png
    After logging in, you are taken to your main status dashboard. Select “Start Experiment” from the “Experiments” menu.

  2. Select a profile
    screenshots/clab/tutorial/start-experiment.png
    The “Start an Experiment” page is where you will select a profile to instantiate. We will use the OpenStack profile; if it is not selected, follow this link or click the “Change Profile” button, and select “OpenStack” from the list on the left.
    Once you have the correct profile selected, click “Next”
    screenshots/clab/tutorial/click-next.png

  3. Set parameters
    Profiles in CloudLab can have parameters that affect how they are configured; for example, this profile has parameters that allow you to set the size of your cloud, spread it across multiple clusters, and turn on and off many OpenStack options.
    For this tutorial, we will leave all parameters at their defaults and just click “next”.
    screenshots/clab/tutorial/set-parameters.png

  4. Select a cluster
    CloudLab has multiple clusters available to it. Some profiles can run on any cluster, some can only run on specific ones due to specific hardware constraints, etc.
    Note: If you are at an in-person tutorial, the instructor will tell you which cluster to select. Otherwise, you may select any cluster.

    The dropdown menu for the clusters shows you both the health (outer ring) and available resources (inner dot) of each cluster. The “Check Cluster Status” link opens a page (in a new tab) showing the current utilization of all CloudLab clusters.

    screenshots/clab/tutorial/pick-cluster.png

  5. Click Finish!
    When you click the “finish” button, CloudLab will start provisioning the resources that you requested on the cluster that you selected.

    You may optionally give your experiment a name—this can be useful if you have many experiments running at once.

    screenshots/clab/tutorial/click-create.png

  6. CloudLab instantiates your profile
    CloudLab will take a few minutes to bring up your copy of OpenStack, as many things happen at this stage, including selecting suitable hardware, loading disk images on local storage, booting bare-metal machines, re-configuring the network topology, etc. While this is happening, you will see this status page:
    screenshots/clab/tutorial/status-waiting.png

    Provisioning is done using the GENI APIs; it is possible for advanced users to bypass the CloudLab portal and call these provisioning APIs from their own code. A good way to do this is to use the geni-lib library for Python.

    As soon as a set of resources have been assigned to you, you will see details about them at the bottom of the page (though you will not be able to log in until they have gone through the process of imaging and booting.) While you are waiting for your resources to become available, you may want to have a look at the CloudLab user manual, or use the “Sliver” button to watch the logs of the resources (“slivers”) being provisioned and booting.

  7. Your cloud is ready!
    When the web interface reports the state as “Booted”, your cloud is provisioned, and you can proceed to the next section.
    Important: A “Booted” status indicates that resources are provisioned and booted; this particular profile runs scripts to complete the OpenStack setup, and it will be a few more minutes before OpenStack is fully ready to log in and create virtual machine instances. You will be able to tell that this has finished when the status changes from “Booted” to “Ready”. For now, don’t attempt to log in to OpenStack, we will explore the CloudLab experiment first.
    screenshots/clab/tutorial/status-ready.png

13.5 Exploring Your Experiment

Now that your experiment is ready, take a few minutes to look at various parts of the CloudLab status page to help you understand what resources you’ve got and what you can do with them.

13.5.1 Experiment Status

The panel at the top of the page shows the status of your experiment—you can see which profile it was launched with, when it will expire, etc. The buttons in this area let you make a copy of the profile (so that you can customize it), ask to hold on to the resources for longer, or release them immediately.

screenshots/clab/tutorial/experiment-status.png

Note that the default lifetime for experiments on CloudLab is less than a day; after this time, the resources will be reclaimed and their disk contents will be lost. If you need to use them for longer, you can use the “Extend” button and provide a description of why they are needed. Longer extensions require higher levels of approval from CloudLab staff. You might also consider creating a profile of your own if you might need to run a customized environment multiple times or want to share it with others.

You can click the title of the panel to expand or collapse it.

13.5.2 Profile Instructions

Profiles may contain written instructions for their use. Clicking on the title of the “Profile Instructions” panel will expand (or collapse) it; in this case, the instructions provide a link to the administrative interface of OpenStack, and give you passwords to use to log in. (Don’t log into OpenStack yet—for now, let’s keep exploring the CloudLab interface.)

screenshots/clab/tutorial/experiment-instructions.png

13.5.3 Topology View

At the bottom of the page, you can see the topology of your experiment. This profile has three nodes connected by a LAN, which is represented by a gray box in the middle of the topology. The names given for each node are the names assigned as part of the profile; this way, every time you instantiate a profile, you can refer to the nodes using the same names, regardless of which physical hardware was assigned to them. The green boxes around each node indicate that they are up; click the “Refresh Status” button to initiate a fresh check.

screenshots/clab/tutorial/topology-view.png

If an experiment has “startup services” (programs that run at the beginning of the experiment to set it up), their status is indicated by a small icon in the upper right corner of the node. You can mouse over this icon to see a description of the current status. In this profile, the startup services on the compute node(s) typically complete quickly, but the control node may take much longer.

It is important to note that every node in CloudLab has at least two network interfaces: one “control network” that carries public IP connectivity, and one “experiment network” that is isolated from the Internet and all other experiments. It is the experiment net that is shown in this topology view. You will use the control network to ssh into your nodes, interact with their web interfaces, etc. This separation gives you more freedom and control in the private experiment network, and sets up a clean environment for repeatable research.

13.5.4 List View

The list view tab shows similar information to the topology view, but in a different format. It shows the identities of the nodes you have been assigned, and the full ssh command lines to connect to them. In some browsers (those that support the ssh:// URL scheme), you can click on the SSH commands to automatically open a new session. On others, you may need to cut and paste this command into a terminal window. Note that only public-key authentication is supported, and you must have set up an ssh keypair on your account before starting the experiment in order for authentication to work.

screenshots/clab/tutorial/experiment-list.png

13.5.5 Manifest View

The third default tab shows a manifest detailing the hardware that has been assigned to you. This is the “request” RSpec that is used to define the profile, annotated with details of the hardware that was chosen to instantiate your request. This information is available on the nodes themselves using the geni-get command, enabling you to do rich scripting that is fully aware of both the requested topology and assigned resources.

Most of the information displayed on the CloudLab status page comes directly from this manifest; it is parsed and laid out in-browser.

screenshots/clab/tutorial/experiment-manifest.png

13.5.6 Graphs View

The final default tab shows a page of CPU load and network traffic graphs for the nodes in your experiment. On a freshly-created experiment, it may take several minutes for the first data to appear. After clicking on the “Graphs” tab the first time, a small reload icon will appear on the tab, which you can click to refresh the data and regenerate the graphs. For instance, here is the load average graph for an OpenStack experiment running this profile for over 6 hours. Scroll past this screenshot to see the control and experiment network traffic graphs. In your experiment, you’ll want to wait 20-30 minutes before expecting to see anything interesting.

screenshots/clab/tutorial/experiment-graphs.png

Here are the control network and experiment network packet graphs at the same time. The spikes at the beginning are produced by OpenStack setup and configuration, as well as the simple OpenStack tasks you’ll perform later in this profile, like adding a VM.

screenshots/clab/tutorial/experiment-graphs-nets.png

13.5.7 Actions

In both the topology and list views, you have access to several actions that you may take on individual nodes. In the topology view, click on the node to access this menu; in the list view, it is accessed through the icon in the “Actions” column. Available actions include rebooting (power cycling) a node, and re-loading it with a fresh copy of its disk image (destroying all data on the node). While nodes are in the process of rebooting or re-imaging, they will turn yellow in the topology view. When they have completed, they will become green again. The shell and console actions are described in more detail below.

screenshots/clab/tutorial/experiment-actions.png

13.5.8 Web-based Shell

CloudLab provides a browser-based shell for logging into your nodes, which is accessed through the action menu described above. While this shell is functional, it is most suited to light, quick tasks; if you are going to do serious work, on your nodes, we recommend using a standard terminal and ssh program.

This shell can be used even if you did not establish an ssh keypair with your account.

Two things of note:

screenshots/clab/tutorial/experiment-shell.png

13.5.9 Serial Console

CloudLab provides serial console access for all nodes, which can be used in the event that normal IP or ssh access gets intentionally or unintentionally broken. Like the browser-based shell, it is launched through the access menu, and the same caveats listed above apply as well. In addition:

screenshots/clab/tutorial/experiment-console.png

13.6 Bringing up Instances in OpenStack

Now that you have your own copy of OpenStack running, you can use it just like you would any other OpenStack cloud, with the important property that you have full root access to every machine in the cloud and can modify them however you’d like. In this part of the tutorial, we’ll go through the process of bringing up a new VM instance in your cloud.

We’ll be doing all of the work in this section using the Horizon web GUI for OpenStack, but you could also ssh into the machines directly and use the command line interfaces or other APIs as well.

  1. Check to see if OpenStack is ready to log in
    As mentioned earlier, this profile runs several scripts to complete the installation of OpenStack. These scripts do things such as finalize package installation, customize the installation for the specific set of hardware assigned to your experiment, import cloud images, and bring up the hypervisors on the compute node(s).
    If exploring the CloudLab experiment took you more than ten minutes, these scripts are probably done. You can be sure by checking that all nodes have completed their startup scripts (indicated by a checkmark on the Topology view); when this happens, the experiment state will also change from “Booting” to “Ready”
    screenshots/clab/tutorial/status-scriptsdone.png
    If you continue without verifying that the setup scripts are complete, be aware that you may see temporary errors from the OpenStack web interface. These errors, and the method for dealing with them, are generally noted in the text below.

  2. Visit the OpenStack Horizon web interface
    On the status page for your experiment, in the “Instructions” panel (click to expand it if it’s collapsed), you’ll find a link to the web interface running on the ctl node. Open this link (we recommend opening it in a new tab, since you will still need information from the CloudLab web interface).
    screenshots/clab/tutorial/experiment-password.png

  3. Log in to the OpenStack web interface
    Log in using the username admin and the password shown in the instructions for the profile. Use the domain name default if prompted for a domain.

    This profile generates a new (random) password for every experiment.

    Important: if the web interface rejects your password, wait a few minutes and try again. If it gives you another type of error, you may need to wait a minute and reload the page to get login working properly.
    screenshots/clab/tutorial/os-login.png

  4. Launch a new VM instance
    Click the “Launch Instance” button on the “Instances” page of the web interface.
    screenshots/clab/tutorial/os-launch.png

  5. Set Basic Settings For the Instance
    There a few settings you will need to make in order to launch your instance. These instructions are for the launch wizard in the OpenStack Mitaka, but the different release wizards are all similar in function and required information. Use the tabs in the column on the left of the launch wizard to add the required information (i.e., Details, Source, Flavor, Networks, and Key Pair), as shown in the following screenshots:
    screenshots/clab/tutorial/os-launch-basic.png
    • Pick any “Instance Name” you wish

    screenshots/clab/tutorial/os-launch-source.png
    • For the “Image Name”, select “trusty-server”

    • For the “Instance Boot Source”, select “Boot from image”

    Important: If you do not see any available images, the image import script may not have finished yet; wait a few minutes, reload the page, and try again.
    screenshots/clab/tutorial/os-launch-flavor.png
    • Set the “Flavor” to m1.smallthe disk for the default m1.tiny instance is too small for the image we will be using, and since we have only one compute node, we want to avoid using up too many of its resources.

  6. Add a Network to the Instance
    In order to be able to access your instance, you will need to give it a network. On the “Networking” tab, add the tun0-net to the list of selected networks by clicking the “+” button or dragging it up to the blue region. This will set up an internal tunneled connection within your cloud; later, we will assign a public IP address to it.

    The tun0-net consists of EGRE tunnels on the CloudLab experiment network.

    Important: If you are missing the Networking tab, you may have logged into the OpenStack web interface before all services were started. Unfortunately, reloading does not solve this, you will need to log out and back in.
    screenshots/clab/tutorial/os-launch-net.png

  7. Set an SSH Keypair
    On the “Key Pair” tab (or “Access & Security” in previous versions), you will add an ssh keypair to log into your node. If you configured an ssh key in your GENI account, you should find it as one of the options in this list. You can filter the list by typing your CloudLab username, or a portion of it, into the filter box to reduce the size of the list. (By default, we load in public keys for all users in the project in which you created your experiment, for convenience – thus the list can be long.) If you don’t see your keypair, you can add a new one with the red button to the right of the list. Alternately, you can skip this step and use a password for login later.
    screenshots/clab/tutorial/os-launch-finish.png

  8. Launch, and Wait For Your Instance to Boot
    Click the “Launch” button on the “Key Pair” wizard page, and wait for the status on the instances page to show up as “Active”.

  9. Add a Public IP Address
    screenshots/clab/tutorial/os-launch-associate.png
    At this point, your instance is up, but it has no connection to the public Internet. From the menu on the right, select “Associate Floating IP”.

    Profiles may request to have public IP addresses allocated to them; this profile requests four (two of which are used by OpenStack itself.)

    On the following screen, you will need to:
    1. Press the red button to set up a new IP address

    2. Click the “Allocate IP” button—this allocates a new address for you from the public pool available to your cloud.

    3. Click the “Associate” button—this associates the newly-allocated address with this instance.

    The public address is tunneled by the ctl (controller) node from the control network to the experiment network. (In older OpenStack profile versions, or depending the profile parameters specified to the current profile version, the public address may instead be tunneled by the nm (network manager) node, in a split controller/network manager OpenStack deployment.)

    You will now see your instance’s public address on the “Instances” page, and should be able to ping this address from your laptop or desktop.
    screenshots/clab/tutorial/os-instance-publicip.png

  10. Log in to Your Instance
    You can ssh to this IP address. Use the username ubuntu; if you provided a public key earlier, use your private ssh key to connect. If you did not set up an ssh key, use the same password you used to log in to the OpenStack web interface (shown in the profile instructions.) Run a few commands to check out the VM you have launched.

13.7 Administering OpenStack

Now that you’ve done some basic tasks in OpenStack, we’ll do a few things that you would not be able to do as a user of someone else’s OpenStack installation. These just scratch the surface—you can upgrade, downgrade, modify or replace any piece of your own copy of OpenStack.

13.7.1 Log Into The Control Nodes

If you want to watch what’s going on with your copy of OpenStack, you can use ssh to log into any of the hosts as described above in the List View or Web-based Shell sections. Don’t be shy about viewing or modifying things; no one else is using your cloud, and if you break this one, you can easily get another.

Some things to try:

13.7.2 Reboot the Compute Node

Since you have this cloud to yourself, you can do things like reboot or re-image nodes whenever you wish. We’ll reboot the cp1 (compute) node and watch it through the serial console.

  1. Open the serial console for cp1
    On your experiment’s CloudLab status page, use the action menu as described in Actions to launch the console. Remember that you may have to click to focus the console window and hit enter a few times to see activity on the console.

  2. Reboot the node
    On your experiment’s CloudLab status page, use the action menu as described in Actions to reboot cp1. Note that in the topology display, the box around cp1 will turn yellow while it is rebooting, then green again when it has booted.

  3. Watch the node boot on the console
    Switch back to the console tab you opened earlier, and you should see the node starting its reboot process.
    screenshots/clab/tutorial/reboot-console.png

  4. Check the node status in OpenStack
    You can also watch the node’s status from the OpenStack Horizon web interface. In Horizon, select “Hypervisors” under the “System” menu, and switch to the “Compute Host” tab.
    screenshots/clab/tutorial/os-node-down.png
    Note: This display can take a few minutes to notice that the node has gone down, and to notice when it comes back up. Your instances will not come back up automatically; you can bring them up with a “Hard Reboot” from the “Admin -> System ->Instances” page.

13.8 Terminating the Experiment

Resources that you hold in CloudLab are real, physical machines and are therefore limited and in high demand. When you are done, you should release them for use by other experimenters. Do this via the “Terminate” button on the CloudLab experiment status page.

screenshots/clab/tutorial/status-terminate.png

Note: When you terminate an experiment, all data on the nodes is lost, so make sure to copy off any data you may need before terminating.

If you were doing a real experiment, you might need to hold onto the nodes for longer than the default expiration time. You would request more time by using the “Extend” button the on the status page. You would need to provide a written justification for holding onto your resources for a longer period of time.

13.9 Taking Next Steps

Now that you’ve got a feel for for what CloudLab can do, there are several things you might try next:

14 CloudLab Chef Tutorial

This tutorial will walk you through the process of creating and using an instance of the Chef configuration management system on CloudLab.

Chef is both the name of a company and the name of a popular modern configuration management system written in Ruby and Erlang. A large variety of tutorials, articles, technical docs and training opportunities is available at the Chef official website. Also, refer to the Customer Stories page to see how Chef is used in production environments, including very large installations (e.g., at Facebook, Bloomberg, and Yahoo!).

14.1 Objectives

In the process of taking this tutorial, you will learn to:

14.2 Motivation

This tutorial will demonstrate how experiments can be managed on CloudLab, as well as show how experiment resources can be administered using Chef. By following the instructions provided below, you will learn how to take advantage of the powerful features of Chef for configuring multi-node software environments. The exercises included in this tutorial are built around simple but realistic configurations. In the process of recreating these configurations on nodes running default images, you will explore individual components of Chef and follow the configuration management workflow applicable to more complex configurations and experiments.

14.3 Prerequisites

This tutorial assumes that:

14.4 Logging In

The first step is to log in to CloudLab. CloudLab is available to all researchers and educators who work in cloud computing. If you have an account at one of its federated facilities, like Emulab or GENI, then you already have an account at CloudLab.

14.4.1 Using a CloudLab Account

If you have signed up for an account at the CloudLab website, simply open https://www.cloudlab.us/ in your browser, click the “Log In” button, enter your username and password, and skip to the next part of the tutorial.

screenshots/clab/tutorial/front-page.png

14.4.2 Using a GENI Account

If you have an account at the GENI portal, you will use the single-sign-on features of the CloudLab and GENI portal.

  1. Open the web interface
    To log in to CloudLab using a GENI account, start by visiting https://www.cloudlab.us/ in your browser and using the “Log In” button in the upper right.
    screenshots/clab/tutorial/front-page.png

  2. Click the "GENI User" button
    On the login page that appears, you will use the “GENI User” button. This will start the GENI authorization tool, which lets you connect your GENI account with CloudLab.
    screenshots/clab/tutorial/login-page.png

  3. Select the GENI portal
    You will be asked which facility your account is with. Select the GENI icon, which will take you to the GENI portal. There are several other facilities that are federated with CloudLab, which can be found in the drop-down list.
    screenshots/clab/tutorial/authorization-tool.png

  4. Log into the GENI portal
    You will be taken to the GENI portal, and will need to select the institution that is your identity provider. Usually, this is the university you are affiliated with. If your university is not in the list, you may need to log in using the “GENI Project Office” identity provider. If you have ever logged in to the GENI portal before, your identity provider should be pre-selected; if you are not familiar with this login page, there is a good chance that you don’t have a GENI account and need to apply for one.
    screenshots/clab/tutorial/geni-login.png

  5. Log into your institution
    This will bring you to your institution’s standard login page, which you likely use to access many on-campus services. (Yours will look different from this one.)
    screenshots/clab/tutorial/utah-login.png

  6. Authorize CloudLab to use your GENI account

    What’s happening in this step is that your browser uses your GENI user certificate (which it obtained from the GENI portal) to cryptographically sign a “speaks-for” credential that authorizes the CloudLab portal to make GENI API calls on your behalf.

    Click the “Authorize” button: this will create a signed statement authorizing the CloudLab portal to speak on your behalf. This authorization is time-limited (see “Show Advanced” for the details), and all actions the CloudLab portal takes on your behalf are clearly traceable.

    If you’d like, you can click “Remember This Decision” to skip this step in the future (you will still be asked to log in to the GENI portal).
    screenshots/clab/tutorial/authorize.png

  7. Set up an SSH keypair in the GENI portal
    Though not strictly required, many parts of this tutorial will work better if you have an ssh public key associated with your GENI Portal account. To upload one, visit portal.geni.net and use the “SSH Keys” item on the menu that drops down under your name.
    If you are not familiar with ssh keys, you may skip this step. You may find GitHub’s ssh key page helpful in learning how to generate and use them.

    When you create a new experiment, CloudLab grabs the set of ssh keys you have set up at the time; any ssh keys that you add later are not added to experiments that are already running.

    screenshots/clab/tutorial/portal-keypair.png

14.5 Launching Chef Experiments

Once you have logged in to CloudLab, you will “instantiate” a “profile” to create an experiment. (An experiment in CloudLab is similar to a “slice” in GENI.) Profiles are CloudLab’s way of packaging up configurations and experiments so that they can be shared with others. Each experiment is separate: the experiment that you create for this tutorial will be an instance of a profile provided by the facility, but running on resources that are dedicated to you, which you have complete control over. This profile uses local disk space on the nodes, so anything you store there will be lost when the experiment terminates.

The Chef cluster we are building in this tutorial is very small, but CloudLab has large clusters that can be used for larger-scale experiments.

For this tutorial, we will use a profile that launches a Chef cluster — a set of interconnected nodes running Chef components such as Chef server, workstation, clients. The CloudLab staff have built this profile by scripting installation procedures for these components. The developed scripts will run on the experiment nodes after they boot and customize them (create necessary user accounts, install packages, establish authentication between the nodes, etc.) to create a fully functional Chef cluster with a multi-node, production-like structure.

See this manual’s section on profiles for more information about how they work.

  1. Start Experiment
    screenshots/clab/tutorial/start-experiment-menu.png
    After logging in, you are taken to your main status dashboard. Select “Start Experiment” from the “Experiments” menu.

  2. Select a profile
    By default, the “Start an Experiment” page suggests launching the OpenStack profile which is discribed in detail in the OpenStack tutorial.
    Go to the list of available profile by clicking “Change Profile”:
    screenshots/clab/chef-tutorial/change-profile.png
    Find the profile by typing ChefCluster in the search bar. Then, select the profile with the specified name in the list displayed below the search bar. A 2-node preview should now be shown along with high-level profile information. Click “Select Profile” at the bottom of the page:
    screenshots/clab/chef-tutorial/find-chefcluster.png
    After you select the correct profile, click “Next”:
    screenshots/clab/chef-tutorial/instantiate-next.png

  3. Set parameters
    Profiles in CloudLab can have parameters that affect how they are configured; for example, this profile has parameters that allow you to set the number of client nodes, specify the repository with the infrastructure code we plan to use, obtain copies of the application-specific infrastructure code developed by the global community of Chef developers, etc.
    For this tutorial, we will leave all parameters at their defaults and just click “Next”.
    screenshots/clab/chef-tutorial/params-next.png

  4. Choose experiment name
    You may optionally give your experiment a meaningful name, e.g., “chefdemo”. This is useful if you have many experiments running at once.
    screenshots/clab/chef-tutorial/experiment-name.png

  5. Select a cluster
    CloudLab has multiple clusters available to it. Some profiles can run on any cluster, some can only run on specific ones due to specific hardware constraints. ChefCluster can only run on the x86-based clusters. This excludes the CloudLab Utah cluster which is built on ARMv8 nodes. Refer to the Hardware section for more information.
    Note: If you are at an in-person tutorial, the instructor will tell you which cluster to select. Otherwise, you may select any compatible and available cluster.

    The dropdown menu for the clusters shows you both the health (outer ring) and available resources (inner dot) of each cluster. The “Check Cluster Status” link opens a page (in a new tab) showing the current utilization of all CloudLab clusters.

    screenshots/clab/chef-tutorial/select-cluster.png

  6. Click Finish!
    When you click the “Finish” button, CloudLab will start provisioning the resources that you requested on the cluster that you selected.

  7. @(tb) instantiates your profile
    CloudLab will take a few minutes to bring up your experiment, as many things happen at this stage, including selecting suitable hardware, loading disk images on local storage, booting bare-metal machines, re-configuring the network topology, etc. While this is happening, you will see the topology with yellow node icons:
    screenshots/clab/chef-tutorial/booting.png

    Provisioning is done using the GENI APIs; it is possible for advanced users to bypass the CloudLab portal and call these provisioning APIs from their own code. A good way to do this is to use the geni-lib library for Python.

    As soon as a set of resources have been assigned to you, you will see details about them if you switch to the "List View" tab (though you will not be able to log in until they have gone through the process of imaging and booting.) While you are waiting for your resources to become available, you may want to have a look at the CloudLab user manual, or use the “Sliver” button to watch the logs of the resources (“slivers”) being provisioned and booting.

  8. Your resources are ready!
    Shortly, the web interface will report the state as “Booted”.
    screenshots/clab/chef-tutorial/booted.png
    Important: A “Booted” status indicates that resources are provisioned and booted; this particular profile runs scripts to complete the Chef setup, and it will be a few more minutes before Chef becomes fully functional. You will be able to tell that the configuration process has finished when the status changes from “Booted” to “Ready”. Until then, you will likely see that the startup scripts (i.e. programs that run at the beginning of the experiment to set it up) run longer on head than on node-0 a lot more work is required to install the Chef server than the client. In the Topology View tab, mouse over the circle in the head’s icon, to confirm the current state of the node.
    screenshots/clab/chef-tutorial/head-running.png

14.6 Exploring Your Experiment

While the startup scripts are still running, you will have a few minutes to look at various parts of the CloudLab experiment page and learn what resources you now have access to and what you can do with them.

14.6.1 Experiment Status

The panel at the top of the page shows the status of your experiment — you can see which profile it was launched with, when it will expire, etc. The buttons in this area let you make a copy of the profile (so that you can customize it), ask to hold on to the resources for longer, or release them immediately.

screenshots/clab/chef-tutorial/experiment-status.png

Note that the default lifetime for experiments on CloudLab is less than a day; after this time, the resources will be reclaimed and their disk contents will be lost. If you need to use them for longer, you can use the “Extend” button and provide a description of why they are needed. Longer extensions require higher levels of approval from CloudLab staff. You might also consider creating a profile of your own if you might need to run a customized environment multiple times or want to share it with others.

You can click on the title of the panel to expand or collapse it.

14.6.2 Profile Instructions

Profiles may contain written instructions for their use. Clicking on the title of the “Profile Instructions” panel will expand or collapse it. In this case, the instructions provide a link to the Chef server web console. (Don’t click on the link yet — it is likely that Chef is still being configured; for now, let’s keep exploring the CloudLab interface.)

Also, notice the list of private and public hostnames of the experiment nodes included in the instructions. You will use the public hostnames (shown in bold) in the exercise with the Apache web server and apache benchmark.

screenshots/clab/chef-tutorial/experiment-instructions.png

14.6.3 Topology View

We have already used the topology viewer to see the node status; let’s take a closer look at the topology of this experiment. This profile has two nodes connected by a 10 Gigabit LAN, which is represented by a gray box in the middle of the topology. The names given for each node are the names assigned as part of the profile; this way, every time you instantiate a profile, you can refer to the nodes using the same names, regardless of which physical hardware was assigned to them. The green boxes around each node indicate that they are up; click the “Refresh Status” button to initiate a fresh check.

You can also run several networking tests by clicking the “Run Linktest” button at the bottom of the page. The available tests include:
  • Level 1 — Connectivity and Latency

  • Level 2 — Plus Static Routing

  • Level 3 — Plus Loss

  • Level 4 — Plus Bandwidth

Higher levels will take longer to complete and require patience.

screenshots/clab/chef-tutorial/topology-view.png

If an experiment has startup services, their statuses are indicated by small icons in the upper right corners of the node icons. You can mouse over this icon to see a description of the current status. In this profile, the startup services on the client node(s), such as node-0, typically complete quickly, but the scripts on head take much longer. The screenshot above shows the state of the experiment when all startup scripts complete.

It is important to note that every node in CloudLab has at least two network interfaces: one “control network” that carries public IP connectivity, and one “experiment network” that is isolated from the Internet and all other experiments. It is the experiment net that is shown in this topology view. You will use the control network to ssh into your nodes, interact with their web interfaces, etc. This separation gives you more freedom and control in the private experiment network, and sets up a clean environment for repeatable research.

14.6.4 List View

The list view tab shows similar information to the topology view, but in a different format. It shows the identities of the nodes you have been assigned, and the full ssh command lines to connect to them. In some browsers (those that support the ssh:// URL scheme), you can click on the SSH commands to automatically open a new session. On others, you may need to cut and paste this command into a terminal window. Note that only public-key authentication is supported, and you must have set up an ssh keypair on your account before starting the experiment in order for authentication to work.

screenshots/clab/chef-tutorial/list-view.png

14.6.5 Manifest View

The final default tab shows a manifest detailing the hardware that has been assigned to you. This is the “request” RSpec that is used to define the profile, annotated with details of the hardware that was chosen to instantiate your request. This information is available on the nodes themselves using the geni-get command, enabling you to do rich scripting that is fully aware of both the requested topology and assigned resources.

Most of the information displayed on the CloudLab status page comes directly from this manifest; it is parsed and laid out in-browser.

screenshots/clab/chef-tutorial/manifest-view.png

14.6.6 Actions

In both the topology and list views, you have access to several actions that you may take on individual nodes. In the topology view, click on the node to access this menu; in the list view, it is accessed through the icon in the “Actions” column. Available actions include rebooting (power cycling) a node, and re-loading it with a fresh copy of its disk image (destroying all data on the node). While nodes are in the process of rebooting or re-imaging, they will turn yellow in the topology view. When they have completed, they will become green again. The shell action is described in more detail below and will be used as the main method for issuing commands on the experiment nodes throughout this tutorial.

screenshots/clab/chef-tutorial/experiment-actions.png

14.7 Brief Introduction to Chef

While the startup scripts are running, let’s take a quick look at the architecture of a typical Chef intallation. The diagram provided at the official Overview of Chef page demonstrates the relationships between individual Chef components. While there is a number of optional, advanced components that we don’t use in this tutorial, it is worth noting the following:

In the experiment we have launched, head runs all three: the server, the workstation, and the client (there is no conflict between these components), while node-0 runs only the client. If this profile is instantiated with the arbitrary number N of clients, there will be N “client-only” nodes named node-0,node-1,...,node-(N-1). Another profile parameter allows choosing the repository from which chef-repo is cloned; by default, it points to emulab/chef-repo. The startup scrips in this profile also obtain and install copies of public cookbooks hosted at the respository called Supermarket. Specifically, the exercises in this tutorial rely on the nfs and apache2 cookbooks — both should be installed on your experiment now in accordance with the default value of the corresponding profile parameter we have used.

14.8 Logging in to the Chef Web Console

As soon as the startup scripts complete, you will likely recieve an email confirming that Chef is installed and operational.

Dear User,

 

Chef server and workstataion should now be

installed on head.chefdemo.utahstud.emulab.net.

 

To explore the web management console, copy

this hostname and paste it into your browser.

Installation log can be found at /var/log/init-chef.log

on the server node.

 

To authenticate, use the unique credentials

saved in /root/.chefauth on the server node.

 

Below are several Chef commands which detail the launched experiment:

 

# chef -v

Chef Development Kit Version: 0.7.0

chef-client version: 12.4.1

berks version: 3.2.4

kitchen version: 1.4.2

 

# knife cookbook list

apache2               3.1.0

apt                   2.9.2

emulab-apachebench    1.0.0

emulab-nfs            0.1.0

...

nfs                   2.2.6

 

# knife node list

head

node-0

 

# knife role list

apache2

apachebench

...

nfs

 

# knife status -r

1 minute  ago, head, [], ubuntu 14.04.

0 minutes ago, node-0, [], ubuntu 14.04.

 

Happy automation with Chef!

In some cases, you will not be able to see this email — email filters (e.g., if you are using a university email account) may classify it as spam. This will not be a problem since the email is supposed to provide useful but noncritical information. You can still access the Chef web console using the link included in the profile instructions and also obtain the credentials as described in the instructions.

When you receive this email or see in the topology viewer that the startup scripts completed, you can proceed to the following step.

14.8.1 Web-based Shell

CloudLab provides a browser-based shell for logging into your nodes, which is accessed through the action menu described above. While this shell is functional, it is most suited to light, quick tasks; if you are going to do serious work, on your nodes, we recommend using a standard terminal and ssh program.

This shell can be used even if you did not establish an ssh keypair with your account.

Three things of note:

Create a shell tab for head by choosing “Shell” in the Actions menu for head in either the topology or the list view. The new tab will be labeled “head”. It will allow typing shell commands and executing them on the node. All shell commands used throughout this tutorial should be issued using this shell tab.

While you are here, switch to the root user by typing:

sudo su -

screenshots/clab/chef-tutorial/experiment-shell.png

Your prompt, the string which is printed on every line before the cursor, should change to root@head. All commands in the rest of the tutorial should be executed under root.

14.8.2 Chef Web Console

Type the following to print Chef credentials:

cat /root/.chefauth

You should see the unique login and password that have been generated by the startup scripts specifically for your instance of Chef:

screenshots/clab/chef-tutorial/shell-chefauth.png

Expand the “Profile Instructions” panel and click on the “Chef web console” link.

Warning: When your browser first points to this link, you will see a warning about using a self-signed SSL certificate. Using self-signed certificates is not a good option in production scenarios, but it is totally acceptable in this short-term experimentation environment. We will ignore this warning and proceed to using the Chef console. The sequence of steps for ignoring it depends on your browser. In Chrome, you will see a message saying "Your connection is not private". Click “Advanced” at the bottom of the page and choose “Proceed to <hostname> (unsafe)”. In Firefox, the process is slightly different: click “Advanced”, choose “Add Exception”, and click “Confirm Security Exception”.

When the login page is finally displayed, use the credentials you have printed in the shell:

screenshots/clab/chef-tutorial/chef-wui-login.png
screenshots/clab/chef-tutorial/chef-wui-nodelist.png

If you see a console like the one above, with both head and node-0 listed on the “Nodes” tab, you have a working Chef cluster! You can now proceed to managing your nodes using Chef recipes, cookbooks, and roles.

In the rest of the tutorial, we demonstrate how you can use several pre-defined cookbooks and roles. Development of cookbooks and roles is a subject of another tutorial (many such tutorials can be found online). Our goal in the following sections is to walk you through the process of using existing cookbooks and roles — the process which is the same for simple and more complex configurations. You will learn how to modify run lists and apply cookbooks and roles to your nodes, run them, check their status, and explore individual components of cookbooks and roles.

14.9 Configuring NFS

Now that you have a working instance of Chef where both head and node-0 can be controlled using Chef, let’s start modifying the software stacks on these nodes by installing NFS and exporting a directory from head to node-0.

  1. Modify the run list on head
    Click on the row for head in the node list (so the entire row is highlighted in orange), and then click "Edit" in the Run List panel in the bottom right corner of the page:
    screenshots/clab/chef-tutorial/head-edit-runlist.png

  2. Apply nfs role:
    In the popup window, find the role called nfs in the Available Roles list on the left, drag and drop it into the "Current Run List" field on the right. When this is done, click "Save Run List".
    screenshots/clab/chef-tutorial/head-nfs-drag.png

  3. Repeat the last two steps for node-0:
    screenshots/clab/chef-tutorial/node0-edit-runlist.png
    screenshots/clab/chef-tutorial/node0-nfs-drag.png
    At this point, nfs role is assigned to both nodes, but nothing has executed yet.

  4. Check the status from the shell
    Before proceeding to applying these updates, let’s check the role assignment from the shell by typing:

    knife status -r

    The run lists for your nodes should be printed inside square brackets:
    screenshots/clab/chef-tutorial/shell-knife-status.png
    The output of this command also conveniently shows when Chef applied changes to your nodes last (first column) and what operating systems run on the nodes (last column).

  5. Trigger the updates

    Run the assigned role on head by typing:

    chef-client

    screenshots/clab/chef-tutorial/head-chef-client.png
    As a part of the startup procedure, passwordless ssh connections are enabled between the nodes. Therefore, you can use ssh to run commands remotely, on the nodes other than head, from the same shell. Execute the same command on node-0 by typing:

    ssh node-0 chef-client

    screenshots/clab/chef-tutorial/node0-chef-client.png
    When nodes execute the “chef-client” command, they contact the server and request the cookbooks, recipes, and roles have been assigned to them in their run lists. The server responds with those artifacts, and the nodes execute them in the specified order.

  6. Verify that NFS is working
    After updating both nodes, you should have a working NFS configuration in which the /exp-share directory is exported from head and mounted on node-0.

    The name of the NFS directory is one of the attributes that is set in the nfs role. To explore other attributes and see how this role is implemented, take a look at the /chef-repo/roles/nfs.rb file on head

    The simplest way to test that this configuration is functioning properly is to create a file on one node and check that this file becomes available on the other node. Follow these commands to test it (the lines that start with the # sign are comments):

    # List files in the NFS directory /exp-share on head; should be empty

    ls /exp-share/

     

    # List the same directory remotely on node-0; also empty

    ssh node-0 ls /exp-share/

     

    # Create an empty file in this derectory locally

    touch /exp-share/NFS-is-configured-by-Chef

     

    # Find the same file on node-0

    ssh node-0 ls /exp-share/

    screenshots/clab/chef-tutorial/nfs-is-working.png
    If you can see the created file on node-0, your NFS is working as expected. Now you can easily move files between your nodes using the /exp-share directory.

Summary: You have just installed and configured NFS by assigning a Chef role and running it on your nodes. You can create much more complex software environments by repeating these simple steps and installing more software components on your nodes. Additionally, installing NFS on a set of nodes is not a subject of a research experiment but rather an infrastructure prerequisite for many distributed systems. You can automate installation and configuration procedures in those systems using a system like Chef and save a lot of time when you need to periodically recreate them or create multiple instances of those systems.

14.9.1 Exploring The Structure

It is worth looking at how the role you have just used is implemented. Stored at /chef-repo/roles/nfs.rb on head, it should include the following code:

#

# This role depends on the emulab-nfs cookbook;

# emulab-nfs depends on the nfs cookbook

# available at: https://supermarket.chef.io/cookbooks/nfs

# Make sure it is installed; If it is not, try: knife cookbook site install nfs

#

name "nfs"

description "Role applied to all NFS nodes - server and client"

override_attributes(

  "nfs" => {

    "server" => "head",

    "dir" => "/exp-share",

    "export" => {

      "network" => "10.0.0.0/8",

      "writeable" => true

    }

  }

)

run_list [ "emulab-nfs" ]

You can see a number of domain-specific Ruby attributes in this file. name and description are self-describing attributes. The override_attribute attribute allows you to control high-level configuration parameters, including (but not limited to) the name of the node running the NFS server, the directory being exported and mounted, the network in which this directory is shared, and the “write” permission in this directory (granted or not). The run_list attribute includes a single cookbook called emulab-nfs.

You are probably wondering now about how the same cookbook can perform different actions on different nodes. Indeed, the emulab-nfs cookbook has just installed NFS server on head and NFS client on node-0. You can take a look at the implementation of the default.rb, the default recipe in the emulab-nfs cookbook which gets called when the cookbook is executed. You should find the following code at /chef-repo/cookbooks/emulab-nfs/recipes/default.rb:

#

# Cookbook Name:: emulab-nfs

# Recipe:: default

#

 

if node["hostname"] == node["nfs"]["server"]

  include_recipe "emulab-nfs::export"

else

  include_recipe "emulab-nfs::mount"

end

The if statement above allows comparing the hostname of the node on which the cookbook is running with one of the attributes specified in the role. Based on this comparison, the recipe takes the appropriate actions by calling the code from two other recipes in this cookbook. Optionally, you can explore /chef-repo/cookbooks/emulab-nfs/recipes/export.rb and /chef-repo/cookbooks/emulab-nfs/recipes/mount.rb to see how these recipes are implemented.

Obviously, this is not the only possible structure for this configuration. You can alternatively create two roles, such as nfs-server and nfs-client, that will call the corresponding recipes without the need for the described comparison. Since a single model cannot fit perfectly all scenarios, Chef provides the developer with enough flexibility to organize the code into structures that match specific environment and application needs.

14.10 Apache Web Server and ApacheBench Benchmarking tool

In this exercise, we are going to switch gears and experiment with different software — apache2, the Apache HTTP web server, and the ab benchmarking tool. You will explore more Chef capabilities and perform administrative tasks, primarily from the command line.

We recommend you to continue using the shell tab for head (again, make sure that your commands are executed as root). Run the commands described below in that shell.

  1. Add a role to the head’s run list
    Issue the command in bold (the rest is the expected output):

    knife node run_list add head "role[apache2]"

    head:

      run_list:

        role[nfs]

        role[apache2]

  2. Add two roles to the node-0’s run list
    Run the two knife commands listed below (also on head) in order to assign two roles to node-0:

    knife node run_list add node-0 "role[apache2]"

    run_list:

    head:

      run_list:

        role[nfs]

        role[apache2]

    knife node run_list add node-0 "role[apachebench]"

    run_list:

    head:

      run_list:

        role[nfs]

        role[apache2]

        role[apachebench]

    Notice that we did not exclude the nfs role from the run lists. Configuration updates in Chef are idempotent, which means that an update can be applied to a node multiple times, and every time the update will yield identical configuration (regardless of the node’s previous state) Thus, this time, when you run “chef-client” again, Chef will do the right thing: the state of each individual component will be inspected, and all NFS-related tasks will be silently skipped because NFS is already installed.

  3. Trigger updates on all nodes
    In the previous section, we updated one node at a time. Try the “batch” update - run a single command that will trigger configuration procedures on all client nodes:

    knife ssh "name:*" chef-client

    apt155.apt.emulab.net resolving cookbooks for run list:

      ["emulab-nfs", "apache2", "apache2::mod_autoindex", "emulab-apachebench"]

    apt154.apt.emulab.net resolving cookbooks for run list:

      ["emulab-nfs", "apache2", "apache2::mod_autoindex"]

    apt155.apt.emulab.net Synchronizing Cookbooks:

    ....

    apt155.apt.emulab.net Chef Client finished, 5/123 resources updated in 04 seconds

    You should see interleaved output from the “chef-client” command running on both nodes at the same time.
    The last command uses a node search based on the “name:*” criteria. As a result, every node will execute “chef-client”. In cases where it is necessary, you can use more specific search strings, such as, for instance, “name:head” and “name:node-*”.

  4. Check that the webservers are running
    One of the attributes in the apache2 role configures the web server such that it runs on the port 8080. Let’s check that the web servers are running via checking the status of that port:

    knife ssh "name:*" "netstat -ntpl | grep 8080"

    pc765.emulab.net tcp6 0 0 :::8080 :::* LISTEN 9415/apache2

    pc768.emulab.net tcp6 0 0 :::8080 :::* LISTEN 30248/apache2

    Note that this command sends an arbitrary command (unrelated to Chef) to a group of nodes. With this functionality, the knife command-line utility can be viewed as an orchestration tool for managing groups of nodes that is capable of replacing pdsh, the Parallel Distributed Shell utility, that is often used on computing clusters.
    The output that is similar to the one above indicates that both apache2 web servers are running. The command that you have just issued uses netstat, a command-line tool that displays network connections, routing tables, interface statistics, etc. Using knife, you have run netstat in combination with a Linux pipe and a grep command for filtering output and displaying the information only on the port 8080.

  5. Check benchmarking results on node-0

    Many different actions have taken place on node-0, including:
    • apache2 has been installed and configured

    • ab, a benchmarking tool from the apache2-utils package, has been installed and executed

    • benchmarking results have been saved, and plots have been created using the gnuplot utility

    • the plots have been made available using the installed web server

    To see how the last three tasks are performed using Chef, you can take a look at the default.rb recipe inside the emulab-apachebench cookbook. In the in-person tutorial, let’s proceed to the following steps and leave the discussion of this specific recipe and abstractions used in Chef recipes in general to the end of the tutorial.

    Obtain the node-0’s public hostname (refer to the list included in the profile instructions), and construct the following URL:

    http://<node-0's public hostname>:8080

    With the right hostname, copy and paste this URL into your browser to access the web server running on node-0 and serving the benchmarking graphs. You should see a directory listing like this:

    screenshots/clab/chef-tutorial/apache-dir-listing-0.png
    You can explore the benchmarking graphs by clicking at the listed links. So far, ab has run against the local host (i.e. node-0), therefore the results may not be so interesting. Do not close the window with your browser since one of the following steps will ask you to go back to it to see more results.

  6. Modify a role
    Just like the nfs role described in the previous section, apachebench has a number of attributes that define its behavior. We will use the Chef web console to update the value for one of its attributes.
    In the console, click on the Policy tab at the top of the page, choose “Roles” in the panel on the left, and select “apachebench” in the displayed list of roles. Select the Attributes tab in the middle of the page. At the bottom of the page, you should now see a panel called “Override Attributes”. Click the “Edit” button inside that panel:
    screenshots/clab/chef-tutorial/console-apachebench-edit.png
    In the popup window, change the target_host attribute from null to the head’s public hostname (refer to the hostnames listed in the profile instructions). Don’t forget to use double quotes around the hostname. Also, let’s modify the list of ports against which we will run the benchmark. Change 80 to 8080 since nothing interesting is running on port 80, while the apache2 server you have just installed is listening on the port 8080. Leave the port 443 in the list — this is the port on which the Chef web console is running. Here is an example of the recommended changes:
    screenshots/clab/chef-tutorial/console-apachebench-change-attr.png
    When these changes are complete, click "Save Attributes".
    Alternative: It turns out that you don’t have to use the Chef web console for modifying this role. Take a look at the two steps described below to see how we can modify this role from the shell or skip to running the benchmark.

    This is what you need to make the described changes without using the web console:
    1. Edit the file /chef-repo/roles/apachebench.rb on head using a text editor of your choice (e.g., vi)

    2. After making the suggested changes, “push” the updated role to the server by typing: knife role from file /chef-repo/roles/apachebench.rb

  7. Run the benchmark against head
    The updated version of the role is available on the server now. You can run it on node-0 using the following command:

    knife ssh "role:apachebench" chef-client

    This command, unlike the knife commands you issued in the previous steps, uses a search based on the assigned roles. In other words, it will find the nodes to which the apachebench role has been assigned by now — in our case, only node-0 and execute “chef-client” on them.

  8. Check benchmarking results again
    Go back to your browser window and update the page to see the directory listing with new benchmarking results.
    screenshots/clab/chef-tutorial/apache-dir-listing-1.png
    The first two (the most recent) graphs represent the results of benchmarking of the web services running on head performed from node-0. Among many interesting facts that are revealed by these graphs, you will see that the response time is much higher on the port 443 than on the port 8080.
    screenshots/clab/chef-tutorial/bench-graphs.png

Summary: You have just performed an experiment with apache2 and ab in a very automated manner. The Chef role you have used performed many tasks, including setting up the infrastructure, running a benchmark, saving and processing results, as well as making them easily accessible. The following section will shed some light on how these tasks were accomplished and also how they can be customized.

14.10.1 Understanding the Internals

Below we describe some of the key points about the pieces of infrastructure code you have just used.

The apache2 role is stored at roles/apache2.rb as part of the emulab/chef-repo. Note that the run list specified in this role includes a cookbook apache2 (the recipe called default.rb inside the cookbook is used) and also a recipe mod_autoindex from the same cookbook. This cookbook is one of the cookbooks that have been obtained from Chef Supermarket. All administrative work for installing and configuring the web server is performed by the recipes in this cookbook, while the role demonstrates an example of how that cookbook can be “wrapped” into a specification that satisfy specific application needs.

The apachebench role is stored at roles/apachebench.rb It specifies values for several attributes and adds a custom cookbook emulab-apachebench to the run list. We use the “emulab” prefix in the names of the cookbooks that have been developed by the CloudLab staff to emphasize that they are developed for Emulab and derived testbeds such as CloudLab. This also allows distinguishing them from the Supermarket cookbooks, which are installed in the same directory on head.

Inside the default.rb recipe in emulab-apachebench you can find many key words, including package, log, file, execute, template, etc. They are called Chef resources. These elements, which allow defining fine-grained configuration tasks and actions, are available for many common administrative needs. You can refer to the list of supported Chef resources and see examples of how they can be used.

Another item that is worth mentioning is the “ignore_failure true” attribute used in some of the resources. It allows the recipe to continue execution even when something does not go as expected (shell command fail, necessary files do not exist, etc.). Also, when relying on specific files in your recipes, you can augment resources with additional checks like “only_if{::File.exists?(<file name>)}” and “not_if{::File.exists?(<file name>)}” to make your infrastructure code more reliable and repeatable (this refers to the notion of idempotent code mentioned earlier).

14.11 Final Remarks about Chef on CloudLab

In this tutorial, you had a chance to explore the Chef configuration management system and used some of its powerful features for configuration management in a multi-node experiment on CloudLab. Even though the instructions walked you through the process of configurating only two nodes, you can use the demonstrated code artifacts, such as roles, cookbooks, recipes, and resources, and apply them to infrastructure in larger experiments. With the knife commands like the ones shown above, you can “orchestrate” diverse and complex applications and environments.

When establishing such orchestration in your experiments, you can take advantage of the relevant pieces of intrastructure code, e.g., available through Chef Supermarket. In cases when you have to develop your own custom code, you may adopt the structure and the abstractions supported by Chef and aim to develop infrastructure code that is modular, flexible, and easy to use.

The ChefCluster profile is available to all users on CloudLab. If you are interested in learning more about Chef and developing custom infrastructure code for your experiments, this profile will spare you from the need to set up necessary components every time and allow you to run a fully functional Chef installation for your specific needs.

14.12 Terminating Your Experiment

Resources that you hold in CloudLab are real, physical machines and are therefore limited and in high demand. When you are done, you should release them for use by other experimenters. Do this via the “Terminate” button on the CloudLab experiment status page.

screenshots/clab/chef-tutorial/terminate.png

Note: When you terminate an experiment, all data on the nodes is lost, so make sure to copy off any data you may need before terminating.

If you were doing a real experiment, you might need to hold onto the nodes for longer than the default expiration time. You would request more time by using the “Extend” button the on the status page. You would need to provide a written justification for holding onto your resources for a longer period of time.

14.13 Future Steps

Now that you’ve got a feel for how Chef can help manage experiment resources, there are several things you might try next:

15 Getting Help

The help forum is the main place to ask questions about CloudLab, its use, its default profiles, etc. Users are strongly encouraged to participate in answering other users’ questions. The help forum is handled through Google Groups, which, by default, sends all messages to the group to your email address. You may change your settings to receive only digests, or to turn email off completely.

The forum is searchable, and we recommend doing a search before asking a question.

The URL for joining or searching the forum is: https://groups.google.com/forum/#!forum/cloudlab-users

If you have any questions that you do not think are suitable for a public forum, you may direct them to support@cloudlab.us