Fork me on GitHub

Cloudstack Simulation Generator

A working prototype indended to help generate scenarios that can be run against the Apache CloudStack management server, specifically when running in simulator mode.

View the Project on GitHub chipchilders/cloudstack-sim-gen

My other projects are on chipchilders.github.io.

cloudstack-sim-gen

Fair warning - these scripts and files are likely filled with bugs and are likely to need editing to remove things like hardcoded paths that are specific to my environment. ;-)

The Process

First, the cloudstack-sim-gen repo has a set of files used to construct, generate and run the test scenarios. Here's a walkthough of the files (listed in the order of their involvement in the process):

Digging into the Scenario Definition

The example scenario that I'm using is relatively small, and the features available are still limited. However it has served as a great way to debug the overall process.  

Let's look at the inputsample file's contents section by section:

The first section includes the high level rules for the scenario, including setting the number of simulated "days" for the scenario, a virtual machine growth (per day) equation, and two dispersion weighting arrays to help the generator select the appropriate offerings and accounts to use when creating the VMs.

    "number_of_days": 30,
    "vm_growth": "10+(x*0.2)+pow(x*0.2,0.5)",
    "offering_dispersion": [7,3],
    "account_dispersion": [5,5],

The scenario model uses "days" as logical units of time, within which VM's will be created. The vm_growth equation is evaluated from 1 to the max number of days specified. Any valid Python mathamatical routine should work here, assuming that "x" (day) is the only variable used in the code. If you wanted a simple N VM's per day, you would just put that number into the field. In the scenario I'm testing with, I looked for a gradual increase from a starting point of 11 as seen below:

Hopefully the use of Python eval() in that setting will provide for lots of flexibility in how the growth can be modeled.

The dispersion settings are used to be passed into a weighted dispersion method of the generator as each VM is created. The offering dispersion will weight the selection from among the defined compute service offerings, while the account dispersion does the same for defined accounts. You should be sure to have the same number of elements of the dispersion lists as you have of the relevant definitions.

To me, one of the more interesting aspects of modeling scenarios is to include capacity growth for the cloud itself. The capacity_increase_rules dictionary is how to describe the rules that you want to follow during the scenario run.

    "capacity_increase_rules": {
        "target_resource": "memory",
        "threashold": 85,
        "cluster_size": 8
    },

Three attributes are available (although only 2 are used right now). target_resource is defined for future use (not implemented). Threashold is the percentage of the target resource (memory only now) at the zone level that will trigger the addition of a new cluster into the simulator's environment. Cluster_size defines the number of hosts that will be added within each new cluster.

That leads to the definitions of accounts. You'll notice that only the username field needs to be unique below:

    "accounts": [
    {
            "email": "test@test.com",
            "firstname": "Test",
            "lastname": "User",
            "username": "1",
            "password": "password"
        },
    {
            "email": "test@test.com",
            "firstname": "Test",
            "lastname": "User",
            "username": "2",
            "password": "password"
        }
    ],

Similar to the account definitions, the service offerings are defined with:

    "service_offerings": [
        {
            "name": "1",
            "displaytext": "FirstFitPlanner Small",
            "cpunumber": 1,
            "cpuspeed": 1000,
            "memory": 1024,
            "deploymentplanner": "FirstFitPlanner"
        },
        {
            "name": "2",
            "displaytext": "FirstFitPlanner Large",
            "cpunumber": 1,
            "cpuspeed": 2000,
            "memory": 2048,
            "deploymentplanner": "FirstFitPlanner"
        }
    ]

The offerings offer a bit more flexibility. Name must be unique, but the value of having multiple offerings is to be able to vary the CPU, RAM and selected deployment planner for the offerings.

Running the Scenario

To run the scenario, you'll need to move the test_scenario.py file into your cloudstack/test/integration/smoke folder.  You'll also have to edit that script to point to the appropriate location to find the generated scenario json file.  You'll also have to edit that script to name an appropriate output file for the collected statistics.

  1. Simulator Configuration: Copy the contents of this gist into a file named advanced-32host.cfg within the setup/dev folder.
  2. Compile and Configure: run the following: mvn -Pdeveloper -Dsimulator clean install; mvn -Pdeveloper -pl developer -Ddeploydb; mvn -Pdeveloper -pl developer -Ddeploydb-simulator; mvn -pl client jetty:run
  3. Setup Zone: Once the mgmt server is running, in a new terminal run the followign: mvn -Pdeveloper,marvin.setup -Dmarvin.config=setup/dev/advanced-32host.cfg -pl :cloud-marvin integration-test
  4. Restart Mgmt Server: In order for the new global settings to take effect, CTRL-C on the mgmt server's terminal, wait for it to start, and then run the following to restart it: mvn -pl client jetty:run
  5. Run the Scenario: In another terminal window, run: nosetests --with-marvin --marvin-config=setup/dev/advanced-32host.cfg --load test/integration/smoke/test_scenario.py

Collected Data

Once the scenario is complete, you should have the specified statistics output file filled with useful data. The format of the output file is JSON, and is structured as follows:

The top level object is called "datapoints", which is an array dictionaries that includes capacity data collected after every new VM is created. The dictionary objects include:

It's important to note that some of the collected data isn't accurately shown when running against the simulator. In particular, the CPU host numbers are empty right now and the "used" values will always be 0 (because we are simulating!). Since memory is, in my experience, typically the major capacity concern for hosts, I've ignored CPU so far and focused on memory for all aspects of the modeling / data collection process. As the collected data improves, I'll include better scenario planning around more and more capacity elements.