Skip to content

Python tools for working with environment files and spatial data in Avida

License

Notifications You must be signed in to change notification settings

emilydolson/avida-spatial-tools

Repository files navigation

Avida Spatial Tools

Python tools for working with environment files and spatial data generated by Avida

example

Code Climate Build Status Coverage Status

Dependencies:

Tutorial:

The visualization functions in Avida Spatial Tools are designed to be mixed and matched to achieve the analysis you want. To make this easier, data processing is done in four steps: reading the data in, transforming the data, aggregating the data (occasionally it is appropriate to reverse the order of these two), and visualizing the data. At each step, there are a variety of options:

  • Parse the data: To start out, you probably have an environment file and some spatial data files recording things that happened in that environment. Two basic functions are provided for pulling this data into your script, one for parsing environment files and one for parsing spatial data files. Parsing an environment file with parse_environment_file("environment.cfg") will return an EnvironmentFile object, which is basically a 2D array of sets indicating which resources are where, plus some handy meta-data. To parse multiple environment files, you can use parse_environment_file_list(["environment1.cfg", "environment2.cfg"]), which will return a list of EnvironmentFile objects. To parse any number of spatial data files, you can use load_grid_data(["grid_task.1.dat", "grid_task.2.dat"]). This will return a 3d array representing an Avida grid with a list at each cell containing all of the values that were in that cell across the files that you loaded in. By default, load_grid_data assumes your spatial data is bitstrings encoded as decimal numbers, as it is in grid_task.*.dat files. In order to load data of a different type, pass the desired type ("int", "string", or "float") as the second argument to `load_grid_data'. Note: This will throw off some of the default color settings.

    Data parsing functions:

    • One environment file: parse_environment_file()
    • Multiple environment files: parse_environment_file_list()
    • One or more grid_task files: load_grid_data()
  • Transform the data (optional): Now that you've got your data loaded in, you might want to put it into a different form. For instance, bitstrings representing phenotypes aren't that easy to do math on. If you need to aggregate across a variety of environments, it might be useful to convert phenotypes to integers representing the number of tasks they do with make_count_grid. Alternately, you can use make_optimal_phenotype_grid to generate a grid of numbers representing the deviation between the phenotype in each location and the resources present in that cell. Or maybe you want to plot the distribution of phenotypes, but you have too many different phenotypes to do that. In this case, assign_ranks_by_cluster can be used to group similar phenotypes together.

    Data transformation functions:

    • Convert values to counts of tasks performed or resources available: make_count_grid()
    • Convert phenotype values to numbers representing deviations from then optimal phenotype: make_optimal_phenotype_grid()
    • Convert values to ranks indicating the complexity of a phenoytype or resource set relative to others in the environment: assign_ranks_by_cluster()
    • Convert values to lists representing the percentage of organisms in that cell doing each task: task_percentages()
  • Aggregate the data: Unless you're making an animation, all of the visualization functions want a 2d array. But load_grid_data gives you a 3d array. To collapse your 3d array to a 2d array, you can use the agg_grid function. takes a grid and another function as arguments and applies that function to every cell of the grid. Common aggregati Iton functions to pass include mean, mode, and median. If you only loaded one data file, you don't need to worry too much about this - the default aggregation is mode, which will just return the one item in your list. If you're actually aggregating across multiple files, then it's more important to think about what form of averaging is appropriate.

    Data aggregation functions:

    • agg_grid()
  • Visualize the data: At last! You can turn your data into a pretty picture! There are two ways in which colors can be assigned to plot elements: 1) If your data grid contains numbers, you can make a heat map based on their values, 2) If your data grid contains bitstrings, you can mix colors associated with each index together.

    Data visualization functions:

    • Color grid
    • Color grid by hue mixing
    • Overlay circles representing phenotypes on colored grid representing environment
      • Represent phenotypes as circles with single color
      • Represent phenotypes as concentric circles representing tasks that individual can do
    • Make a movie showing overlaid phenotypes changing over time

Building from source:

If you want to use the bleeding edge development version from github, rather than the last stable release from PyPI, you can install it with the following commands:

git clone git@github.com:emilydolson/avida-spatial-tools.git
cd avida-spatial-tools
python setup.py build
python setup.py install

You will need to re-execute the last two commands after making or pulling changes to the repository.

Development:

If you have requests for new features submit an issue or e-mail me - I'm happy to add things! Or, if you feel so inclined, feel free to implement them yourself and send me a pull request. I've tried to keep things pretty modular, so it shouldn't be too hard. I use py.test for testing because it was the easiest testing framework to get working with image regregression tests.

Development dependencies:

About

Python tools for working with environment files and spatial data in Avida

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages