SDG: Search-based Synthetic Data Generator

Many testing activities, such as usage-based statistical testing, require the generation of synthetic testing data that can be used to build confidence in the reliability of the system under test. Generating such data is not a trivial task as the underlying data schemas are usually large, complex, and subject to numerous domain-related logical constraints. The ultimate goal of the SDG tool is to automatically generate such synthetic data.

System Requirements

  • Eclipse IDE (Mars or higher) [link].

  • Java Development Kit (JDK) 1.8.0 (or higher) [link].

  • Note that all the other required third-party libraries are included in the installation package.

  • We also recommend using the Papyrus modeling environment for building and managing models [link].


Demonstration Material

  • Profile for expressing the statistical characteristics of the test data [link].

  • Example of a domain model annotated with statistical information (TaxCard) [link].

  • OCL constraints expressing the logical validity of the data [link].

  • Example of a valid and representative test data sample generated using SDG [link].


Installation Material for SDG

  • The SDG tool can be found [here].

  • Installation and usage instructions can be found [here].


Relevant Publications

  • G. Soltana, M. Sabetzadeh, and L. C. Briand, "Synthetic Data Generation for Statistical Testing”, in proceedings of 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017), Illinois, USA, October 30 - November 3, 2017.


Contact Information

Ghanem Soltana
Interdisciplinary Centre for Security, Reliability and Trust
29, Avenue John Fitzgerald Kennedy
L-1855, Luxembourg
E-mail: ghanem(dot)soltana(at)uni(dot)lu