Data Collection ToC
1. Overview 2. Create a new dataset 2.1 Copy an existing Dataset 2.2 Content 2.3 Messages 3. Upload Data 3.1. Structured Data 3.1.1. Select File 3.1.2. Get File Information 3.2. Specify Dataset 3.3. Define Primary Key 3.4. Validation 3.5. Summary 3.6. Unstructured Data 4. Push big files to server 5. Import metadata structure 5.1. Select File 5.2. Read Source 5.3. Set Parameters 5.4. Summary
The Data Collection Module provides tools to create new datasets, enter metadata, upload data to the system, and import metadata structures (i.e. schemas). There are three workflows available under the Collect tab and one in the Setup:
· Create a dataset (Collect)
· Upload data to a dataset (Collect)
· Push big files to server (Collect)
· Import metadata structure (Setup)
This wizard will assist you in creating a new dataset in BExIS++. The Wizard is very flexible and builds up differently depending on the selected Metadata structure. Therefore, we describe only the basic functions here.
The first step is to generate an empty or a copy of an existing dataset based on your selection of the two mandatory elements: data structure, and metadata structure.
The next stage is determined by the selected metadata structure.
By using Select button, you can choose a Dataset to make a copy of it. Related to the Dataset, you can choose a Data Structure, but there is only one related Metadata Structure.
You are able to use predefined content or change fields as you want.
Remove an attribute.
Change order of the attribute.
When an input is faulty, the input field is highlighted in red. If you go with the mouse over the box, you get information about what is wrong.
To upload your data, please go to the Collect > Uploade Data via main menu. This wizard will assist you in uploading data into the BExIS++ repository. A dataset can be structured or unstructured.
The term "Tabular data" is used for all datasets where there internal structure of the data is "known" to the system. For example, in a data table the header, which defines the columns (i.e. variables) is the structure of the data. Before uploading/importing data to the system the data structure needs to be created with the Data Structure Manager of the Data Planning module.
In the first step an existing file containing your data needs to be selected. You can either select a file from your local computer or a file that has been uploaded to the server prior to starting the Upload Wizard. The second option is designed for files larger than 4 MB that may take several minutes to transfer. The wizard supports file formats of Microsoft Excel (*.xlsm) or ASCII (*.txt, *.csv). Microsoft Excel files are required to use a template created with the Data Planning Module (Plan) of BEXIS 2 (refer to Data Planning User Guide for more details). Once a file has been successfully selected, click the Next button and proceed to the next step.
For all Microsoft Excel files using a BExIS++ template the file information and data structure is automatically extracted and this step is omitted. Please refer to the Data Planning User Guide for more details on how to create such a template.
For all ASCII files users need to provide information on the file structure and formatting.
First, please choose a separator that is being used to separate data values from each other in your ASCII file.
Depending on your language different punctuation is used for decimal values. Please choose the one present in your ASCII file.
Next please specify whether the orientation of your data is column-wise or row-wise (see figure below).
Data sets may contain empty rows or columns on top or to the left before the header and the actual data values start. Please specify this offset in number of columns or rows.
Further, your data file may contain a header defining variable names, types etc. The row/column where this header starts needs to be specified (see figure below).
Finally, the row/column where the actual data values start needs to be specified.
In BExIS++ your data is stored and managed as part of a dataset. A dataset may contain one or more of your data files. But all data files within one dataset must be of the same data structure, i.e. the number of variables and their properties must be identical in each file. To upload your data to the system, please select one of the existing dataset from the dropdown list.
While adding data to an existing dataset you need to specify a unique identifier (e.g. primary key) for your dataset. If your dataset already contains a variable with such a key please select it. Otherwise a primary key can becreated by combining available variables. Please click the Check button to verify whether the selected combination is unique. If you go back and change something in the process of uploading, you need to check the primary key again.
With this step, the selected data file is validated against the selected data structure. Both, the structure of the data (e.g. variable properties) and whether the data values fit to the specified structure (e.g. data type, value range) is evaluated.
Click on Validate button to validate the data file.
If you go back and change something in the process of uploading, you need to validate the file again.
With this final step a summary of your uploaded data file is provided. Please check the information and click the Finish button to confirm and finalize the upload.
An unstructured data could be either selected from your local computer or could be a file that has been uploaded to the server. In the case of unstructured data, we do not read the contents of the data. We copy the files to the server and place them in relation to the dataset.
BEXIS 2 application can support many file formats such as (*.avi) (*.bmp) (*.csv) (*.doc) (*.docx) (*.gif) (*.jpg) (*.mp3) (*.mp4) (*.pdf) (*.png) (*.shp) (*.tif) (*.txt) (*.xls) (*.xlsm) (*.xsd) (*.zip).
The Maximum acceptable file size up to now is: 1024 MB.
Each user has a personal folder on the server where files are stored temporary. On this page you can see the uploaded files. You can delete each file by clicking on the X, or use these files later, when you want to upload data to a dataset.
Metadata structures (also called schemas or profiles) are typically created and imported by a data manager or administrator of the system. Thus this import function is available under the Setup > Import metadata Structure. The wizard will assist you in importing your metadata structure into the BExIS++. A metadata structure must be defined in a XSD schema file.
When importing a metadata schema into BExIS++, each element of the XSD file(s) is analyzed for its type, name, annotations, attributes, data types, constraints etc. Based on this information a form is automatically being created. For example, if an element is of data type Date, a date picker UI component will be used in the form. Also all names and descriptions are used exactly as they are in the XSD file(s).
NOTE: There are metadata standards available for almost any domain or type of data. It is good practice to follow one of them in order to ensure interoperability later on. However, although technical possible, most standards are very complex and should not be used as a whole. Users would just be overwhelmed and may need only a small selection of elements to describe their data. Standards are designed to cover a great range of use cases and data managers (in collaboration with their community) should make the effort in defining a set of feasible metadata elements in a profile (XSD file).
IMPORTANT: Please check, whether the XSD schema files have any dependencies to other files. You can find the dependencies in the import or include tags.
The current BExIS++ system requires all referenced files to be locally available on the server (no URL to external resource). So you may need to store all references first to a local folder, change the schema location path in every file (e.g. ./fileName.xsd) and then upload all files to the server. (See 4. Push files to server).
In the first step an existing file containing your data needs to be selected. You can either select a XSD file from your local computer or a file that has been uploaded to the server prior to starting the Wizard. You may use the "Push big data to server" function in the Collect menu to upload multiple related XSD files.
Note: Please upload a valid XSD structure. BExIS++ does not check this kind of validation.
Please specify a name (i.e. display name) for the new metadata structure. You may also enter a root node if only a part of the XSD is to be used (optional).
To find the root node open the XSD Schema file and have a look on the element tags. In the example of ABCD it looks like this.
If no root node is selected then the wizard will automatically select the first element which is a complex type. But it is also possible to define the element "DataSet" as root node and the metadata structure starts from this element. The Name of a metadata structure must be unique and the root node must exist.
For the system to handle a dataset at least the title and a description is needed. In this step these two elements, which are typically available in all metadata structures, should be identified and made explicit to the system.
The Summary page is an overview about the created metadata structure.