ZEMCH 2015 - International Conference Proceedings | Page 246

Finally, the variety about sensor nature found in the two datasets (D1 is based on 5 various types of sensors while D2 is mainly based on just motion sensors) will help to contrast the performance of the models when fed with data from different sensor nature. 3.1.1 Dataset 1 Data was collected from three scenarios presenting different characteristics: House A, House B and House C. D1 consists of data divided in days and collected by means of sensors deployed in the three house scenarios. The types of sensors used are motion passive infra-red (PIR), reed switches, pressure mats, mercury contact sensors and float sensors. For example, House A has a total of 14 sensors and the labels were annotated by the occupants using a Bluetooth voice detector device rather than other scenarios in which the activities were just written down on paper. Further information can also be found in https://sites.google.com/site/tim0306/datasets. The D1 data used two sets of annotations. The first set refers to sensor events which were recorded using 4 columns stating the start time, the ending time, the sensor ID and the value of the sensor. Note that, as all the sensors included in this dataset are binary, the state is always 1(ON) since it only reflects the time the sensor has been active. The second set also indicates starting and ending time but for the activity performed in that interval, therefore indicating the ID for the activity being performed. The numbers of activities go from 10 in House A, 13 in House B and 16 in House C. Figure 1 contains a section of this dataset. Figure 1: Example of annotation for Dataset 1. The first rows are the sensor annotations and the latter ones are the activities labelling. 3.1.2 Dataset 2 This dataset was collected from 27 motion sensors deployed in a domestic scenario through 56 days. However, many sensor events were not associated to any activity and subsequently they cannot be used to train and test our models. In spite of this, the data available for supervised learning is still of quality enough to be used for model performance evaluation. Unlike D1, this dataset contains only motion sensors events. Nonetheless, as discussed in future sections, this motion sensor data proves to have the potential to provide information enough to accurately perform activity recognition under certain considerations. This dataset has only one set on annotations (including sensor readings and labels), which contains 7 columns stating the time, the sensor ID and the value of the sensor as in D1, which in this case can be either ‘ON’ or ‘OFF’. In addition to the sensor event, in the same line we can also find when an activity starts (denoted by the activity label plus ‘begin’) or ends (denoted by ‘end’). This dataset has a total of 10 activities. For further information visit: http://ailab.wsu.edu/casas/datasets/. An example would be as follows: 244 ZEMCH 2015 | International Conference | Bari - Lecce, Italy