ZEMCH 2015 - International Conference Proceedings | Page 246
Finally, the variety about sensor nature found in the two datasets (D1 is based on 5 various types
of sensors while D2 is mainly based on just motion sensors) will help to contrast the performance
of the models when fed with data from different sensor nature.
3.1.1 Dataset 1
Data was collected from three scenarios presenting different characteristics: House A, House B
and House C. D1 consists of data divided in days and collected by means of sensors deployed
in the three house scenarios. The types of sensors used are motion passive infra-red (PIR), reed
switches, pressure mats, mercury contact sensors and float sensors. For example, House A has a
total of 14 sensors and the labels were annotated by the occupants using a Bluetooth voice detector device rather than other scenarios in which the activities were just written down on paper.
Further information can also be found in https://sites.google.com/site/tim0306/datasets.
The D1 data used two sets of annotations. The first set refers to sensor events which were recorded
using 4 columns stating the start time, the ending time, the sensor ID and the value of the sensor.
Note that, as all the sensors included in this dataset are binary, the state is always 1(ON) since it
only reflects the time the sensor has been active. The second set also indicates starting and ending time but for the activity performed in that interval, therefore indicating the ID for the activity
being performed. The numbers of activities go from 10 in House A, 13 in House B and 16 in House
C. Figure 1 contains a section of this dataset.
Figure 1: Example of annotation for Dataset 1. The first rows are the sensor annotations and the latter
ones are the activities labelling.
3.1.2 Dataset 2
This dataset was collected from 27 motion sensors deployed in a domestic scenario through 56
days. However, many sensor events were not associated to any activity and subsequently they
cannot be used to train and test our models. In spite of this, the data available for supervised
learning is still of quality enough to be used for model performance evaluation. Unlike D1, this
dataset contains only motion sensors events. Nonetheless, as discussed in future sections, this
motion sensor data proves to have the potential to provide information enough to accurately
perform activity recognition under certain considerations.
This dataset has only one set on annotations (including sensor readings and labels), which contains 7 columns stating the time, the sensor ID and the value of the sensor as in D1, which in this
case can be either ‘ON’ or ‘OFF’. In addition to the sensor event, in the same line we can also find
when an activity starts (denoted by the activity label plus ‘begin’) or ends (denoted by ‘end’). This
dataset has a total of 10 activities. For further information visit: http://ailab.wsu.edu/casas/datasets/. An example would be as follows:
244
ZEMCH 2015 | International Conference | Bari - Lecce, Italy