Synchronising data from multiple Android sensor files into one file

Difficulty Level:

The OpenSignals mobile application ( Google Play link ) allows to acquire data from the sensors that are built into the hardware of an Android smartphone. When acquiring data from multiple Android sensors, the data for each is saved into an individual ".txt" file. In order to properly put the data of the sensors into context the files need to be synchronised. In this Jupyter Notebook we will have a look at how to synchronise the sensors and how to write them all into a single ".txt" file.

For this Jupyter Notebook we will synchronise the data from five sensors: The accelerometer, GPS, light, proximity, and significant motion. This is done because these sensors represent a wide range of different acquisition types. However, the procedures shown here can be also applied to as many Android sensors as you record using the OpenSignals mobile application .

As part of this Jupyter Notebook we will guide you through all essential steps for synchronising the Android sensors that will allow you to have full control over the entire synchronisation process. In the last section we will present a function integrated into our biosignalsnotebooks package that conveniently handles all these steps for you.

In case this is your first time working with Android sensors, we highly recommend reading the Introduction to Android sensors notebook with general information on Android sensors.

If you want to have a look into the implementation of all this functions, you are welcome to visit our GitHub biosignalsnotebooks repository . We will provide direct links to each presented function throughout this notebook .

1 - Package imports

First, lets import the biosignalsnotebooks package. All functions we will need for the synchronisation steps are part of this package. Furthermore, we are going to import the os package.

# biosignalsnotebooks package
import biosignalsnotebooks as bsnb

# package for using operating system dependent functionality
import os

2 - Loading the data and gathering information on the signals

Before starting with the actual synchronisation, we will load the sensor data and gather some useful information about the signals. We will do this with the load_android_data(...)

function.

This function takes two inputs:

in_path (list of strings or string): String or list of strings containing the path(s) to the files that are supposed to be loaded

print_report (boolean): Boolean indicating to print the report that is generated while loading the data

The function returns the sensor data in a list and a dictionary with the following information about the signals:

names: The names of the sensors.

number of samples: The number of samples each sensor recorded.

starting times: The timestamps when the sensors started recording.

stopping times: The timestamps when the sensors stopped recording.

avg. sampling rates: The average sampling rate of each sensor (*).

min: The minimum sampling rate.

max: The maximum sampling rate.

mean: The mean of the sampling rates.

std: The standard deviation of the sampling rates.

starting order: Order in which the sensors started recording, from first to last.

stopping order: Order in which the sensors stopped recording, from first to last.

In order to use this function we will first have to make a list containing all the file paths pointing to the files we want to synchronise. For simplicity we added all our files into one folder, thus making the process of creating a file list much more straight forward.

When we run the function with the parameter print_report=True , a report is printed. The printed report shows that the accelerometer samples at the highest rate, while the GPS, light, and the proximity sensor sample at lower rates. Furthermore, we see that the significant motion sensor detected only one motion that it labelled as significant. Thus, its sampling rate is set to zero. The report also shows that the proximity sensor was the first to start recording, while the significant motion sensor was the last to start recording. The accelerometer was the last to stop recording and the significant motion sensor the first to stop recording.

(*) in case you are wondering why an average sampling rate is displayed, then have a look at this notebook about resampling signals recorded with Android sensors.

# set file path
path = '../../images/other/android_file_sync/'

# get a list with all the files within that folder
file_list = os.listdir(path)

# make full path for each file
file_list = [path + file for file in file_list]

# load the files
sensor_data, report = bsnb.load_android_data(file_list)

names: ['Acc', 'GPS', 'Light', 'Proximity', 'SigMotion']

number of samples: [17661, 132, 178, 18, 1]

starting times: [188488175020937.0, 188490065043301.0, 188488121593000.0, 188488102522000.0, 188502049742000.0]

stopping times: [188669138104000.0, 188668271596298.0, 188668132960000.0, 188636399533000.0, 188502049742000.0]

avg. sampling rates: [97.58896511423767, 0.7351020363555616, 0.9832712397545429, 0.11463481216084659, 0]

min. sampling rate: 0.0

max. sampling rate: 97.58896511423767

mean sampling rate: 19.88439464050172

std. sampling rate: 38.85403633973379

starting order: ['Proximity', 'Light', 'Acc', 'GPS', 'SigMotion']

stopping order: ['SigMotion', 'Proximity', 'Light', 'GPS', 'Acc']

3 - Padding all signals to the same length

Since the signals are going to be synchronised and written into the same file, all signals have to be of the same length. However, depending on what parts of the signals we want to include into our synchronised file, the length to which all signals should be padded to varies. In order to make this a little bit more clear, we will have a look at two "toy" signals.

For these two signals, there are four possible ways on how to decide which parts of these signals to include into the synchronisation. The graphs are shown below.

# imports for bokeh plotting
from bokeh.layouts import gridplot
from bokeh.plotting import figure, show
import numpy as np

# define signals
t1 = np.arange(0, 21, 2)
t2 = np.arange(3, 22, 3)

x1 = [0.5, 2, 1, 3, 2.5, 2, 2.5, 1.5, 1.5, 3, 2]
x2 = [0.5, 2, 1.5, 3.5, 1, 2, 0.5]

# define color and alph
c = 'white'
alpha = 0.7
color1 = bsnb.opensignals_color_pallet()
color2 = bsnb.opensignals_color_pallet()

# ------ figure 1 -------
p1 = figure(**bsnb.opensignals_kwargs("figure"))
# rectangle for highlighting
p1.rect(10.5, 2, 21, 3, color=c, fill_alpha = alpha)

# x1
p1.line(t1, x1, color=color1, line_width=1, legend_label='x_1') # draw lines
p1.circle(t1, x1, color=color1, size=10) # draw circles
# x2
p1.line(t2, x2, color=color2, line_width=1, legend_label='x_2') # draw lines
p1.circle(t2, x2, color=color2, size=10) # draw circles
p1.xaxis.axis_label = 'Time (s)'
p1.title.text = 'Entire recording time'
p1.title.align = 'center'
p1.title.text_font_size = "15px"
p1.legend.location = "top_left"
bsnb.opensignals_style([p1]) 
#show(p1)

# ------ figure 2 ------
p2 = figure(**bsnb.opensignals_kwargs("figure"))
# lines for indicating sart and stop
# rectangle for highlighting
p2.rect(10, 2, 20, 3, color=c, fill_alpha = alpha)

# x1
p2.line(t1, x1, color=color1, line_width=1, legend_label='x_1') # draw lines
p2.circle(t1, x1, color=color1, size=10) # draw circles
# x2
p2.line(t2, x2, color=color2, line_width=1, legend_label='x_2') # draw lines
p2.circle(t2, x2, color=color2, size=10) # draw circles
p2.xaxis.axis_label = 'Time (s)'
p2.title.text = 'Recording time of x_1'
p2.title.align = 'center'
p2.title.text_font_size = "15px"
p2.legend.location = "top_left"
bsnb.opensignals_style([p2]) 
#show(p2)

# ------ figure 3 -------
p3 = figure(**bsnb.opensignals_kwargs("figure"))
# lines for indicating sart and stop
# rectangle for highlighting
p3.rect(12, 2, 18, 3, color=c, fill_alpha = alpha)

# x1
p3.line(t1, x1, color=color1, line_width=1, legend_label='x_1') # draw lines
p3.circle(t1, x1, color=color1, size=10) # draw circles
# x2
p3.line(t2, x2, color=color2, line_width=1, legend_label='x_2') # draw lines
p3.circle(t2, x2, color=color2, size=10) # draw circles
p3.xaxis.axis_label = 'Time (s)'
p3.title.text = 'Recording time of x_2'
p3.title.align = 'center'
p3.title.text_font_size = "15px"
p3.legend.location = "top_left"
bsnb.opensignals_style([p3]) 
#show(p3)

# ------ figure 4 -------
p4 = figure(**bsnb.opensignals_kwargs("figure"))
# lines for indicating sart and stop
# rectangle for highlighting
p4.rect(11.5, 2, 17, 3, color=c, fill_alpha = alpha)

# x1
p4.line(t1, x1, color=color1, line_width=1, legend_label='x_1') # draw lines
p4.circle(t1, x1, color=color1, size=10) # draw circles
# x2
p4.line(t2, x2, color=color2, line_width=1, legend_label='x_2') # draw lines
p4.circle(t2, x2, color=color2, size=10) # draw circles
p4.xaxis.axis_label = 'Time (s)'
p4.title.text = 'x_1 and x_2 recording at the same time'
p4.title.align = 'center'
p4.title.text_font_size = "15px"
p4.legend.location = "top_left"
bsnb.opensignals_style([p4]) 
#show(p4)

# ------ Grid plot -------
grid = gridplot([[p1, p2], [p3, p4]], **bsnb.opensignals_kwargs("gridplot"))
grid.sizing_mode = 'scale_width'
#bsnb.opensignals_style([grid]) 

show(grid)

In order to have the freedom to explore all possible options, the biosignalsnotebooks package provides a function that allows setting when to start and when to end the synchronisation. This, of course, means that all signals will either be padded or cropped to the defined start and end points. To provide an intuitive usage, the start and end points are defined by the sensor names, thus, making it possible to easily choose from our set of recorded signals.

Additionally, it gives the possibility to set the type of padding to be used. There are only some exceptions because an arbitrary padding doesn"t make sense for all sensor types. The GPS always uses a padding of type "same", thus mimicking that the phone is at a fixed location. The significant motion sensor is always padded with zeros.

The name of function is pad_android_data(...) and it takes the following inputs:

sensor_data (list): A list containing the data of the sensors to be synchronised.

report (dict): The report returned by the load_android_data function.

start (string, optional): The sensor that indicates that indicates when the synchronisation should be started. If not specified the sensor that started latest is chosen.

stop (string, optional): The sensor that indicates when the synchronising should be stopped. If not specified the sensor that stopped earliest is chosen

padding_type (string, optional): The padding type used for padding the signal. Options are either "same" or "zero" . If not specified, "same" is used.

The function returns the padded sensor data within a list.

For the purpose of this Jupyter Notebook we will be using the entire recording time (start when the proximity sensors starts recording and end when the accelerometer sensor stops recording) and we will pad using the padding type "same" . This means that the values at the start and end of the recording are repeated.

padded_sensor_data = bsnb.pad_android_data(sensor_data, report, start_with='Proximity', end_with='Acc', padding_type='same')

4 - Resampling all signals to the same sampling rate

In this next step, we will resample all our signals to the same sampling rate. This has to be done in order to ensure that all signal columns are of the same length. We will be using the function we developed in the notebook focused on resampling signals recorded with Android sensors

.

For each sensor we are going to resample the data to a sampling rate of 100 Hz , which would be the approximate sampling rate of the accelerometer according to the report we generated above. In addition to that we will shift the time axis to start at zero and display it in seconds and use the interpolation type "previous" . The results of the resampling are then saved into two lists. One for holding the resampled signal data and the other for holding the resampled time axes.

# list for holding the resampled data
re_sampled_data = []
    
# list for holding the time axes of each sensor
re_sampled_time = []

# cycle over the sig
for data in padded_sensor_data:
    
    # resample the data ('_' suppresses the output for the sampling rate)
    re_time, re_data, sampling_rate = bsnb.re_sample_data(data[:,0], data[:,1:], shift_time_axis=True, sampling_rate=100, kind_interp='previous')
    
    # add the the time and data to the lists
    re_sampled_time.append(re_time)
    re_sampled_data.append(re_data)

Since we resampled all of our signals to the same sampling rate, all time axes should be equal and the data of each sensor should be of the same length. We can easily check this by doing the following:

print('Checking for number of samples in each sensor')
# cycle through the data list
for i,data in enumerate(re_sampled_data):
    
    # get the sensor name
    name = report['names'][i]
    
    # print the first axis of the data 
    print('{}: {}'.format(name,data.shape[0]))

# get the number of unique time axes
unique_axes = np.unique(re_sampled_time)

print('\nNumber of unique time axes: {}'.format(unique_axes.ndim))

Checking for number of samples in each sensor
Acc: 18104
GPS: 18104
Light: 18104
Proximity: 18104
SigMotion: 18104

Number of unique time axes: 1

5 - Creating a new header

Next, we are going to create a new header that will be written to the file in which we are going to store all our data. This can be done using the create_android_sync_header(...)

function.

The function takes a single input:

in_path (list of strings or string): List containing the paths to the files that are supposed to be synchronised

sampling_rate(int): The sampling rate to which the signals are going to be synchronised

It returns the header as a string.

# create header
header = bsnb.create_android_sync_header(file_list, sampling_rate)

# print the edited header string
print(header)

# OpenSignals Text File Format
# {"internal sensors": {"sensor": ["xAcc", "yAcc", "zAcc", "Latitude", "Longitude", "Altitude", "Light", "distance", "SigMotion"], "device name": "internal sensors", "column": ["nSeq", "xAcc", "yAcc", "zAcc", "Latitude", "Longitude", "Altitude", "Light", "distance", "SigMotion"], "sync interval": 2, "time": "14:55:57", "comments": "", "keywords": "", "device connection": "UNKNOWNinternal sensors", "channels": [0, 1, 2, 3, 4, 5, 6, 7, 8], "date": "2020-07-17", "mode": 0, "digital IO": [], "firmware version": 0, "device": "android", "position": 0, "sampling rate": 100, "label": ["xAcc", "yAcc", "zAcc", "Latitude", "Longitude", "Altitude", "Light", "distance", "SigMotion"], "resolution": [1, 1, 1, 1, 1, 1, 1, 1, 1], "special": [{}, {}, {}, {}, {}, {}, {}, {}, {}], "sleeve color": ["UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN"]}}
# EndOfHeader

6 - Writing the synchronised data to a new file

The last step that we need to conclude the synchronisation is to write our data to a new file. For this, you can use the save_synchronised_android_data(...)

function.

The function takes the following inputs:

time_axis (1D array): The time axis after the padding and resampling the sensor data.

data(list of arrays or list of lists): List containing the padded and re-sampled sensor signal arrays or lists. The length of each signal array/list along the 0-axis has to be the same size as time_axis.

header (string): A string containing the header that is supposed to be added to the file.

path (string): A string with the location where the file should be saved.

name (string, optional): The name of the file. If not specified, the file is named "android_synchronised.txt" .

The function returns the path, where the file was saved. We will save the file with the synchronised data to our current working directory. You can of course choose any other valid directory.

# get the current path
save_path = os.path.abspath(os.getcwd())

# save the synchronised data 
bsnb.save_synchronised_android_data(re_sampled_time[0], re_sampled_data, header, save_path)

'C:\\Users\\gui_s\\Documents\\biosignalsnotebooks_org\\header_footer\\biosignalsnotebooks_environment\\categories\\Other\\android_synchroinsed.txt'

Thus, our step by step synchronisation process is concluded. Following these steps gives you full control over the entire synchronisation process and all data that is returned during that process.

7 - Synchronisation with a single function call

In case you want to use a function that conveniently handles everything for you, you can use the sync_android_files(...)

.

The function takes the following inputs:

in_path (list of strings): List of strings that contain the paths that point to the files that are supposed to be synchronised.

out_path (string): The path where the synchronised file is supposed to be saved.

sync_file_name (String, optional): The name of the new file. If not provided then the name will be set to "android_synchronised.txt" .

automatic_sync (boolean, optional): Boolean for setting the mode of the function. If not provided it will be set to True.

As the parameter automatic_sync indicates, the function can be run in two different modes:

If automatic_sync=True , the function will do a full automatic synchronisation of the files. The synchronisation will only take place in the time window in which all sensors are running simultaneously. The rest of the data is cropped accordingly. Furthermore, the sampling rate for resampling the signals is set to the highest sampling rate present. This sampling rate is rounded to the next tens digit (i.e 43 Hz -> 40 Hz | 98 Hz -> 100 Hz). Sampling rates below 5 Hz are set to 1 Hz. The interpolation type "previous" is always used. In this mode, the function will give feedback on what it is doing and how it is setting the values.

If automatic_sync=False , the function will run in an interactive mode. It will guide you through the entire synchronisation process step by step and prompt you for specific inputs that are needed to set certain parameters.

Below, we run the function in the automatic synchronisation mode. Feel free to change the boolean to "False" and try out the function in the interactive mode.

bsnb.sync_android_files(file_list, save_path, sync_file_name='automatic_android_sync', automatic_sync=True)

names: ['Acc', 'GPS', 'Light', 'Proximity', 'SigMotion']

number of samples: [17661, 132, 178, 18, 1]

starting times: [188488175020937.0, 188490065043301.0, 188488121593000.0, 188488102522000.0, 188502049742000.0]

stopping times: [188669138104000.0, 188668271596298.0, 188668132960000.0, 188636399533000.0, 188502049742000.0]

avg. sampling rates: [97.58896511423767, 0.7351020363555616, 0.9832712397545429, 0.11463481216084659, 0]

min. sampling rate: 0.0

max. sampling rate: 97.58896511423767

mean sampling rate: 19.88439464050172

std. sampling rate: 38.85403633973379

starting order: ['Proximity', 'Light', 'Acc', 'GPS', 'SigMotion']

stopping order: ['SigMotion', 'Proximity', 'Light', 'GPS', 'Acc']


---- DATA PADDING ----

Synchronizing from start of SigMotion sensor until end of SigMotion sensor.
Using padding type: 'same'.
Warning: Start and end at same time...using next sensor that stopped earliest instead

---- DATA RE-SAMPLING ----

The signals will be re-sampled to:  100.0 Hz.
Shifting the time axis to start at zero and converting to seconds.
Using interpolation type: 'previous'.

---- Saving Data to file ----

The file has been saved to: C:\Users\gui_s\Documents\biosignalsnotebooks_org\header_footer\biosignalsnotebooks_environment\categories\Other\automatic_android_sync.txt

In this Jupyter notebook we learned how to synchronise multiple Android sensors and write them to a single file. Additionally we saw how we can do the entire synchronisation process with a single function call.

We hope that you have enjoyed this guide . biosiganlsnotebooks is an environment in continuous expansion, so don"t stop your journey and learn more with the remaining Notebooks .

☌ Project Presentation
☌ GitHub Repository
☌ How to install biosignalsnotebooks Python package ?
☌ Signal Library

☌ Notebook Categories
☌ Notebooks by Difficulty
☌ Notebooks by Signal Type
☌ Notebooks by Tag

from biosignalsnotebooks.__notebook_support__ import css_style_apply
css_style_apply()

.................... CSS Style Applied to Jupyter Notebook .........................