Predict your activity using your Android, Cassandra and Spark

Coup de coeur, Tech

Lately, I started a new sport activity: running. When you run, you get really curious about the acceleration, the distance, the elevation and other metrics you can analyse when you practice this kind of sport. As a runner, I started using phone application (runkeeper) and recently I bought a Garmin Watch so I can get more information about my running sessions.
But how this kind of application analyse data and compute all this metrics?
Let’s focus on one metrics: proper acceleration.

What is proper acceleration?

Proper acceleration or physical acceleration is the acceleration it experiences relative to freefall and is the acceleration felt by people and objects. It is measured by an accelerometer.
The accelerometer data consist of successive measurements made over a time interval. that’s what we call a time series.

timeseries

How can I get an accelerometer?

Luckily, most of smartphones contain an accelerometer sensor.
The sensor measures 3 values related to 3 different axes as shown in the picture bellow:
accelerometerSchema

As an Android fan, I implemented an Android App: Basic Accelerometer which shows different axes values and the current date as timestamp.

Let’s create Basic Accelerometer Android App!

All source code is available on my Github repository here.
First step, I implemented the start activity:

After creating the starting menu, I have to collect the sensor values in a new activity: “AccelerometerActivity”.
To use the sensor, the activity class must implements SensorEventListener.

Now, I’m able to get information from the sensor and post them to an online REST service.
I used Retrofit, a REST client for Android and Java:

After that, I added an asynchronous task to post sensor values at each sensor’s update:

Now we’re able to launch our app!

How to install the app on your phone?

  • Download Android Studio
  • Clone the BasicAccelerometer project and open it on Android Studio
  • Activate developer mode on your Android phone (must have 4.0.3 version and above).
  • Plug your phone, run the app and choose your phone as a target.

The application will start automatically on your phone and you will see the screen below:

BasicAccelerometerScreen1

Now as the application is started, we will focus on the REST Service.

REST Service and Cassandra DB

The android app is ready to sent us real time data: time series of our acceleration.
As you may have noticed, I used acceleration bean on my Android app:

The acceleration is posted to a REST service.
The REST API receiving accelorometers data and storing them into Cassandra. Each acceleration contains:

  • acceleration capture date as a timestamp (eg, 1428773040488)
  • acceleration force along the x axis (unit is m/s²)
  • acceleration force along the y axis (unit is m/s²)
  • acceleration force along the z axis (unit is m/s²)

Rest API sources are available on my Github here. All data are saved on Cassandra Data Base.

Apache Cassandra is an NoSQL database. When writing data to Cassandra, data is sorted and written sequentially to disk. When retrieving data by row key and then by range, you get a fast and efficient access pattern due to minimal disk seeks – time series data is an excellent fit for this type of pattern.

To start Cassandra data base:

  • dowload the archive Cassandra 2.1.4
  • open it
  • execute this command: sh /bin/cassandra

On the REST application, I used Spring Data Cassandra which uses DataStax java driver so I can easily interact with Cassandra DB to do different operations:  write, read, update or delete.
Spring Data Cassandra helps to configure Cassandra cluster and create my keyspace:

After configuration, I created my application model:

We are trying to store historical data, so I used a compound key (user_id and timestamp) as they are unique:

Then, I added the REST controller.
The controller receives POST request with an acceleration and insert values on Cassandra DB.

The acceleration bean used in the controller is the same as defined for the Android app with an extra attribute: userID (I’ll explain the usage later).

After defining the REST controller and defining Cassandra configuration, we’re able to run the application:

Spring Boot starts a Jetty Server but you can use Tomcat Server instead. You’ll have to update project dependencies:

dependencies {
compile("org.springframework.boot:spring-boot-starter-web") {
exclude module: "spring-boot-starter-tomcat"
}
compile("org.springframework.boot:spring-boot-starter-jetty")
compile("org.springframework.boot:spring-boot-starter-actuator")
compile("org.springframework.data:spring-data-cassandra:1.2.0.RELEASE")
testCompile("junit:junit")
}

Let’s launch REST app!

We have already our Andoid. We run now the REST app. Then you have to add REST app URL to Android app:
http://myLocalIp:8080/accelerometer-api

As soon as you click on start button, Basic Accelerometer app begins to send acceleration data to the REST service:

basicAccelerometerScreen2

And then we start see insertion logs on REST app:

insertIntoCassandraLogs

And if we check Cassandra DB, we launch a CQL terminal to do some queries:
sh apache-cassandra-2.1.4/bin/cqlsh

Here’s how data looks like in Cassandra:
cassandraCqlTerminal

At this level, we collect data from the accelerometer and we store it in Cassandra Data Base.

How to analyse accelerometer data?

Remember, we aim to analyse our acceleration data. We must have some references to be able to create a decision tree model.

Luckily this model already exist and there is an interresting article explaining how to create a decision tree model based on acceleration data using data from Cassandra, Spark and MLib here.

Apache Spark is a fast and general engine for large-scale data processing.  MLlib is a standard component of Spark providing machine learning primitives on top of Spark which contains common algorithms , and also basic statistics and feature extraction functions.

The source code to do prediction with an exiting model is available on my Github here. [Update: this is the latest version with Scala]

We want now to guess just by analysing our acceleration if we are walking, jogging, standing up, sitting down, goind up or downstairs.
The decision tree model contains Resilient Distributed Dataset (RDD) of labeled points based on some features.

A Resilient Distributed Dataset (RDD), the basic abstraction in Spark. Represents an immutable, partitioned collection of elements that can be operated on in parallel.

The feature include different values:

So to analyse the collected data from BasicAccelerator application, we have to compute features as defined in our decision tree model.
We init our Spark Context:

Then, we read data from Cassandra Data Base (UserID “TEST_USER” is hard coded in REST service application, you can update it or add it to Android App).

Spark-Cassandra Connector is a lightning-fast cluster computing with Spark and Cassandra. This library lets you expose Cassandra tables as Spark RDDs, write Spark RDDs to Cassandra tables, and execute arbitrary CQL queries in your Spark applications.

The connector transforms data written into Cassandra into Spark’s RDDs:

After creating our features and computed them into vectors, we can call the model and try to predict our activity.
You must use spark-cassandra-connector-java-assembly-1.3.0-SNAPSHOT or above to be able to save and load models:

Last final step: prediction

Now we can launch our prediction to see if we can predict the activity based on acceleration:

  1. Launch the REST application
  2. Start the Andoid app with REST application URL
  3. Do an activity during 30seconds (Sitting, Standing Up, Walking, Jogging or Going up or down stairs) while holding the phone in one hand.
  4. Stop the Android app
  5. Launch the prediction activity:

Then you will see the predicted activity as a result:

predictActivity

Conclusion

We’ve seen how to use a connected object (smartphone) to collect time series data and store it into Cassandra.
Then we used Spark Cassandra Connector to transform data into RDD. Then we analysed those RDD using a decision tree model created with Spark.
This is a just a light simple of the infinite possibilies we can have nowadays with conneted devices.

Tags : , , , , , ,

Amira LAKHAL est Java Champion et membre du bureau de Duchess France. Elle travaille actuellement chez Actyx ou elle participe à la création de solutions pour le secteur industriel. Elle est passionnée par l’agilité et les langages fonctionnels.

Commentaires

  • C dit :

    Is it also possible to run MLLib code directly on Android?
    I collect data on Android, after some time a model is generated on Android using this data, which can be used later on Android to get some predictions?
    Instead of how it’s don it this tutorial: Collect data on Android, Send this data to REST Service and get predictions based on a Model which was created before?

  • Sen dit :

    Greeting, I am new to android… How am i going to get used to the REST API? i stuck at that part

  • Sen dit :

    Currently i am stuck at the creating database there
    i download the Cassandra and try to run sh /bin/cassandra using command prompt but it seen like my window laptop is not working on it
    i try to solve it using solve it using shell.w32-ix86 but it still wouldnt work
    i install DataStax Distribution v3.9 window version but the shell wouldnt work keep on saying cannot connect to local host and port then disappear eact time i try to open it
    Could you kindly tell me, what is the following steps i should do?
    Thanks

    • Amira Lakhal dit :

      Sorry I’m not familiar with Windows system. The sh script is to use for Debian based system.
      You may follow the steps mentioned in Datastax tutorial to launch the Cassandra CQL Shell on a Windows system.

    • Sen dit :

      I try this solution before but face some ip and port problem the shell keep on disappear everytime i execute it
      i try to google but cant solve it
      anyways thank you for your effort

      • Sen dit :

        I was able to install the cassandra in my window already.
        However i stuck at the REST app URL there, i was wondering where should i put the REST app if i want to run the cassandra in local host and what how do i get the REST app URL?
        thank you

  • Laisser un commentaire

    Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *

    Nom*

    Email

    Website

    12 − 1 =

    *

    Ce site utilise Akismet pour réduire les indésirables. En savoir plus sur comment les données de vos commentaires sont utilisées.

    En continuant à utiliser le site, vous acceptez l’utilisation des cookies. Plus d’informations

    Les paramètres des cookies sur ce site sont définis sur « accepter les cookies » pour vous offrir la meilleure expérience de navigation possible. Si vous continuez à utiliser ce site sans changer vos paramètres de cookies ou si vous cliquez sur "Accepter" ci-dessous, vous consentez à cela.

    Fermer