Data Integration

Acing the Fitbit API: Talend Job of the Week, part 2

Generic icon representing a headshot of a person

Talend Team

11 min read

Welcome back for another Job of the Week! 

This week we will be picking up where we left off in the last episode, where we showed you how to generate a refreshable Fitbit OAuth 2 authentication token and use it to gather Fitbit data. This simple example will return stats on Fitbit activities over a period of time. In this case, I will be focusing on tennis activities, but you will be able to alter the job to look at whatever activities you carry out with your Fitbit device and even extend it if you want. 

As before, let's start with the https://dev.fitbit.com/apps web page, which shows us the Data Collector app we created last time. We don't need to discuss this, as this was covered in the last episode. Take a look again if you want to refresh your memory. 

Now move to the Fitbit web API page at https://dev.fitbit.com/build/reference/web-api/explore/. Remember to include the trailing forward slash! 

Here we are going to be looking at the Get Activity log list function. First, let's look at the parameters: 

  • beforeDate 

  • afterDate 

  • sort 

  • offset (the number of records to skip when making multiple calls) 

  • limit (the number of records in each batch; when we're making multiple calls, the max is 100) 

Here are the responses: 

  • 200 for success 

  • 400 for badge request 

  • 401 for bad authentication 

  • 409 for issues regarding subscription IDs and user combinations — we shouldn't be seeing this in this example 

Let's test this now, as we did in the previous episode, and start by authenticating our user to use this web page. We go to the Authorize button, click on it, and authorize here. As we can see, this is not required for me to do this time, as it's not been long since I did it last time. (I actually filmed this just after the previous example, so we're still valid.) 

Now we will click Try it out to test this function: 

  • Set the afterDate to be 2022-01-01 — so just records from the beginning of this year 

  • Set the sort order to be Ascending 

  • Set the offset to be 0 

  • Set the limit to 100 — the maximum for each call 

Now let's click Execute. 

Here we can see the code used to call this using Curl and the request URL. We can see it was successful because of the code of 200 — and here we can see the results. 

Let's look at this JSON using https://jsonpathfinder.com/

I've copied the JSON we saw on the previous page here. Here we can see there are two elements at the root level of the JSON: activities and pagination. 

Pagination shows the following: 

  • afterDate 

  • limit 

  • next URL to call — with the limit in the offset recalculated 

  • offset 

  • previous URL (if there is one) 

  • sort order 

Next, let's look at the activities. Here we have an array of all activities that have been returned in this batch. Let's pick a random one to look at: #8. Here’s what we can see: 

  • activeDuration —  the period of the activity in milliseconds 

  • activeZoneMinutes — this contains minutesInHeartRateZones, which is an array of calculated heart rate zones, and the totalMinutes, which is a calculated minute score 

  • activityLevel, an array of activity levels and minutes spent at each level 

  • activityName  — as we can see, I was playing tennis here 

  • activityTypeId 

  • averageHeartRate 

  • calories 

  • caloriesLink, a URL to get more detailed calorie data 

  • distance 

  • distanceUnit, which shows that the distance was recorded in kilometers 

  • duration 

  • elevationGain 

  • hasActiveZoneMinutes 

  • heartRateLink, which gives more details on heart rate data 

  • heartRateZones, an array containing data on min and max heart rates at each zone, how many minutes were spent in each zone, and calories used at that zone 

  • lastModified 

  • logId 

  • logType 

  • manualValuesSpecified 

  • originalDuration 

  • originalStartTime 

  • source, which shows name of the device that collected the data — you can see I used the Fitbit Sense —  and trackedFeatures, which show the features that have been tracked or recorded 

  • speed 

  • startTime 

  • steps 

  • tcxLink, which maps heart rate to location (used more for running) 

So let's take a look at the job. As you can see, this isn't as big as the previous job, but you could build a lot more into this if you wish. 

First, let's look at the Get Fitbit Contexts subjob. This is where we are using the job we built in the previous episode to generate the OAuth 2 token. We can see the Fitbit OAuth 2–generated job here. You just need to drag this job as a component. 

See that I’ve clicked on the Transmit whole context option here. I don't point this out, but be aware that I've also clicked on the Copy child job schema button to copy the schema of the tBuffer output component used in that job to the output schema of this component. 

We can see here that the context variables of this job are the same as the context variables used for the Fitbit OAuth to generate a job. Notice that I provide the same initial values, which are mostly empty, and also put the path of the context file. All the same, but they aren't all necessary here. I've done this just for ease of setup. So the Fitbit OAuth to generate a job will run and pass the context values to the tContextLoad component to load the context variables ready for this job to use. 

The next component is to Set initial URL to your Java components. This uses the code present here to build our URL with parameters for the Get Activity log list function shown previously. 

Notice that we are hard coding the afterDate, the offset, and the limit here. These can be modified dynamically if you want, but I've gone with this method as it suits my purposes and is easy. This is saved to the URL key in the global map. 

The next component is a tLoop. This drives our multiple calls to the API endpoint. 

We set this to a While loop. The declaration and iteration values can be ignored here, since all of our logic is set in the conditioned field. This essentially checks that the URL stored in the global map is not empty. 

The next T Java is a Dummy component, essentially used because the following tRest client cannot be linked to an iterate link. 

Then we add this link and use onComponentOk link to link to the tRest client. The tRest client component is configured with a URL, which is set to the value held under the key of URL in the global map. This will be edited later in this subjob. The HTTP Method is set to Get, the Accept Type is set to JSON, and there are no parameters set here. 

If we go to the Advanced Settings, we can see that the Content-Type is set in the same way as it is in the Fitbit OAuth 2–generated job. The Authorization header is set to “Bearer ” with the access token following. Convert Response To DOM Document is also unticked. 

The first tExtractJSONFields component we can see here is used to extract the activities element and the pagination.next element. We can see these here in the JSON Path Finder. If we expand the pagination element, we can see afterDate, limit, next, and offset. Next is regenerated for every run, and the offset and the limit are recalculated according to the parameters initially set. 

Back to the job and the tExtractJSONFields component. We can see the Loop JSONPath query set to $, activities is set to activities, and pagination — possibly badly named — is set to pagination.next. 

These strings are sent to the next component, which is a tJavaFlex component. We can see that the code is all in the main section. 

Here we update the global map value, which is stored under the URL key. This is used to drive our tLoop. We also print this URL out to the output window so that we can see what is happening when we run the job. You'll notice that the Data Auto Propagate tick box is set here, to automatically pass on the schema values to the next component. 

The next component is another tExtractJSONFields component. This is used to get the remaining values out of the activities element of the JSON. Here we extract the following: 

  • activeDuration 

  • activityName 

  • activityTypeId

  • averageHeartRate 

  • calories

  • distance

  • distanceUnit

  • startTime

  • steps

  • heartRateLink 

I didn't end up using all of these, but I left them in. 

You can see here that the JSON Field is set to activities and the Loop JSONPath query is set to $[]. This is used to iterate over the array. 

The final component of this sub job is the tHashOutput component. We use this to store the values in memory. You'll see that the Append tick box is ticked to store every record that is returned from the looping subjob. 

Now to the last subjob. In this section I'm doing some basic analysis of the data returned. 

The first component is a tHashInput. This is linked to the tHashOutput used in the previous subjob. The schema has been copied from the tHashOutput. 

I have added some simple data processing and filtering into this job using a tMap. This includes the input data, the output data, our output filter, and some tMap variables used to process some of the data. 

Let's look at the variables: 

  • The first one is to calculate minutes from milliseconds. We can see the code says if activeDuration is not null, then divide it by 1000 and then divide the result by 60. Otherwise, return zero.

  • The next calculates the caloriesPerMinute. If both calories and activeDuration are not null, then the result of the minutes calculation to calculate the calories per minute; otherwise, set to zero.

  • The final variable is the stepsPerMinute variable. If steps are not null, divide by the calculation to find the minute. 

In the output schema, we can see the data we are outputting and where it comes from: 

  • minutes is set to var.minutes 

  • averageHeartRate is set to row9.averageHeartRate

  • calories is set to row9.calories

  • caloriesPerMinute is set to var.caloriesPerMinute

  • distance is set to row9.distance

  • startTime is set to row9.startTime

  • steps is set to row9.steps

  • stepsPerMinute is set to var.stepsPerMinute 

The final section to look at is the filtering. Here I am filtering the tennis activities where the minutes are greater than or equal to 50 and less than or equal to 120. And the steps need to be greater than zero. This is done to remove incomplete data. 

After this, we have a tSort component. I set this to sort by stepsPerMinute, descending (desc). 

The final component is just the tLogRow, set to Table mode. I just want to output to the output window in order to show the results. You can add whatever you like here. 

Now all that’s left is to run the job.  

The OAuth 2 generator has fired and set out context variables, including the access token. The tLoop is firing, and for every iteration we get a new URL representing the next URL to be fired by the tRest client. The subjob with the tRest client is being fired again and again, bringing in the data. 

In the results for the last subjob, the tHash input has returned the rows collected in the subjob above. It then sends them to the tMap, which processes and filters them. The tSort component sorts them and they are output using the tLogRow. 

If we look here, we can see that I have played a lot of tennis since the beginning of the year. We can see the minutes. Average heart rate, calories, calories per minute distance, start time, steps, and steps per minute. 

This was a very basic use of the API and it took me considerably longer to film this than to produce the job. The job took about 15 minutes to make up after I built the Fitbit OAuth to generate a job to set the OAuth 2 access token. 

Please feel free to download a copy and play with this. If you have any comments or questions, please feel free to get in touch either on Twitter (@Talend) or LinkedIn.

In this article:

Data Integration

Ready to get started?