Created: 16 Oct 2013, last rebuilt: 07 Dec 2015

Scraping the Fitbit website

This is the script I use to fetch high-resolution step data from fitbit website. You may know that their official API exposes daily stats only, however the Fitbit site shows data with 5-minute granularity. Upon short inspection we can see that the widget on the page calls an undocumented API and retrieves the data as XML. SInce this is unofficial, we’ll have to pretend to be the browser and authorize using browser cookie.

The easiest way to do this is to go to Fitbit site, log in and open developer tools; there you can copy the cookie and also sniff out the user id.

If you inspect network connections, you’ll see something like this:

URL:http://www.fitbit.com/graph/getGraphData?userId=XXXXX&type=intradayAltitude&dataVersion=41518&version=amchart&dateFrom=2013-10-16&dateTo=2013-10-16&ts=1381951807278&chart_type=column2d

There it shows your “userId”, which is what you have to paste into the first script, together with the cookie.

The script then simply loops through all dates between today and your entered start date (when you started using Fitbit trackers) and fetches the data into a pile of XML files (one per day). It ignores the dates for which the file already exists, throttles the requests with random pauses and spoofs the user agent as well. When you need to update the data you just have to refresh the cookie and re-run the script.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#!/usr/bin/ruby
require 'open-uri'
require 'date'

#functions =====================
def fetchData(url, cookie)
    @payload=nil
    begin
        @ua = "Opera/9.52 (Windows NT 6.0; U; en)";
        if @remote  = open(url, "User-Agent" => @ua, "Cookie" => cookie)
            @payload = @remote.read
        end
    rescue Exception
        puts "Exception"
        @payload = nil
    end
    return @payload
end

def craftURL(df, user_id)
    url = "http://www.fitbit.com/graph/getGraphData?userId=#{user_id}&type=intradaySteps&dataVersion=24063"
    url += "&version=amchart&dateFrom=#{df}&dateTo=#{df}&ts=1354279667701&chart_type=column2d" 
    return url
end

#settings =======================
cookie = "### LOG IN WITH BROWSER AND PASTE YOUR COOKIE HERE ###"
user_id = "### COPY THIS FROM _INSPECT NETWORK_ TAB"
starttime = Date.new(2011,3,15)

#never download today, it's not yet complete! 
#Yesterday might also be today in different TZ, so go 2 days back just to be sure.
endtime = Date.today-2


#main loop ======================
endtime.downto(starttime) do |d|
    day = "%02d" % d.day
    mon = "%02d" % d.month
    yr  = d.year
    dt_str = "#{yr}-#{mon}-#{day}"
    filename = "#{dt_str}.xml"
    if File.exists?(filename)
        next
    end    
    puts "-------------------"
    puts filename
    url = craftURL(dt_str, user_id)
    xml = fetchData(url, cookie)
    File.open(filename, 'w') {|f| f.write(xml) }
    sleep 10 + 5*rand
end

The following script then walks through a dir full of xml files and spits out a CSV, ready to be imported into excel. Every day is one line, every 5 minutes is one column. You can play with it in anything that reads CSV, but Excel’s pretty fast if you want to do some conditional formatting.

1
2
3
4
5
6
7
8
9
Dir.glob('*.xml') do |f|
    steps_arr = []
    File.open(f,'r').each_line do |l|
        if l=~ /(\d*) steps taken/
            steps_arr << $1
        end
    end
    puts steps_arr.join(",")
end

This is a sample of what I got.

950 days of steps