Part I. A Guided Tour of the Social Web Prelude
2. Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
2.2. Exploring Facebook’s Social Graph API 46
2.2.1. Understanding the Social Graph API 48
As its name implies, Facebook’s Social Graph is a massive graph data structure repre‐
senting social interactions and consisting of nodes and connections between the nodes.
The Graph API provides the primary means of interacting with the Social Graph, and the best way to get acquainted with the Graph API is to spend a few minutes tinkering around with the Graph API Explorer.
It is important to note that the Graph API Explorer is not a particularly special tool of any kind. Aside from being able to prepopulate and debug your access token, it is an ordinary Facebook app that uses the same developer APIs that any other developer application would use. In fact, the Graph API Explorer is handy when you have a par‐
ticular OAuth token that’s associated with a specific set of authorizations for an appli‐
cation that you are developing and you want to run some queries as part of an explor‐
atory development effort or debug cycle. We’ll revisit this general idea shortly as we programmatically access the Graph API. Figures 2-1 through 2-4 illustrate a progressive series of Graph API queries that result from clicking on the plus (+) symbol and adding connections and fields. There are a few items to note about this particular query:
Access token
The access token that appears in the application is an OAuth token that is provided as a courtesy for the logged-in user; it is the same OAuth token that your application would need to access the data in question. We’ll opt to use this access token throughout this chapter, but you can consult Appendix B for a brief overview of OAuth, including details on implementing an OAuth flow for Facebook in order to retrieve an access token. As mentioned in Chapter 1, if this is your first encounter with OAuth, it’s probably sufficient at this point to know that the protocol is a social web standard that stands for Open Authorization. In short, OAuth is a means of allowing users to authorize third-party applications to access their account data without needing to share sensitive information like a password.
See Appendix B for details on implementing an OAuth 2.0 flow that you would need to build an application that requires an ar‐
bitrary user to authorize it to access account data.
48 | Chapter 2: Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
Node IDs
The basis of the query is a node with an ID (identifier) of “644382747,” corre‐
sponding to a person named “Matthew A. Russell,” who is preloaded as the currently logged-in user for the Graph Explorer. The “id” and “name” values for the node are called fields. The basis of the query could just as easily have been any other node, and as we’ll soon see, it’s very natural to “walk” or traverse the graph and query other nodes (which may be people or things as books or TV shows).
Connection constraints
You can modify the original query with a “friends” connection, as shown in Figure 2-2, by clicking on the + and then scrolling to “friends” in the “connections”
pop-up menu. The “friends” connections that appear in the console represent nodes that are connected to the original query node. At this point, you could click on any of the blue ID fields in these nodes and initiate a query with that particular node as the basis. In network science terminology, we now have what is called an ego graph, because it has an actor (or ego) as its focal point or logical center, which is connected to other nodes around it. An ego graph would resemble a hub and spokes if you were to draw it.
Likes constraints
A further modification to the original query is to add “likes” connections for each of your friends, as shown in Figure 2-3. Before you can retrieve likes connections for your friends, however, you must authorize the Graph API Explorer application to explicitly access your friends’ likes by updating the access token that it uses and then approve this access, as shown in Figure 2-4. The Graph API Explorer allows you to easily authorize it by clicking on the Get Access Token button and checking the “friends_likes” box on the Friends Data Permissions tab. In network science terminology, we still have an ego graph, but it’s potentially much more complex at this point because of the many additional nodes and connections that could exist among them.
Debugging
The Debug button can be useful for troubleshooting queries that you think should be returning data but aren’t doing so based on the authorizations associated with the access token.
JSON response format
The results of a Graph API query are returned in a convenient JSON format that can be easily manipulated and processed.
2.2. Exploring Facebook’s Social Graph API | 49
Figure 2-1. Using the Graph API Explorer application to progressively build up a query for friends’ interests: a query for a node in the Social Graph
Figure 2-2. Using the Graph API Explorer application to progressively build up a query for friends’ interests: a query for a node and connections to friends
50 | Chapter 2: Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
Figure 2-3. Using the Graph API Explorer application to progressively build up a query for friends’ interests: a query for a node, connections to friends, and likes for those friends
2.2. Exploring Facebook’s Social Graph API | 51
Figure 2-4. Facebook applications must explicitly request authorization to access a user’s account data. Top: The Graph API Explorer permissions panel. Bottom: A Face‐
book dialog requesting authorization for the Graph API Explorer application to access friends’ likes data.
52 | Chapter 2: Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
Facebook Query Language
In addition to the Graph API, FQL provides a fine alternative fpr querying Facebook’s Social Graph and has a SQL-inspired syntax that most developers find intuitive. It seems to be the case that any data you could query with the Graph API, you could also query via FQL, and although it may be true that some advanced queries that are possible with FQL may not be possible with the Graph API, it appears that Facebook’s longer-term plan is to ensure that the Graph API is at full parity with FQL. For example, some recent investments in the Graph API resulted in a number of powerful new features, such as field expansion and nesting. If you’re interested in learning more about FQL, consult the FQL Reference, and try out a query with the FQL Query console that’s available as an alternate option from the Graph API Explorer. For example, you could query the first and last names of your friends in the FQL Query tab of the Graph API Explorer with the following FQL query:
select first_name, last_name from user
where uid in ( select uid2 from friend where uid1 = me() )
Although we’ll programmatically explore the Graph API with a Python package later in this chapter, you could opt to make Graph API queries more directly over HTTP yourself by mimicking the request that you see in the Graph API Explorer. For example, Example 2-1 uses the requests package to simplify the process of making an HTTP request (as opposed to using a much more cumbersome package from Python’s standard library, such as urllib2) for fetching your friends and their likes. You can install this package in a terminal with the predictable pip install requests command. The query is driven by the values in the fields parameter and is the same as what would be built up interactively in the Graph API Explorer. Of particular interest is that the friends.lim it(10).fields(likes.limit(10)) syntax uses a relatively new feature of the Graph API called field expansion that is designed to make and parameterize multiple queries in a single API call.
Example 2-1. Making Graph API requests over HTTP
import requests # pip install requests import json
base_url = 'https://graph.facebook.com/me'
# Get 10 likes for 10 friends
fields = 'id,name,friends.limit(10).fields(likes.limit(10))'
2.2. Exploring Facebook’s Social Graph API | 53
url = '%s?fields=%s&access_token=%s' % \ (base_url, fields, ACCESS_TOKEN,)
# This API is HTTP-based and could be requested in the browser,
# with a command line utlity like curl, or using just about
# any programming language by making a request to the URL.
# Click the hyperlink that appears in your notebook output
# when you execute this code cell to see for yourself...
print url
# Interpret the response as JSON and convert back
# to Python data structures content = requests.get(url).json()
# Pretty-print the JSON and display it print json.dumps(content, indent=1)
If you attempt to run a query for all of your friends’ likes by setting fields = 'id,name,friends.fields(likes), and the script appears to hang, it is probably be‐
cause you have a lot of friends who have a lot of likes. If this happens, you may need to add limits and offsets to the fields in the query, as described in Facebook’s field expan‐
sion documentation. However, the facebook package that you’ll learn about later in this chapter handles some of these issues, so it’s recommended that you hold off and try it out first. This initial example is just to illustrate that Facebook’s API is built on top of HTTP. A couple of field limit/offset examples that illustrate the possibilities with field selectors follow:
# Get all likes for 10 friends
fields = 'id,name,friends.limit(10).fields(likes)'
# Get all likes for 10 more friends
fields = 'id,name,friends.offset(10).limit(10).fields(likes)'
# Get 10 likes for all friends
fields = 'id,name,friends.fields(likes.limit(10))'
It appears as though the default limit for queries at the time of this writing is to return up to 5,000 items. It’s possible but somewhat unlikely that you’ll be making Graph API queries that could return more than 5,000 items; if you do, consult the pagination doc‐
umentation for information on how to navigate through the “pages” of results.