Talk:Data Store API documentation
From Facebook Developers Wiki
Contents |
[edit] Questions
... Question content moved to bugzilla entry: http://bugs.developers.facebook.com/show_bug.cgi?id=1828 ... --652369465 01:58, 25 March 2008 (PDT)
It would be very nice to see some examples of using FQL with the datastore API. The only useful application I have been able to get to work is selecting objects by _id reserved attribute, which exactly duplicates data.getObject(s) so isn't really any enhancement. For example, there must be some ways of interacting with associations in FQL, which would really help reduce the number of API calls that must be made (i.e. get associated id's as a subquery then get the data associated with those ids; takes 1 call for fql.query vs 2 calls in getAssociatedObjects/getObjects). It seems a matter of simply revealing various reserved attributes like _id, as it seems one can use association names in FQL queries (i.e. 'select [whatever] from app.[assocname] where [whatever]' is valid, the problem is we don't have anything valid to include in the [whatever]s)
--736245603 23:08, 27 December 2007 (PST)
At the time you're setting a property of a hashed object you can get its fbid back from setHashValue(). However, there doesn't seem to be any other way to recover the fbid if you only know the hash (for example to look up the object in an association). So it seems like it is necessary to create a bogus property that can be set just to get the fbid back as a side effect. Is there a cleaner way to do this? - Dec 18, 2007
I create an object using setHashValue. For use with associations, I have to know its uid. Given an object's hash, how do I get its uid?
- You should be able to get it through an internal property "_id". The reason that setHashValue() returns the id directly is to avoid a round-trip.
- But how does one retrieve that internal property? Using data.getHashValue and trying to retrieve the _id property results in "Exception Thrown: FacebookRestClientException Code: 100, Message: Invalid parameter: property name: _id". Any other method seems to require having the id already (for example, making an fql query). Is there a way to, for example, construct an fql query using hash keys? --736245603 14:46, 26 December 2007 (PST)
--533160967 13:47, 25 November 2007 (PST)
- We just modified setHashValue() to directly return object's fbid. Hope this helps.
I would like to have users of my application set up some preferences which will limit visibility to friends and other users in facebook. To achieve a good solution, I would need to use FQL in order to intersect data from other User and their own preferences. i.e.: if A,B,C are users. User A query through my application a list of user where both B and C are candidate to appear, but maybe based on B's and C's preference they would not appear.
Is there someway to do that now?
Thanks, --Brian J. Cardiff 08:03, 22 November 2007 (PST)
- Hmm, we are not exposing other users' preferences to currently logged-on user, since this is really a big security problem potentially. You will have to create an association between two users and maintain it based on user preferences. For examples, say A->B means A can see B's info. When B changes pref, remove this assoc. Then when A logs in, A cannot see B's info, since this assoc is missing. In other words, we have to build a shared table between different users than directly querying another user's preference.
- But there is no nice way to maintain that info. Since user could later add new friends. Is there any other alternative?. I think it would be a nice feature to have Shared data between users of an application. It could open a bunch of opportunities.
--Brian J. Cardiff 12:50, 22 November 2007 (PST)
- Maybe I misunderstood the question. Object types an application created can always be shared between different users. I was only talking about User Preferences table.
Should I be able to query my object types with FQL? Some comments here seem to imply that. When I try it, I get: "Unknown table: mytype".
- Whoops, problem solved. I was forgetting to use the "app." prefix. So "select firstname from app.mytype..." seemed to work.
[edit] Wish List
A setAssociation implementation that takes hash value created using setHashValue instead of object id --584586303 19:19, 11 October 2007 (PDT)
- We just modified setHashValue() to directly return object's fbid. Hope this helps.
Either updateAssociation or a setAssocation (and their plural) to allow the _data and _time fields to be updated / touched without having to removeAssociation and then setAssocation again. Workaround right now would be to use the BATCH RUN API, but not all client libs support this. Either way, the BATCH RUN option is still a kludge for this. --652369465 06:07, 23 February 2008 (PST)
Can we get the plural of createObject - createObjects and have it return an array of IDs that I can then use to create associations? I can't really batch it as I need the ID from the create object to create my associations. --709271019 17:01, 27 February 2008 (PST)
Access to the object types when a user is not logged in would be useful. We try and allow a user to navigate our app without having to be logged in or have it installed - It is fine with our external database server, but I can't get access to the data store for the app. So I am left with having preloaded FQL and data store access for the static data for logged in users and the external MySQL database for those not logged in. Defeated the point of me starting down the data store path --709271019 17:01, 27 February 2008 (PST)
[edit] Archived
Please post new questions under "Questions", not here. Thanks!!
What is the recommended way to take a facebook user id (e.g. the current session's user), and query that user's objects and associated objects in the data store? [one solution would be to have a 'user' object type, with a single instance per user, and store the FB Object ID of that instance in a user preference -- but there's a less convoluted way, I assume] --719255867 21:39, 24 October 2007 (PDT)
- You may just define your object type say "user_data", then use setHashValue() and getHashValue() with user id as the key.
What is the alias for Object Identifier for complex assoc_info1 in Data.defineAssociation? The wiki says:" Describes object identifier 1 in an association. This is a data structure that has:
* alias: name of object identifier 1. This alias needs to be a valid identifier, which is no longer than 32 characters, starting with a letter (a-z) and consisting of only small letters (a-z), numbers (0-9) and/or underscores. * object_type: Optional - object type of object identifier 1. * unique: Optional - Default to false. Whether each unique object identifier 1 can only appear once in all associations of this type. "
Is this alias the object type? I don't understand where to get it from? Also, in using get associated objects, how can we limit number of rows returned? is this possible?
--15720046 21:06, 31 October 2007 (PDT)
- "alias" is just a name you come up with. Currently there is no way to set a limit, since it seems to be normally used together with some criteria. We are adding "limit" and "order by" in FQL. If you feel you need a simple "limit" that cuts off the list arbitrarily, please file a feature request in our bugzilla so we can pick it up. Thanks.
- When will there be PHP coding examples, and a PHP API library?
-- David Jones
- We are working on coding examples and client libraries. Sorry this is heavily delayed.
- Client lib is coming out soon. We're doing some final testing. Expect to see it early next week. Thanks for pointing it out!
- if you're impatient and can't wait to try it out i wrote this client lib that extends the default facebook client adding the datastore methods. it should integrate seamlessly into your code, just create a new Facebook_datastore object and you'll be off and running. --Paul Wells
- What kind of database administration is available for this? Specifically for user preferences, how can i select all settings from all users in the database such that I can provide a backup? Also how can i insert multiple settings for multiple users? Is it possible to delete settings for users?
- Our current plan is to provide authenticated user based data access for real-time application to consume. We will seriously consider good thoughts with your questions, although backup is implicitly done on Facebook side.
- Cool, thanks a lot. I mentioned backup because I'm really just looking for a way to mass export/import data (for db migration, statistics generation, etc).
-- Larry Gadea
cool! thanks for the sneak peek. it seems like this is a wrapper around SQL, which explains a lot of the design choices. still, i'm curious about a few things.
(update: i've posted a longer writeup on my site.)
- is the fbid namespace global? or per app?
- It's a GUID, a globally unique identifier. Well, it's actually a UUID, universally unique identifier, in case Mars has life and computer.
- the descriptions of the properties params for createObject and updateObject seem to imply that an object can have only a subset of its properties defined. is that true?
- The question is not clear to me. Please elaborate.
- sure. createObject's properties parameter is optional. if it's not provided, i assume that means the object is created without any of its properties set. from that point, you can set some or all of its properties with updateObject, setObjectProperty, etc. so, each of an object's properties can be set or unset?
- Yes.
- in most places - setObjectProperty, set/getHashValue, etc. - properties are sent and returned as strings. i take it that means we have to serialize int properties manually. :(
- We were facing a decision between a strongly typed model vs. a weakly typed model in handling return values.
- makes sense. sounds like you went for the weakly typed model? so we will have to serialize integer properties ourselves?
- Yes and no, depending on who "we" are, because theoretically this can be wrapped by client library. There are also languages like JavaScript that doesn't really care (at least not much) what data type an object's property is.
- it's nice to have incHashValue, but it'd be even better to have more general transaction capabilities. any plans for that?
- Not in consideration currently, due to distributed nature of data objects.
- general-purpose queries based on property values are an important feature in most datastores, but they're conspicuously absent here. any plans to add them?
- In fact, you could, through FQLs. In FQL, any property can appear in WHERE clause.
- i'm unclear on the hash value that can be used to look up objects. is it just a string encoding of the fbid for use in e.g. URLs?
- It's an arbitrary string application defines. So, yes, it can be URLs, if an application elects to do so.
- associations are great! they seem like a wrapper around foreign keys, but they're set per-object instead of per-object-type, which is a really interesting difference from standard SQL foreign keys. i can definitely see how they were inspired by social networking use cases.
it is interesting that they're handled separately from properties, though, and that i have to set them on individual objects. it'd be nice to optionally be able to apply them to all objects of a given type, automatically.
- on a related note, what happens if i create an invalid association, e.g. one that violates a uniqueness constraint? the setAssociation(s) calls don't have return values, and there aren't any association-specific error codes.
- It returns an invalid operation error, although we may consider to special case the error code.
- are the association counting methods efficient? ie, are they roughly constant time? or linear? i can see how they'd be very useful for large associations, but if they're no faster than calling getAssociatedObjects() and counting the results, that makes them less useful.
- Constant time. This should be documented. There is one more difference from calling getAssociatedObjects() then count, because we have an upper limit on how many associated object ids to return. This is missing from documentation, too.
- what are the practical limits on scaling? how many objects could an app realistically store? how many object types? properties per object type? associations? etc.
- Facebook Data Store API is designed to be scalable, although we will have resource limit imposed to make fair use between different applications. Quota will be documented when we have them.
thanks! and of course, thanks for a fun platform.
- What will be the property length limit be for object properties (assumed to be a string)?
- 255
- Will there be FBML-level access to some of these preferences?
- Possibly.
- Is there any way to retrieve sorted and/or paginated and/or filtered subsets of objects from the store? I see you can query for a list of objects by IDs, but how about "give me all objects with property Foo greater than 100, and order them by property Bar, but only give me the first 20 of them" (i.e. SELECT * FROM mytbl WHERE Foo>100 ORDER BY Bar LIMIT 20). Is this supported or not an intended use?
- The problem with that query is that it is not scalable. When we have just a few objects, it is trivial to query like that. When application grows, object count may become millions or even billions, a typical pattern with platform applications. We want to avoid people writing queries like that against distributed databases. Not only it takes lots of resources on our side, but also it makes applications slow. Therefore, normally we would ask our developers to think twice on forming such a query. We do see necessity of supporting that in a sandbox-ed way. So potentially we may come up with special table types that are promised to keep small number of objects, allowing useful queries like this. For now, you may approach the problem by keeping track a separate list of objects that meet your criteria at object creation and modification time. For example, I can create a dedicated table to store top 20 of all object ids that have Foo>100. Then query would become SELECT * FROM top20_of_mytbl. This way, we add penalties to data writing side, which is normally okay, yet giving us tremendous boost on data reading side.
- It is not clear from the documentation how the value for map parameter needs to be specified for createObject method. In fact, while trying different combinations via API test console, I have almost reached the limit for use of console. It'll be great if somebody could clarify on it.
- The test console needs updated with complex data structure support, a bug we need to fix. But createObject() simply takes a name-value pair, or a map of string to string.
- It'll help if somebody can provide an example of using name-value pair, map and list until API documentation itself is updated to be more comprehensive. I noticed these are the new types other than complex type introduced with new Data API.
- Apologize for not being clear. Test Console is updated and we changed to use JSON for all complex types for easier encoding. Please read documentation index page, and we put notice up top for input examples. We will update PHP coding examples and individual documentation pages to make this clear as well, once we have PHP client library ready.
- It'll help if somebody can provide an example of using name-value pair, map and list until API documentation itself is updated to be more comprehensive. I noticed these are the new types other than complex type introduced with new Data API.
--584586303 07:53, 23 September 2007 (PDT)
Probably I'm just being dense... But how do we determine or define the hash keys for use with data.getHashValue, etc? Everything seems to assume that some hash mapping between hash keys and object IDs is already in place, but I can't see where it gets set up. Thanks.
- There is no need to set up a mapping. When setHashValue() is called, an internal object ID is created, if the hash key is a new one. An internal mapping is created. From then on, you may either access the object by the hash key you came up with, or the object id that's returned from getObject().
- I'm unsure of what you mean by this. Could you perhaps give an example of a workflow using the API commands? --Jason Reich 15:25, 25 September 2007 (PDT)
- I (OP) also don't understand what this means. Are you suggesting that accessing objects by hash key and accessing objects by object ID are actually two totally separate and distinct sets of objects with no overlap? I suppose this would make your answer make sense, but it's certainly not implied by the API docs or even particularly logical.
- Maybe I didn't understand the question. A hash key is also decided by the client. If you perceive your table as a hash table, you will use hash key to access data. Then you would never need to use object ids at all. If you perceive your table as a normal database table, you will use object ids to access data. Then you would never need to use hash keys. If you want to use hash keys AND object ids, which I don't personally know why, you could.
How do you format this 'complex' data type use in data.defineAssociation? Also, is there any way to make columns form an index so that they can be used to make a query indexable.
- Ahh, we actually have a bug with complex data types. NOTE: We found a bug with our complex data structure handling, and currently they may not work as expected. We will post an update when this is fixed. Apology for any inconvenience.
- The bug is fixed now. We changed to use JSON for easier encoding. Apologize for not being clear on this.
- As to indexing columns, it's only meaningful to a traditional database. With a distributed one, unless data are partitioned, there is no real performance gain having the index or not, because vast majority of query time spends on connecting to right database to find your data. Therefore, we may add index to columns in the future, but it will be most likely NOT on a distributed table. Object associations happen to be the solution to solve a subset of data indexing problems of distributed databases. By setting up associations, some index types are supported by querying the right associated objects. For example, if my query index is like "ZIPCODE = 12345", we can perceive an association between 12345 and my objects. Then finding objects meeting the query would just be finding objects that are associated with 12345.
--Jason Reich 15:25, 25 September 2007 (PDT)
General question: This new data store API is a neat concept, but does it make sense to use it right now practically? So it offloads data storage to FB servers (at the cost of development complexity), but apps still need to process each request on their own servers. And now those formerly 50ms SQL calls will turn into 500ms API calls, which means the CGI threads on the web server are tied up for that much longer (possibly orders of magnitude), which means more memory is required, which means each server can possibly handle even LESS traffic than before. And we're not in control of our data any more either. Given these issues, what is the upside to using this API at the present time?
- These are very good questions. We are at early stage of deploying data storage solutions to our developers, so advantages of using it is not obvious to people yet. At the very minimum, we want to provide a scalable solution that not all our platform developers can afford to have if it's their very early stage of application development. What we don't want to see is that an application is written to run on one or local servers of an application, and when it becomes popular, application response becomes slow due to unexpected load. In long run, we will integrate this service into application platform more deeply, so to solve the problems you pointed out (500ms vs. 50ms). In any case, you have full control of your data, and there is no way we would share your data with anyone else without your permission.
- It will be great to have a setAssociation implementation that takes hash value created using setHashValue instead of object id. Is there a plan to add something like that?
- This is an interesting idea. We will consider this. The only concern right now is that it might make interface a bit messier.
- It seems from an earlier response that the hash key is implemented as an additional index into the object. If that's the case, then this will need a parallel set of interfaces for " Association Data Access API" with "Hash Key(string object identifier)" instead of "Object Id" as input(s). Or may be it just seems that simple not knowing enough about underlying implementation. In spite of all complaints etc., thanks for the wonderful platform.
- You are right. It does make a lot of sense to use hash keys for associations. We are facing a choice of either coming up with a parallel set of interfaces, which is messy in terms of larger API surface, or augmenting existing association data structure with an additional "hash key" field, which may be confusing to people who don't use it. Since right now one can always grab object ids even from a hashed object, we are waiting for more input from other people to find out whether this is something most people want.
- I think if you don't introduce hash key based associations, most people might end up not using associations. The reason is that in order to use associations, the developer will have to store mapping from object ids to actual value locally. In that case, it might not add much value. In fact, I think it makes more sense to support associations using hash key rather than object id. Although that could be because I am missing something here.
- Actually, I think one very useful use case would be an association between user id and a string hash key.
- You are right. It does make a lot of sense to use hash keys for associations. We are facing a choice of either coming up with a parallel set of interfaces, which is messy in terms of larger API surface, or augmenting existing association data structure with an additional "hash key" field, which may be confusing to people who don't use it. Since right now one can always grab object ids even from a hashed object, we are waiting for more input from other people to find out whether this is something most people want.
- It seems from an earlier response that the hash key is implemented as an additional index into the object. If that's the case, then this will need a parallel set of interfaces for " Association Data Access API" with "Hash Key(string object identifier)" instead of "Object Id" as input(s). Or may be it just seems that simple not knowing enough about underlying implementation. In spite of all complaints etc., thanks for the wonderful platform.
--584586303 17:13, 29 September 2007 (PDT)
I'm trying to use the data API, and when I try to use data.defineAssociation, I get "invalid parameter: alias1". But there is no alias1 anywhere. Has anyone successfully used the data API like this? What am I doing wrong? Here's my code (DATA_TWOASYM is a constant I've defined as 3):
$facebook->api_client->data_defineAssociation("listitems", DATA_TWOASYM, array('alias' => "list"), array('alias' => "item"), "container");
I've also tried this:
$facebook->api_client->data_defineAssociation("listitems", DATA_TWOASYM, array('alias1' => "list"), array('alias2' => "item"), "container");
But I get the same thing. --42904324 17:16, 12 October 2007 (PDT)
- It seems to me it's related to complex data encoding, last minute change that might be buggy. We are looking at it right now.
- Thanks! I've filed this as bug 365 so we can keep track of it better. --42904324 05:02, 15 October 2007 (PDT)
- This should be fixed now. Thanks for reporting this!
[edit] Good in theory, but not in practice
While I appreciate the idea of trying to lower the barrier of entry for new developers by allowing them to have their database schema defined through the Facebook platform API and hosted on Facebook servers, I don't see how the current approach will ever really be viable/useful. By trying to provide a sort of "poor man's RDBMS" that can be used in a flexible/generic way, you've created an API that is extremely clunky and obtuse. The problem is that any developer clever enough to understand it and make use of it effectively is also going to be clever enough to set up their own RDBMS, if they don't already have one anyways (which they probably already do), and also clever enough to grasp that there are quite a large number of benefits inherent in doing so (from easier setup, to not having to deal with network latencies whenever making a request, to *far* greater flexibility in terms of implementing queries to retrieve the data, to the various built-in SQL functions that the Data API does not provide analogs for, and so on).
Another rather severe limitation is that the Facebook client (at least at present, as far as I can tell) has no way to remember the database schema created by the developer, which can make for a maintainence nightmare over time. Ideally the developer is a ble to define custom object types to correspond to the table entries being used, but if using the Data API, there's no way to guarantee that those custom objects will correctly jive with the schema that is hosted on Facebook's end, apart from just hoping that everything has been done correctly (which is something that you can rely on sometimes, but only if you're *really* good).
The one exception is the user-preference API. I think this subset of the Data API is clean and straightforward enough that it could actually be useful to developers. It provides simple functionality that is probably generally useful to most apps, and doesn't try to overreach itself. I think a better approach for the Data API might be to just extend this subset (first by extending the namespace used for the pref_id to allow for much more than just 200 things, and also preferably by making it a string instead of a number), and add similar functionality to cover other areas (for example, there could be the existing API that allows for per-user named properties, and also an API that allows for per-application named properties, etc.). Then, developers who want to build simple apps without having their own application server can do so, using the 'Properties' API to store their application data. It doesn't support the kind of schema that the current Data API does, but at the same time it's much more straightforward to use. And besides, once an application is complex enough that it needs the more advanced kind of schema features supported by the current API, it's complex enough that it should really be using a locally hosted RDBMS anyways, and not one that is hosted on Facebook's servers.
If Facebook really intends for application developers to host complex database schemas on their servers, then I suggest just providing full-fledged application hosting, with shell account access and the whole nine yards. That way developers can just log in, define their schema the normal way, implement their application (complete with custom queries and ORM entities and everything else), and have the entire thing hosted and running on Facebook, without having to worry about the costs of setting up an application server, or the costs of having to scale things up should the application ever become popular. I think that would solve the stated problem of reducing entry costs and scalability issues for developers better than the current version of the Data API might, as it would provide better performance in general (the code and database would live on the same machine, and not only that, but presumably the API server would just be a short network hop away, providing a benefit to all other parts of the application that involve Facebook API calls as well), be simpler for people to use/configure, and also solve the problem that even with the Data API, a developer still needs an application server to host the app itself, and they could still run into scalability issues if their app became popular (the database isn't the only source of scalability problems).
[edit] Java Client Methods for Data Store
I am trying to access the Data Store methods from a Java client via the Facebook Java Client Library. Because the Pair class in that library has package-level access only, I can't extend the FacebookXmlRestClient in my own application package to use the callMethod(IFacebookMethod method, Pair<String, CharSequence>... paramPairs) for accessing the Data Store. When will a new Java library with Data Store access methods appear?
- Please file this as a feature request, then we may sort this out to have someone working on it. Thanks.
[edit] C# / .NET Client API for Facebook Data Store
I've just released a (basic, but better than nothing) wrapper for the API. The source code is up on Codeplex, and you can download it at the following Url:
http://www.codeplex.com/facebookdatastoreapi/
Feel free to leave feedback, feature requests and issues/bugs.
Cheers,
joel
