File System/List/TEP

From WebOS

Jump to: navigation, search

These TEPs are very experimental, and are to be based on the research at File System/List.

They may be implemented in code to see how practical they are.

Contents

[edit] Stephen Paul Weber

[edit] Notes that lead to TEP

Box.net structure nests files in folders in XML. This is incompatible with flat-tagging. Flat-tagging is not incompatible with nested folders.

Box.net returns the following information: id, file_name, keyword, shared, size, created, updated, tags (folders have id, name, shared) [1]

del.icio.us public XML API is RSS 1.0. Dublin Core is used for much of the metadata (as is normal in RSS 1.0). Data provided: title, link, description, creator, date (created), tags as text, tags as taxo:topics. Tags are never returned as items. [2]

YouOS returns both files and directories, filtered by directory or tag. OPML-like XML structure. Attributes on file or directory elements: mimetype, isdir (True or False), tags (comma-separated), filename, last_updated_date, path, size. [3] [4]

Omnidrive returns both files and directories. Directory elements have this data: name (attribute, end of path), fullname (path), created, modified, accesstype (permissions?), live (is a 'live folder'), publicurl. File elements have this data: name (attribute, end of path), fullname (path), size, created, modified, filetype, fileextension, accesstype (permissions?), live, publicurl [5]

Google Docs provides partial data in an RSS 2.0 feed. Files only. Data provided: title, link, description, guid, pubDate [6]

Zoho provides documents only in XML. Data provided: document_name, document_id, version, document_name_url, author_name, created_date, document_trashed, lastmodifiedby, shared_users, document_locked, document_blogged [7]

Common set of fields:

  • ID (in all but del.icio.us)
  • name/title
  • created date

Useful among full storage services:

  • type
  • permissions
  • updated date
  • size?

Other:

  • some way to tell if an item is file or folder

Observation : many other web2.0 (/webOS?) services use RSS/ATOM for listings or partial listings. Flickr (and relatives), YouTube (and relatives), Ning, Google's GData, Craigslist, etc.

[edit] Main Proposal

URL format GET : <endpoint>/<directory_or_tag_id>

A directory id may be any string. In essence this can be any URL. Hierarchy MAY be represented by / . GET strings are not allowed. This can work because parsers should have the full ID of all sub-elements in a hierarchy and can thus request that, without needing a particular structure.

GET : <endpoint> MUST work. It MUST return a listing of all items in root (be that files, folders, files and folders, virtual folders of usernames, or just a blank list)

Redirections (HTTP Location headers) MUST be followed.

It is easy enough to make certain tag names immaterial to a parser. The root tag can be named anything. There is no necessary children of the root element besides items. Items can be either files or folders. Item tag names can also be anything. If the root tag and item tags are chosen properly this XML structure is understandable to many RSS 1.0 parsers.

Dublin Core has created a standard metadata-in-XML representation. It is unnecessary to duplicate work.

RSS 1.0 and 2.0 are both used for data listings in many places. As date-based feed formats they are inappropriate for this API, but compatibility should be maintained, if possible, for reading them.

Fields items MUST have (dc: prefix when using Dublin Core) : dc:identifier, title, dc:created

Fields items MAY have : mime, permissions (UNIX-style number, ie, 644), dc:modified, size

mime defaults to inode/file if empty. Tags/directories are inode/directory

Permissions - owner is authenticated user (if there is one), group is all those invited to share the item (if supported by system), other is everyone. If not specified, permissions inherit from parent directory. If not specified all the way up the chain (or is root) then defaults to 600 (owner read and write, others nothing)

Any Dublin Core data MAY be present.

Dublin Core data MAY be mapped onto other fields (dc:date to dc:created, dc:title to title, etc.)

For cross-compatibility, a link item MAY be present. If it is, everything after the last / SHOULD be the dc:identifier and before it MAY be used as the File System/Download endpoint (if such is not known otherwise). The link item MUST be a direct download URL for the item.

For cross-compatibility, any node that does not contain dc:identifier, or something that implies it, SHOULD be ignored.

For cross-compatibility, if there is only one child node to the root node, and that node contains grand-children, then that node SHOULD be treated as the root node (think RSS 2.0).

For cross-compatibility, pubDate MAY be treated as dc:created.

For cross-compatibility, enclosure MAY be treated as link.

RSS 1.0 and 2.0 should mostly work as-is with this spec, and just be messy implementations of it. Translators for ATOM, hAtom and other feed formats may be beneficial.

[edit] Example Data

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
   <item>
      <dc:identifier>/dir/filename.ext</dc:identifier>
      <title>filename.ext</title>
      <dc:created>DATE OF CREATION</dc:created>
   </item>
</rdf:RDF>

[edit] Database

Database records may be listed using this TEP. Any attributes not defined in the TEP itself SHOULD be stored by the parser for use the application. Database fields may then be encoded at item-level. To prevent namespace conflicts a db: namespace should be used. Like so:

<list>
  <rec xmlns:db="http://webos.singpolyma.net/File_System/List/TEP#Database">
     <dc:identifier>id</dc:identifier>
     <title>title</title>
     <dc:created>date</dc:created>
     ...snip...
     <db:field1>data</db:field1>
     <db:field2>data</db:field2>
     ...snip...
  </rec>
</list>

No third-level node (ie, child of item node) may contain XML. More complex data should be encoded in an XML-safe string.

'Virtual folders' whose names are query-type strings (up to, and including, SQL SELECT statements and the like) are quite legal, and give a full database-query usability to this. There is nothing 'standard' about such a practice at this point. All folders/tags are assumed to be only themselves, but may be nested, unioned, intersected, etc without problem. At least a basic query TEP may eventually be added as an extension to this.

If content is not in the appropriate element, it may always be accessed at the File System/Download endpoint or the link element endpoint.

[edit] Extensions

Adding a file-extension-like ending to the URL MAY result in data transformation. For example:

http://example.com/list/directory/ << returns XML as above
http://example.com/list/directory.json << returns JSON, supporting ?callback=

If there is a directory named directory.json, use directory.json.json

If there is a directory named directory and one named directory.json, this fails

There is another suggestion at XRDS

-- singpolyma 20:28, 24 March 2007 (PDT)

[edit] Stephen Paul Weber - 2

[edit] Dublin Core TEP

All appropriate data is possible to encode using the Dublin Core standard. That could be used straight as an XML format. See my first TEP above.

-- singpolyma 20:28, 24 March 2007 (PDT)

Personal tools