Web File-Manager

A proposal of a web application

A web file-manager is proposed, mostly in the form of a JavaScript demo-application. A simple naming structure is designed, utilizing the concept of web symbolic links. The demo-application focuses on request-response logic of naming structure manipulation. There are 6 types of operation presented: the usual F5–F8 copy, move, mkdir and delete operations and the special, link-related, mklink and edlink.

Operation logic is described via characteristic cases which are organized in specially tailored tables.

The demo-application can be considered as part of the design of a real web application.

Online demo

The demo application is best accessed via project tables, see Projects. Button-like links are provided below.
Author
Ondřej Pavlata
Jablonec nad Nisou
Czech Republic
Document date
Initial releaseJuly 15, 2011
Last major release July 15, 2011
Last updateJuly 15, 2011
Warning
  1. This document has been created without any prepublication review except those made by the author himself.
  2. The author is a habitual user of Total Commander [].

Table of contents

Introduction

Advocacy of hierarchical filesystems
Over the last 30 years, the concept of a hierarchical filesystem has been the most comprehensible, and widespread or even the most used model of data storage. The model can be understood as an abstraction of spatial containment. Similarly to the physical world, a directory might mean either of the following:
  1. a container,
  2. a container plus its contents.
Note that this ambiguity also applies to the semantics of words like box [], or pack [] (presumably, most of the human languages contain ambiguities of this sort).

Operational semantics follows the physical world conventions:

Note: The Linux Virtual Filesystem contains 3 features that might break the above rules: multiple hardlinks, bind-mounts and symbolic links. However, there is an underlying low-level structure, which is not affected by the latter two features, so that only multiple hardlinks can be regarded as hierarchy defects. We consider such defects of being of low importance - they can be avoided by simply not using multiple hardlinks.

In an ever increasing measure, limitations of hierarchical filesystems have been recognized, in particular that of the single classification constraint: Users are forced to choose just one of possibly many classifications for the filed data. The problem is illustrated in the following example.

Example: Consider a hierarchy, part of which looks like in A or B.

A : 1-dimensional structure B : 2-dimensional structure
  ↓ Sibling directories
agency-x
Things related to agency X
causa-rabbit
Things related to the Rabbit causa
course-en
Things related to my English course
my-work
Things written by myself
info
Information (web) resources
 
Source →
my-work
    info    
 Topic ↓

agency-x

  • report-12


causa-rabbit
course-en
Suppose that a report has to be worked out as part of a collaboration with agency X. Suppose that this is a complex report that does not fit suitably in a single filesystem object - it is rather a directory, say report-12, with multiple file content. In the A case, the problem is that both agency-x and my-work are candidates for where report-12 should be put to. In the B case, the problem is that artificial order must be imposed to the multiple criteria (here Source and Topic) so that the report is then stored either in agency-x/my-work or my-work/agency-x.

Numerous solutions have been proposed to support multiple classification within file systems, giving rise to so called semantic file systems. Two main approaches can be distinguished [], []:

Experiences in the field of semantic file systems show that the idea of bypassing hierarchical filesystems is not very reasonable. Here is a quotation from the conclusion made by Yoann Padioleau, one of the researchers behind the Logic File System []:
Trying to replace hierarchical filesystems by a Logic File System was a noble research idea, but a bad idea. Using LFS as an additional filesystem, an additional way to access your actual hierarchical filesystem is far better; less ambitious, but better.

We can draw our conclusion as follows:

Conclusion: Despite the limitations, there are reasons to assume that hierarchical filesystems remain in widespread use over the next decades.

Filesystem of the Web
There are 3 main alternatives of what can be understood under the term hierarchy: We can write (A) ⊂ (B) ⊂ (C), meaning that (A) is the most strict sense.

If we use the (B) interpretation we can arguably claim that a significant portion of the World Wide Web is constituted of (or is expressible as) a hierarchical filesystem. A global filesystem – just like the Web itself is global. The Web can be considered as a huge storage medium consisting of web servers. Most servers host hierarchical filesystems or at least contain data that can be meaningfully expressed (and manipulated) as hierarchical filesystems.

To discover how the concept of the Web as a filesystem is currently supported we pose the following questions:

  1. Which standards (documents, research papers, concepts) do describe the filesystem view of the Web?
  2. Which web application does provide global filesystem access of the Web? (Give the URL of such an application.)

As of 2011, the answers to these questions seems to be: 1. None, 2. None.

Note: The WebDAV method-set [] declares itself as a Network File System for the Internet []. However, WebDAV only supports communication to/from a single web server. It does not provide a unified view of the Web as a global filesystem. Speaking in Linux/Unix/POSIX terms, WebDAV is just a low-level filesystem. But we are concerned in the high-level filesystem here, in something which can be considered the Virtual File System of the Web.

Main goals
This proposal aims at providing a filesystem view of the Web. It is considered natural that such a view is provided by a web application – a Web File Manager accessible via a web browser. Main goals for the file manager can be summarized as follows.

 

Data model

Data is categorized into layers and spheres. This yields a 2d-partitition as follows.
Spheres →
Session data
local
World-Wide-Web
 Layers ↓
Account layer (accounts)
Base layer (base nodes)
The base layer
The base layer is a huge web-wide forest of base nodes (or just nodes). Thus, it can be written as a structure (N, ), where Three basic types of base nodes are recognized: As usual, non-directory nodes are not allowed to have descendants (i.e. they have to be among the leaves of the forest).
The account layer
The account layer is a web-wide set of hosts and accounts, denoted H and A, respectively. There is the usual containment relation between hosts and accounts: each account is contained in exactly one host. A forest structure (A, ) might be assumed, with per-host components, but in comparison with the base forest, the descendancy of accounts is supposed to be quite shallow and less important.

Accounts are partially mapped to directories of the base forest via the target-directory function .t : A N. There can be accounts with the target directory undefined.

There is an initial account a for which .t is always defined.

Most accounts have a connection state: they can be connected or disconnected.

Contexted nodes
We introduce the set Ѧ of (account-)contexted nodes by i.e. a contexted node (a, n) is a base node n together with the context of an account a such that n belongs to the tree rooted at a's target directory. Obviously, the structure (Ѧ, ) where is a component-wise finite forest.
Names
Accounts and base nodes are labelled with component names or just names (or also components). The naming domain Σ consists of three mutually disjoint domains of component names: The following conditions apply to name labelling:
  1. Each non-root base node (i.e. any node which has a parent directory) has a local name which is unique within its parent directory.
  2. Each account has a unique global name.
  3. The initial account is named by ::.
Global names
Global names are of the form
<host name>::<user name>
where
Lookup
Navigation between contexted nodes is established using lookup operators .ℓ1(), .ℓ2() : Ѧ × Σ Ѧ as follows.

Notes:

Pathnames
Pathnames or shortly paths are just (finite) sequences of component names, denoted Σ. There is a usual nomenclature of pathnames, in particular,
Pathname resolution
Paths are used to address nodes by path resolution which is a function based on composition .ℓ2() lookups.

There are several kinds of path resolution. The most important criterium is how links are involved. Similarly to POSIX symlinks, any link node n provides a path, n.lpath. If the link is encountered, this path is potentially subject to the resolution process. The following modes of link resolution (of path resolution) are considered:

See appendix for details.

Resolvability of absolute paths is independent on the directory where the resolution starts.

The logical forest
The logical forest (or logical view, cf. [], []) corresponds to canonical paths resolvable in normal link resolution, i.e. it is a structure (Ͼ, ) where
Links
Links are similar to POSIX symbolic links. The main differences are as follows: There are two basic kinds of links: plain and non-plain.

A plain link just contains the lpath. The content of a non-plain link is roughly of the form

<global name> + <connection parameters> + <subsequent path> .
The components <global name> and <subsequent path> determine the lpath. Connection parameters provide data to accomplish global name resolution pertinent to <global name>. In particular, they may contain Note that the only way how the connection parameters can affect the path resolution is (un)restriction.

Non-plain links provide a persistent counterpart to a login form. The persistency is up to the password. By default, the password is not stored with the link - it is only used for establishing account connection.

VFS spheres
The VFS operates on world-wide-web data. However, part of the data structure is a session data that is only pertinent to the VFS instance. In particular, the root account (named by ::) and the relevant base node tree is transient.
Data manipulation (VFS methods)
The VFS provides methods to view and modify its data.

The following table lists modification methods which are considered mandatory to the manager.

Method Name Brief Description Analogues
HTTP / WebDAV POSIX Utility []
delete Delete items from the VFS.
For each item, resolve it to a tree of the base forest, and delete the tree, node after node.
DELETE rm
copy Copy items to a given location.
For each item, resolve it to a tree of the base forest, and copy the tree, node after node, to the requested location.
COPY cp
move Rename or move items.
For each item, resolve it to a tree of the base forest, and either rename the root of the tree or move the tree, node after node, to the requested location, or both.
MOVE mv
mkdir Create a directory or directories with the requested (path)name. MKCOL mkdir
mklink Create a new link and store it in the requested location. ln -s   (*)
edlink Update a link - edit the content of an existing link.

Notes:

 

WFM operation

WFM versus VFS
The Virtual Filesystem can be considered a sub-application of the Web File Manager. The WFM can be understood in both the broad and narrow sense:
WFM (broad sense)
WFM (narrow sense)
VFS
The file manager allows a user to view and/or modify the VFS. The manager and the VFS act in the client-server manner. The VFS receives requests from the manager, processes them and returns responses.
Tasks and turns
The manager provides VFS manipulation via manipulation tasks or simply tasks. A task can be considered interactive counterpart to a VFS method. The interactivity is achieved by the following: Task operation is illustrated by the following diagram:
File Manager Directory View

(select items, start a task)
¦
¦
¦
Dialog
Request
Response
VFS
Request
Response
process
(apply a VFS method)
Task iteration cycles are called turns. The VFS works on turn basis: performance of a VFS method corresponds to a part of a single turn (green-bordered boxes). The WFM (in narrow sense) is responsible for

 

Demo application

Main goals
Technical parameters
Browser support
The application has been developed for use in the newest (as of 2011) Gecko or WebKit based browsers. The following provides browser compatibility status according to tests.
BrowserVersionStatus
Firefox 3.6
4.0
Google Chrome (12.0)
Safari 5.0
Opera 11.11 Usable, with deficiencies (most notably, there are screen update problems).
Internet Explorer 7.0
8.0
Usable, with layout deficiencies.

Projects

A project is a data entry containing complete VFS + manager state. The demo-application is equipped with a set of projects such that each project exhibits some characteristics of operational logic. Projects are organized in a naming hierarchy, and can be accessed / applied via project tables which are tailored for reasoning about operational logic.
Path resolution
PathResol > * > *
The main table contains the following items:
PathResol > * > * > Tail
The table refers to cases which can occur when path resolution stops: (the conditions are required to hold simultaneously) The unresolved rest of the path is called path tail.
PathResol > * > * > RootUp
The table refers parent lookup faults.

Note: The WFM proposal interprets '..' in the strict sense – roots have no parents.

PathResol > * > * > RootName
The table refers to the very unusual cases when the path (directly) contains a root-name.
Link manipulation
Link > * > *
The table refers to cases of link creation / modification.
Operation
Oper > *
Oper > * > Overwrite
The table refers to cases of name conflicts encountered in copy/move operations.
Oper > * > StartPos
The table refers to cases of start position within the first item's (source) tree.
Oper > * > StartPos > Point
The table refers to cases of visit logic. During tree deletion, copy or move, tree nodes are both pre-visited and post-visited. Node copy occurs on pre-visit, node deletion occurs on post-visit.
Basic tables
For the most part, the following tables present a re-grouping of already described cases.
Delete > Basic
MkDir > Basic
The table refers to cases of directory creation.
* > Basic
* > Overwrite
The following can be considered another examples of already described cases.
* > Links
Default
MultiTask
DirView
The table refers to cases of directory view state.

 

Bibliographic references
Stephan Bloehdorn, Max Völkel, TagFS – Tag Semantics for Hierarchical File Systems, 2006,
Lisa Dusseault, WebDAV: Next-Generation Collaborative Web Authoring, Prentice Hall PTR, 2003,
L.M. Dusseault (Ed.), HTTP Extensions for Web Distributed Authoring and Versioning (WebDAV), RFC 4918, 2007, http://www.webdav.org/specs/rfc4918.html
Methods: DELETE , COPY , MOVE , MKCOL
Farlex, Inc., The Free Dictionary, http://www.thefreedictionary.com/
R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, Hypertext Transfer Protocol -- HTTP/1.1, RFC 2616, 1999, http://labs.apache.org/webarch/http/draft-fielding-http/rfc2616.html
Methods: DELETE
Christian Ghisler, Total Commander, http://www.ghisler.com/
Burra Gopal, Udi Mamber, Integrating Content-Based Access Mechanisms with Hierarchical File Systems, 1999, http://www.cis.upenn.edu/~bcpierce/courses/dd/papers/gopal.pdf
IEEE and The Open Group, IEEE Std. 1003.1-2008: Portable Operating System Interface (POSIX) Base Specifications, Issue 7, 2008, http://pubs.opengroup.org/onlinepubs/9699919799/
Utilities: cd , rm , cp , mv , mkdir , ln
Georges N'dou Kouame, Kouassi N'goran, Semantic file Systems, 2005, http://stromboli.it-sudparis.eu/~bernard/ASR/04-05/projets/systeme-fichiers-semantique/systeme-fichiers-semantique.pdf
David Ingram, Insight: A Semantic File System, 2008, http://www.dmi.me.uk/code/insight/final-report.pdf
Microsoft Corporation, TechNet, http://technet.microsoft.com/
Object Services and Consulting, Inc., Semantic File Systems, 1997, http://www.objs.com/survey/OFSExt.htm
Yoann Padioleau, Logic File System, (wiki), 2008, http://padator.org/wiki/wiki-LFS/doku.php
Yoann Padioleau, Olivier Ridoux, A Logic File System, 2003, http://www.usenix.org/event/usenix03/tech/full_papers/full_papers/padioleau/padioleau.pdf
Ondřej Pavlata, The Linux VFS Model: Naming structure, 2011, http://www.atalon.cz/vfs-m/linux-vfs-model/
Chet Ramey, Bash - the GNU shell, ;login: the USENIX Association newsletter, December 1994, http://tiswww.case.edu/php/chet/bash/article.pdf
Tx0, Tagsistant, 2010, http://www.tagsistant.net/
J. Whitehead, G. Clemm, J. Reschke, Web Distributed Authoring and Versioning (WebDAV) Redirect Reference Resources, RFC 4437, 2006, http://greenbytes.de/tech/webdav/rfc4437.html
R. Wille, Restructuring lattice theory: an approach based on hierarchies of concepts, 1982,
Epilogue
Don't deceive yourself. You did know it – you have always known it.
George Orwell, 1984
License
This work is licensed under a Creative Commons Attribution 3.0 License.