Infinote protocol

Please note: This document is by no means complete. Comments appreciated.

TODO: Session example(s)

TODO: List error domains and codes

TODO: Max-Total-Log-Size

The basic Infinote protocol uses XML and can be used over any transport that can deliver XML messages between hosts, such as XMPP.

Terminology

  • Host: A machine within a network.
  • Directory: Instead of just a set of documents (as in obby), Infinote supports a hierarchical directory structure of documents. This structure is referred to when we talk about the 'Directory'.
  • Node: A 'node' is a subdirectory or document in the directory.
  • Explore: A subdirectory node can be explored by a client. When a client explores a node, the server sends it the nodes the subdirectory contains and notifies it of changes if nodes are later added or removed in that subdirectory.
  • Session: A currently running editing session in which a document is (collaboratively) modified. A session is related to a document node in the directory. Note that in a session, only a single document can be edited. However, it is possible to subscribe (see below) to multiple sessions.
  • Synchronization: The process of copying a session's state from one host to another is called Synchronization.
  • Subscription: A host is called subscribed to a session when it receives session updates when users change the document.
  • User: When a host is subscribed, it can join a User into the session. Only users can issue requests (see below)
  • Request: This has actually two different meanings, based on the context:
    • When a user modifies the document, it sends a request to all subscribed hosts to inform them about the change of the document.
    • A client may make a request to a server, waiting for a response. In all such requests the client may set the seq attribute to some integer value. The server will answer the request with the same seq.

TODO: Distinguish between those two meanings of "request". I thought of calling the document-modifying request a "record" (which would however conflict with the record that can be made of a session, and be replayed later).

Infinote allows especially some perhaps not-so-intuitive scenarios, such as:

  • Client-to-Server-Synchronization: A client synchronizes a session to the server to be stored in its directory.
  • Synchronization-Only: A session can be synchronized to a client without the client subscribing to the session.
  • No-User-Subscription: A host can subscribe to the session without joining a user (and therefore unable to modify the document). This is read-only access, in fact.
  • Multiple-User-Subscription: When subscribed, a host can join multiple users into the session. This allows a client to act as a proxy for more clients that only can connect to the proxy but not to the "real" server.

Groups

All communication is performed within so-called groups. A group is identified by its name and publisher (see below). Hosts can be member of a group. Groups are basically used to abstract away connections between all subscribed users of a session. A group guarantees that it is possible for each member to send a message to all others members, but it depends on the Communication method on how the message is actually delivered. This can be a simple client/server architecture where the server relays all the client's messages, or it can be some form of peer-to-peer or multi-user chat. See more on communication methods below.

<group name="InfDirectory" publisher="armin@0x539.de">
 <more-content />
</group>

The receiver of the message needs to process the message, and decide about the scope of the message. The scope of a message is either "point-to-point", or "group". "point-to-point" means that the message was exclusively for the receiving entity, not for the whole group. If the scope is "group", however, then the host needs to make sure that all group members receive the message (again, how this is actually done depends on the communication method).

Each group has a publisher which is the host that opened the group. For each network that the host is part of, the host needs a unique identification string. For example, a host could be part of both a TCP/IP and a Jabber network. Then, the publisher ID could be its JID on the jabber network and its IP and Port number (in a normalized form) on the TCP/IP network. The term network here refers to the fact that members of a network cannot technically communicate with members of another network. So, if a message contains a given publisher string, all potential recipients can identify the corresponding host.

If the publisher field is not given, then the sending host is refered to. Also, the publisher field can be the special values "you" or "me" to refer to the sending or receiving host, respectively.

The following networks are currently defined:

  • TCP/IP: For TCP/IP communication, "$ip_address:$port" are used as publisher strings. Examples are "127.0.0.1:6523", or "[::1]:6523". Before sending XML messages between hosts an XMPP stream including encrypting the communication via TLS and authenticating via SASL is performed, as described in RFC 3920.
  • Jabber: JIDs are used as publisher strings.

Communication Methods

A communication method defines how a message from a group member is delivered to the other group members. Each method is identified by a name. Currently, the following methods are used:

  • central: The central method sends all group-related messages to the publisher, and the publisher relays the message to all clients. This means that when the publisher has lost its connection to the network, then group messages can no longer be delivered.
  • more to follow (peer-to-peer, groupchat, ...)

Enhanced Child Text

Some XML nodes allow child text which contains <uchar codepoint="uuuu" /> children, with uuuu being a unicode character point in decimal notation. This is to be interpreted as if the child text contained the given unicode character at the position of the <uchar /> node. This allows transmission of characters that would otherwise be invalid XML, such as null bytes. Below is explicitely stated whether a message allows such enhanced child text or not.

Directory messages

This section describes the messages an infinote client or server need to handle for the directory of documents. With server, we mean a host that exposes its directory and allows others to edit the contained documents collaboratively. The client is the remote counterpart, it can browse the server's directory and decide to edit documents hosted on the remote server.

In general, any node within the directory is associated a unique ID which is an unsigned integral number. The ID can be used to refer to a specific node in the tree. The root node always has ID 0. Also, any node has a type. Nodes of type "Subdirectory" are simply containers for more nodes, nodes with other types are document nodes. For example, nodes representing a plain text document have type "InfText?". Other node types might be implemented in the future.

Each directory can also have a global chat session associated with it that clients may subscribe to. Note however that this is a very basic chat only. If people need anything more fancy they are adviced to use IRC or IM instead.

Directory handling is done within the 'InfDirectory?' group using the 'central' method. All directory messages are of point-to-point scope.

Client Side

The following messages are defined on the client side, meaning an infinote server needs to handle these. Apart from the explained messages the server replies with, the server might reply with <request-failed> to any request (see Server Side below for a more detailed explanation of that message).

<explore-node id="node_id" seq="seq_id" />
  • id, Integer: The node ID to explore. The ID needs to refer to a subdirectory type of node.
  • seq, Integer, optional: An optional sequence number for the request. The server reply will have the same seq set.
  • Scope: point-to-point

Explores the node with the given ID. This means that the client requests a list of all nodes contained in the refered subdirectory node. The server replies with single <explore-begin> message, then an arbitrary amount of <add-node> messages and a final <explore-end> message. All of these messages have seq_id for the sequence number set, if seq was provided.

It is only allowed to explore a node once. However, when having explored a node, then the server notifies the client about about changes such as node additions or removals within that subdirectory.

<add-node parent="node_id" type="Type" name="Name" seq="seq_id"><sync-in /><subscribe /></add-node>
  • parent, Integer: The node ID of the parent node. This must refer to a subdirectory node.
  • type, NodeType?: The type of the node to create.
  • name, String: The name of the new node. Must not contain the '/' character.
  • seq, Integer, optional: An optional sequence number for the request. The server reply will have the same seq set.
  • <sync-in />, optional: If <sync-in /> is set, then the client will requests to provide initial content for the document, otherwise the server creates an empty document. This is only valid if type is not "InfSubdirectory?".
  • <subscribe />, optional: If <subscribe /> is set, then the client requests to initially be subscribed to the session the new node represents. This is only valid if type is not "InfSubdirectory?".
  • Scope: point-to-point

Adds a new node to the server's directory. The server replies with <add-node> or <sync-in>, depending on whether <sync-in /> is set or not.

<remove-node id="node_id" seq="seq_id" />
  • id, Integer: The node ID to remove. If it is a subdirectory, all children are removed recursively.
  • seq, Integer, optional: An optional sequence number for the request. The server reply will have the same seq set.
  • Scope: point-to-point

Removes a node from the server's directory. The server replies with <remove-node>

<subscribe-session id="node_id" seq="seq_id" />
  • id, Integer: The node ID of the document whose session to join. Must not refer to a subdirectory node.
  • seq, Integer, optional: An optional sequence number for the request. The server reply will have the same seq set.
  • Scope: point-to-point

Requests subscription to the session refered to by node_id. The server replies with <subscribe-session>

<subscribe-chat seq="seq_id" />
  • seq, Integer, optional: An optional sequence number for the request. The server reply will have the same seq set.
  • Scope: point-to-point

Requests subscription to the server's global chat. The server replies with <subscribe-chat>.

<subscribe-ack id="node_id" />
  • id, Integer, Optional: The node ID of the document whose session to join. Must not refer to a subdirectory node. The field is omitted when this refers to a chat subscription.
  • Scope: point-to-point

If the server requested the client to subscribe to a session, via <subscribe-session> or <subscribe-chat>, then this is the client's reply if it accepts the subscription. The client is considered subscribed after having sent <subscribe-ack />.

<subscribe-nack id="node_id" />
  • id, Integer, Optional: The node ID of the document whose session to join. Must not refer to a subdirectory node. The field is omitted when this refers to a chat subscription.
  • Scope: point-to-point

If the server requested the client to subscribe to a session, via <subscribe-session> or <subscribe-chat>, then this is the client's reply if it rejects the subscription. This might be the case when the client does not support the communication method for the session.

Server Side

<request-failed domain="error_domain" code="error_code" seq="seq_id">
 <text>Human-readable text</text>
</request-failed>
  • domain, String: The domain where the error came from. The most important domain is "INF_DIRECTORY_ERROR".
  • code, Integer: A numerical error code identifying what went wrong. The meaning depends on the error domain.
  • seq, Integer: The seq_id the client had set in the original request that failed, if any.
  • text, optional: Human-readable error message in the server's language. Does not allow enhanced child text.
  • Scope: point-to-point

This is sent if a client request could not be processed, for example because it refered to a directory node that does not exist.

TODO: List all possible error domains and codes somewhere.

<explore-begin total="num" seq="seq_id" />
  • total, Integer: The number of nodes in the explored directory. This can be used to show a progress bar on the client.
  • seq, Integer: The seq_id the client had set in the original <explore-node> request, if any.
  • Scope: point-to-point

Server reply to a <explore-node> request. This is followed by total <add-node> and one <explore-end> message.

<explore-end seq="seq_id" />
  • seq, Integer: The seq_id the client had set in the original <explore-node> request, if any.
  • Scope: point-to-point

This is sent when all <add-node> messages that belong to the original <explore-node> request have been sent.

<add-node id="node_id" parent="node_id" type="Type" name="Name" seq="seq_id">
 <subscribe group="group_name" method="method_name" />
</add-node>
  • id, Integer: The node ID of the added node
  • parent, Integer: The node ID of the parent node (which is a subdirectory)
  • type, NodeType?: The type of the new node.
  • name, String: The name of the new node.
  • seq, Integer: The seq_id the client had set in the original <explore-node> or <add-node> request, if any.
  • <subscribe />, optional: If set, then the client is considered subscribed to the newly created session (in which case type must not be "InfSubdirectory?").
    • group, String: The group name of the subscription group for the session. The subscription group is the group for all connections subscribed to a session. The publisher of the subscription group is always the same as the one of the InfDirectory? group.
    • method, String: The communication method that is used to deliver messages in that group.
  • Scope: point-to-point

The server notifies the client that a new node has been added to the directory, within a subdirectory that the client already explored. Also used when a client currently explores a subdirectory.

If <subscribe /> is set, then the client is subscribed to the session that the new node refers to. This is normally only used if the client explicitely requested subscription in its <add-node> request. Therefore the session is _not_ synchronized to the client, since both sides know the content already anyway (the session is initially empty). If <subscribe /> is set, then the client needs to reply with <subscribe-ack /> or <subscribe-nack />.

<sync-in id="node_id" parent="node_id" type="Type" name="Name" method="method_name" group="group_name" seq="seq_id">
 <subscribe group="group_name" method="method_name" />
</sync-in>
  • id, Integer: The node ID of the added node
  • parent, Integer: The node ID of the parent node (which is a subdirectory)
  • type, NodeType?: The type of the new node (this is never "InfSubdirectory?")
  • name, String: The name of the new node
  • seq, Integer: The seq_id the client had set in the original <add-node> request, if any.
  • group, String: The group name of the synchronization group. The synchronization group is used to synchronize the initial session content from the client to the server. The publisher of the synchronization group is always the same as the one of the InfDirectory? group.
  • method, String: The communication method used to deliver messages within the synchronization group.
  • <subscribe />, optional: If set, then the client is considered subscribed to the newly created session.
    • group, String: The group name of the subscription group for the session. The subscription group is the group for all connections subscribed to a session. The publisher of the subscription group is always the same as the one of the InfDirectory? group.
    • method, String: The communication method that is used to deliver messages in that group.
  • Scope: point-to-point

This can only be sent in response to a <add-node> request from a client with <sync-in /> set. It also notifies the client about a new node creation, and tells the client the group name and method for the synchronization group. The client then synchronizes the initial session content within that group. See more on this below, in the Session messages section. If the synchronization fails, then the node is not created.

If <subscribe /> is set, then the client is subscribed to the session that the new node refers to. This is normally only used if the client explicitely requested subscription in its <add-node> request. Normally, the subscription group and method will be the same as the synchronization group and method, but it is allowed that they are different. However, if the group name is the same, then the method needs to be the same as well (because it refers to the same group). If <subscribe /> is set, then the client needs to reply with <subscribe-ack /> or <subscribe-nack />.

<remove-node id="node_id" seq="seq_id" />
  • id, Integer: The node ID of the removed node.
  • seq, Integer: The seq_id the client had set in the original <remove-node> request, if any.
  • Scope: point-to-point

The server notifies the client that a node has been removed from the directory, within a subdirectory that the client already explored.

<subscribe-session id="node_id" group="group_name" method="method_name" seq="seq_id" />
  • id, Integer: The ID of the document whose session the client is subscribed to. Must not refer to a subdirectory node.
  • group, String: The group name of the subscription group. The subscription group is the group for all connections subscribed to a session. The publisher of the subscription group is always the same as the one of the InfDirectory? group.
  • method, String: The communication method used to deliver messages within the subscription group.
  • seq, Integer: The seq_id the client had set in the original <subscribe-session> request, if any.
  • Scope: point-to-point

Subscribes the client to a session. The client needs to reply with <subscribe-ack /> or <subscribe-nack />. When the client acknowledged the subscription, then the server will start to synchronize the session's content to the client within the subscription group.

<subscribe-chat group="group_name" method="method_name" seq="seq_id" />
  • group, String: The group name of the subscription group. The subscription group is the group for all connections subscribed to a session. The publisher of the subscription group is always the same as the one of the InfDirectory? group.
  • method, String: The communication method used to deliver messages within the subscription group.
  • seq, Integer: The seq_id the client had set in the original <subscribe-chat> request, if any.
  • Scope: point-to-point

Subscribes the client to the global server chat. The client needs to reply with <subscribe-ack /> or <subscribe-nack />. When the client acknowledged the subscription, then the server will start to synchronize the session's content to the client within the subscription group.

Example

<group name="InfDirectory" publisher="armin@0x539.de">
 <explore-node seq="0" id="0"/>
</group>

Some client wants to explore the root node of the directory of the server with publisher id armin@….

<group name="InfDirectory" publisher="armin@0x539.de">
 <explore-begin total="3" seq="0"/>
</group>
<group name="InfDirectory" publisher="armin@0x539.de">
 <add-node id="3" parent="0" name="first" type="InfSubdirectory" seq="0"/>
</group>
<group name="InfDirectory" publisher="armin@0x539.de">
 <add-node id="2" parent="0" name="third" type="InfSubdirectory" seq="0"/>
</group>
<group name="InfDirectory" publisher="armin@0x539.de">
 <add-node id="1" parent="0" name="second" type="InfSubdirectory" seq="0"/>
</group>
<group name="InfDirectory" publisher="armin@0x539.de">
 <explore-end seq="0"/>
</group>

Server reply. This could also be packed within a single <group> message, but it would then be sent as a single message and could not be sent in chunks. A single, big message could block traffic in other groups that go through the same connection this way.

<group name="InfDirectory" publisher="armin@0x539.de">
 <explore-node seq="1" id="1"/>
</group>

The client wants to explore another node.

<group name="InfDirectory" publisher="armin@0x539.de">
 <explore-begin total="2" seq="1"/>
</group>
<group name="InfDirectory" publisher="armin@0x539.de">
 <add-node id="5" parent="1" name="bar" type="InfSubdirectory" seq="1"/>
</group>
<group name="InfDirectory" publisher="armin@0x539.de">
 <add-node id="4" parent="1" name="foo" type="InfSubdirectory" seq="1"/>
</group>
<group name="InfDirectory" publisher="armin@0x539.de">
 <explore-end seq="1"/>
</group>

That is probably self-explanatory.

<group name="InfDirectory" publisher="armin@0x539.de">
 <add-node seq="2" parent="1" type="InfSubdirectory" name="baz"/>
</group>

The client wants to create a new subdirectory called "baz" within the "second" subdirectory.

<group name="InfDirectory" publisher="armin@0x539.de">
 <add-node id="6" parent="1" name="baz" type="InfSubdirectory" seq="2"/>
</group>

Alright, subdirectory created with ID 6.

<group name="InfDirectory" publisher="armin@0x539.de">
 <add-node seq="3" parent="1" type="InfSubdirectory" name="baz"/>
</group>

This tries to create another subdirectory within the same parent also called "baz".

<group name="InfDirectory" publisher="armin@0x539.de">
 <request-failed code="0" domain="INF_DIRECTORY_ERROR" seq="3"/>
</group>

But this does not work. Error code 0 in the INF_DIRECTORY_ERROR domain means "Node with that name already exists".

Synchronization

Synchronization means the transmission of a session's current state from one host (synchronization server) to another host (synchronization client). Synchronization messages are sent within a synchronization group, which both hosts must be member of (there might be other members in that group, for example the subscription group of the session can be used as synchronization group). All synchronization-messages have scope "point-to-point".

Synchronization client

These messages can be sent by the synchronization client during the synchronization.

<sync-error domain="domain" code="code">
 <text>Human readable</text>
</sync-error>
  • domain, String: A domain from which the error comes. This is most likely "INF_SESSION_SYNCHRONIZATION_ERROR".
  • code, Integer: A numerical error code explaining what went wrong, depending on domain.
  • text, String, optional: A human-readable error message, in the synchronization client's language. Does not allow enhanced child text.
  • Scope: point-to-point

Sent when a message could not be processed during synchronization. This cancels the synchronization, and possibly an upcoming subscription.

<sync-ack />
  • Scope: point-to-point

Tells the other end that all synchronization messages have been received and processed. The synchronization was successful.

Synchronization server

These messages can be sent by the synchronization server during the synchronization. Note that these messages are independant from the session type. Concrete session types, such as text sessions, may add more messages. These are explained below, in the InfText? session messages section.

<sync-cancel />
  • Scope: point-to-point

Cancels the synchronization. This is possible as long as <sync-end /> has not yet been sent (otherwise the synchronization client has already received all data anyway, so there isn't anything to cancel anymore). No further synchronization messages follow this one.

<sync-begin num-messages="num" />
  • num-messages, Integer: The number of messages until the synchronization is complete, including <sync-begin> and <sync-end>.
  • Scope: point-to-point

The first message of the synchronization. It tells the client the number of messages in the synchronization, so it can show progress.

<sync-end />
  • Scope: point-to-point

The last message of the synchronization. The synchronization can still fail when this message has been sent because the client might still fail to process a previous message. Therefore, the synchronization is not considered complete after having sent <sync-end />. Instead, the client sends <sync-ack /> when it has processed all messages (which results in a complete synchronization) or <sync-error /> if there was an error (resulting in a failed synchronization).

<sync-user id="id" name="name" status="status" />
  • id, Integer: A unique user ID within the session.
  • name, String: The user's name.
  • status, UserStatus?: The status of this user. See below in the "Session messages" section to learn what a user status is.
  • Scope: point-to-point

Tells about a user within the session. If status is unavailable, the user had joined the session some time ago, but has then left the session. Again, concrete session types may add more attributes for users.

Session messages

Each session has a so-called subscription group, that is a group of which all subscribed connections are a member. The group name for that group is specified by the publisher (which is the host that contains the node corresponding to the session in its directory).

Users

To modify the document, a user needs to be joined into the session. This can currently only be performed by sending a corresponding <user-join /> request to the publisher. This means that, when the connection to the publisher is no longer available, then no new users can be joined.

Each user has a status, and some attributes require to contain such a UserStatus?. Allowed user status are:

  • active: The user has joined the session.
  • inactive: The user has joined the session, but is currently not actively modifying or viewing the document.
  • unavailable: The user has left the session. This can happen when a host with joined users unsubscribes from a session, when such a connection goes down or when the connection explicitely sets the status of the user to unavailable (via <user-status-change />). Unavailable users can not modify the document, but they can be re-joined into the session via a <user-join /> request.

Client Side

These messages can be sent from any host to the publisher. For every request that allows a "seq" attribute the server can additionally reply with <request-failed /> (see below for possible Server messages) in addition to the possible server replies that are explained for the request.

<user-join name="name" seq="seq_id" />
  • name, String: A unique user name within the session.
  • seq, Integer, optional: An optional sequence number for the request. The server reply will have the same seq set.
  • Scope: point-to-point

Requests a user join with the given name. The server will reply with <user-join> or <user-rejoin>, depending on whether an unavailable user with the same name exists already or not. Again, more user attributes may be defined for concrete note types.

<session-unsubscribe />

Unsubscribes from the session. The status of all users joined from this host will be set to unavailable. The host must also leave the subscription group (this depends on the method used, but for some methods the host will no longer receive group messages anyway).

Server Side

These messages can be sent from the publisher to all other group members, but not from one group member to another one.

<request-failed domain="error_domain" code="error_code" seq="seq_id">
 <text>Human readable</text>
</request-failed>
  • domain, String: The domain the error comes from. This is most likely INF_SESSION_ERROR or INF_USER_ERROR.
  • code, Integer: A numerical error ID specifying what went wrong, depending on domain.
  • seq, Integer: The seq that the client set for the original request that failed, if any.
  • text, String, optional: Human-readable error message in the server's language. Does not allow enhanced child text.
  • Scope: point-to-point

Tells the client that a request it made could not be processed.

<user-join id="user_id" name="name" status="status" seq="seq_id" />
  • id, Integer: The (unique) ID of the joined user.
  • name, String: The user name of the joined user.
  • status, UserStatus?: The status of the joined user. This must not be "unavailable".
  • seq, Integer, optional: The seq that the client set for the original <user-join /> request, if any.
  • Scope: group

Tells the client that a new user has joined the session. Again, more user attributes may be defined for concrete note types.

<user-rejoin id="user_id" name="name" status="status" seq="seq_id" />
  • id, Integer: The (unique) ID of the rejoined user.
  • name, String, optional: The user name of the rejoined user. If not set, then the old name is kept.
  • status, UserStatus?: The status of the rejoined user. This must not be "unavailable".
  • seq, Integer, optional: The seq that the client set for the original <user-join> request, if any.
  • Scope: group

Tells the client that an already existing, but unavailable, user rejoined the session. The client should already know of that user, for example because it has received a <user-status-change /> before, or the user was already unavailable during synchronization. Again, more user attributes may be defined for concrete note types.

<session-close />
  • Scope: group

Tells the client that the session has been closed. This means, no more changes can be made. The client can try still to resubscribe if the document still exists in the directory.

Group messages

These messages can be sent from group members to other group members. More messages may be defined for concrete note types, see below for text sessions (type "InfText?").

<user-status-change id="user_id" status="status" />
  • id, Integer: The user ID of the user that changes status.
  • status, UserStatus?: The new status of the user.
  • Scope: group

Changes status of the user with the given ID. The user must be joined from the same connection the request comes from.

InfChat? messages

This section describes additional messages that are used for chat sessions.

Synchronization

These are additional messages sent during synchronization:

<sync-message type="msg_type" user="user_id" time="time">
  Text
</sync-message>
  • type, Optional: Type of the message. Can be "normal", "emote", "userjoin" or "userpart". If it is omitted, then it default to "normal".
  • user, Integer: The user that has sent the message.
  • time, Integer: A Unix timestamp (number of seconds since 01/01/1970 00:00 UTC) of when the message was sent.
  • Text, Optional: The text of the message. Messages of type "userjoin" and "userpart" don't have text. Does not allow enhanced child text.

This synchronizes a message that was sent earlier to a client. It provides the client with a history of earlier chat messages.

Group messages

These messages can be sent from each group member to other group members.

<message type="msg_type" user="user_id">
  Text
</message>
  • type, Optional: Type of the message. Can be "normal" or "emote". If it is omitted, then it defaults to normal.
  • user, Integer: User ID of the user sending the message. The user must have joined via the connection the message comes from.
  • Text: The text of the message. Does not allow enhanced child text.

Sent when a user sends a message to the chat.

InfText? messages

This section describes additional messages that are used for text sessions.

State vectors

The state of a document is completely defined by a so-called state vector specifying how many operations each user did. For example, if user A made 2 requests and user B made 3 requests, the document has a certain state. All users that processed these 5 requests are in the same state. Note that it does not matter in which order the requests are processed (as long as the causal order is kept), which is the key concept behind the adOPTed algorithm.

Each user maintains a state vector for each other user that describes in which state that user is. This vector is updated every time a request is received from a particular user. This requires every user to send no-op requests if it has been inactive for some time, so that others still know the state of this user.

Also, each user maintains a request log in which all requests the user made are stored. This is required for transforming incoming requests from other users that were made in a different state of the document. Details to the adOPTed algorithm and how these transformations are performed can be found in the papers at http://portal.acm.org/citation.cfm?id=240305 and http://portal.acm.org/citation.cfm?doid=320297.320312. Some trickier aspects are also mentioned below in the Concurrency Control section.

Some attributes of the following messages describe such a state vector. The state vector has the form "user_id:processed_requests;user_id:processed_requests;[...]" where user_id is the ID of a user and processed_requests describe the number of requests that this user is guaranteed to have already processed.

State Vector Diffs

State Vectors can also be sent as a "Diff", which means relative to a state vector sent previously.

For example, if such a diff contains the value "2:3;1:1", this means that the value 3 needs to be added to the processed requests of user 2 of the reference state vector, and 1 to the requests of user 1.

Operations

Some fields of the InfText? messages use operations to describe a change that has been made to the document. This section defines what operations are allowed.

There are two types of operations. Such operations that modify the document and such that do not. For example, inserting a character into the document does modify it, but moving the cursor of a user does not. If the operation of a request does not modify the document, the request is not recorded into the request log (and cannot be undone).

<insert pos="pos">text</insert>
<insert-caret pos="pos">text</insert-caret>
  • pos, Integer: The character offset at which to insert text.
  • text, String: The text to insert.

An operation that inserts new text into the document. The <insert-caret> version additionally places the cursor of the user that made the request behind the inserted text.

<delete pos="pos" len="len" />
<delete-caret pos="pos" len="len" />
  • pos, Integer: The character offset at which to start deleting characters
  • len, Integer: The number of characters to delete.

Deletes text from the buffer. This operation is only used in <request /> messages. Again, <delete-caret /> additionally places the cursor of the user that made the request to the place where characters have been deleted.

<delete pos="pos"><segment author="user_id">text</segment>[...]</delete>
  • pos, Integer: The character offset at which to start deleting characters.
  • segments: The text that was deleted, including author information, i.e. which user wrote what text. Does allow enhanced child text.

Deletes text from the buffer and specifies what text is deleted. This operation is only used in <sync-request /> messages. The synchronization client cannot deduce what text was actually deleted, but must be able to compute the reverse operation in case someone undoes the request. In a normal <request /> message, other users can deduce what text was deleted by having a look at the document and which transformations were required to transform the request to the current state before the operation is actually executed.

<no-op />

Does nothing, and does especially not modify the document (see above). This is used when a user was inactive for some time to report its current state vector to the other users.

<move caret="pos" selection="len" />
  • pos, Integer: The character offset to place the cursor to.
  • len, Integer: The length of the selection area, in characters. Negative values mean selection towards the beginning of the document.

Changes the position of the user's cursor position and selection area. This operation does not modify the document.

<undo />
<undo-caret />

Undoes the previous request of the user. The <undo-caret /> version also places the user's cursor to the position where the operation was performed.

<redo />
<redo-caret />

Redoes the previous request of the user. The <redo-caret /> version also places the user's cursor to the position where the operation was performed.

User attributes

The following attributes are used in the <sync-user /> and <join-user /> requests for text sessions:

  • time, StateVector?: The current state vector of the user
  • caret, Integer: The position of the user's cursor, in characters, from the document beginning
  • selection, Integer: The number of characters selected, starting at the character caret. Negative means selection towards the beginning of the document.
  • hue, Float: A floating point number between 0.0 (red) and 1.0 (red), specifying the user's color.

TODO: Viewport

Synchronization

These are additional messages sent during synchronization:

<sync-request user="user_id" time="time">
 <operation />
</sync-request>
  • user, Integer: The ID of the user that made the request.
  • time, State Vector: The state at which the request was made.
  • operation, Operation: The operation performed by the request, see below.
  • Scope: point-to-point

Tells that at the given time a user made a request. The request is already included in the document content sent via <sync-segment /> messages. However, the request should be added to the request log and might be necessary to transform incoming operations or to compute Undo operations.

<sync-segment author="user_id">Text</sync-segment>
  • author, optional: The user ID of the user that wrote the text.
  • Text: The text that user wrote. Does allow enhanced child text.
  • Scope: point-to-point

Synchronizes a continous part of the initial document that was written by a single user. The <sync-segment> messages must be sent in-order so the document can be reconstructed.

Group messages

These messages can be sent from each group member to other group members.

<request user="user_id" time="time">
 <operation />
</request>
  • user, Integer: The ID of the user that made the request.
  • time, State Vector Diff: Describes the document state at which the request was made.
  • operation, Operation: The operation performed by the request, see below.
  • Scope: group

Whenever a user issues a request the <request> message is sent. The user must have joined via the connection the request message comes from. <request> is sent to the whole subscription group. The time is a State Vector Diff, with the previous time sent by that user being the reference state vector. Therefore, it requires the requests from some user to arrive in-order, but state vector components from inactive users (or ones that have left the session) need not to be transmitted all the time. Note that the value for the user itself is therefore always zero, because there cannot be any requests inbetween them.

Concurrency Control

When a request is received at a state v previous to the current document state, then the request must be transformed to the current state before it can be applied. As already pointed out, infinote uses the adOPTed algorithm for concurrency control and group undo as described in the papers at http://portal.acm.org/citation.cfm?id=240305 and http://portal.acm.org/citation.cfm?doid=320297.320312.

The following sections use pseudo code to explain algorithms used by infinote. When there is a request r in such code, then r.v means the state the request can be applied at, r.user refers to the user that issued the request and r.op yields the operation of the request. I hope, everything else is self-explanatory. When v is a state vector, then v[i] means the i-th component of it.

Transformation rules

Infinote supports stringwise insert and delete operations. The transformation rules are mostly straight forward, but there are two things to point out. First, let me show you the definition of the transformation function in pseudo code:

transform(ins(pos1, str1), ins(pos2, str2), cid):
  if(pos1 < pos2 || (pos1 == pos2 && cid == other)):
    return ins(pos1, str1)
  if(pos1 > pos2 || (pos1 == pos2 && cid == self)):
    return ins(pos1 + str2.length, str1)

transform(ins(pos1, str1), del(pos2, len2), cid):
  if(pos1 >= pos2 + len2):
    return ins(pos1 - len2, str1)
  if(pos1 < pos2)
    return ins(pos1, str1);
  if(pos1 >= pos2 && pos1 < pos2 + len2)
    return ins(pos2, str1);

transform(del(pos1, len1), ins(pos2, str2), cid):
  if(pos2 >= pos1 + len1)
    return del(pos1, len1);
  if(pos2 <= pos1)
    return del(pos1 + str2.length, len1)
  if(pos2 > pos1 && pos2 < pos1 + len1)
    return split(del(pos1, pos2 - pos1), del(pos2 + str.length(), len1 - (pos2 - pos1)))

transform(del(pos1, len1), del(pos2, len2), cid):
  if(pos1 + len1 <= pos2)
    return del(pos1, len1)
  if(pos1 >= pos2 + len2)
    return del(pos1 - len2, len1)
  if(pos2 <= pos1 && pos2 + len2 >= pos1 + len1)
    return del(pos2, 0)
  if(pos2 <= pos1 && pos2 + len2  < pos1 + len1)
    return del(pos2, len1 - (pos2 + len2 - pos1))
  if(pos2  > pos1 && pos2 + len2 >= pos1 + len1)
    return del(pos1, pos2 - pos1)
  if(pos2  > pos1 && pos2 + len2  < pos1 + len1)
    return del(pos1, len1 - len2)

Least common successor

The least common successor of two state vectors v and w is defined as the state vector u so that both v <= u and w <= u hold and that there is no other state vector k so that v <= k and w <= k and k <= u (otherwise k would be the lowest common successor, and not u). It is easy to show that the least common successor of two given state vectors is unique. It can be computed as follows:

lcs(v, w):
  for every user i:
    u[i] = max(v[i], w[i])
  return u

Concurrency IDs

The first thing that comes to mind when having a look at the transformation function is that transform does not take two arguments, but three. cid means concurrency ID, and it can have one of the values "self", "other" or "none". It is used to decide which operation to transform when two insert operations that insert text at the same position need to be transformed against each other. "self" means the first operation needs to be shifted to the right, and "other" means that the second operation needs to be shifted.

To compute the concurrency ID between two operations, we define another function:

cid(ins(pos1, len2), ins(pos2, str2)):
  if(pos1 < pos2)
    return other;
  if(pos1 > pos2)
    return self;
  if(pos1 == pos2)
    return none;

Now, whenever we encounter the situation that we need to transform two requests that carry insert operations with the same position against each other, we use the cid() function to determine a concurrency ID. However, it would not make sense to use the cid() function on the same two operations, since this would yield "none", and calling transform() with two insert operations at the same position and concurrency ID "none" is not defined.

Instead, we call the function on the operations of the same requests at a previous state. If r1 and r2 denote the two requests as they were generated on the respective sites, then lcs(r1.v, r2.v) is the earliest state to which both requests can be transformed (because of definition of the least common successor). Therefore, we transform both r1 and r2 to the state lcs(r1.v, r2.v) and call cid() on them to find a concurrency ID.

The idea behind this approach is that, during transformation, two insert operations can be transformed so that their insertion position fall together (at the same state), but it cannot happen that at one state, the first operation inserts text behind the position of the second operation, and vice-versa at another state. This means that the order of the inserted text cannot change, and thus the order in an arbitrary state is sufficient to know the order in all other states.

Of course, it can still happen that the cid function yields "none" for the two operations at the lcs. In this case, no order is defined between the two requests, for example because two users inserted text at the same position at the same time. In this case, it does not matter which of the two requests is shifted to the right, but we need to guarantee that the same request is used at every site. Infinote uses the user ID of the issuing request to decide this. We can define a cid function for requests that incorporates this behaviour:

cid(req1, req2):
  c = cid(req1.op, req2.op)
  if(c != none) return c

  if(req1.user > req2.user)
    return other;
  if(req1.user < req2.user)
    return self;

Again, note that the algorithm works equally well if we swap "other" and "self" here, or use another criteria to decide which request to shift. However, this is what infinote does, and what other implementations that want to be compatible with infinote need to do.

Split operations

You may also have noticed the "split" in the above pseudo code for the transformation function. split denotes a helper operations that is a wrapper around two operations. This is required since in one specific case, a delete operation gets splitted so that the text deleted is no longer continuous (namely if text is inserted within the deletion range). However, this new split operation might also need to be transformed against other operations, so we need to define the transformation function for it. Luckily, this is straightforward:

transform(split(first, second), op2, cid):
  return split(transform(first, op2 cid), transform(second, op2, cid))

transform(op1, split(first, second), cid):
  tmp = transform(op1, first, cid)
  nsecond = transform(second, first, none)
  return transform(tmp, nsecond, cid)

adOPTed implementation

There are two noteworthinesses to point out in infinotes adOPTed implementation. They are not clear in the adOPTed papers (perhaps because they do not cover redo) but I needed to apply them for some tests to pass:

  • The time field of undo and redo requests is only used to know when the request is ready to be executed. When the request is recorded into the request log, its state vector is changed to the one of the original request (the original request is the original DO request the undo or redo is undo/redoing, respectively) expect the component of the user issuing the request. This is because Undo/Redo is always computed starting from the original request. This, in turn, is required to apply the mirror operation as late as possible (after having performed transformations).
  • I had to extend the definition of the Reachable function of the adOPTed algorithm to take into account Do/Undo pairs. The extended version looks like this in pseudo code:
    reachable(v):
      for every user u:
        if not reachable_user(v, u):
          return false;
      return true;
    
    reachable_user(v, i):
      n := v[i];
      forever:
        if(n == 0) return true;
        r := Request(u, n - 1);
    
        if r is a DO-request:
          return r.w() <= v;
        else:
          n := associated_request(r).v[i];
    
    The function associated_request returns request the given request r undoes (which is a DO or REDO request if r is a UNDO request, or an UNDO request if r is a REDO request). r.w() means the state vector of r (r.v) with the component representing the user which issued the request increased by one (same convention as in the adOPTed paper referenced above). In the following diagram DO operations at (0,0), (0,1) and an UNDO operation at (1,1) have been issued. Note that the point (2,0) is reachable by folding from (0,0). The translation path of the blue request from (0,0) to (2,1) shows how it may be used. Although there exists another path in this example where (2,0) does not need to be reachable, the tests described in test/std/test-24.xml and test/session/test-51.xml fail with the original definition of the Reachable function.

http://infinote.0x539.de/illustrations/extended-reachable.png

Discussion

Please ask if things remain unclear.

Armin: Things I did not yet think about (too much): Access restriction, xmlns, viewport.