This document describes a server side implementation of RFC4533 in RHDS. It is implemented as plugin for two reasons
First the concepts of content synchronization are presented, the design of the plugin implementation is explained and a discussion of open topics follows
Although everything is fixed in RFC 4533 as always there are interpretations and design choices. This section presents the core elements and concepts of the RFC and how they will be used in the implementation. The RFC specifies a mechanism for a client to synchronize its copy of a database with the, changing, content of a directory server. It can be considered as a consumer initiated replication, where the server has no knowledge about the clients.
In contrast to replication it is not change oriented, but entry centric. Always complete entries (subject to access control and requested attributes) will be sent. Or, in update phases, dns (or uuids) of entries not changed or deleted since the last session.
To achieve this the RFC introduces several new LDAP elements and specifications for their content: messages and controls and defines how client and server exchange information.
Digression: relation to persistent searches
Reading the RFC the similarity with persistent searches is obvious, but there are two important differences:
* intermediate messages
A persistent search (not changes only) sends the full results of the search and then switches to persistent mode, but the client is not notified when the first phase is completed
* session management
Each persistent search starts with getting all entries, there is no state information so that a new persistent search can just enhance the results of a previous search
These shortcomings are addressed by RFC4533, but the implementation can build on what is available with persistent searches.
There are two new controls, one to initiate a synchronization session and a state control to be sent with any response (result or entry) back to the client
sync request control
controlType is 1.3.6.1.4.1.4203.1.9.1.1
syncRequestValue ::= SEQUENCE {
mode ENUMERATED {
-- 0 unused
refreshOnly (1),
-- 2 reserved
refreshAndPersist (3)
},
cookie syncCookie OPTIONAL,
reloadHint BOOLEAN DEFAULT FALSE
}
This initiates the synchronization session and defines one of two potential modes; refreshOnly or refreshAndPersist. The optional cookie can be used to continue a previous synchronization session
sync state control
syncStateValue ::= SEQUENCE {
state ENUMERATED {
present (0),
add (1),
modify (2),
delete (3)
},
entryUUID syncUUID,
cookie syncCookie OPTIONAL
}
It has to be sent with each result entry. It is attached to entries in the refresh phase and in the persist phase when updates are sent. Important is the entryUUID, which the client has to use to uniquely identify entries, the dn might have changed. Note that there is no specific state for modrdn operations, it is the task of the client to handle this based on the entryUUID
sync done control
syncDoneValue ::= SEQUENCE {
cookie syncCookie OPTIONAL,
refreshDeletes BOOLEAN DEFAULT FALSE
}
This control is sent with a search result done message and will only be sent to complete a refresh only session. It should provide a cookie to the client to be used in further update sessions
There is one new message introduced to indicate the completion of the refresh phase when a the client requested a refreshAndPersist session or the separation of two phases in the refresh phase. It defines the responsename and responsevalue of a LDAP intermediate response message as defined in rfc4511
responseName is 1.3.6.1.4.1.4203.1.9.1.4 and
responseValue contains a BER-encoded syncInfoValue. The criticality is FALSE (and hence absent).
syncInfoValue ::= CHOICE {
newcookie [0] syncCookie,
refreshDelete [1] SEQUENCE {
cookie syncCookie OPTIONAL,
refreshDone BOOLEAN DEFAULT TRUE
},
refreshPresent [2] SEQUENCE {
cookie syncCookie OPTIONAL,
refreshDone BOOLEAN DEFAULT TRUE
},
syncIdSet [3] SEQUENCE {
cookie syncCookie OPTIONAL,
refreshDeletes BOOLEAN DEFAULT FALSE,
syncUUIDs SET OF syncUUID
}
}
Since the dn of an entry can change there is the need to have a unique identifier in the entry which is stable and is globally unique. In RHDS nsuniqueid is used and will be used for implementing this RFC
The purpose of a cookie is threefold
A client should continue a synchronization session only with the server it received the cookie from to guarantee consistency. In a replicated topology this could probably be relaxed. The fully qualified domainname and ldap port should be sufficient to identify a server
Continuing a synchronization session makes only sense if the request uses the same searchbase, the same credentials (determined by the client dn), the same search filter and the same list of requested attributes (maybe not mandatory) clientdn:searchbase:searchfilter[:list of req attrs]. This could become quite long, maybe hashing that string is an option state information
When a cookie is provided by a client a server the server should only return information about changes since the time indicated by the state in the cookie. This state info could be timestamp, or a csn or a changenumber or a combination of several of these. In the first implementation the content for updates to send is determined using the retro changelog, so the current highest changenumber is used
So the cookie implemented is:
<nsslapd-localhost>:<nsslapd-port> # <client dn>:<searchbase>:<search filter> # <changenumber>
There are two modes defined in the SyncRequestControl and both can have a cookie or not, so there are four different scenarios for a synchronization session:
These scenarios will be detailed now
In a refresh phase with a cookie, the server has to determine what has changed since the state provided in the cookie. This has to be done reliably and efficiently and is dependent on the choice of the state information used in the cookie and the change history available in the server. In this first implementation change determination is using the retro changelog, because it is:
The content synchronization is implemented as a plugin for reasons explained above. This requires that the plugin has to control the sending of entries, the appending of controls to entries or messages, the sending of a result message or not.
But so far not everything could completely be implemented in the plugin, there are two exceptions
The sync info message is a specific case of a ldap intermediate response message, which is defined in RFC4511, but is not supported in RHDS so far. It is implemented in result.c in parallel to functions sending entries, results and referrals. It is exposed in slapi-plugin.h to be used by plugins. Since it is a standard ldap message it should be in the core server. If it would have to be implemented completely in the plugin one would have to rely on methods not available in the api, eg get the connection, guess the address of the SockBuffer encode the message and call ber_flush directly. Could work and will have to be tried when porting
The plugin requires calls in the preoperation phase for search, entry and result plugin entry points.
The plugin requires calls in the postoperation phase for all modify operations and for searches.
To communicate between plugin calls at different entry points an object extension is used to provide state information
The presearch plugin is the central point to trigger all synchronization events,
the next steps depend on the scenario defined by mode and cookie
If the initial entries are sent in the core server search operation the sync state control needs to be added to the result message. This is the case if there was no cookie specified and the initial content is sent. The plugin gets the object extension and checks the sync flags to determine what to do. Eventually it creates a sync state control by
In persist mode a sync info message was sent instead of a ldap result message, the only way to prevent the sending of a result message is to return an error from the PRE_RESULT plugin
If the entries have been sent by performing an normal search this is the place where a sync done control can be added to the pblock to be sent with the result message.
If it is in persist mode and the entries have been sent by the normal search process, this is the place where to send the sync info message, sending of the result message will be prevented in the pre_result plugin
Because of the need to handle nested operations (see queue and pending list) the callback sync_update_persist_betxn_pre_op adds nested operation at the end of the per thread pending list.
These callbacks use to be postop, that are move to BETXN callback to prevent the following scenario (seen in #51190):
WIth this scenario update A is applied before update B in the retroCL but update B is enqueue before A. So the presistent search will send B and possibly skip A. For each change operation a change info node is created, containing the
and it is handed over to the persistent subsystem by
This is modeled after the persistent search implementation
If a cookie has been provided and could be decoded and is valid, the changenumber CNUMBER from the cookie is used to determine the changes applied to the database since the cookie was issued.
An internal search is performed with the search base of “cn=changelog” and a search filter “(changenumber>=CNUMBER)”.
An entry handler function is passed to the internal search and this function is called for each changelog entry and its behavior is dependent on the changetype:
in all cases add a sync state control with the nsuniqueid
If an entry cannot be retrieved based on its dn or nsuniqueid, this means that it was deleted or renamed by a change not yet processed, so this changelog entry is skipped
During that phase it exists a dedicated thread that sends the udpated entries to the client. To know which entries to send, the thread needs a piece of information that it reads from a queue. The post change plugins are responsible to write the information to the queue.
For a simple update, the information identifying the entry is written to the queue and there is no difficulty. If the update is nested it is more complex. For example U1: ADD userA, automember adds U2: userA in Grp1, then memberof update U3: userA to make it memberof Grp1, then automember adds U4: userA to Grp2,…. so we have an ordered list of updates U1, U2, U3, U4. RetroCL registers those updates in that order. But post change plugin will register them into the queue with U4, U3, U2, U1. There is a risk of sending updates in the invalid order, skipping updates, being unable to identify the next changenumber to set in the cookie.
The solution is to implement a pending list of updates. The pending list is a per threads structure thread_primary_op. A thread running a nested operation register the operation at the end of the pending list. When all operations are completed and successful the updates are written on the queue, in the same order of the pending list.
If a SyncRequestControle is decoded and the mode is persistent, the following actions are performed:
run the thread until shutdown or search is abandoned
The plugin can prevent sending a result to the client by returning an error from the PRE_RESULT plugin, but it cannot prevent the call to send_ldap_result() in any case. So the o_status is set to RESULT_SENT and the operation is skipped when operations are abandoned in disconnect_server(). This is currently handled by abandoning an operation if OP_FLAG_PS is set
In the persist phase the deletion of an entry requires to send the entry without attributes. This is done by explicitly setting the attributes in the call to send_ldap_search_entry to “1.1”, but the original pblock is used to send the entry and it accesses the original attr list and so crashes in send_specific_attrs. This can be avoided if the call to send_specific_attrs is skipped if “noattrs” is detected. An other would be to generate a dummy entry from the dn without attrs and send this.
For each entry its nsuniqueid has to be sent in the syncstatecontrol, but based on the retro changelog this is not directly available for entries deleted or renamed. The Retro Changelog has to be configured to log the nsuniqueid by adding the following line to the retro changelog config entry
nsslapd-attribute: nsuniqueid:targetUniqueId
The persistent search code increments the connection reference count when the persistent thread is started and decrements it when it is terminated. This requires direct access to the connection data structure and is not available from the plugin.
In the initial tests the thread could be terminated properly and the server could also be cleanly shutdown, so it is not clear if this direct handling of the connection ref count is required and needs to be further investigated
These test scenarios should be run to verify the functionality
There are a few features in the RFC which are not yet implemented
The RFC requests that the server supports the LDAP cancel operation ( RFC 3909 ). This is an abandon operation which the server has to acknowledge by a response message. This can be added if clients need it
So far the implementation only handles ordinary entries, but the same mechanism should apply to referral entries. Will not be supported in the first version
This will only be possible if the Retro Changelog can be configured to store the full deleted entry