SPD Server Distributed Locking

Overview: SPD Server Distributed Locking

SPD Server 5.3 supports distributed locking for data stored in HDFS. Distributed locking provides synchronization and group coordination services to clients over a network connection. For the service provider, SPD Server uses the Apache ZooKeeper coordination service—specifically the implementation of the recipe for Shared Lock that is provided by Apache Curator.
Distributed locking provides the following benefits:
  • The lock server maintains the lock state information in memory and does not require Write permission to any client or data library disk storage locations.
  • A process requesting a lock on a table that is not available (because the table is already locked) can choose to wait for the table to become available, rather than have the lock request fail immediately.
  • If a process abnormally terminates while holding locks on tables, the lock server automatically drops all locks that the client was holding, which eliminates the possibility of leftover lock files.

Understanding the Service Provider

Apache ZooKeeper is an open-source distributed server that enables reliable distributed coordination to distributed client applications over a network. ZooKeeper safely coordinates access to shared resources with other applications or processes. At its core, ZooKeeper is a fault-tolerant, multi-machine server that maintains a virtual hierarchy of data nodes that store coordination data. For more information about ZooKeeper and the ZooKeeper data nodes, see Apache ZooKeeper.
Apache Curator is a high-level API that simplifies using ZooKeeper. Curator adds many features that build on ZooKeeper and handle the complexity of managing connections to the ZooKeeper cluster. For more information about Curator, see Curator.
SPD Server accesses the Curator API to provide the locking services.

Requirements for SPD Server Distributed Locking

SPD Server distributed locking has the following requirements:
  • ZooKeeper 3.4.0 or later must be downloaded, installed, and running on the Hadoop cluster. The zookeeper JAR file is required.
  • Curator 2.7.0 or later must be downloaded on the Hadoop cluster. The following Curator JAR files are required:
    • curator-client
    • curator-framework
    • curator-recipes

Requesting Distributed Locking

By default, SPD Server uses the standard SPD Server member-level locking. To request distributed locking, you must include parameter file options in the spdsserv.parm parameter file. You must specify the following parameter file options to provide the information so that SPD Server can communicate with ZooKeeper:
  • ZKPR_QUORUM= to specify the list of quorum machines.
  • ZKPR_PORT= to specify the I/O port to service requests.
In addition, these parameter file options can be included in the spdsserv.parm parameter file to change default values:
  • ZKPR_CTIMEOUT= to specify the connection time out.
  • ZKPR_LTIMEOUT= to specify the lock wait time out.
  • ZKPR_MAXRETRY= to specify the number of times that Curator attempts to connect to ZooKeeper before failing.
  • ZKPR_RETRYSLEEP= to specify the amount of time that Curator sleeps between attempts to connect to ZooKeeper.
  • ZKPR_RPRTHRESH= to specify the wait time before deleting an empty ZooKeeper server node.
  • ZKPR_STIMEOUT= to specify the wait time before the session is considered expired.