Fancy helping out? Want to get your feet wet with Erlang? Contribute to CouchDB!

This document is out of date, although some of this still applies you should seek clarification from the mailing list or IRC before starting work on anything.


See [WWW] for more about Erlang. A quick "What's Erlang" is provided the [WWW] FAQ.

Erlang Application Design

Erlang applications typically share a similar design. While Erlang was developed, a lot of "best-practices" were established. It is a good idea to follow them, if you want to benefit from all of Erlang's (cool) features.

The Erlang VM makes creating of new threads extremely cheap CPU-time-wise. It also implements highly efficient message passing between threads. It does a good job of distributing all threads over all available CPUs and cores of a system and it can even distribute threads over a network onto other machines.

Writing an Erlang program requires you to think a bit about the design of your program. You need to think a bit where concurrency can (or should) happen in your program. It is essential to identify the parts that don't need tight coupling (of function calls and data structures). You then create modules according to the parts of your system. Modules are a bit like classes, in that they include all the interfaces, functions and data structures that are needed for a specific task in your program, but they are not actually classes in the OO sense of things. More on CouchDB's modules just in a bit.

Another significant concept in Erlang application design is the ignorance of error or exception handling. In a function, you only care for cases you are interested in, not for possible problems that can occur. If a function encounters something you didn't expect there, the thread running the function simply gets terminated and the calling thread gets notified. Since a terminating thread is usually not expected by the calling thread, it dies as well. This goes all back the caller-chain to the original caller. This might sound weird at first, but it really helps to keep code concise and maintainable.

Since Erlang is built to create server software, all the modules you create that do actual work are supervised by a supervisor thread that makes sure the threads that work, are always alive. If, in case of an error, a module terminates, the supervisor just respawns it.

CouchDB's Modules

CouchDB consists of a small set of modules to built up its functionality. To the outside, CouchDB provides a REST interface via a HTTP server. There's obviously a module handling incoming HTTP requests.

The HTTP server module then passes on requests to the couch module. The couch module looks at the request and invokes the correct functions to serve it. A request could be asking for a document in a database, the couch module then asks to the database module to get the data. The database module checks, if document and database actually exist and then in turn asks the storage module to actually read the data from disk.

CouchDB allows to create views on your data. If you query a view, the couch module asks the view module to handle all the dirty work. The view module, in fact, is a standalone program written in C and based on Mozilla SpiderMonkey Engine that CouchDB just launches and talks to view standard IO. Fabric is also managed by a supervisor thread as a daemon process in your operating system.

Fulltext searching is realised very similar. It is handled by two independent daemons that each take care of writing a search index and then querying that respectively.

Help Wanted

Here's a short list of areas that can benefit from your help.

Security and Authentication

CouchDB currently lacks any security. We want to introduce a super-flexible permission system with users and groups and read and write permissions that can be enforced on documents and databases. Please see the [WWW] technical overview and this [WWW] blog post for some info on what is planned.

The prerequisite to Security is Identity. The proposal is to use LDAP as the directory of users and groups. Once authenticated the server will know the distinguished name of the current user. It may have an datastructure representing the full LDAP entry of the current user which it can pass to JavaScript functions.

The JavaScript security function may live in a design document, there might be several security functions per database, perhaps one for each document type. There could perhaps be security functions on the data documents?

For example the below function allows everyone to read, but only the creator of the document may update or delete.

     return true;
     return true;
    return false;

Database Partitioning

To handle vast amounts of data. Databases need to be partitioned over multiple servers. CouchDB will make this super-easy for you, once you've implemented it. @@ add more details


Currently being worked on.

The fulltext search implementation only does a bare minimum for now. You can improve that by adding more flexible configuration options. For example, it'd be nice to be able to restrict fulltext searching only to a specific field or set of fields in a document, or a restriction based on a field value.


Currently being worked on.

Sphinx is a standalone full-text search engine, meant to provide fast, size-efficient and relevant fulltext search functions to other applications. Right now it supports reading data from MySQL, PostgreSQL, or XML. Hopefully CouchDB will be another possible data source soon. Information on [WWW] @@ add more details


Currently being worked on.

Unit Tests

We're looking to develop a build time unit test framework similar to the current tests available at run time in the WWW admin console. Perhaps you have an idea how this might be implemented.

Running CouchDB From the Source Directory

This is documented on the page Running_Couchdb_in_Dev_Mode

last edited 2009-01-26 22:23:54 by ChrisAnderson