Right now, locating and connecting to data on the web is dependent on URLs and using search engines to find URLs and grab chunks of data, following the hypermedia links. No doubt, these ideas are heavily embedded in our thinking and day-to-day use. But there are alternative models, such as content-centric networking, which introduces the possibility of greatly increasing network efficiency and security by recognizing content, not URLs.
The idea of retrieving data by content in a network is analogous to retrieving data by content in a single computer using associative memory. With standard computer memory you supply an address and get back the value stored at that address. Content addressable memory (CAM) also known as associative memory provides an additional mechanism by which you supply data and get back the address(es) where matching data has been stored. Naturally this mechanism requires extra circuitry so this sort of memory is expensive. Furthermore, present hardware can only deal with exact matches of relatively short data words.
In a content-centric network, a user would request data blocks by name, not by connection, so any server on the network having the named data could respond. This would require a major expansion of capability in network hardware. It is not the same as recovering data from a network cache which relies on the data origin address.
The closest current net technology comes to this approach is the BitTorrent large scale data sharing system. BitTorrents require the use of shared metadata "torrent" descriptor files plus additional communication among clients receiving data blocks to coordinate distribution over many connections. The extra work of setting up torrents and coordinating clients means that torrents are only feasible for large files.
The CCNx project
Right now, Project CCNx, a PARC sponsored open-source project seems to be the most active attempt to realize a content-centric network. In the CCNx vision, each block of data would be self sufficient, carrying name and version information uniquely identifying the block so that any server holding a block could recognize a request and respond without additional traffic. In theory, data in high demand would become widely distributed so that requests could be satisfied closer to the client. Authenticity would be accomplished by cryptographic signing and public keys.
The CCNx project recognizes that the development of the concept is at a very early stage. The releases of specifications and reference software are intended for an audience of researchers and experimenters to stimulate ideas and conversations. Eventual wide use of CCNx would require the creation of very specialized hardware, but it is now possible to experiment with the concepts in software. The current specification and software release (0.2.0 Dec, 2009) uses C and Java 1.5 or 1.6 on Linux, Unix or MacOS operating system. Using the sample software you can experiment with a named data repository, simple client sample applications, and monitoring tools on your local LAN.
It seems to me that the biggest problem with the CCNx vision is that it would take a major overhaul of the way the current network hardware and browsers work to make it useful. Content-centric networking faces big obstacles if it is to find wide application. On the other hand, as I shall discuss in the next article, the concepts to support the Semantic Web are being developed and applied incrementally in small areas of the net.