Java Protocol Handler
Java Protocol Handler
Page 1 of 14
Introduction
Up until Java 2, a real-world project had a lot of infrastructure to build: what with class loaders, security, distributed lookup and communications, etc. Between JavaTM 1.1 and Java 2 Sun fleshed-out concrete implementations of what had been merely interesting abstractions: Security managers based on a concrete java.lang.SecurityManager implementation Secure class loading based on java.lang.SecureClassLoader A rich GUI framework provided by Swing And the motley crew of distributed application tools provided by Java 1.1 is a parade of frameworks in Java 2--it's difficult for a single programmer to even get a handle on the differences between Jini, Enterprise JavaBeansTM, JMX, FMA/Jiro, Java Naming and Directory InterfaceTM, JMS, etc. Interestingly, of these powerful APIs and implementations, almost all have strong dependencies on URL objects. HTTP and FTP are good, general purpose protocols--definitely enough for general stream-based resource access. For fine-grained resource types like internationalizable resource strings, Java class file resources, and security policy resources, the workhorse HTTP and FTP protocols are long in the tooth and short on configurable performance. For faster, more efficient data flow you often need protocols designed for specific applications. Based on a set of easily configurable factory patterns, the java.net.URL architecture allows you to implement easily a custom protocol handler and slip it into an application's java.net.URL class structure. This architecture has hardly changed since 1.0. Only now, with all the dependencies on URLs in Java 2's Core APIs and extensions, is it really worth your while to study this architecture and maybe even build your own custom protocol handlers. The really cool part is that your custom protocol handler is instantly usable by all of Java 2's advanced API implementations. For example, you can use a java.net.URLClassLoader to load classes using not just "file:"-, "http:"-, and "ftp:"- based classes and other resources, but also to access resources based on your own custom protocols--DB-based protocols, protocols with complex encryption and security, or maybe XML-based protocols (SOAP, XML-RPC, and so forth.). In addition, custom protocols and URLs fit right into Java 2's configurable code-based security. A Java policy file that refers to your custom protocols will correctly sandbox code loaded using those protocols. The myriad of systems coming from major vendors based on migrating code-such as Jini, JMX, FMA/Jiro, Java Embedded Server, etc., or are also based, ultimately, on the lowly java.net.URL class and its architecture of pluggable protocol handlers. This paper describes for you everything you need to know to exploit the custom protocol plug-in architecture. After reading this paper, you'll have a good understanding of the java.net.URL architecture and how different java.net package objects interact to resolve URLs into resource streams. Knowing the architecture shows you how to build your own protocol handlers, which involves (at a minimum) providing a concrete implementation of two important java.net packet classes: java.net.URLStreamHandler and java.net.URLConnection. Once you've designed and created a custom protocol handler, you'll also need to know some tricks about deploying them in real-world Java execution environments. For most deployment scenarios, just placing a protocol handler on the classpath isn't enough. This paper details everything you need to know about deployment. Once you have the architecture, know how to implement, and know how to deploy custom protocol handlers, you'll be able to integrate them into your Java-based applications with ease.
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 2 of 14
Adding this grant into the current Java policy file would cause all code loaded from the CVS server to be trusted. Anywhere you use a URL in Java you can install and use a custom protocol handler.
java.net.URL Architecture
It was surprising to me to realize how minimal the java.net.URL class implementation really is. A java.net.URL object instance is used to represent a URL string, where a URL string usually follows this pattern: protocol://host:port/filepath#ref
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 3 of 14
The public accessor methods of the URL class basically reflect the private fields of a URL object, and they also provide access to the individual parts of a URL string (See Figure 1).
Figure 1:A URL object stores the contents of a URL string, parsed into several fields Most of the action code in the java.net.URL class is dedicated to just the data representation in Figure 1: accessor method implementations, an equals() implementation to compare two URL objects for equivalency, a hashCode() implementation so URLs can easily be key objects in Maps, and a sameFile() method that compares two URL objects to see if everything save the "ref" field is the same.
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 4 of 14
Figure 2:Sequence of interactions between java.net objects to resolve a URL into a stream
public class BadProtoExample { public static void main(String[] args) { try { URL url = new URL(" bogus://www.microsoft.com"); System.out.println( "The URL: is: " + url); } catch (MalformedURLException mue) { System.err.println(mue); } } }
This example program, when run on a typical Java Virtual Machine1, produces this output on the process' standard error stream: java.net.MalformedURLException: unknown protocol: bogus
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 5 of 14
Figure 3: Flow of execution while application creates then resolves an URL into an InputStream Figure 3 shows the URL constructor using the URLStreamHandlerFactory to obtain a URLStreamHandler. When client code later calls openStream(), the URLStreamHandler is itself used as a factory to create a URLConnection. Most of the code that establishes a connection to a server and downloads a resource stream is located in the URLConnection. The URLConnection is a single-use object whose whole life's purpose is to generate an InputStream for the client code to consume.
Class Relationships
The URL class is a state storage class. The URL object stores information about the parts of a URL string, and manages other objects in order to resolve that URL string into a resource stream when asked to do so. Figure 4 shows the static relationship between classes in the java.net.URL architecture.
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 6 of 14
Figure 4:Static relationship between classes and implementations in UML The URL object uses a URLStreamHandlerFactory to create a URLStreamHandler.URLStreamHandlerFactory is in fact an interface. Java provides the default implementation of the factory interface, but you can replace it with your own--in certain deployment scenarios discussed later in this paper you must provide your own factory implementation. The URL object maintains an immutable reference to its URLStreamHandler. When asked to resolve itself into a resource stream, the URL object uses its URLStreamHandler to create a URLConnection, then as part of the same code block the URL completes its use of the URLConnection. The URLConnection just represents a temporary, synchronous URLStreamHandler session. It's a place for the URLStreamHandler to place URL-object-specific state, which is necessary for stateless URLStreamHandlers. The URLConnection is an interesting beast. Given the system as I've described it so far, the behavior of a URLConnection is only parameterizable by the URL object's state. That is, the only variables whose values differentiate the behavior of two URLConnection objects created by the same URLStreamHandler would be the field values of the URL objects (the host names, port numbers, filepaths, and ref strings stored internally by the URL objects) that create the two URLConnections. A request represented as a URL could not be augmented with any further information (such as HTTP POST data), especially because the URL class is final, so subclasses can't add state members. I'll describe URLConnection objects in more depth later. But I'll tell you now the URL class defines a public method openConnection() (as opposed to openStream()) whose implementation returns theURLConnection object itself back to controlling code, rather than an InputStream. This allows client code to parameterize further a resource request past what the state members of the URL object can represent. For example, a URLConnection created by an http: URL generates an HTTP POST request when its public interface is used in a particular way. The contents of a single http: URL string alone can't do that. Back to Top
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 7 of 14
And for any reader who is outwardly hostile to anything Microsoft-related, such as a Windows registry, I can only offer that the Windows registry's hierarchical arrangement is so close to the implicitly-hierarchical arrangement of resources implied by URL trying syntax that a win32registry: protocol handler is a great example of the power of custom protocol handlers. Normally you can only access a Windows registry in standard Java with native methods, because Microsoft only provides accessor functions in C-callable DLLs. With the win32registry: protocol handler, the protocol handler class implements all of the native code. Code that wants to access data in the registry need only reference the data with a win32registry: URL. For example,if you wanted to know whether or not Sun's JRE was installed on the local system, you can easily find out by querying the Windows registry.JavaSoft adds registry entries storing the configuration of the JRE during JRE installation. Under the registry key, HKEY_LOCAL_MACHINE \SOFTWARE \JavaSoft \Java Runtime Environment \1.2, is all the JRE configuration parameters, such as the installation directory of the JRE (stored in the "JavaHome" subkey). Using the win32registry: protocol handler, you can look up the value of this key with just a few lines of code:
URL url = new URL("win32-registry:///HKEY_LOCAL_MACHINE/ \ SOFTWARE/JavaSoft/Java Runtime Environment/1.2/JavaHome"); InputStream is = url.openStream(); DataInputStream dis = new DataInputStream(is); String strJavaHome = dis.readUTF();
The code example shows that string-valued keys are encoded using UTF-encoded strings. Integer-valued keys will result in streams that include a 4-byte integer in network-byte order. Binary-valued keys will result in streams that just include the binary data of the key.2
Implementing URLStreamHandler
The first thing to implement is a concrete URLStreamHandler subclass.The URLStreamHandler implementation actually only has two relatively simple tasks to perform: 1. When a new win32registry: URL is created, the URLStreamHandler must parse the URL string into the various field values host, port, file, and ref. 2. Create and return a new URLConnection instance when the URLStreamHandler's openConnection() method is called.
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 8 of 14
Task 1 is actually quite important. Whenever a new URL object is created, the URL constructor first obtains a reference to the appropriate URLStreamHandler (using the URLStreamHandlerFactory, as previously described), and second asks the URLStreamHandler to parse the URL string into host, port, file and ref field values. Figure 5 illustrates this two-part construction process.
Figure 5:Two-part construction of a URL object. The handler populates the URL's fields. The URL constructor calls handler.parseURL(). Your implementation of parseURL() is supposed to parse the URL string and use the embedded values to populate the URL object's host, port, file, and ref fields. Rather than every protocol handler designer re-writing the same code to parse a URL into its constituent parts, the default implementation of parseURL() assumes the URL is in the normal protocol://host:port/file#ref format. So, if your handlers expects URLs to be of that form, you don't actually have to implement parsing code.4 The win32registry: protocol handler expects URLs to be of the normal form, so there's no need to override the inherited parseURL() method implementation.
public abstract class URLStreamHandler { ... protected abstractURLConnection openConnection(URL u); ... }
The handler's openConnection() method is called directly by the URL class openStream() implementation. An openConnection() call is a request to the handler that it create an object that can resolve the URL into a resource stream. The URLConnection object is that object. The name, URLConnection, indicates that the java.net.URL architecture developers expected the only way to resolve a URL object into a stream is actually to open a connection to some server, like an HTTP or an FTP server. My win32registry: handler actually doesn't have to open a network connection to any other server. Still, I must provide aURLConnection implementation that effects the same behavior my connection implementation won't contact a different server but will instead contact the Windows operating system through a series of native calls. A deep explanation of the win32registry: URLConnection implementation follows in a moment. To complete the win32registry: handler's openConnection() implementation, all I have to do is create and return my custom URLConnection instance:
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 9 of 14
protected URLConnection openConnection(URL u) { //...create and return a custom // URLConnection initialized // with a reference to the target // URL object... return new RegistryURLConnection(u); } }
In fact, the aforementioned code is just about my entire URLStreamHandler implementation; I left most of the implementation for the RegistryURLConnection class.
Figure 6:URLStreamHandlerFactory creates the URLStreamHandler using reflection to first load appropriate class, then create instance. Back to Top
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 10 of 14
Expected URLStreamHandler class name. The java.protocol.handler.pkgs system property may contain a list of package prefixes, rather than just a single-package prefix name. The list is pipe-character ('|') separated. The URLStreamHandlerFactory evaluates each prefix in right-to-left order, stopping when it finds the first generated fully-qualified class name matching a class on the classpath. Note also that the URLStreamHandlerFactory always appends the default protocol handler package prefix, sun.net.www.protocol, to this system property. That is, Sun's default handlers are always used if a handler can't be found in a different package first. The win32registry: handler must be in an appropriately named package:
The VM must be started up with an appropriate value for the java.protocol.handler.pkgs system property (Type the following on one line): java -Djava.protocol.handler.pkgs= com.develop.protocols ...
Implementing URLStreamHandlerFactory
An alternative to using the XYZ.protocol.Handler package and class naming convention (along with the java.protocol.handler.pkgs system property) previously described is implementing your own URLStreamHandlerFactory. The default URLStreamHandlerFactory pays attention to the java.protocol.handler.pkgs system property and applies the package and class naming conventions. Your alternative implementation of the URLStreamHandlerFactory interface could get a handler implementation from wherever it wanted. There are a few situations where you might consider using your own factory implementation. Foremost is when you don't have access to set the java.protocol.handler.pkgs system property. This property must be set at VM start-up time, and is usually set by a command-line parameter or a start-up properties file that the VM uses. If you want to use your custom protocol as part of a component, such as a servlet, Jini service or EJBean implementation running within a container VM, then you will probably need to include a URLStreamHandlerFactory implementation as part of your component installation. The URLStreamHandlerFactory interface is very simple:
Get a URL object to use your custom factory implementation instead of the default by using an overloaded version of the URL class constructor, as in this example: URL url = new URL(null, "cvs://server/project/folder#version", new MyURLStreamHandlerFactoryImpl());
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 11 of 14
This URL object will use MyURLStreamHandlerFactoryImpl instead of the default factory. Note, however, that Java 2 security does require a NetPermission named specifyStreamHandler granted to the calling context in order to use this constructor. That is, code without that NetPermission grant, running within a Java 2 VM with a default SecurityManager will receive an AccessControlException. The following java.policy file grant demonstrates how to assign such a permission to code signed by "MyCompany":
... grant signedBy "MyCompany" { permission java.net.NetPermission "specifyStreamHandler"; }; ...
Implementing URLConnection
A URLConnection object manages the translation of a URL object into a resource stream. URLConnections in general can handle both interactive protocols, such as HTTP and FTP, as well as non-interactive protocols, such as "jar:" and the win32registry: protocol. That is,the URLConnection subclass used to make HTTP requests is able to handle interactive request/response dialogs with a server. The RegistryURLConnection implementation only needs to make certain local system calls to translate a URL object into a resource stream-there is no complex, multi-part request/response dialogue between the RegistryURLConnection and the native O/S. The URLConnection class contract must be able to handle both types of resource request models. In addition, do so in a very generic fashion: the interface must be easily applied to many different types of client/server interactions. Consequently, the URLConnection class' design is somewhat abstract and rather complicated. I will explain URLConnections in general before describing my implementation of my RegistryURLConnection class in particular.
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 12 of 14
URLConnection back to the caller. This way, code that created a URL object may have direct access to its URLConnection, and that controlling code can then manipulate the request the URLConnection is to make.The following example code shows how to perform an HTTP PORT request in Java using a http: URL's URLConnection.
//Application (a.k.a "controlling ")code. URL url =new URL("http://someserver/file "); URLConnection urlconn =url.openConnection(); //...use setRequestHeader()to modify request into // a POST request.(The "METHOD "header is specially // understood by the HTTP URLConnection.)... urlconn.setRequestHeader("METHOD ","POST "); //...Indicate the content type of the POST data... urlconn.setContentType(" application/x-www-form-urlencoded "); //...equivalent to setting the "content-type "header... //...Open a connection with the server... urlconn.connect(); //...Get the output stream to send request data... OutputStream os =urlconn.getOutputStream(); //...Write POSTed form data... //...Get the InputStream, which completes // the request... InputStream response =urlconn.getInputStream();
Once the controlling code has initialized all necessary request headers, the controlling code calls the URLConnection's connect() method.This method has a nebulous definition in general. Basically, it means that the request headers have been fully-formed, and for the URLConnection to go ahead and make a connection to the server/service/library that is servicing the request. An HTTP URLConnection implementation would obviously make a TCP connection and send request headers in response to this method being called. The win32registry: protocol handler goes through native methods to actually get the target registry key value in response to this method. Note that getOutputStream() and getInputStream () methods both call connect() initially if it hasn't been called yet. When connect()returns, the response headers should be initialized to values indicating the result of the request. Controlling code expecting a response stream would call getInputStream() at this point. The contents of the stream and the response header values indicate the result of the URL request. Again, the URLConnection base class has a rather generic definition, meant to be a broad umbrella under which the concepts of parameterized requests and responses in general can fit. When implementing a protocol handler you must understand how your protocol fits into this generic model so that you can implement a meaningful URLConnection class. I recommend you read the JavaDocs for the java.net.URLConnection class. There are several convenience and hook methods in the API I left out or glossed over in order to discuss the essential points. Back to Top
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 13 of 14
Meaning of RegistryURL Connection's Response Headers My implementation does have to do a little cognitive mapping to squeeze a registry value response into the URLConnection's response headers and OutputStream. First of all, RegistryURLConnection.connect() sets a single response header called "Exists" to indicate whether or not the target registry key exists. The valid values will be "True "or "False." Next, if a registry key does exist,that doesn't mean it has a value. For example, the \HKEY_LOCAL_MACHINE \SOFT- WARE \JavaSoft \Java Runtime Environment \1.2 key doesn't have a value, it's just a parent node to sub-nodes that actually do have values. A request for this node would result in a value of "True "for the "Exists "response header, but a value of "application/x-null "for the "Content-type "response header. The other valid "Content-type "header field value," application/x-java-DataStream,"is used if the key does exist and it does have a value associated with it. In this case, another header field, "DataType "indicates the registry key's value. This response header will have one of three values: UTF, int, or binary. In the case of binary, the Content-length header field indicates the length of the response stream. The implementation of all the getXYZ()methods to access response headers forward the call to the getHeaderField() method. The getHeaderField() implementation just accesses values in a Map object. This is a minimal implementation of response headers that supports actually having response headers (as opposed to a null implementation, used by protocol handlers that don't use response headers at all). Where to Find the win32registry: Protocol Handler Implementation The full source code, including examples you can run to test the protocol handler, is available by sending e-mail to: javaprotocol@develop.com.
Summary
The Java 2 Core API depends heavily on the URL class. A deep understanding of the internals of the java.net.URL architecture allows you to exploit those dependencies by creating reusable, plug-in protocol handlers. Once you build a protocol handler you can use it in any Java VM, creating URL objects that refer to your new protocol. It takes very little work at deployment time to enable your handler: just let the VM know about your handler by defining the java.protocol.handler.pkgs system property, and make sure your handler implementation is available off the classpath. Using a custom protocol handler can often save you vast amounts of time and effort when designing and implementing Java architectures to access stream-based resources. The java.net.URL architecture is already built, already a part of all Java 1.x+installations, and used by many parts of the Java Core API, Extension APIs, and third-party packages. There's no reason not to exploit this architecture whenever possible.
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011
Page 14 of 14
http://java.sun.com/jsp_utils/PrintPage.jsp?url=https%3A%2F%2Fsummer-heart-0930.chufeiyun1688.workers.dev%3A443%2Fhttp%2Fjava.sun.com%2Fdeveloper%2FonlineTraining%2Fprotocolhandlers%2F
9/7/2011