The rapid growth of the Internet and the falling costs of cloud services has led to a rise in innovative web services such as Google Docs, PayPal and Amazon.com. Even internal company services can rely on a private network to facilitate client server communications. In many cases, the clients and servers operate in heterogeneous environments, meaning that the machine architecture or operating system may be different between a client and server. Modern client machines are not as powerful as their server counterparts. This is because a client machine is designed to address a different set of concerns. For example, clients are designed to facilitate a single user, whereas a server will have to provide computing power for multiple concurrent users. Conversely, a client machine is highly concerned with providing exceptional usability and therefore the operating system is designed to be intuitive to use and to learn. Servers are typically used by system administrators or other related information technology professionals and may have a user interface that requires more expertise to use. Given that the machine and operating environments are different between clients and servers, how do they communicate with each other? Furthermore, how would they communicate over a network? In this lesson, we will explore the idea of middleware which is a type of architecture used to facilitate communications of services available and requests for these services between two applications that are operating on environmentally different systems. In addition, we will look at Remote Procedure Calls or RPC, which is the basis for modern day middleware using network technologies. As we've noted before, modern systems rarely operate in isolation. Network connectivity is not only common, it is expected. New software systems are designed to be able to communicate with other systems over a network. Legacy systems are either updated, redesigned or rewritten in order to be able to utilize network connectivity. Computer networks have enabled the growth of distributed computing; a system architecture in which computers on a network are able to communicate and coordinate their actions by passing messages through the network. This has enabled developers design tiered architectures, with each layer focusing on specific aspects of the overall system. Since clients and servers can become more specialized, software and hardware developers are able to create software environments and machine architectures that serve to enhance these specialized characteristics. As a result, a situation is created where the client and server are no longer able to communicate because they do not understand each other. So how do we solve this? We use a component known as middleware, which manages the communication between heterogeneous components by providing a common interface to all the clients and servers. This architecture resembles the mediator design pattern. It is conceptually very similar, but on a much larger scale. Here, the middleware component X is the mediator. Middleware on the other hand is used to facilitate communication on a large scale by providing a common interface between entire distributed systems. This allows developers to access functionalities of the system without having to implement an entire tier of subsystems in their architecture. The complexity of modern systems has created the need for more sophisticated middleware systems. As systems move towards entier architectures, middleware needs to be able to encapsulate more functionalities, from business logic and distribution of client requests to being involved in handling authentication and authorization. Therefore, we need to either design new middleware architectures or be able to extend existing ones. While there are many examples of middleware, we will focus on Remote Procedure Call or RPC, because it is the basis for middleware systems used for certain web services. RPC was designed specifically as a middleware component for networked systems and has been extended to improve its flexibility and usability. As the name implies, a remote procedure call allows clients to invoke procedures that are implemented on a server. Here, the client and server are either on completely separate machines or are a different virtual instance on the same machine. How are these two cases different? If the client and server are on different machines and the physical address space between the two would clearly be different. That means that the client does not know the physical memory address of the procedure that it wants to call in the server since they do not share the same physical memory. When the client and server are on different virtual instances on the same machine, the virtual address space is not shared between the two. It is up to the operating system to manage each individual virtual instance and find the correct virtual address for the procedure being invoked. In both cases, the client does not have direct access to the procedure that it is calling. One of the primary functions of RPC is to facilitate that call between the client and the server. Before we delve further let's step back and take a brief look at the history of RPC to understand why it was developed and why it's important to modern web services. Developed and introduced by Birell and Nielson in the 1980s, Remote Procedure calls were created in order to provide a transparent method of calling procedures that were located on a different machine. In their initial design, Remote Procedure Calls consist of three primary components: A client which is the caller; it is the component that is making the remote call. A server which is the callee; it is the component that implements the procedure that is being invoked and an interface definition language or IDL which is the language through which the client and server communicate. In. Addition RPC could include name and directory services and binding methods to allow clients to connect to various servers. Remote procedure calls became successful because it used the concept of procedures which developers were already familiar with. It did not require developers to learn a new language or programming paradigm. This enabled developers to design and implement distributed systems more efficiently. RPC did not start out as a cornerstone for middleware based architecture, instead they were a simple collection of libraries that developers could include in their applications. These libraries contained all the functionalities that were required in order for systems to make remote procedure calls. In today's systems, Remote Procedure Calls are used in a variety of different configurations. You may recognize them as stored procedure calls in a database system or in XML messages for Web services. Remote Procedure Calls began as a means to facilitate procedure calls between different machines and have grown to play a larger role in modern systems. As computer networks and distributed systems become more important, RBC was extended from a collection of libraries to support large middleware systems. So how does a remote procedure call work? The three primary components of a remote procedure call are the client, the server and the interface language definition or IDL. The IDL is the first component that is implemented because it is used to define what procedures on the server are available to the client. It also describes the input parameters as well as the returned response. The IDL is essentially the specification for Remote Procedure Calls. It tells the client what remote services are available, how they are accessed and what the server will respond with. In RPC, the heavy lifting is performed by the client and server stubs and the interface headers. These are produced once the IDL is compiled. The client stub acts as a proxy for the procedure call. It is responsible for establishing the connection with the server through a process called binding, formatting the data to a standardized message structure such as XML, sending the remote procedure call and receiving the server's stubs response. The servers that received the call and invokes the desired procedure. It also contains the code for receiving the remote call, translates to sanitize message into a data format the server recognizes and sends a server's response back to the client stub. Each stub is compiled and linked directly to the client or server component. This means, that when a client makes a remote procedure call, the application will act like it is a local procedure call to the client, because the client stub is in the same address space. Complex stubs can be created so that developers can create client components in environments that are different than the server. For example, complex stubs can allow client components operating on a Windows machine to communicate with server components that are running on Unix. Given the number of programming language and operating platforms, manual development of client server stub pairs that can communicate with each other is inefficient. By using an IDL, stub's can be generated because the IDL maps the concrete programming language to the intermediate representation in the stub. The interface headers is a collection of code templates and references that are used to define what procedures are available at compile time. These files are used in the development of the client and server components. The basic types of interfaces provided to make a remote procedure call are, the procedure registration, which will tell the client what procedures are remotely accessible on the server. The procedure call, which is the actual procedure that is being invoked on the server, and, the procedure call by broadcast, which is the same thing as a procedure call, but these procedures are invoked by broadcast. The stubs abstract all the networking details, which can be kept hidden from the developers so that they don't have to worry about them. If the developer wants more control over the details of the client server connection, our PC does allow configuring the connection settings. Next, let's look at how remote procedure calls are performed, which involve seven steps. Remote procedure calls start at the client component. It makes the procedure call and passes the arguments to the client stub. Since the client stub is linked to the client, no network connection needs to be made. This step is exactly like a standard procedure call. Next, the client stub will convert the parameters to a standardized message format and copy them into the message through a process called marshalling. Here, the parameter data is transformed into a format that is suitable for communicating with another application. The format that is used is determined by the IDL that is used to compile the stub. The client stub sends the message to the server using the binding information that it is given. Binding is the process in a client that connects to a server. This can be done statically or dynamically. Static binding uses hardcoded binding information. For example, you can tell your client component the IP address and the port of the server you wanted to connect to. It is a simple and efficient method of connecting, but the client is coupled to the server. If the server goes offline, or if it's connection information changes, then the client will not be able to establish a connection. Static binding also does not allow for server redundancy since clients will always connect to a specific one. Dynamic binding is more complex and adds another layer to your tiered architecture. This added layer is referred to as the name and directory server. It is responsible for keeping track of which servers have been bound and balancing the load between all servers. The dynamic binding layer can also keep track of servers and change the binding information if the servers change. In either method of binding, the developer of the client component does not need to worry about binding, because the client stub is responsible for the static binding or communicating with the dynamic binding layer. When the message arrives at the server, it is received by the server stub. Since the arguments have been marshalled, it must be translated back to a format that is usable for the server side procedure call. This is done in a process called unmarshalling. Once the arguments have been converted back to the proper format, the server stub will invoke and pass the arguments to the procedure in the server component. Just like on the client side, this procedure call looks like a normal procedure call to the component, because the server stub is linked to the server side component. Once the procedure finishes execution, the server component will return the results back to the server stub. In order for the results to be returned back to the client, the server stub needs to marshall the results into a standardized format. The server stub does not need to bind to the client because the connection is already established by the client. All the server stub needs to do, is return the message back through the connection. Just like with the server stub, the client stub unmarshalls the results in the message. And then returns it to the client component. The client can now close the connection to the server. Our PC was originally designed to be synchronous. During a remote procedure call, the client component will pause its execution while we wait for a response. This is also known as blocking since the client cannot perform any other task until the server returns a result. Well, this is a simple design to implement. It is not always practical. Having to wait for a response introduces a number of issues. What if the server never returns a response? The client ends up waiting indefinitely. How long do we wait for the server to respond? Some procedures may take longer than others or the server may take longer to create a response under a heavy load. Do we re-transmit a remote procedure call? This would depend on the procedure. For instance, you do not want to re-transmit a remote procedure call for a bank transaction because it would cause a system to perform that transaction multiple times. There are a number of cases that you need to examine in order to determine how your client component should handle timeouts, retransmission, and server exception messages. Modern systems need to be able to handle remote procedure calls in asynchronous manner. Asynchronous systems are considered non-blocking, because clients do not need to wait for a server response before moving on to another task. This is an important idea, which allows for different components of a distributed system to work more independently of each other. Systems are also able to perform different tasks in parallel with each other, because they do not need to wait for one task to end before starting another. Asynchronous behavior adds more complexity to your system because you need to manage how your system will allocate resources for various pending tasks. Keep in mind that overloading your system with asynchronous tasks can also reduce your system's overall performance. For instance, improper coordination of asynchronous tasks between threads can cause a bottleneck if work on one thread requires input from another thread. Remote procedure calls and middleware, in general, play a key role in some service oriented systems. With the growth of networks and increasing prevalence of distributed systems, being able to access services implemented by procedure's located on a different machine or virtual instance is very important.