Federbit

Friday, December 30, 2016

Web Application Architecture: Separating Client and Server

A web application is built upon a client-server architecture, a 3-tier architecture to be precise. As we’ve seen in the previous article, the code we write is separated into a client, and a server part. In this article, we will have a closer look at this separation.

More power to you!

A web browser is the key component in a web application’s 1^st tier (the client). When the world-wide-web was still in its infancy, web browsers and client devices were not as powerful as they are today. They were only powerful enough to render simple HTML, and execute some simple Javascript. But as client devices become more and more powerful, the web browser follows. The structure of HTML documents become more complex with extensive CSS and Javascript.

Today, it’s not uncommon to see web applications with user interface as complex and beautiful as those of desktop applications. The capability of web browsers have evolved from simply rendering HTML documents, into performing application-like operations. We can use this to our advantage when we’re designing our web app's architecture. Not only in terms of user experience such as interactivity and ease-of-use, but also in the internal structure of the application. This could impact maintainability and scalability which is important to us developers. By tapping into the power of modern browsers and client devices, we could ease the server’s (and hopefully also developer’s) burden slightly, if not considerably.

Roads to Rome

If we build our web application the way we build websites, each HTTP request would result in a response containing a ready-to-render HTML document. It would be the server’s (2^nd tier) responsibility to cook (generate) a well-done HTML document. All the browser has to do is serve (render) the HTML for the user to view.

However, there’s another way. We could leave the cooking part to the browser. All the server has to do is give the recipe (HTML view template & Javascript) and the ingredients (data) to the browser. It would be the browser’s responsibility to cook the ingredients based on the recipe (fills the template with data) and serve it to the user.

This way, the server does not deliver a blend of data and view (display format, layout, etc). Instead, it will deliver each of them independently through different request-response cycle. The view will then be incorporated by the browser to become a rich presentation tier that’s ready to request and display data on user’s behalf. Since we’ve separated the data and the view, we can create another view (e.g. native mobile / desktop app) to request the same data. The data can even be requested and consumed by another server (2^nd tier), which is what a web service is all about.

Reading between the lines

If we decided to go the website way, we will be generating dynamic HTML quite extensively. This is not to say that we would otherwise not generating any dynamic HTML at all. It’s just that the extent of it would be far less if we decided to separate the view from the data. This is because, as explained above, we have moved the responsibility of data-filling (injecting data into the view) to the client.

Mind you though, even if we have separated the view from the data, the view template itself would sometimes needs to be dynamic. For example, if we have a view template that can fetch data from different service endpoints (e.g. URL) dynamically. It may depends on the interaction context of when / where we display the view. We would then need to dynamically put the service endpoint information into the view. But this is far less complicated than the task of blending view and data.

Data in the Raw

Building a server that generates dynamic HTML is a task that’s familiar to those who have ever built a website with dynamic contents. We can accomplish this easily using server-side scripting languages such as PHP/JSP/ASP. But building a server that generates raw data might not sound familiar to some. First, we need to determine the format of data that we will exchange between the client and the server. Remember that the flow of data is bidirectional. The client may need to send some data along with the request it send to the server.

Common options for data format is XML and JSON. The former is more elaborate and flexible, while the latter is more concise and compatible. This is due to the fact that JSON is native to Javascript. Secondly, we must handle the HTTP request-response at the server at a lower level. This is because we need to do some things to the response such as attaching some data, setting the appropriate content-type, etc, instead of simply returning an HTML page. How to do this largely depends on the technology / language that we use for the 2^nd tier. For example, if we’re using Java, we could use JSP to serve the view, and Servlet to serve the data.

About Time

One more thing to note is about the synchronicity of the request-response cycle from the point of view of the client. If we go the website way, our request-response cycle will be synchronous. This means that these cycle happen in sequence after one another. Each response from a request will refresh the page, and only after this we can issue another request.

But if we decided to split the view and the data, our request-response cycle will mostly be asynchronous. This means that these cycles can happen independently one another. We can issue new requests even before we receive the response of the previous request. We will be informed (via callback function) when the responses arrive. Thus, a response does not necessarily refresh the page. We don’t want to display XML/JSON data from the response as-is, but instead inject that data into the view. We can achieve this using Javascript’s XmlHttpRequest which is the heart of AJAX / AJAJ.

Jump on the Mobilewagon

Building a web application is more complicated than building a website. This is especially true if we decide to implement an architecture that separate the view from the data. We might end up with more source files, more Javascript, and more functions at the server-side. But this complexity does not come in vain, in fact, our code will be easier to maintain because it handles different concerns (view & data) separately.

Another advantage is, we can now create different types of clients without any change to the server part. For example, we can create a native mobile application to perform the same task as our web application. This will be relatively easy because we can reuse the server part, in which most of our code resides. Our native mobile application will only consists of minimal elements. Those are the UI (views), HTTP client (to request data from the server), and some event-handling and data-binding code. This is essentially what the client part of our web application contains.

Web Application Guide: Welcome to the Jungle

If you’re new to the web application development world, you might be bewildered by the myriads of technical terms and acronyms out there. Good news is, we don’t have to understand each and every one of them to start to build our own web app. All we need to do is understand the basic concepts and terminologies, and the rest will make sense as we advance. Hopefully, this guide will give a solid foundation of knowledge for beginners to survive the jungle of web application development.

Where things fall into place

First things first, we should understand that the way a web application works is basically something that’s called a client-server architecture, or to be more specific, a 3-tier architecture. Now, stop here, and read the previous sentence again, slowly, and make a mental note of it. Most of us may have heard about the 3-tier architecture, only to forget it in the middle of our struggle to become a web programmer.

The 1^st tier is what we call the presentation tier, or the client, or the front end. This is the web browser running on each user’s device, whether it is a PC, a laptop, or a mobile phone. The role of this tier is, as its name suggest, to present views of data to the user. We use primarily HTML, and additionally CSS & Javascript to achieve this.

The 2^nd tier is the business logic tier. This is the Web (HTTP) Server listening for connections from the client’s browser. This Web Server is usually running in a centralized location, whether it is a shared hosting, a VPS, or a dedicated server hardware. The role of this tier is to connect the 1^st and the 3^rd layer, and to provide the core functionality of the application which may involve authentication, authorization, validation, calculation, etc. The tools we use to achieve this depends of the type of the Web Server. To name a few, there’s PHP, ASP, JSP, Java Servlet, and (recently) Javascript.

The 3^rd tier is the data tier. This is the DBMS listening for connections from the Web Server, which is usually, but not necessarily, running on the same machine as the Web Server. The role of this layer is to provide data storage. Some example of the tools in this tier is PostgreSQL, MySQL, Oracle, Microsoft SQL Server, etc. Together, the 2^nd and the 3^rd tier constitute the “server” part of the client-server architecture.

Now we're talking

Since we’ve put things into perspective, let’s dive in a bit deeper. To be able to function as an application as a whole, those tiers need a way to communicate to each other. It’s the communication between the 1^st and the 2^nd tier that we’d like to have a closer look. It’s basically HTTP, with its commands such as GET, POST, PUT, etc that follows a request-response cycle. In short, the browser issues an HTTP command to the Web Server (request), and the Web Server will give an answer (response).

With each request / response, we could attach some data, which can be in many different format (or “content type”). The most obvious one is of course, HTML/CSS/Javascript, which is a part of the 1^st tier component that needs to be delivered by the Web Server to the browser by attaching it into a response. For the request, common ways to send data is using HTTP Query String and HTML Form. Actually, by using HTTP Query String, the data is not attached to a request body, but instead embedded into the URL.

Normally, with each request-response cycle, the browser will refresh the page with a new content (HTML/CSS/Javascript) that comes with the response. But this is not the case if we’re making the request using AJAX, which is asynchronous in nature. This means that when the request is made, the browser won’t wait for a response to arrive and display the data contents in it, but instead leave it to us to handle the response when it come. A variation of AJAX is AJAJ, which is using JSON instead of XML to represent the data.

Tools of the trade

Building a web application is not a simple task, even in the 1^st tier. This is especially true if we’re building an application that require complex user interfaces with multiple forms, tables, lists, etc. To build a great application, we must ensure that, despite its complex requirements, the application is both user friendly and programmer friendly. This means that the user interface must be easy to use, consistent, and intuitive, while keeping the source code clean, consistent, and therefore easy to maintain. This is where frameworks and libraries came to the rescue. Please note that, while frameworks and libraries also exist for the 2^nd tier, we’ll only focus on the 1^st tier for now.

The words “framework” and “library” are sometimes used interchangeably, but actually they’re quite different. A framework usually requires us to provide pieces of code in a predefined structure that it will put together, while a library provides pieces of codes for us to incorporate into our own code as we see fit. One of the most popular library is jQuery, which helps to simplify the way we access/modify the DOM (Document Object Model, the javascript’s perspective of an HTML document), handle events, create CSS animations, and doing AJAX calls. Two other popular libraries are React and Knockout which provide data binding, which is a way to synchronize UI elements with your variables. As with frameworks, there are Angular, Durandal, Ember, etc, which come with their own strengths and weaknesses.

Conclusion

Understanding the components involved and the interaction between them is crucial in building a great web application, and any application for that matter. It will server as a guide for beginners in understanding the bigger picture. What may seemed like a daunting pile of technical terms at first, can now be seen as merely a collection of building blocks. We can, at our own discretion, tackle them individually based on our specific needs, as our study progresses.

Web App Design: Client - Server API

There are several ways to design a web application in terms of client-server architecture. We have learned one of them from the previous article. To summarize, we separate view templates from raw data so that we can access the same data using several different clients. The server only needs to deliver view templates and raw data as requested by the browser. It will be the browser’s job to put them together to make a usable UI.

While this design offers simplicity at the server side and flexibility at the client side, it does come at a cost. We now need a more elaborate server API (Application Programming Interface). This comes from the fact that the server needs to handle view and data request separately. We may need to define several functions to handle the data requests of even a single view. Overall, the client-server interaction will increase, and we need to design this thoughtfully to ensure the security, reliability, and performance of our web application.

Look who’s here!

Whether we choose to mix or separate view and data, a basic security principle applies. We need to authenticate and authorize users that access our web application. Authentication deals with validating the identity of a user (“who” is the user?). Authorization deals with enforcing access control for an authenticated user (“what” can the user do?). We must verify whether a request comes from an authenticated user. When it does, we must also verify whether that user is authorized to make such request. This applies to each and every request the server receive, whether it’s for a view or data. In other words, we must consider the authentication / authorization aspect of each service / function at the server.

A common way to implement authentication is using a token (session ID) delivered as HTTP Cookie. When a user successfully signed in, a session ID is generated and sent to the browser as a Cookie. For each subsequent requests, the browser will send this Cookie along with the request. The server will then use the session ID to verify whether the session is still valid, and also to determine “who” is making the request. After identifying the user, the server can then determine whether that user is allowed to access the requested view or data. To ensure better security, it’s advisable to mark the cookie as HTTP-only, and send it via HTTPS. This will reduce the risk of a session ID get “stolen” by a malicious script (XSS attack) or by a network sniffer (MITM attack).

State of the art

The next thing to consider is about the statefulness of the server. Before we talk about the difference between a stateful and stateless server, let’s talk about application states. The word “state” in an application can refer to one of two different meanings. The first is the interaction state. This state is short-lived (only valid during a single transaction), and usually stored in volatile memory (variables). The second one is the resource state. This state is long-lived (valid across different transactions), and usually stored in persistent storage (file / database). The state we’re interested in is the interaction state.

By being stateful, the server is responsible for tracking the interaction state of each client. This means that the server must “remember” what each user has been doing, or what data they have submitted for the current transaction. The saved state / data at the server will affect how the server will handle the next request. This means that the server must allocate some resource to save those states, and implement some kind of algorithm to manage them. This does not scale well with increased number of user or the service provided. It will impact the server in terms of complexity and the amount resource needed.

When a server is stateless, it will be the client’s responsibility to manage the interaction state. The server doesn’t care about the previous interactions / request-response cycles. It only cares about the data submitted by the client for the current request. Please remember that this is in the context of an interaction state. In the context of a resource state (file / database), the server still regards previously persisted states (saved files / committed database transactions). Resource state management falls under the 2^nd-to-3^rd tier communication, not the 1^st-to-2^nd tier we’re currently considering.

In a stateless server, the interaction state of the current transaction resides at the client side. The server knows the state of a certain client because that client send the state information along with a request. Therefore, the client must implement some algorithm and allocate some resource to manage its own state. The algorithm will be much simpler since it only handle the state of a single client. And since the amount of available resource is proportional to the number of clients, this model will scale better.

Face to face

When we’ve taken care of the security and state management aspect, it’s time to design our API. In general, we will have 2 types of service, one that returns a view template (HTML + CSS + Javascript), and one that returns data (JSON/XML). The view service API will be easier to design because it represents the UI that users interact with during a transaction. We can easily identify the necessary functions from the UI flow of a certain transaction. We can even implement a single function to handle all view requests of our web application, although this might not be a good idea. This is possible because all a view service needs to do is fetch the requested view template, and send it to the client.

The data service API on the other hand, is more complicated and needs a fine-grained design. This is because data services can vary in terms of returned values and parameters. For example, a data service may return a single value, an object, an array, etc. A value itself can be a string, number, date, binary, etc. A data service may also need parameters of various length and data types.

A good place to start is a list of user actions that result in a process requiring communications between the 2^nd and 3^rd tier. For example, a user clicks a search button to find a product. This would require a database access, and therefore must be handled by a server-side code. We could translate this to the server API as something like find_product function. Such function would require a part of the product name as a parameter, and return an object as the result. We could further define the returned object as having properties such as product code, product name, available qty, etc.

Another place where we can identify the necessity of a server API is when it involves some business logic or complex calculations. Implementing business logic at the server-side keeps our client-side lightweight. It also avoids code duplication when we decide to access the same server using several different client implementations (e.g. native mobile / desktop application).

All sewn up

Another important part of UI interaction is data validation. We can implement data validation at client-side, server-side, or both. The advantage of client-side validation is of course, responsiveness. This is because any calculation involved is done locally by the browser, and no network connection is necessary. The downside is, it’s not secure. We need to remember that the client side of a web application (HTML + CSS + Javascript) is basically “open source”. Of course we could minify and obfuscate things, but a determined user might still be able to decipher it and bypass any client-side validation we may have implemented. Therefore, it’s important to view any client-side data validation as a user experience enhancement rather than a security / data integrity measure.

Furthermore, there are cases where a validation involves the 3^rd tier. In these cases, the validation must be done by the 2^nd tier. For example, let’s consider an inventory application in which a user can add new products. We may need to check the database whether a product already exists, and inform the user accordingly. This would be one of the advantage of server-side validation, it has access to the 3^rd tier. Another important advantage is security, since the code resides on the server. The downside would be latency and bandwidth consumption, since the client needs to send the data to the server, and wait for the server’s response to determine whether or not the data is valid.

Performing validation both client-side and server-side would be the best of both worlds. We can implement client-side validation for preliminary checking to show invalid input to the user. Again, this is only for convenience purpose, to allow a user to notice invalid input and fix it quickly instead of waiting for a request-response cycle. We still need to perform the same validation (and may even be more with databases, etc) at the server-side to ensure data integrity. Only after an input pass the client-side validation, will it be send to the server for server-side validation and further processing. This way we can achieve a good user experience without sacrificing data integrity.

Practical Usage of Java Maps

One interesting and useful class in core Java (not Javascript) is the class java.util.HashMap. It’s an implementation of the java.util.Map interface which represents a Map data structure. A Map (sometimes also called associative array) is a data structure that contains “mappings” of a “key” to a corresponding “value”. Unlike arrays, we don’t access the elements by their position (index), but rather by a previously associated object. The object can be as simple as a String, or as complex as a JavaBean. Another implementation of the Map interface is the class java.util.TreeMap, which we’ll also take a look at since it’s just as interesting and useful.

On the lookout

One way we can utilize a HashMap is by using it as a lookup table. A lookup table is useful when we need to combine data from two or more datasource based on a certain category. Let’s say we have a sales database and an inventory database. The sales database contains data about sales forecast of each product for next month. The inventory database contains data about the quantity of each product currently available. We are then asked to create a report that compares next month’s forecast against currently available quantity of each product.

On the fly

There are several ways to do this. The simplest (and probably the worst) one would be something like:

Query the forecast database for all products, then iterate the result.
For each iteration, query the inventory database for the corresponding product.

The algorithm above would execute n+1 queries against the database, where n is the number of products. Not a very efficient one, considering that a database query has quite an expensive latency overhead.

Once and for all

A better strategy would be something like:

Query the inventory database for all products, keep the result for later use.
Query the forecast database for all products, then iterate the result.
For each iteration of the forecast data, find the quantity of the corresponding product by iterating the inventory data we get from no.1.

This algorithm will only execute 2 queries against the database, but it will execute as many as n? (termial function of n) conditional operations (for the if statement to find the corresponding product). Let’s say we have 100 products identified by a String product-code. The above algorithm will call java.lang.String’s equals() 5050 times (100? = 1+2+...+100 = 5050). Although such call will happen in local memory (as opposed to going through the network stack as in the case of database queries), it still is inefficient. It will impair scalability, and in certain cases could impose a considerable performance penalty.

Put it on the map

Another way to create such report would be using a HashMap:

Query the inventory database for all products, and iterate the result to put each element into a HashMap, with the product’s code as the key.
Query the forecast database for all products, then iterate the result.
For each iteration of the forecast data, find the quantity of the corresponding product by “getting” it from the HashMap.

With this algorithm, we have 2 queries, and n HashMap's put() and get() calls (for a total of 200 calls for 100 products). According to the javadoc, HashMap’s get() and put() method has a constant-time performance. This means that given a properly configured HashMap, each get() and put() will take the same amount of (relatively short) time. Therefore, this algorithm will have a (much) better performance than the previous two, and it will scale well.

Sum and substance

Another situation where a HashMap would be useful is when we need to perform some sort of aggregation. Let’s say we have a multinational company that keeps its sales data in separate databases for each country. We are then asked to create a report that shows the worldwide sales of each product.

We can achieve that using something like this:

Pick one (any) database, query the sales data, then iterate and put it into a HashMap with the product’s code as the key.
For each of the other databases, query the sales data, and iterate the result.
For each iteration of the sales data, get the (accumulated) sales value of the corresponding product from the HashMap. Add that to the current sales value, and put it back into the HashMap (overwriting previous value).

When this algorithm finish, we will have the value of worldwide sales for each product in the HashMap. We can retrieve that using the values() method.

The missing link

A HashMap makes no guarantee as to the ordering of its elements. This means that the entrySet(), keyset() and values() methods may return a collection of data (mappings / keys / values) that’s in different order from the order in which we put them in the first place. The ordering is unpredictable, it could even change between method invocations. If we need a HashMap with a predictable ordering of its elements, we could use java.util.LinkedHashMap.

We can configure a LinkedHashMap to use either the insertion-ordering or the access-ordering. With insertion-ordering, the elements are ordered according to the order they were put into the Map for the first time. With access-ordering, the elements are ordered according to the order they were last accessed. Please note that the “access” operation covers the get(), put(), and putAll() methods.

Let’s sort it out

As mentioned at the beginning of this article, there’s another interesting implementation of the Map interface, the TreeMap. A TreeMap is also predictable in the ordering of its elements, but unlike a LinkedHashMap, a TreeMap maintains the ordering according to its keys. The ordering can be based upon the natural ordering, or upon a Comparator provided at construction time.

Let’s modify the multinational company example above a little bit. This time, each country can sell different sets of products. The set of products may overlap between two or more countries. This means that we must not assume that sales data queried from any database contains a complete set of products. In other words, we might add new products as we’re processing each country. We’re asked to create the same report, but this time with the data sorted by product codes.

The algorithm looks something like this:

Pick one (any) database, query the sales data, then iterate and put it into a TreeMap with the product’s code as the key.
For each of the other databases, query the sales data, and iterate the result.
For each iteration of the sales data, check whether we already have an entry in the TreeMap for the corresponding product.
- If we do, get the (accumulated) sales value, add that to the current sales value, and put it back into the TreeMap (overwriting previous value).
- If we don’t, put the current sales value into the TreeMap with the product’s code as the key.

When this algorithm finish, we will have the value of worldwide sales for each product in the TreeMap. Since we use the product code as the Map’s key, the data returned by the values() method (and also the keyset() and entrySet() method) will be in the order of the product code. This is because the TreeMap always maintains the ordering of its elements as we modify it.