A Basic Understanding Of Selenium WebDriver Architecture

As per today’s scenario, companies not only desire to test software adequately, but they also want to get the work done as quickly and thoroughly as possible. To accomplish this goal, organizations are turning their head towards Automated Testing.

It not only helps in reducing the efforts for manual testing but also works towards finding the defects which manual testing cannot expose.  It also helps in cases where manual testing is error-prone as it is a time-consuming process.

Automated testing is guided by a web of techniques and technologies, but do you know what is the best of them all? Well, the answer to is Selenium WebDriver!

But before understanding the Selenium WebDriver concept, we need to know about the Selenium first.

What Is Selenium?

Selenium is a free automated testing suite used to automate web applications across different browsers and platforms. It supports various programming languages like Java, Dot Net, PHP, Python, Perl, Ruby, etc, and various browsers like Mozilla Firefox, Google Chrome, Safari, and Internet Explorer.

The four major components of Selenium are:

  1. Selenium IDE
  2. Selenium RC
  3. WebDriver
  4. Selenium Grid

The four major components of Selenium

Selenium IDE

The Selenium IDE is nothing but a Mozilla Firefox add-on that allows recording, editing, and debugging tests. It was previously known as Selenium Recorder.

It is able to record and playback tests from within the same plugin. The Selenium IDE will not work after Firefox version 54 as its support is limited to Firefox version 54.

Selenium RC

Selenium Remote Control (RC) is a server that accepts commands for the browser via HTTP. It solves the limitation of Selenium IDE and supports various programming languages.

Using Selenium IDE, we can record and run the script but only in the Firefox browser. However, using Selenium RC, we can run the same recorded script in any browser. To do this, we have to start and stop the server to execute the test scripts.

Selenium Grid

The Selenium Grid is a testing tool which allows us to run our test scripts on different OS and on different browsers. It is a part of the Selenium Suite which specializes in running multiple tests across different browsers, operating systems, and machines. With Selenium Grid, one server acts as the hub and others act as a node.

In Selenium Grid, the hub is a computer which is the central point where we can load our tests. The Hub also acts as a server because of which it acts as a central point to control the network of Test machines. A node can be referred to as a test machine which opts to connect with the hub.

What Is Selenium WebDriver?

Selenium WebDriver is one of the most powerful and popular tools of the Selenium toolkit. Unlike Selenium IDE, WebDriver allows you to execute your tests against different browsers.

It is an extended version of Selenium RC. It aims to provide a friendly API that is easy to explore and understand, easier to use than the Selenium-RC API, which helps to make your test scripts easier to read and maintain.

Basically, they are used to remove the server part from the WebDriver so that the performance does not emerge as an issue in the WebDriver. It simply means to write your code and it will directly communicate with the browsers.

The Architecture Of Selenium WebDriver

The Architecture Of Selenium WebDriver

As mentioned clearly in the above image, the Firefox driver (and other browser’s drivers) extends the Remote WebDriver class and the Remote WebDriver class implements the WebDriver interface.

The FirefoxDriver

FirefoxDriver is a term of class that has been written or designed specifically for the Firefox browser. It includes the use of methods that are implemented and can be further instanced. It can perform all the methods on the Firefox browser as defined in the interface WebDriver.

The Remove WebDriver

Remote WebDriver is an implementation class of the WebDriver interface that an automation test engineer can use to execute their test scripts via the Remote WebDriver server on a remote machine.

The WebDriver as an Interface

WebDriver is an interface provided by Selenium WebDriver. As we know, interfaces in Java are the collection of constants and abstract methods (methods without any implementation). The WebDriver interface serves as a contract that each browser-specific implementation like ChromeDriver, FireFoxDriver must follow.

The Selenium WebDriver architecture consists of 3 layers:

  1. Language Binding,
  2. WebDriver API,
  3. Browser Drivers.

Language Binding

Language Bindings was developed to support multiple languages.

This term can be defined as a bunch of languages that are developing a framework, interacting with the Selenium WebDriver, and working on various browsers and other devices.

For example, if you want to use the browser driver in Java, so you need to use the Java bindings for the Selenium WebDriver. If you want to use the browser driver for C#, Ruby, or Python, then use the binding for that language.

Selenium WebDriver API

This API is a medium of communication with programming languages and browsers.

This API sends the commands taken from language level bindings, interprets them, and sent them to the respective driver. In short, the WebDriver API has a common library that sends commands to the respective drivers.

Browser Drivers

Browser drivers help in communication with the browser. Drivers receive commands directly from the Server and perform actions on different browsers as per different commands using the remote WebDriver.

We have so many browsers like Mozilla Firefox Browser, Chrome Browser, Opera Browser, and IE Browser. Each browser will contain a separate driver, and each driver knows how to drive the browser that it corresponds to.

For example, the Chrome driver knows how to handle the details of the Chrome browser and drive it to do things like clicking the button, going into pages, getting data from the browser itself. The same thing happens for Firefox, IE, and so on.

Pros & Cons Of the Selenium WebDriver

Everything is loaded with some pros and cons, so let’s dig into the advantages and disadvantages associated with the same…

Advantages of the Selenium WebDriver

1. Selenium is an open-source, freeware, and portable tool.

2. It supports various languages that include Java, Perl, Python, C#, JavaScript, and VB Script. etc.

3. Selenium supports many operating systems like Windows, Macintosh, Linux, Unix, etc.

4. Selenium supports many browsers like Internet Explorer, Chrome, Firefox, Opera, Safari, etc.

5. WebDriver is faster, as compared to RC.

6. Unlike RC, you don’t have to start a server in WebDriver.

7. You can simulate the movement of a mouse using Selenium.

8. It allows you to simulate keyboard keypress events using different classes.

9. You can find the coordinates of any object easily using WebDriver.

10. Integration with the testing framework like JUnit or TestNG is very easy with the WebDriver.

Disadvantages of the Selenium WebDriver

1. Selenium does not provide any built-in IDE for script generation and it needs other IDEs like Eclipse for writing scripts;

2. Selenium users lacks online support for the problems they face because there is no reliable support from anybody;

3. It supports Web-based Applications only;

4. We can’t automate audio & video related test cases by Selenium WebDriver;

5. No built-in Reporting facility.

Final Words

With this detailed information of the Selenium WebDriver, one can decide the type of relationship required for their project or idea. I am sure that even you must have decided for yourself by now.

Let me know the same in the comments section below!

Follow me!

4 thoughts on “A Basic Understanding Of Selenium WebDriver Architecture”

Leave a Reply to Vikas Dhama Cancel reply