How To Find Broken Links Using Selenium WebDriver
Many times we face broken hyperlinks on websites. Today we learn how to find broken links using Selenium. Before moving into these topics, let’s learn the fundamental questions.
What are Broken Links?
A broken link (aka dead link or link rots) is a link on a website that doesn’t work (ie., it doesn’t redirect to the page it is meant to) due to one or more of the following reasons.
- The destination webpage is no longer available (offline or permanently moved).
- The destination webpage has been moved without a redirect being added.
- The URL structure(permalinks) of a webpage is changed.
- An invalid URL(misspelled, mistyped, etc.) is mentioned in the source webpage.
- Due to firewall or geolocation restriction.
A URL that has a 2xx HTTP status code is valid and URLs have 4xx and 5xx HTTP status codes are invalid. If you face the 4xx status code that it is due to a client-side error and the status code is 5xx which means it is due to a server response error.
Why check for Broken Links?
Webservers return an error message when a user tries to access a broken link. Users will be directed to an error page when they click on a broken link. This leads to a bad user experience. We have to handle them continuously and remove any existing broken links on our website. We can do this process manually. Most of the websites have hundreds or thousand of links and testing all of them manually is not possible. It requires huge time, resources, and effort. Instead of inspecting manually, we can leverage Selenium WebDriver to test broken links.
Don’t miss: Manual Testing vs Automation Testing – Differences Everyone should know
How to verify the Broken Links and images
Follow the below steps to verify broken links.
- All the links are tagged with either link <a> or image <img> on a web page. Collect the links based on tags <a>, <img>
- Send HTTP request and read HTTP response code of every link.
This way you can find out whether the link is valid or invalid based on response codes.
Find Broken Links Using Selenium WebDriver
One of the key test cases is to find broken links on a webpage. Due to the existence of broken links, your website reputation gets damaged and there will be a negative impact on your business. It’s mandatory to find and fix all the broken links before release. If a link is not working, we face a message as 404 Page Not Found.
Let’s see some of the HTTP status codes.
Below code fetches all the links of a given website (i.e., SoftwareTestingMaterial.com) using WebDriver commands and reads the status of each href link with the help of HttpURLConnection class.
Click here for more information on HttpURLConnection
Given clear explanation in the comments section within the program itself. Please go through it to understand the flow.
package softwareTestingMaterial; import java.net.HttpURLConnection; import java.net.URL; import java.util.List; import java.util.concurrent.TimeUnit; import org.openqa.selenium.By; import org.openqa.selenium.WebDriver; import org.openqa.selenium.WebElement; import org.openqa.selenium.firefox.FirefoxDriver; public class BrokenLinks { public static void main(String[] args) throws InterruptedException { //Instantiating FirefoxDriver System.setProperty("webdriver.gecko.driver", "D:\\Selenium Environment\\Drivers\\geckodriver.exe"); WebDriver driver = new FirefoxDriver(); //Maximize the browser driver.manage().window().maximize(); //Implicit wait for 10 seconds driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS); //To launch softwaretestingmaterial.com driver.get("https://www.softwaretestingmaterial.com"); //Wait for 5 seconds Thread.sleep(5000); //Used tagName method to collect the list of items with tagName "a" //findElements - to find all the elements with in the current page. It returns a list of all webelements or an empty list if nothing matches List<WebElement> links = driver.findElements(By.tagName("a")); //To print the total number of links System.out.println("Total links are "+links.size()); //used for loop to for(int i=0; i<links.size(); i++) { WebElement element = links.get(i); //By using "href" attribute, we could get the url of the requried link String url=element.getAttribute("href"); //calling verifyLink() method here. Passing the parameter as url which we collected in the above link //See the detailed functionality of the verifyLink(url) method below verifyLink(url); } } // The below function verifyLink(String urlLink) verifies any broken links and return the server status. public static void verifyLink(String urlLink) { //Sometimes we may face exception "java.net.MalformedURLException". Keep the code in try catch block to continue the broken link analysis try { //Use URL Class - Create object of the URL Class and pass the urlLink as parameter URL link = new URL(urlLink); // Create a connection using URL object (i.e., link) HttpURLConnection httpConn =(HttpURLConnection)link.openConnection(); //Set the timeout for 2 seconds httpConn.setConnectTimeout(2000); //connect using connect method httpConn.connect(); //use getResponseCode() to get the response code. if(httpConn.getResponseCode()== 200) { System.out.println(urlLink+" - "+httpConn.getResponseMessage()); } if(httpConn.getResponseCode()== 404) { System.out.println(urlLink+" - "+httpConn.getResponseMessage()); } } //getResponseCode method returns = IOException - if an error occurred connecting to the server. catch (Exception e) { //e.printStackTrace(); } } }
Bonus tip: Another way to check for broken links on your website is to use an automated SEO tool. For example, one such tool is Website Crawler by Sitechecker. This tool will automatically scan your site and find all the broken links and other types of problems. You’ll also get a list of tips to fix any problems you find.
If this post on “Finding Broken Links Using Selenium WebDriver” was able to help, then don’t mind sharing it with others.
If you are not a regular reader of SoftwareTestingMaterial.com then I highly recommend you to signup for the free email newsletter using the below link.
i want to same with testng please help
Hi Pravash, convert the class in to TestNG class. Try and let me know, if you need any help.