False positives
False positives are URLs that are flagged as broken, but are actually working if visited by a user. There are multiple reasons why this could happen.
Causes
HTTP status codes are used to decide whether a URL is working or broken. Some websites might return erroneous status codes even though the URL works as expected when visited by the end user. There are several reasons why this might happen.
Authentication
Some websites require authentication to view the content behind a URL. These websites should return the status code 401 Unauthorized to indicate that authorisation is required when no authorisation details have been provided.
Currently, there are no mechanisms in the Easy Link Checker to provide authentication credentials. We are however planning to support such mechanisms in the future. As of now, the solution is to exclude the affected links or to accept the returned status code, as explained below.
Rate limiting
There are websites that have implemented mechanisms to prevent excessive access to their resources. These mechanisms, referred to as rate limiting, restrict the number of requests a user or IP address can make within a specified time frame. During link checking, a large number of requests may be sent to the same web site, which can cause the requests to be rate limited. If this happens websites should return the status code 429 Too Many Requests. However, some websites may return a different status code or no data at all, causing a time out.
To work around rate limiting, the affected links can be excluded from link checking or the status code can be accepted. See below for more information.
Slow websites and timeouts
Websites can also be slow or not respond at all. Currently a timeout of 5 seconds is configured. If a website does not respond within this time frame the corresponding URL will be considered as broken. At the moment the timeout value is not configurable but we will try to change that in the upcoming releases.
To mitigate this problem one can exclude the affected URLs.
Mitigation
Link exclusion
Links can be included and excluded from the scanning process as describe in the Filtering links section on the Further configuration page. Excluded links will be listed in a separate category in the result table.
Status code
Erroneous status codes can be accepted with the Accepted status codes option on the Further configuration page.
HTTP vs HTTPS
Most websites today use HTTPS (https://
) and might not support HTTP (http://
) anymore. It is not uncommon to redirect requests from an HTTP to the corresponding HTTPS URL using HTTP redirection. If this is the case, the links should be successfully classified by the Easy Link Checker. However, there are websites which return erroneous status codes for all HTTP URLs or they might use other redirection mechanisms. In any case, it is advisable to change your URLs to use HTTPS instead of HTTP if the website offers it.