Preventing Form Spam

Although your site visitors never see it, form spam is a headache for site owners – especially if you get email or text notifications every time a visitor completes a form on your site. Preventing form spam helps alleviate this headache. But what is form spam exactly?

Most form spam occurs when bots discover your form and then attempt to use it as a gateway to submit undesirable information such as abusive language, links to malware or phishing websites, and links to other illegal activities. Once a bot identifies a vulnerability, your form can be used many thousands of times in a short space of time for maliciously. This can result in your domain name getting blacklisted, SMTP services being locked out due to abuse, significant work in resolving and dealing with the situation as well as performance degradation of your website. (So maybe it’s more of a migraine than a headache.)

Regardless of whether your form was coded by hand or created with a WordPress form plugin, it is susceptible to such attacks unless you implement additional strategies to protect it from malicious use.

The best approach is to implement spam prevention measures when the form is originally created to avoid problems later on.

The preventative measures discussed in this article are:

  1. Invisible captchas
  2. Filtering disposable email addresses
  3. Honeypot
  4. WordPress anti-spam plugins

Invisible Captchas

We’re all familiar with the conventional captcha challenge. You check the box to confirm you are a human and then you may have to click additional boxes that contain a certain object such as a traffic signal or motorbike. The captcha is intended to distinguish human from machine input. There are various providers of captcha technologies including the most popular, reCAPTCHA, provided by Google. Another popular captcha with a focus on privacy is hCaptcha.

Captchas provide one of the most efficient ways of preventing form spam, not least because they require a form of human interaction that is incredibly difficult for spam bots to circumvent. A downside, however, is that they can be frustrating for someone completing the form, particularly if the captcha technology does not accept their response and requires subsequent retries, and can be problematic from an accessibility perspective also.

To overcome this problem, some captcha systems now provide invisible versions. The captcha technology sits invisibly on your website and monitors events on your page to determine if a human is present. For example, it may monitor mouse, touch, or key stroke events.

Invisible captchas provide a frictionless way of ensuring a human visitor is submitting forms on your website. Another advantage of invisible captchas over their conventional counterpart is they are more accessible. Clicking on squares containing out of focus images can pose accessibility challenges.

Implementing captchas on a site usually involves client and server side components. The client component is typically a lightweight JavaScript library which will issue a token once it has determined whether or not a human is present. The token is a unique identifier that is then used to retrieve information about the result of the captcha by a server side script.

Let’s use the Google reCAPTCHA system as an example.

The first step is to create reCAPTCHA v3 keys. You can do that by visiting: https://www.google.com/recaptcha/admin/create. Two keys are created, a site key (which is a public key) and a secret key which is only used server-side when checking the token.

Once you have your keys you need to add some simple JavaScript to the page containing your form. A simplified example of this script would be:

<!-- Load the Google reCAPTCHA JavaScript library -->
https://www.google.com/recaptcha/api.js

<!-- Submit the form when the reCaptcha is successful -->
<script>

    function on_submit(token) {
        document.getElementById("demo-form").submit();
    }

</script>

<!-- Form containing a modified submit button -->
<form id="demo-form">

<button class="g-recaptcha" 
        data-sitekey="reCAPTCHA_site_key" 
        data-callback="on_submit" 
        data-action="submit">Submit</button>

</form>

Note that the traditional submit button has been replaced with a custom button element. In this example, when the button is clicked, the reCAPTCHA process will begin and if successful it will call the on_submit function which will submit the form. The form submission POST request will include an additional variable called g-recaptcha-response which contains a token.

The token is then checked server side as part of your PHP script that processes the form POST request. An example script for checking the response is show below:

// Get the reCAPTCHA response from the POST request
$token = $_POST[ 'g-recaptcha-response' ];

// Validate reCAPTCHA response
if( !preg_match( '/^[w-]*$/', $token ) ) {

	// reCAPTCHA response is invalid
	exit();
}

// Make a POST request to verify reCaptcha token
$response = wp_remote_post(

	'https://www.google.com/recaptcha/api/siteverify',

	array(

		'body' => array(

			'secret' => 'reCAPTCHA_secret_key',
			'token' => $token
		)
	)
);

// Check for errors from the wp_remote_post function
if ( is_wp_error( $response ) ) {

	// Handle error
	$error_message = $response->get_error_message();

} else {

	// Decode JSON response
	$response_decoded = json_decode( $response );

	// Check for a successful reCAPTCHA response
	if( !is_null( $response_decoded ) && $response_decoded->success ) {

		// reCAPTCHA was successful, process the form submission

	} else {

		// reCAPTCHA failed, disregard the form submission
		exit();

	}
}

Using a form plugin can greatly simplify this process as the client and server components will be handled for you. For example, by using the WS Form form plugin, you simply drag a reCAPTCHA field to your form and configure a few settings to secure your form.

Adding reCAPTCHA V3 to a form using WS Form

Using an invisible captcha on your form is a non-intrusive way of providing a high level of protection against spam bots.

Learn more:

Filtering disposable email addresses

Disposable email addressing, also known as DEA, refers to unique email addresses that are set up for a limited number of uses to hide the identity of the sender. These are often used quite legitimately to protect one’s privacy, but spam bots will often use disposable email addresses when penetrating your forms to look for weaknesses. It is possible you might also get unwanted form submissions that are using disposable email addresses to solicit business or for malicious intent.

You may, therefore, wish to filter disposable email addresses.

Filtering out disposable email addresses can be achieved by checking the submitted email address in the PHP script that processes your form POST request.

First you’re going to need a list of domain names used by disposable email addresses. There are several git repositories containing such lists available online that can be found with a simple Google search. For the purpose of this tutorial we’ll save the list of domains to a file called domains.conf with each domain occupying a separate line in the file. For example:

0-mail.com
027168.com
0815.ru
0815.ry
0815.su
0845.ru
0box.eu
0clickemail.com
...

Next you will create a PHP script to check for those domains.

// Get the email address from the POST request
$email = sanitize_email( $_POST[ 'email' ] );

// Check if the email address is valid
if( filter_var( $email, FILTER_VALIDATE_EMAIL ) ) {

	// Split email address by @
	$email_array = explode( '@', $email );

	// Extract domain
	$domain = strtolower( array_pop( $email_array ) );

	// Load blocked domains file into array
	$domains_blocked = file( 'domains.conf', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES );

	// Check if domain exists in list of blocked domains
	if( !in_array( $domain, $domains_blocked ) ) {

		// Email address is good, process the form submission
	} else {

		// Disposable email address found, disregard the form submission
		exit();
	}
}

Using a form plugin, you can simplify this process. For example, WS Form is integrated with WordPress plugins such as Clearout which provide online services for detecting valid email addresses.

Many form plugins also offer WordPress filter hooks that allow you to add your own filters for detecting disposable email addresses.

Honeypot

Honeypot spam protection involves adding an additional field to your form that is hidden from view to visitors but visible to spam bots. Spam bots will complete the field with a value, whereas genuine visitors will not.

Testing for spam with honeypot is incredibly simply. If a value is present in the field, the submission is rejected. If no value is found, then the submission is accepted.

Honeypot provides an effective method of preventing form spam, and from a user perspective, offers a frictionless experience.

Many form plugins offer an option for enabling Honeypot protection by simply checking a box in the form settings, as shown below for WS Form.

Enabling honeypot spam protection in WS Form

WordPress anti-spam plugins

In addition to the methodologies outlined above, there are a variety of WordPress plugins that can be used to help prevent form spam, such as Akismet, Cleantalk, Jetpack Anti-Spam and WordPress Zero Spam. These plugins integrate with popular form plugins to provide protection for your website.

Check out the WordPress plugin directory to find more!

In conclusion

Form spam doesn’t have to be a headache for the site owner, or cause additional steps for the form completer in order to be avoided. With a little forethought and putting some measures in place, your form can deliver qualified submissions.