Correcting Typos in Email Addresses with mailcheck.js
For many of the sites developed at The Plant, the main way of staying in contact with users is by e-mail. In fact, not only is e-mail used for communication but for verifying user accounts too. However, an annoying problem crops up when users sign up to sites with an invalid e-mail address: these users are unable to access the site that they signed up to. This is commonly something that happens when a user mistypes their own e-mail address and is something that should be prevented if possible.
So for one of our biggest sites we have started using mailcheck.jsto offer corrections when there is a typo in the domain part of an e-mail address. When an e-mail address is entered and the domain part is spelt slightly differently to one of many known domains, mailcheck.js offers this known domain as a suggestion. We present this suggestion to the user, which need only be clicked to make the suggested correction.
Checking against a list of all domains would just take too long but the top 175 domains account for 72% of the site users. Therefore, suggestions can be offered in around 70% of cases where the domain is misspelt. Being able to offer corrections for around 70% of domain misspellings sounds good but does this translate to 70% less accounts being left unverified? Are users going to use corrections if they are offered? Does this new feature make a difference? These are questions that we were interested in answering.
For many of our sites we use Mixpanel to track user-generated events. By tracking when suggestions are made and when these suggestions are used, we started to get an idea of how much this new feature was being used. Weekly averages since introducing the feature are as follows:
- Suggestions are being offered to 16% of users signing up.
- Users are clicking these suggestions to correct the spelling of their e-mail address in 8% of cases.
- This translates to just over 1% of users signing up to the site being offered a correction which they then use.
Such low uptake of this feature was not anticipated but can be explained by several use cases:
- The user notices the domain spelling correction but it is a false positive and ignores it. (For example, the user’s email is hosted at “hmail.com” but this domain is not included in the list of known domains. In this case, the domain gmail.com is suggested even though it is incorrect to do so.)
- The user notices the domain spelling correction but chooses to ignore it because they are not interested in having a verified account. (Many invalid e-mail addresses were at the domain “test.com”)
- The user notices the domain spelling correction but corrects the address manually, by typing it out.
- The user does not notice the domain spelling correction.
To gain a clearer idea of which of these scenarios was occurring, the data from Mixpanel was matched by week to our own data on the number of users signing up and verifying their accounts each week. Additionally, for all users that signed up each week, mailcheck.js was used to determine if a correction would have been suggested.
Before introduction of the feature: for more than 90% of the e-mail addresses that would have been suggested to the user as incorrect, e-mails did not bounce when sent to these addresses. This indicates that the effect of false positives is significant.
The effect of users intentionally signing up with invalid e-mail addresses was fairly minimal. After introduction of the feature: for more than 97% of users signing up with e-mail addresses thatwould have been suggested as incorrect, e-mails did not bounce when sent to these addresses. This can be seen in the following graph.
The difference between number of corrections being used (as reported by Mixpanel) and the number of corrections that wouldhave been suggested was less than 3 each week on average. This indicates that when a correction is offered, a user rarely makes the correction by hand.
There was no strong evidence suggesting that users are not noticing domain spelling corrections and on average, users are confirming their accounts slightly less since this feature was introduced. Thus, domain corrections are being used but false positives are perhaps occurring too frequently and the list of known domains needs to be altered.