Monday, February 8, 2010

Better email validation with Tapestry

Recently I implemented a small form for creating user accounts with Tapestry. I used all the standard Validators available hoping that they will successfully pass the manual tests. Well, some of them did :-) except from one: The 'Email' Validator.

If you look at the sources of Tapestry (5.1.0.5) email validator, you may spot the problem.

//...
private static final String ATOM = "[^\\x00-\\x1F^\\(^\\)^\\<^\\>^\\@^\\,^\\;^\\:^\\\\^\\\"^\\.^\\[^\\]^\\s]";

private static final String DOMAIN = "(" + ATOM + "+(\\." + ATOM + "+)*";

private static final String IP_DOMAIN = "\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\]";

private static final Pattern PATTERN = Pattern
.compile("^" + ATOM + "+(\\." + ATOM + "+)*@" + DOMAIN + "|" + IP_DOMAIN + ")$", Pattern.CASE_INSENSITIVE);

//...

public void validate(Field field, Void constraintValue, MessageFormatter formatter, String value)
throws ValidationException
{
if (!PATTERN.matcher(value).matches()) throw new ValidationException(buildMessage(formatter, field));
}
//...


Can you see it?

What about the user "dajdajda" having account at "dhadadhahjad.edhadja.dads"? What about a Polish guy trying "adaś@ćma.pl" or German one checking out "thomas.müller@öäüß.de"? Are these really valid email addresses? Do you want to accept them? I would rather reject them right at the beginning.

To me full SMTP validation is a little overkill (because of performance, complexity and multiple additional problems that make your life complicated). However a hybrid of regex checks and DNS validation seems to be the acceptable solution.

I have created my custom Email Validator based on the simple DNS check code samples from http://www.rgagnon.com/javadetails/java-0452.html.
Also, for regex validation I used the org.apache.commons.validator.EmailValidator which in my opinion offers a better regex pattern than one defined in Tapestry 5.1.0.5 (see code above).

If you agree with me, then please feel free to use my code shown below (free of charge ;-)). You can create your DNSEmailValidator in any package you want, then contribute it in you AppModule.java and simply use it within your @Validate or <t:.. validate="dnsEmail"/> statements.


//...

import java.util.Hashtable;

import javax.naming.NamingException;
import javax.naming.directory.Attribute;
import javax.naming.directory.Attributes;
import javax.naming.directory.DirContext;
import javax.naming.directory.InitialDirContext;

import org.apache.commons.validator.EmailValidator;
import org.apache.tapestry5.Field;
import org.apache.tapestry5.MarkupWriter;
import org.apache.tapestry5.ValidationException;
import org.apache.tapestry5.ioc.MessageFormatter;
import org.apache.tapestry5.services.FormSupport;
import org.apache.tapestry5.validator.AbstractValidator;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class DNSEmailValidator extends AbstractValidator<Void, String> {
private final static Logger _logger = LoggerFactory
.getLogger(DNSEmailValidator.class);

public DNSEmailValidator() {
super(null, String.class, "invalid-email");
}

public void validate(Field field, Void constraintValue,
MessageFormatter formatter, String value)
throws ValidationException {

// validate the syntax
final EmailValidator validator = EmailValidator.getInstance();
if (!validator.isValid(value))
throw new ValidationException(buildMessage(formatter, field));

// validate the DNS
final String[] tokens = value.split("@");
try {
int servers = doLookup(tokens[1]);
_logger.info(tokens[1] + " has " + servers + " mail servers.");
} catch (NamingException e) {
throw new ValidationException(buildMessage(formatter, field));
}
}

private String buildMessage(MessageFormatter formatter, Field field) {
return formatter.format(field.getLabel());
}

public void render(Field field, Void constraintValue,
MessageFormatter formatter, MarkupWriter writer,
FormSupport formSupport) {
formSupport.addValidation(field, "dnsEmail", buildMessage(formatter,
field), null);
}

static int doLookup(String hostName) throws NamingException {
Hashtable<String, String> env = new Hashtable<String, String>();
env.put("java.naming.factory.initial",
"com.sun.jndi.dns.DnsContextFactory");
DirContext ictx = new InitialDirContext(env);
Attributes attrs = ictx.getAttributes(hostName, new String[] { "MX" });
Attribute attr = attrs.get("MX");
if (attr == null)
return (0);
return (attr.size());
}
}


Contribution code:


//...
public static void contributeFieldValidatorSource(
MappedConfiguration<String, Validator<Void, String>>> configuration) {
configuration.add("dnsEmail", new DNSEmailValidator());
}
//...


Of course, this DNSMailValidator does not make sure that the mail account really exists. It proves that there are mail servers available at the given domain name. Checking the existence of mail account may take a little longer and sometimes fail (e.g. when greylisting is enabled). The cool thing is that Tapestry allows you to create any Validator you want, so you may implement both and use the one that satisfies current requirements of your project.

Wednesday, February 3, 2010

After interview

Yesterday I had a job interview for a Software Engineering (at least I thought so) position at one of the departments of my current company.

I am an interview newbie (who did like 5-6 of them in his life) but I think that's enough to notice the difference between a good and poorly prepared one.

Well, since the company I work for was never (and probably will never be) particularly good at interviewing software engineers, I kind of expected only pointless and stupid questions to be asked.

The fact that I usually like to read a lot on "what's the difference between a good and bad software professional", "how to filter people when you recruit them", etc. only made it only worse.

I think that the guy asking the questions didn't really know what he would like to hear and know about me. How would you answer a question going something like this:
"How much time do you think you would need to get up to speed with our project?". Well, it sounds like an important question, doesn't it? ... no doubts about it... but how the hell am I supposed to answer it if I know very little about the project, it's size, the code and documentation quality, the people (e.g. if they are helpful or not). I couldn't say that loud but I got really pissed off and thought ("This is the question you should ask yourself after getting to know me a little bit better!! That's why I am on the f.... interview, isn't it?"). Some people would probably give him an answer "As fast as it's possible", and I bet that this would satisfy that guy (who I bet loves "standard answers"). The problem is that I am a creative engineer and do not like routine too much. When interviewed I like to THINK.

I usually have nothing against dummy questions like "Describe Visitor Pattern" or "What's the difference between heap and stack?". They prove that I read books from time to time, or that I happened to work on a project where people actually used design patterns. On the other hand I do not favor this kind of "checks" because they do not prove if I am a "THINKING PERSON". I may be an idiot who knows patterns by heart but has no idea how to solve complex problems with or without them. Software Engineering is more like art. There is many ways to build software, nice and less elegant ones (but still correct). We can not say that the "Gang of four" defined the best ever solutions to design and implement applications. Patterns help to solve common problems and establish communication layer between developers who know what they are. However, they do not always solve our custom problems in the best possible way (there is no such thing like "the best way" anyway). That's why I would test people not on their knowledge but rather their ability of CREATIVE THINKING.

We will see if I get the job :-) The project is very interesting but one of the interviewers did a really good job in demotivating me.

At the end he also said something about my experience. That he does not like the fact that I did Java development for almost all my professional career (he would be really happy if it was C++ instead). I mean... honestly, he could hold it for himself because I personally, do not regret this fact AT ALL!! Language is just a tool, and as long as people do not get that it is JUST A TOOL, they will not be able to hire the real passionate software developers.

I think I suck, but my interviewer sucks much, much more! (and get's paid much, much better).

The guy did one good thing though. He motivated me to spend even more time on my own project. I can't wait to finally launch it. It would better be a success!

Tuesday, February 2, 2010

Want to learn C++ for free?

I found a very nice and free C++ tutorial (which kind of looks like a huge electronic book to me).

I find the section 7.9 — The stack and the heap particularly well written. The author uses simple analogies (e.g. the plate and the mailbox analogy) to explain how the stack and heap works. I always liked this kind of approach when learning new stuff. Unfortunately not many book authors use them in their works. Too bad!

1:0 for the free electronic book.

Monday, February 1, 2010

Can call myself a C++ programmer now!

Yeah, I've finally finished the "C++ from the ground up" book today (including the last appendix).

The author himself calls everyone who managed to go through all the chapters of the book a C++ programmer. I guess he is right. I totally feel like I am able to not only understand but also code some pretty advanced apps in this language. Of course, advanced does not mean perfect and bug free in this case. Every newbie has to start with some shitty code to become a master. The important point is to write the junk code at home and not contribute it to latest version of the application produced together with your colleges at work.

The book is really awesome and I highly recommend it to everyone willing to learn Standard C++.

The dark side of the book was that it did not cover some of the subjects commonly seen in the professionally written C++ applications. The author suggests his other book (isn't it funny? "... to know more about XXX refer to MY other book ...") to learn about e.g. "function objects" or some advanced STL data types and algorithms. Also, the author does not give any hints about how the C++ app should be structured (the ".h" files, dir structure of a project, dll's (or so's), etc.) which I think is a pity. I would expect at least a small, mini chapter covering this topic. I guess the book is good if you are willing to learn a new language (and only the language), but not the way to use it efficiently when building a large application.

Well, I am happy that I've read this book and will probably come back to it from time to time to refresh my knowledge on some certain language specific subjects.

Hmm... and maybe I will buy Herbert's second book "C++ The Complete Reference" to make sure I did not miss anything important. The templates, I definitely need to refresh my knowledge on templates right now! :-)