Just to clear, the only requirement is a Scanner/Lexer/Tokenizer (Whatever you want to call it), correct?
If that's the case, this project lies squarely in my wheelhouse, I've actually just recently written my own C lexer->preprocessor->parser->compiler pipeline, and it's all fairly fresh in my mind. I should be able to crank out a custom lexer in no time.
Also, as a note, you say that the lexical specifications are listed, they are not in the description.