about me
homepage
diary*
gpg, pgp
CV (french)
CV (english)
projects
Debian
VideoLAN
FLESSD
Elk Scheme
libcaca
genethumb.sh
PWNtcha
MonsterZ
LMOS
more...
MPEG & DVD
DVD subtitles
MPEG-4 lecture
DeCSS distribution*
doc
SVN & CVS
gprof & pthreads
Debian project lecture
shorter version
Verisign dickheads
doubleclick filter
WTFPL
leisure
artwork
photos
music
DTC logos
porn
goatse
#LinuxFr fortunes*
links
PWNtcha logo

PWNtcha - captcha decoder

Note to /. readers: this article comes in a quite untimely fashion. Though I have been making tremendous progress with many other captchas, I have not updated this webpage for months! Please come back in a few days when I have had time to make a nicer webpage.

Many thanks to the VideoLAN project for hosting my page during the /. effect, to the GNAA for providing me with a massive amount of Captcha samples, and of course to every other contributor to this project.

PWNtcha stands for "Pretend We’re Not a Turing Computer but a Human Antagonist", as well as PWN capTCHAs. This project’s goal is to demonstrate the inefficiency of many captcha implementations.

For an overview on why visual captchas are a bad idea, see Matt May’s excellent presentation, Escape from CAPTCHA, as well as the W3C’s Inaccessibility of Visually-Oriented Anti-Robot Tests working draft.

FAQ

Please read this FAQ attentively before making hasty assumptions

  • Q. Does this mean that captchas are dead?
  • A. No, of course not. There are many very difficult captchas here and there. PWNtcha does not decode them and probably never will.
  • Q. Why don’t you list captcha <foo>?
  • A. Maybe because I was not aware of it. Please send me more information about it.
  • Q. Where is the code?
  • A. No code is available yet. I am still pondering the pertinence of allowing code in the wild. The good old full-disclosure debate... If you think I should release the code for PWNtcha, feel free to explain your arguments to me.
  • Q. Please give me a copy of PWNtcha so that I can test it on my own CAPTCHA and see how efficient it is!
  • A. PWNtcha does not work that way. It is not an intelligent program that tries to decode a random CAPTCHA. Such a program would be nearly impossible to do. PWNtcha is simply a toolkit of image manipulation functions, and a list of known CAPTCHAs with the associated list of image operations to apply in order to decode each of them. If I have never seen your CAPTCHA, then PWNtcha does not know about it, and there is absolutely no way it could decode it.

Defeated captchas

PWNtcha is able to detect and decode the following captchas:

Origin Samples PWNtcha efficiency Comments
Authimage 100% Vendor site: http://www.gudlyf.com/index.php?p=376
Weaknesses: constant font, aligned glyphs, constant glyph position, constant rotation, no deformation, non-textured background, constant colours, no perturbation.
Clubic 100% Weaknesses: constant font, no rotation, no deformation, aligned glyph, constant background, weak colour variation, weak perturbation.
linuxfr.org 100% Weaknesses: constant font, aligned glyphs, no rotation, no deformation, non-textured background, weak colour variation, weak perturbation.
LiveJournal 99% Weaknesses: constant font, constant character position.
lmt.lv 98% Weaknesses: constant font, almost aligned glyphs, no rotation, no deformation, constant background, no colour variation, weak perturbation.
Ourcolony 100% Weaknesses: constant font, no rotation, no deformation, no colour variation, no perturbation
Paypal 88% Weaknesses: constant font, almost aligned glyphs, no rotation, no deformation, constant background, no colour variation, no additional perturbation.
phpBB 97% Vendor site: http://www.phpbb.com/
Weaknesses: constant font, no rotation, no deformation, constant colours, weak perturbation.
SCode and derivatives 100% Vendor site: http://james.seng.cc/archives/000145.html
Weaknesses: at most 3 different fonts, no rotation, no deformation, weak colour variation, useless perturbation (separate colour key).
Slashdot 89% Weaknesses: constant font, no deformation, constant colours, weak perturbation.
vBulletin 100% Vendor site: http://www.vbulletin.com/
Weaknesses: constant font, fixed glyph position, no rotation, no deformation, almost constant colours, weak perturbation.
Xanga 49% Weaknesses: fixed horizontal glyph position, no rotation, no deformation, constant colours, insufficient perturbation.

Captchas being worked on

I am working on defeating the following captchas:

Origin Samples Comments
Drupal
Trencaspammers
Xanga (2)

Other captchas and hard captchas

These captchas can currently not be defeated by PWNtcha. Note however that this is not an acknowledgement of efficiency; for instance, EZ-Gimpy can be easily defeated by other projects. However, the Passport/Yahoo and CFXCaptcha captchas are probably going to last for a long time.

Origin Samples Comments
20six Extremely weak captcha, easily removed perturbation.
Authimage (3) A very good captcha, but not always human-solvable.
CFXCaptcha A very good captcha.
Clearscreen Weak perturbation, but interesting use of non-alphanumeric characters.
Cwazymail An excellent idea, but a critically buggy implementation.
EZ-Gimpy (eg. Yahoo! Briefcase) Already defeated by another project.
Hoke A very weak captcha.
.Mac A weak captcha that is not always human-solvable.
ICQ A pretty good captcha that uses a wide variety of backgrounds and fonts.
IMDb A very good captcha with a very well thought implementation, but a small dictionary.
MS MVPS Already defeated by another project.
MVN Forum A pretty good captcha that uses a wide variety of backgrounds and fonts.
Passport A very good captcha, but not always human-solvable.
Pichan An apparently weak captcha.
Screenname A pretty good captcha that uses a wide variety of backgrounds, fonts and deformations.
Yahoo! A very good captcha, but not always human-solvable.

Links