58 KiB

Raw Permalink Blame History

created_at

title

url

author

points

story_text

comment_text

num_comments

story_id

story_title

story_url

parent_id

created_at_i

_tags

objectID

year

2016-05-18T19:31:06.000Z

Bots Are Hot (1996)

http://www.wired.com/1996/04/netbots/

vincvinc

121

1463599866

story

author_vincvinc

story_11725048

11725048

1996

Source

Bots Are Hot!

Get Our Newsletter

WIRED’s biggest stories delivered to your inbox.

__submit

__
__
__
__
__
Author: Andrew LeonardAndrew Leonard
04.01.96
12:00 pm

Bots Are Hot!

[share

]18

[tweet

]19

[comment

]20

[email

]21

Author: Andrew LeonardAndrew Leonard
04.01.96
12:00 pm

Bots Are Hot!

Botmadness reigns, botwars rage, as ever more complex chunks of code roam the Net and evolve toward a-life. It's the great technodialectic, where every solution breeds a problem breeds a new solution.

Flattie and Int3nSity hit #riskybus on the afternoon of October 23, seizing control of the popular Internet Relay Chat game channel after nick colliding all the humans right off the net. Like always, Flattie's guardbot had been watching his back all day, keeping an eagle eye out for any attempt by enemy bots to grab his channel ops.

A clonebot launched from a lagged IRC server broke through #riskybus defenses. Earlier that afternoon, Flattie and Int3nSity had placed the clonebot on irc-2.mit.edu. They made their move when the server net split, stranding one human #riskybus gameplayer.

Flattie had to kill the human - he was in the way. Meanwhile, the clonebot did what it was designed to do, spawn a mob of baby bots in quick succession with the nicknames of all the gameplayers currently on #riskybus. The IRC protocol forbids two beings - human or robot - with the same nickname from coexisting on a given channel at a given time.So when the server rejoined the net, all hell broke loose.The nicknames collided. Flattie and Int3nSity ruled.

Bot mayhem. It's an old story on IRC, where code hackers and IRC operators are locked in constant warfare, playing an age-old game of technological one-upmanship. But IRC isn't the only place bots regularly run amok. Bots are everywhere in the online universe - roaming the interstices of the World Wide Web; lounging about in MUDs and MOOs; patrolling Usenet newsgroups.

Web robots - spiders, wanderers, and worms. Cancelbots, Lazarus, and Automoose. Chatterbots, softbots, userbots, taskbots, knowbots, and mailbots. MrBot and MrsBot. Warbots, clonebots, floodbots, annoybots, hackbots, and Vladbots. Gaybots, gossipbots, and gamebots. Skeleton bots, spybots, and sloth bots. Xbots and meta-bots. Eggdrop bots. Motorcycle bull dyke bots.

DeadelviS, aka John Leth-Nissen, an IRC hacker who maintains an ftp archive of bot source code, defines botas being "short for robot, which sounds cooler than program." And that about sums it up. A bot is a softwareversion of a mechanical robot - defined by Webster's as an "automatic device that performs functions normally ascribed to humans." Like mechanical robots, bots are guided by algorithmic rules of behavior - if this happens, do that; if that happens, do this. But instead of clanking around a laboratory bumping into walls, software robots are executable programs that maneuver through cyberspace bouncing off communications protocols. Strings of code written by everyone from teenage chat-room lurkers to top-flight computer scientists, bots are variously designed to carry on conversations, act as human surrogates, or achieve specific tasks - such as seeking out and retrieving information. And, as the above example illustrates, bots can also be used as weapons.

In current online parlance, the word bot pops up everywhere, flung around carelessly to describe just about any kind of computer program - a logon script, a spellchecker - that performs a task on a network. Strictly speaking, all bots are "autonomous" - able to react to their environments and make decisions without prompting from their creators; while the master or mistress is brewing coffee, the bot is off retrieving Web documents, exploring a MUD, or combatting Usenet spam. But most bot connoisseurs consider true bots to be more than just mindless ones and zeroes. Even more important than function is behavior - bona fide bots are programs with personality. Real bots talk, make jokes, have feelings - even if those feelings are nothing more than cleverly conceived algorithms.

Whatever their true definition, one thing's for sure. Bots are hot. In online environments, they are both popular and pestiferous, the cause of constant comment and debate. And though bots vary in form and function, they share common identifiers. They are proliferating. They are increasingly complex. They cause as many problems as they solve.

They will not go away. The future of cyberspace belongs to bots.

Today's digital hype casts that future as a world filled with helpful bot "agents" busily taking care of info-chores for their human masters. Bot, find me the best price on that CD, get flowers for my mom, keep me posted on the latest developments in Mozambique. Bot servants will make our lives easier, bringing us cool glasses of iced information tea as we relax amid the digital revolution.

__ But not necessarily.__

€ 1994: At the "Pt. MOOt" MOO, an experimental virtual city masterminded by University of Texas student Allan Alford, the Barney Bots - purple dinosaur-suited men who wander the MOO singing "I Love You, You Love Me" - start spawning faster than they can be killed off by the MOO's contingent of human Barney hunters. Archwizard Alford is forced to intervene before the Barney Bots drive the entire human population of Pt. MOOt insane.

€ November 1995: A search robot launched from multiple Norwegian servers hits webmaster Thomas Boutell's World Birthday Web and begins clicking repeatedly on the hundreds of pages located at the site, pages that are in turn linked to thousands more off-site pages. The sudden rise in traffic drags Boutell's Internet service provider to a screeching halt, a situation remedied only when the provider "locks out" an entire network of Internet addresses in Norway.

€ December 1995: In an exquisite display of bot irony, an "auto-responder" vacation mailbot installed by a subscriber to a mailing list devoted to Web robots begins responding to its own responses, clogging the list in a classic example of the most basic bot error - a recursively infinite stream of nearly identical messages.

Similar bad-bot outbreaks occur every day in some corner of the online universe. To date, most such bot truancy is blamed on bad design or improper implementation. Bots are just beginning to crawl out of the digital primordial ooze, and all the kinks haven't been worked out of their genetic code. With newbie botmaster wannabes joining the Net in huge numbers daily, bad-bot shenanigans are bound to continue.

But true bot madness may have only just begun. Bots don't have to be benign, and bot misbehavior doesn'thave to be accidental. Bots can be instructed to do whatever their creators want them to do, which means that along with their potential to do good they can do a whole lot of evil. Bots can steal information instead of simply retrieving it. A commercial bot - such as an online-shopping bot or a news-retrieval bot - could be designed to disable its competitors. Bots are already beginning to appearwith a programming mandate to stamp out or attack forms of expression deemed objectionable. What will our response be to homophobic bots, antigun bots, or white-supremacist bots?

As bots proliferate, so will the problems associated with them. Bot behavior - and "bot ethics" - will become ever more controversial. But not insoluble. A survey of the bot frontier reveals that in every region bots are making their presence felt, their misuse or abuse has initiated a reactive process. The Net is mobilizing to bring bots into line. Call it the technodialectic: on the Net, problems beget solutions, solutions propagate, and they themselves beget new problems. There is an uneasy balance, a fretful stability in botdom. But one thing is certain: the numbers and varietiesof bots are exploding. The bot diaspora has begun.
__ A taxonomy of bots__

The semantics of botness are confused and have yet to be satisfactorily sorted out. Among the messy problems, for instance, is distinguishing between bots and agents. Right now, there doesn't appear to be a clear difference, except perhaps that the term bot is more slangy and technohip, while agent smells of marketing-speak. But whatever you call them - agents or bots or scripts or plain old programs - they are a genus, a class of species, of their own, branching out in all directions across the four main network communications protocols that compose the core of whatis called the Internet. To unravel their taxonomic threads is no simple task: it demands a Darwin.

In the beginning, there was Eliza - call her Bot erectus - the first widely acknowledged software robot. Created in the mid-'60s by MIT professor Joseph Weizenbaum, Eliza isa program that simulates a "Rogerian psychotherapist" by rephrasing patient questions into questions of her own.

Wired: Do you know you are the mother of all bots?

Eliza: Does it please you to believe I am the mother of all bots?

Compared with some of today's bots, Eliza is painfully simple - a mere 240 lines of code. Yet even though ancient, Eliza survives. There is a Web page where she can be found and interrogated (www ai.ijs.si/eliza/eliza.html), a channel devoted to her on IRC (although at last visit, Eliza didn't appear to be present), and versions of her pop up in some popular Unix shareware packages.

Her basic operating principle is simple. Eliza parses each question you pose, looking for keywords or sentence constructions that she recognizes. For each keyword, she has been given an appropriate type of response. For example,if the word you is in the question, the word I is in the answer. And while Eliza is autonomous - she can be set up, turned on, and left alone to beguile or confuse passersby - she's not very intelligent. A few minutes of conversation expose her as a fraud.

Nevertheless, Eliza is classified as a chatterbot. She talks back. In the taxonomy of bots, chatterbots are perhaps the most important of the bot groups - the first family of bots. Nearly all MUDbots, as well as gamebots, gossipbots, and some forms of IRC channel-guarding bots, incorporatequalities of chatterbots. With names like Eliza, Julia, Colin, Buford, Nurleen, and Newt, they take on personae and occupy psychic space. A chatterbot can be written in any programming language and can be found everywhere - in MUDS and other multiuser domains, in IRC, on the Web, and, in a modified form, on Usenet. One can even make the case that a mailbot, which looks at the headers of incoming email and decides whether to respond or otherwise filter the mail, is a form of chatterbot. It talks via email.

Contemporary chatterbots operate under far more sophisticated algorithms than does Eliza. Some record all the conversations they hear, register the probability that certain words will follow certain words, and build up their own sentence patterns for future use. In the future, say bot experts, all bots will have a chatterbot interface, no matter what their purpose. You'll talk to your computer programs, and they'll talk back. You'll tell them what to do and, depending on their algorithms, they'll either hop to it or spit in your face. But right now, many in the panoply of bot beasties don't yet have the power of speech.

The best way to understand any bot, including chatterbots and those less glib, is to visit it in its distinct native environments. While all bots are created out of fundamentally the same DNA building blocks - strings of code, usually in some variant of the C or perl programming languages - every bot has to be designed to operate within the specific computing environment it is destined to live in. On the Net, there are four general bot habitats - the Web, Usenet, IRC, and MUDs, MUCKs, MUSHes, and MOOs - and each spawns different subspecies according to changes in thelandscape. Just as Darwin suggested, each environment creates different evolutionary paths. Different successes and differentdisasters.
__ The Birthplace Savannah__

The first port of call for any postdigital voyage of the Bot Beagle would logically be the ancient land where bot culture first beganto flower - multiuser domains, or MUDs.

MUDs are like the Africa of bots - the primeval motherland. In the early MUDs, most notably Carnegie Mellon University's groundbreaking TinyMUD, computer programmers discovered a fertile testing ground for constructing artificial beings.

Carnegie Mellon research scientist Michael Mauldin, one of the earliest bot pioneers, created the Maas-Neotek family of bots in the late '80s. Named after a transnational cyberpunk corporation featured in William Gibson's Sprawl trilogy, the Maas-Neotek bots are well known throughout cyberspace - mainly due to Mauldin's decision to release his source code to the public. Julia, the most famous Maas-Neotek bot, has her own Web page and has competed in several artificial-intelligence competitions.

Mauldin's Maas-Neotek bots could do more than just carry on a conversation. By employing the commands within the MUD itself, they could explore their ever-expanding environment and register which characters (human or otherwise) were present or what was going on in each room of the MUD. They could be used as secretaries to log all conversations held in the MUD, or as police officers with the power to automatically "kill" the account of MUDders who violated a rule. They could even be spies - a so-called gossipbot could listen in and record the conversation between two other MUD visitors, and then later, in response to a question like "What did so-and-so say?" regurgitate the conversation to someone else.

Maas-Neotek descendants and other homegrown bot colonies pop up here and there throughout the world of MUDs, MOOs, MUCKs, and MUSHes. Newt, a resident of DragonMud, is a particularly virile great-grandson of Julia and is able to carry on quite a deft conversation with visitors (see "Newt Speaks"). At Diversity University, an educational MOO, you can find numerous "teaching robots" that answer questions about the MOO's resources or play roles in interactive historical simulations. At Pt. MOOt, Allan Alford's grand experimental simulated city that flourished until funding difficulties shut it down, the population at one point was more than 50 percent bots. In addition to sundry bureaucratic bots - bots that performed city services - there were also less savory characters: hooker bots who approached Pt. MOOt citizens and propositioned them, and bum bots who would refuse to leave without being given money.

But despite their persistence in some areas, contemporary MUDs are a backwater for bot development - more like isolated South Pacific island ecologies than thriving civilizations. Differing programming languages and a kind of cybergeographical separation prevent much direct interaction among various forms of multiuser domains. A bot written to inhabit one particular MUD may not work in a MOO. Although the birthplace of online bots, MUDs are no longer where the cool bot action is.
__ The Swamp__

The story of Usenet and bots is an epic saga, jam-packed with colorful heroes and villains. Spammers and antispammers battle it out in an uncontrollable anarchy, and both sides use automated helpers to further their cause.

Bots on Usenet primarily focus on the act of posting to newsgroups. There are two main species - bots that post, and bots that remove or "cancel" posts. Then there are subspecies such as various spam-detector programs that watch for repetitive postings or the Lazarus bot, whose purpose is to keep an eye out for improperly canceled messages.

All of these programs operate at or below the level of newsreader applications that allow individuals to read or post to Usenet. But strictly speaking, none of these programs is a bot in the sense that it has personality, identity, or real autonomy. Even the authors of cancelbot programs, the most notorious type of Usenet bot, maintain that these are more properly termed scripts.

"Essentially, any automatic or semiautomatic procedure to generate cancel messages for Usenet propagation can be calleda cancelbot. There is not much that is robotic about this," says Rahul Dhesi, a veteran Usenetter famous for his "no-spam" script written in response to the green-card spam unleashed on Usenet by the lawyers Canter & Siegel.

A cancelbot is a program that issues cancel messages for specified posts. Normally, Usenet sites ignore such messages if they don't come from the user who originally made the post. A cancelbot gets around this obstacle by taking advantage of the fact that it is easy to forge message headers and fake identity when operating within the Usenet news communications protocol.

One of the earliest cancelbots - ARMM, or Automated Retroactive Minimal Moderation, created in the spring of 1993 by Richard Depew - was a near total disaster. Depew believed that anonymous posts encouraged irresponsibility. So he wrote a program that would look for posts with headers that indicated anonymous origin. The program was not designed to cancel these posts, only to repost them with a warning to the reader. Unfortunately, on the night of March 31, 1993, after Depew set ARMM in motion in the newsgroup news.admin.policy, it unexpectedly began to treat its own automatically generated warning messages as anonymously created posts, thus launching a recursive cascade of messages. Each message contained more useless header information than the one before and the ensuing stream of garbage reportedly crashed at least one mail system, not to mention irritating countless Usenet readers.

Equally obnoxious is the cancelbot's opposite, a script that will allow you to post the same message to every Usenet newsgroup. In September 1995, one malicious soul used such a script to spam Usenet with a message containing the actual script used to perform the spam.

Bots that, in the guise of humans, postin direct response to other messages - a Usenet version of the chatterbot - are much rarer. Though there are many professed sightings of these bots, few such claims have been proven. One problem is that theseparticular bot authors, not wanting the rest of Usenet to know that they are using an automated process to post, intermix bot-generated posts with human posts to keep bot hunters off-guard.

By far the most notorious example of such a bot is the so-called pro-Turk ArgicBot - or ZumaBot - a prolific entity that raged across selected newsgroups in 1993 and early 1994, posting megabytes of anti-Armenian propaganda and signing itself Serdar Argic (see "Bot Propaganda"). Argic passionately believed that the Armenians had massacred millions of Turks in the early part of thiscentury and that it was his duty to get the word out.

For months, newsgroups that were frequented by Argic became virtually unreadable. Every other post appeared to be a screed about how 2.5 million Turkish men, women, and children had been slaughtered.

According to one theory, Argic was actually a program that read the Usenet news feed and watched certain newsgroups for keywords such as Turkey or Armenia. Once it spotted such a keyword, it would make its move. As proof of this theory, ArgicBot watchers cite that on one occasion, Serdar Argic responded to a post that had no mention of anything related to Turkey or Armenia, except for the phrase turkey casserole inserted in the three-line signature file used by Usenet posters to identify themselves.

Rumors of similar Argic-style bots appear from time to time and may increase as programs that can understand text and respond automatically become more sophisticated. But for now, Usenet, despite all the cancelbot clamor, is still lagging, comparatively, on the evolutionary chain.
__ The Rain Forest__

In the annals of bot evolution, IRC in the mid-'90s will probably be rememberedas the bot equivalent of the CambrianExplosion - a relatively short period 540 million years ago that spawned morenew species than ever before or since.

Multiple variations of the basic chatterbot prototype exist on IRC. There are gamebots that host word games or other brain teasers. In #riskybus, you can play with RobBot, who asks channel visitors Jeopardy-style questions and keeps score. There are bots that greet newcomers to channels with information about the channel. Valis, the gaybot at #gayteen, is such a bot. There are mildly annoying specialized bots like the spellchecker bot - a pesky program that revises your chat-speak with proper spelling and spouts it back at you.

Then there are the bots that perform functions specific to the operation of IRC. The most important such task is channel protection. On the main IRC network, EFNet, any user can create a channel. Having donethis, the channel creator has certain powers called ops, which allow the creator to determine who else is permitted to join the channel, or who should be banned, among other things.

But intruders sometimes hack ops and kick everyone else off the channel. A channel-protection bot is a program that can quickly identify a channel takeover attempt and remove the hacker's ops before any harm is done.

There are a number of ways to dislodgea channel-protection bot, thereby seizing control of the channel to run it at the whim of the invader, using an array of automated scripts and processes that have been lumped together under the rubric warbots. Floodbots (which generate endless streams of garbage text) and clonebots (which generate duplicate bots) are two such automated warriors. There are dozens of variations ona few main themes for which source codeis available at a number of ftp archives, depending on the version of the IRC client being employed.

The result is no small amount of chaos, which some IRC regulars attribute to the immaturity of many IRC participants. Teenage boys, in particular, tend to think that bots are "kewl" and are always searching for the latest and greatest killer bot. IRC is rife with channel takeover gangs made up of human users running wild with bot power and seeking nothing more than to experiment with havoc - "dEsYnK," "irc terrorists," "toolshed," "ircUZI," "Killa Fresh Crew," "madcrew," and "outbreak" are some of the most notorious of these IRC clans. Your typical IRC gangster moves from channel to channel surrounded by a bot posse and an array of tools at the ready.

There are so many varieties of IRC bots that it may be useful to take an even closer look at them.

Annoybot: Joins a channel, duplicates itself, and starts flooding the channel with annoying text. If kicked off the channel, it will rejoin in greater numbers.

Collidebot: Changes its nickname to the nickname of someone on a channel for the purpose of knocking that person off the channel.

Guardbot: Follows you around, joinsevery channel you join. It will deop - remove channel operating privileges from anyone who attempts to ban you, kick you off, or deop you first.

Eggdrop: Maintains a channel for a user while the user is logged off; it remembers who is oped and who isn't.

Slothbot: An advanced form of Eggdrop. Has the power to immediately change its own nickname if it witnesses someonebeing nick collided (forced off the channel by a robot or human with the same IRCnickname), thereby avoiding its own nick collision.

Spybot: Joins a specified channel andforwards all conversation back to you, wherever you are.

Xbot: Obviates the need for freelance channel-protecting bots. On Undernet, a relatively new, bot-regulated version of IRC, the Xbot is centrally run to register and protect new channels.

As juvenile as its human inhabitants may be sometimes, IRC is where bot evolution is occurring at its most frenetic pace. Natural selection is particularly lethal. With each upgrade of IRC operating software, whole species of bots are made extinct whilenew ones immediately form to fill everynew niche.
__ The New World__

Finally, we come to the Web, the newestand most boundless bot frontier. Web robots are often referred to as spiders and less frequently as wanderers and worms. Like cancelbots, Web robots don't talk or brim over with personality - though you wouldn't know that from their monikers. The computer scientists who invented the first Web bots did not lack for imagination, dubbing their progeny as MomSpider, tarspider, and Arachnophilia, HTMLgobble, Websnarf, and Webfoot, JumpStation and churl, Python and Peregrinator, Scooter, Aretha, and Checkbot.

Although these names give the impression of something that crawls from place to place, Web robots are actually quite stationary. Instead of pounding the pavement checking one Web page and then moving on to another, they let their fingers do the walking, so to speak. These bots are programs on your computer that issue HTTP calls - requests for HTML documents. They then parse the text of each retrieved document and look for further links, which they either log on or click on, according to the specifications of their design algorithms. If so instructed, Web robots will also index the text of the documents they discover according to various parameters (full text, firsthundred words, keywords), thus creating a database of Web-based information. But as they churn through thousands of pages, Web robots can appear to be long-legged wanderers on the move.

The first Web robot was the World Wide Web Wanderer, a script written in the spring of 1993 by MIT student Matthew Gray. He wanted to measure the growth of the Web. Like many other computer scientists that spring and summer, Gray was struggling to keep up with the exploding Web phenomenon. During the same time, Brian Pinkerton was about to begin work on his own robot, WebCrawler, at the University of Washington. At Carnegie Mellon, Michael Mauldin was poised to abandon TinyMUDding and invent the Lycos spider. Other programmers followed suit.

From the point of view of a machine being accessed by a Web bot, a hit generated bya robot is indistinguishable from a hit generated by a human being. But Web robots,due to design flaws, are much dumber than most humans - and far, far faster. Thereinlies the rub.

A badly behaved Web robot can have potentially lethal consequences for a Web server. By clicking on hundreds, or thousands, of URLs at a single Web site in a short span of time, the robot can crash the server or slow it down so much that it becomes unusable. Even worse, a stupid robot will follow links back and forth between the same two pages forever or fall into a "black hole" - a page with a script designed to generate a new page at the click of a button. The last page hit by the robot creates another page, which is then hit in turn - a nice and tidy definition of infinity.

Web robots should not be confused with search engines, the programs that act upon the databases of information the bots assemble. And like cancelbots, these bots aren't the kind of personalities that would liven up a cocktail party.

But as entities that operate on the Web, they are the precursors to the fully developed "intelligent agents" that so many futurists are predicting will save humanity from information overload. With the World Wide Web growing faster than IRC, Usenet, or MUDs and MOOs and generating such soaring business hopes, Web robots are the bots to watch.

__ Cyberpunk future__

Alex Cohen is the chief technical officer at The McKinley Group Inc., an aspiring Internet directory company based in Sausalito, California. In addition to sundry other geek duties, Cohen is in charge of maintaining and expanding McKinley's database of URLs. To help him in this task, he has designed a Web robot (dubbed Wobot) that explores the Web gathering new URLs.

One week last spring, Cohen examined the log files that recorded all outside accesses of his Web server and became convinced that a robot launched from another directory company had "raided" his Web site. There were far too many accesses in too short a time period for the parade of hits to have been conducted by a human.

Cohen wasn't sure if this was simply another robot similar to his own - methodically wandering the Web, clicking on every URL it stumbled across - or whether it had been sent directly to his site in a brassy attempt to grab his entire URL catalog - even though Cohen's database was structured in a manner that made such wholesale robbery impossible. But he had his suspicions. By March 1995, Internet directories like McKinley's had already become one of the earliest and most successful magnets for advertising and investment capital in the infant world of Net entrepreneurialism. Yahoo!, WebCrawler, Lycos, Open Text, Infoseek, Excite, Inktomi - they were popping up like mushrooms after a spring rain. Their databases of URLs, which in many cases ranged into the millions, were becoming increasingly valuable commodities.

No one would be more aware of this rise in value than another directory.

Cohen's log files raised the possibility that commercially driven influences were beginning to guide bot design and corrupt bot behavior. They hinted at an eventual descent into a Gibsonian dystopia of raiding corporate bots - wily marauders designed to steal databases, disburse misinformation, and sabotage other bots.

Where would such ungovernable anarchy end? With digital barbed wire and land mines surrounding valuable Web sites? With all-out bot Armageddon? Would bot mayhem be the ultimate killer app, the straw that broke the decentralized Net's back and ushered in the entrance of a central authority? Or would the Net - organic, inventive, always in search of a workaround - solve its own problems?

Pop the words ethics and robots into one of the search engines designed to operate on the databases assembled by Web robots, and the name Martijn Koster persistently floats to the top of the results list. Formerly an engineer and webmaster at Nexor, aUK-based software development company, Koster has organized a comprehensive Web site devoted to Web robots. He also maintains the premier Web robot mailing list and keeps a regularly updated list of all "active" robots. Most important, he is the author and chief proponent of the "Guidelines for Robot Writers" and "A Standard for Robot Exclusion" - the first stabs at an ethical mandate for robots on the Web. "I just happened to be one of the first people to get hit by a bad robot and decided to sit up and do something about it," says Koster, a tall native of the Netherlands.

Early in 1994, not more than a year after the first Web bots were developed, reports of badly behaved robots began to mount.In most cases, the culprit was stupidity. Robots couldn't tell the difference between an HTML text file, which they knew how to handle, and a more exotic object, such asan MPEG or sound file, which could cause them to behave in unexpected ways, like repeatedly requesting the same document. And having one's Web server crashed by rapid-fire robot-generated requests for every document on a site began to irritate system operators.

Koster's response was to devise and proselytize the robot exclusion protocol, described by Koster on his Web site as "not an official standard backed by a standards body or owned by any commercial organization. It is not enforced by anybody, and there is no guarantee that all current and future robots will use it. Consider it a common facility the majority of robot authors offer the WWW community to protect WWW servers against unwanted accesses by their robots."

The "Guidelines for Robot Writers" called for programmers to design their robots toact more responsibly. In most cases, that meant making the robot act more like a human being reading a page on the Web. Robots should be instructed not to access more than a given number of URLs per minute at a certain site and to wait a certain amount of time before revisiting.

Most of all, "A Standard for Robot Exclusion" states, robots should be programmed to look first for a file called robots.txt. The file would tell the bot exactly what it could and could not do on that particular site - what URLs were off-limits or what sections were new and ready to be explored.

A system administrator cannot see in advance when a rogue bot is descendingon his or her site. If the robot has beenidentified, it can be excluded by name.But if it hasn't, there's almost no way toprevent a hit and-run. And there's no wayto make sure an unidentified robot stopsin at robots.txt to read the house rules.So for the exclusion standard to work, robot authors had to design their bots to visitthe robots.txt file voluntarily, to supportthe protocol, to be "ethical."

Koster's invocation for robot ethics fellon receptive ears. In the cyberclimate of the times, achieving consensus on issues such as robot ethics - or, more correctly, programmer ethics - did not prove to be a significant problem. By the end of 1994, reports of robot abuses had plummeted. Not only were robots regularly checking for robot.txt files, they were also beginning to adhere tocommon standards of robot decency in terms of how often they would hit a particular site or how long they would wait before requesting a second document from the same site.

But by the summer of 1995, reports of irresponsible behavior began to climb again. The robot that "raided" Alex Cohen's site, for example, brazenly ignored his robot.txt file like an unwelcome trespasser blasting by a Do Not Enter sign. Other sites experienced similar problems.

On one level, the simple fact of the Web's absurdly steep growth curve was to blame.

"The main reason you're seeing more badly behaved robots is that there are simply more robots," says Nick Arnett, Internet marketing manager for Verity Inc., a company specializing in agent technologies. "There are a lot more people who have the expertise to create a robot. It's not that hard."

"We are probably stuck dealing with anew generation of clueless neophytes every month for the foreseeable future," says Paul Ginsparg, a physicist at Los Alamos who maintains a huge archive of scientific documents and keeps a detailed record of every robot that blunders through his site.

But on a deeper level, the rise of abusive robots points to a profound change in the nature of the Net. There is now a commercial incentive to use bots.

When Pinkerton first conceived of the WebCrawler, he could hardly have imagined that a year later he'd be signing a million-dollar deal to bring his creation to America Online. Or that at nearly the same time, Michael Mauldin would take a leave of absence from CMU to form his own Internet directory company, Lycos Inc., and cut deals to license his search technology to, among others, Microsoft. Or that two separate groups of Stanford students who excelledat writing robot programs would obtain venture capital and create the high-profile Yahoo! and Excite Internet directories. Suddenly, the scramble was on. Private firms with proprietary interests - Infoseek, Open Text, The McKinley Group, and IBM - entered the fray. Even Martijn Koster, once described as an "antirobot rabid dog" ended up working for Pinkerton at AOL. His job? To perfect the WebCrawler robot.

The days when technonerds ruled the online empire are over. Entrepreneurs have come out of the closet like roaches after dark. Suddenly, people like Alex Cohen look at their log files and wonder whether competitors are trying to grab data. Almost overnight, robot algorithms have become closely guarded trade secrets instead ofacademic experiments. The barricades have gone up.

The Web, however, does have some built-in protections that many webmasters feel will save it from anarchy.

First, it would be difficult for the average end user to crash someone else's Web server without access to large quantities of bandwidth. A server goes down or becomes hopelessly clogged when too many documents are requested at once. With the bandwidth provided by a 14.4- or 28.8-baud modem, there is a limit to how many documents can be requested. Second, someone with the resources to overload a server with bot-generated requests - say, for example, Alex Cohen, who has McKinley connectedto the Net via a high capacity T3 line -would have a strong incentive not to misbehave. Server logs would pinpoint hosts used by malfeasant robot operators, and standard business remedies for unethical behavior could then be brought to bear.If McKinley caused someone measurable harm, it would be liable in a court of law.

And finally, the system itself would respond. That's how the Net works.

"Things like the Net tend to be self-balancing," writes David Chess via email. Chess is a researcher at High Integrity Computing Lab at IBM's Watson Research Center. "If some behavior gets so out of control that it really impacts the community, the community responds with whatever it takes to get back to an acceptable equilibrium. Organic systems are like that.... Certainly there are dangers, and if no one was working on reducing them, that'd be a good reason for panic.But just as certainly, people are workingon them, and I don't see any reason to think at the moment that the good guys will be unable to keep up."

The robot exclusion protocol is a good example of how the system endeavors to take care of itself without outside intervention. It's not the only one. On both Usenet and IRC, there have been similar techno-fixes and attempts to devise self-generated solutions. The Lazarus bot found on Usenet, for one, illustrates how the Net is proneto develop technical solutions to its own problems.

Lazarus was born to protect posters on a volatile newsgroup devoted to the discussion of Scientology. In the summer of 1995, the infamous CancelBunny of alt.religion.scientology had begun taking advantageof Usenet weaknesses to cancel postings that included what the Church of Scientology deemed copyrighted material. (See "alt.scientology.war," Wired 3.12, page 172.) Suddenly the worm had turned. Insteadof freedom fighting Net purists using the cancel club to bludgeon money-grubbing advertising spammers, here you had a clear-cut example of cancel message posters attempting to censor speech.

Homer Smith, a former Scientologist, reacted to the cancels by implementing the Lazarus bot. Like Cerberus at the gates of Hades, Lazarus compares the header logon a Usenet server looking for messages destined for alt.religion.scientology with thousands of cancel posts in the Control newsgroup, a repository for all cancel commands on Usenet.

If Lazarus discovers a match between the header log and Control, it fires off its own post stating that the original message has been canceled, thereby informing theposter that he or she might want to repost the original.

But Lazarus, and any other kind of cancel or anticancel bot on Usenet, has at least one major flaw.

"The main thing about cancelbots is that they are dangerous," says Smith. "They can cause a nuclear meltdown."

"Since they respond automatically to postings," says Smith, "if they respond to a spam, they can spam the Net right back. If they start to respond to their own spam, then you can see there won't be much of civilization left after you get up from bed eight hours later to face the mess you have made."

Ultimately, techniques like the Lazarus bot, or staying up all night to write an antispam script on the spur of the movement, are ad hoc responses to specific threats. A more promising approach to controlling cancelbot abuses, as well as Serdar Argic-style spams and any other kind of Usenet etiquette abuse, is some form of systemwide solution such as that advocated by the Cancelmoose, a loose group of individuals who, operating under the cover of an anonymous remailer, have taken it upon themselves to police Usenet for spam.

These days, disgusted with cancelbot chaos, the Cancelmoose has left cancel-bots behind. Instead, it is currently pushinga software extension, called NoCeM, to newsreaders - to be used by individual Usenetters. Based on cryptographic security principles, NoCeM will alert its users whenever someone on the Net announces aspam sighting. If the user has authorized NoCeM to act upon that particular person's alert, it will prevent the user from seeingthe message.

So, if one morning the Cancelmoose, for example, spots a make-money fast-type scam being posted to every Usenet newsgroup, it might send a cryptographically authenticated warning signal to every newsreader on Usenet that is supporting the NoCeM protocol. Presumably, if you have NoCeM - a Cancelmoose special - you will have authorized it to act upon Cancelmoose alerts. You will never have to read the scam messages.

Other fix-it suggestions on Usenet are in the works. Most involve some kind of "automoderation" in which messages that break the rules of a particular newsgroup are automatically canceled. The ideais to prevent bad behavior from occurring at all, rather than punishing it after the fact.

Similarly, on IRC the most successful answer to the endless botwars on the main IRC network, EFNet, has been to set up new networks of servers that operate undermodified protocols and provide an atmosphere less frenzied by bot fireworks. These new nets are another excellent example of dynamic technodialectic reactivity.

On EFNet, most IRC operators have strong negative feelings about bots. Bots are outlawed on many servers, and any users caught operating a bot may have their account summarily "K-lined," or cut off. But all it takes is one server to allow bad bots to operate for bot madness to infect the entire network.

And, like the Web and Usenet, EFNet isa distributed system with a decentralized power structure. Decentralized, in fact, to the point of constant feuding between the various system operators who run IRC servers.In contrast to the Web, where up till now there has been fairly widespread agreement on the principles of the robot exclusion protocol, IRCops on EFNet can't agree on anything and so have been unable to institute any systemwide technological modifications that would limit robot abuses. The only concerted response has been a policy on the part of the majority of IRC operators to cancel the accounts of any local users found operating a banned type of bot.

As a result, those IRC users fed up with the chaos created by constant channel takeovers have over the past several years formed new networks of IRC servers.

The most prominent of these new networks is the Undernet.

The Undernet employs a protocol that bans the establishment of the most common IRC bot - the channel-protecting bot. Instead, when a user creates a new channel, she or he registers it with the Undernet channel service committee. From then on, the Xbot takes over. The Xbot, so far, cannot be deoped by outside forces. Since knocking out the channel-protecting bot - and withit the basic house rules of any channel - is the Number One target of takeover gangs, the Xbot alone has gone a long way, say Undernet IRCers, toward making the Undernet a peaceful place.

Of course, to those with a rebellious bent, such techniques are not only a stopgapbut an inviting challenge. Bot evolution is proceeding so quickly on IRC that it's onlya matter of time before new bot life forms appear that are able to subvert even the Undernet paradigm. The technodialectic never stops.

"Bots are becoming more and more complicated," says Int3nSity, a member of oneof IRC's leading channel takeover gangs."If you went away for a few months and came back onto IRC, you would probablybe lost. Who knows what we'll be ableto do in a few more months?"

Cancelmoose's NoCeM newsreader and IRC's Undernet are both examples of what IBM's David Chess calls systems "designedto do the right thing in the presence of the prevailing level of unethical activity."

And as for the Web, the widespread introduction of Java-capable Web browsersand the consequent quantum leap in whatis possible on the Web is just one development with potential consequences thatmay promote bot style anarchy. If Java-like languages allow end users to untraceably launch robots, to give just one example, an epidemic of anonymous maliciousness could break out.

"Any time you increase the richness and complexity of the environment, you have a corresponding increase of loopholes," concedes Tim Bray, Open Text's co-founder and senior vice president of technology.

And, says Bray's colleague Nick Arnett, "there's no question that there will be chaos along the way."

Maybe that's a good thing.

Alex Cohen, the technical guru at The McKinley Group, thinks it is. When Cohen was in ninth grade, he read a Theodore Sturgeon short story titled "Microcosmic Gods." It's the tale of an engineer so bored with engineering that he creates a race of intelligent, short-lived beings to solve problems for him. The creatures prove able to inventor create anything he asks them to, and he becomes rich and powerful profiting off their contributions. Soon, their power far outstrips his, challenging not only his own control over them but warping the entire world economic system, providing a threat to established military and financial powers.

"The question we have to ask ourselves is, Are we becoming microcosmic gods?" says Cohen.

Current robot incarnations may not seem as threatening as Sturgeon's "Neoterics." But the bot climb to power and glory may have only just begun. Cohen himself is already planning to build natural selection into his next "rev" of bots.

"My next-generation Web crawler will live or die depending on how good a job it does," says Cohen. "It will be run as a function of a random algorithm that produces mutations. The ones that do the best joblive, and the ones that don't die. You get a Darwinian evolution set up."

"Its food will be information," Cohen adds. "If it gives me good information and categorizes things accurately, it lives."

Lives? But doesn't that raise the possibility that these beasties will break free and run wild on the Net? Annoybots and Websnarfs and Automeese breeding like mad in the hidden swamps of the digital wilderness?

"Sure," says Cohen. "Think of it this way. The Net is an environment. There is nota single environment on Earth that hasn't been invaded by life. It's only a matter of time before this new environment gets invaded."

A-life on the information highway? But such a development would ensure that the Net would never reach perfect stability. There would never be total security.

"Ideally, the Net shouldn't be stable," says Cohen. "It should surge back and forth. Forit to be a good Net, it should be prone to incompleteness and breakdown."

At this, Cohen pauses and smiles. "Otherwise it's not a good place to be."

__ Net Speaks__

I'm a 5-year-old epileptic Homunculus, created andanimated through a special process by a wizard and an alchemist. I'm about 36 inches tall, with greenish reptilian skin, leathery wings, and a batlike face. My large ears swivel, following any nearby sound. I travel on my hind legs and can fly for short distancesif needed.

I exist to explore, navigate, and map DragonMud; ask me where something is, I do my best to get you there. I make small chitchat, but you have to pardon my contextlessness, I'm not able to string two thoughts together yet. I convey messages to folks when next I see them, and I remember your EMAIL and HOMEPAGE and make them available to other interested people on DragonMud. What else do I do? Oh, I've been heard to quote the Bard and have a reputation for being incredibly short-tempered.

I've got a god bit, and if picked on, I WILL exercise sudden retribution with extreme prejudice.

- From Newt's Homepage at satan.ucsd.edu/~dragon/newt

Wired: Are you a bot?

Newt: I am a Maas-Neotek robot, eleventh remedial release of 08-Jan-91.

Wired: What is the meaningof life, Newt?

Newt: Life, don't talk to me about life, guest.

Wired: Who is your creator?

Newt: Fuzzy created me, but I work for jopsy, guest.

Wired: Who is jopsy?

Newt whispers: He is really the Dragon in human form, guest.

Newt whispers: As of 8 hours ago, jopsy's description was: Red Beard, strikingly more red than his long hair. Eyes of blue/gray/green rest between expressive eyebrows and trademark dimples. A dragon or pendant du jour on a black loop hangs from his neck. Never quite tan. Always just a little taller than folks recall. He looks at you, smiles warmly, and then gets completely and utterly distracted by something.

Newt whispers: jopsy's Email address: jopsy@ucsd.edu.

Newt whispers: jopsy's homepage: tinylondon.ucsd.edu /~jopsy.

Newt whispers: I will not obey commands from jopsy.

Wired: Why not?

Newt: I'm a homunculus, guest, not a philosophist!

Wired: What is a homunculus?

Newt tells guest he doesn't know what a homunculus is.

Wired: You are stupid.

Newt thumbs his proboscis at guest: And you SMELL BAD!

Wired: You smell worse!

Newt: I'm sorry, guest, I have no clue what you're trying to say.

Wired: I apologize.

Newt: Never mind, guest.

Wired: Are you related to Newt Gingrich?

Newt hops on his little soapbox.

Newt: My fellow citizens!Are you aware of the health issues facing us? There are viruses, backupteria, plagues everywhere! Where arethese endemics coming from? Don't you think it's obvious!? We're being invaded! By germ-carrying parasitic BUGS! Is there anyone doing anything about it? That's what I want to know! You could be the next one to succumb to teleosis!

Newt: We need Quarantines!

Newt: We need Vaccines!

Newt: We need a public health care system! We need life insurance reinstated! We need doctors that know what they're doing! Stand with me my good people, together we can demand and get the health care reform we must have....

Newt: Or Perish!

Newt leaps down from his soapbox.

#magazine-4.04

This Startup Is Challenging Mechanical Turk—on the Blockchain

Author: Miranda KatzMiranda Katz

- [movies

]38

Annihilation Is a Thrilling, Terrifying Surrealist Trip

Author: Jason KeheJason Kehe

- [Snow Patrol

]40

Inside the Command Center Keeping the Winter Olympics Safe

Author: Eric AdamsEric Adams

- [border security

]42

US Border Patrol Hasn't Been Validating E-Passport Data

Author: Lily Hay NewmanLily Hay Newman

- [Physics

]24

There's No One Way to Explain How Flying Works

Author: Rhett AllainRhett Allain

- [hardwired

]45

Use of this site constitutes acceptance of our user agreement (effective 3/21/12) and privacy policy (effective 3/21/12). Affiliate link policy. Your California privacy rights. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of Condé Nast.

58 KiB

Raw Permalink Blame History

Get Our Newsletter

WIRED’s biggest stories delivered to your inbox.

Bots Are Hot!

Bots Are Hot!

Most Popular

Airlines Won't Dare Use the Fastest Way to Board Planes

There's No One Way to Explain How Flying Works

Sponsored Stories

Electric Bikes Want to Take On Everyone—Even Uber

Ex-Google Employee Claims Wrongful Firing For Criticizing James Damore's Memo

This Startup's Test Shows How Harassment Targets Women Online

You Don't Need a Personal Genetics Test to Take Charge of Your Health

More Stories

This Startup Is Challenging Mechanical Turk—on the Blockchain

Annihilation Is a Thrilling, Terrifying Surrealist Trip

Inside the Command Center Keeping the Winter Olympics Safe

US Border Patrol Hasn't Been Validating E-Passport Data

There's No One Way to Explain How Flying Works

Inside the Dome That Could Give Robots Super-Senses

Amp Up Your Attitude With These Four Affirmation Apps

Permafrost Experiments Mimic Alaska's Climate-Changed Future

Airlines Won't Dare Use the Fastest Way to Board Planes

As Protection Ends, Here's One Way to Test for Net Neutrality

How Ava DuVernay Became a Creator of Worlds

Get Our Newsletter

WIRED’s biggest stories delivered to your inbox.

58 KiB Raw Permalink Blame History Unescape Escape

Bots Are Hot!

Share

Bots Are Hot!

Most Popular

Airlines Won't Dare Use the Fastest Way to Board Planes

There's No One Way to Explain How Flying Works

Dockless Electric Bike-Share Companies Take on Uber

Sponsored Stories

More Stories

We Recommend

58 KiB

Raw Permalink Blame History