![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
The Globe and Mail's Shane Dingman describes an unexpected for the Wattpad corpus.
If you are one of the 40 million people who enjoy reading or writing the mostly romantic werewolf, superhero or historical fiction stories found on Canadian startup Wattpad, you may also be contributing to the development of the next generation of artificial intelligence.
In a new paper called Augur: Mining Human Behaviors from Fiction to Power Interactive Systems, a group of Stanford University computer science researchers revealed that they used the Wattpad “corpus” – a collection of almost two billion words (or 600,000 chapters) written by regular people – to help a computer understand the world around it. The team intends to make the program they built, Augur, into an open-source tool that other researchers can build on.
“The basic idea is that it’s very difficult to program computers to understand the broad range of things that people do,” says fourth-year PhD student Ethan Fast, co-author of the paper (published as part of the upcoming Computer Human Interaction conference) and a member of Stanford’s Human-Computer Interaction Group. “Fiction has a lot of useful things to say about the world, and if you have enough of it, you can model it in much more depth than you could hope to manually.”
Until recently, Toronto-based Wattpad, founded in 2006, didn’t make its data available to researchers, and it may not have happened in this case if it weren’t for the intervention by co-founder Ivan Yuen, who knows members of the Stanford team. More than 200 million uploads (some stories, some just chapters) have been shared on Wattpad, the majority of its users are under 30 and they spend 13 billion minutes a month on the service. So far, the company, which has 112 employees, has raised more than $66-million (U.S.) in venture capital financing.
“When we started this in 2014, we knew there was value in the corpus, but we hadn’t really explored it too much,” Wattpad’s head of engineering, Jordan Christensen, says. “As we started working with the Stanford guys, it really opened our eyes a bit and now … through our own internal research and with partners, we are really starting to change the way we think about Wattpad.”