User-generated content is rife with risk and opportunity.
The opportunity for it to deliver remarkable images is made clear on an almost-daily basis, be it in the midst of a crisis like the Boston Marathon bombings, Hurricane Sandy, or simply someone snapping a notable shot at a local event.
The risk is that images are easily faked, scraped and manipulated.
News organizations and others seeking to source images and information from the crowd therefore have no choice but to push forward with new methods of verification — and to make existing methods quicker and more accurate. So it’s no surprise that we’re seeing initial moves towards automating aspects of the verification process.
The Guardian and Scoopshot both recently unveiled new initiatives to bring an element of automation to verification. In both cases a human element is still essential. But as I noted previously, it’s important to see how much machines can help us deal with the challenge of verifying large amounts of content more quickly.
Authenticity scoring
Scoopshot is a crowdsourced photography service that enables news organizations to source (and assign) photographs from their community and from users around the world. Niko Ruokosuo, the CEO of Scoopshot, detailed his company’s new initiative in a recent announcement.
Ruokosuo said “we’ve developed a new tool within the Scoopshot ecosystem that instantly and graphically shows media companies the authenticity level of any user-submitted image. Our system basically substitutes an inherently flawed manual process that may take an hour per image for a highly-automated, intelligent programme that takes seconds.”
Scoopshot now delivers an authenticity score for each photo calculated based on data about an image — such as whether it was taken using their mobile app, and if the image’s metadata is available.
Similarly, the new GuardianWitness initiative, which enables its community to easily contribute images via the web or mobile apps, offers built-in functionality to gather a submission’s metadata, helping automate one aspect of verification.
Both efforts rely at least partly on EXIF data, which can tell you basic information about a digital image, such as the type of camera used, the exposure information, and other details.
“We wanted at least a basic level of verification to be applied before something was published on GuardianWitness,” Joanna Geary, the Guardian’s digital development editor, told me by email. “We are, however, sensitive to different types of content potentially requiring different levels of verification. So, for example, we might do some very basic copyright checks on a picture of a dog, but would go into much, much more detail for a picture from Syria.”
Along with automating the examination of EXIF data, the Guardian and Scoopshot both use native apps to help make it easier to authenticate aspects of an image. Having photographers work in a controlled setting, such as an app for taking pictures, can help answer questions about how a photo was created, according to Samaruddin Stewart, a current Knight Fellow at Stanford University who is researching “the use of image forensic tools to identify manipulation in potential news photographs.”
“In this route you can oversee the chain of custody and also layer in additional information that today’s smartphones are great at capturing,” he told me.
But Stewart also noted some of this approach’s limitations.
The biggest limitation, Stewart said, is the need to change user behavior, such as launching a specialized app to capture a photo or video instead of simply using a standard camera app, as users do “99% of the time.” Users can import visuals into an app from a camera roll, he noted, but this “heightens the risk of manipulations since the chain is broken.”
Economic incentives for automation
That’s why Scoopshot offers a score instead of a guarantee that an image is real. In the end, it’s up to the journalists accessing the system to decide whether a high score is enough, or if they need to dig deeper into how an image was created. Speaking about the bars that signal authentication on Scoopshot, Ruokosuo told my colleague Andrew Beaujon that a news organization can “feel pretty good” about a three-bar photo.
A recent Journalism.co.uk article about Scoopshot’s scoring system reported that it enabled a Dutch newspaper to publish “verified images from Scoopshot users within six minutes of asking for submissions.”
The article also noted that the company’s CEO “insisted that some agencies may still manually check images should they wish to, arguing that the software indicates risk rather than complete legitimacy.”
In Scoopshot’s case, automation is aimed at reducing the risk while increasing speed. The faster its clients can use images, the more it might be able to sell.
“Figuring out how to best source and vet these visuals at scale will likely determine who can ultimately grow engagement, differentiation, and likely revenue,” said Stewart.
Now that there are clear economic incentives for helping speed up and perfect this process, we’re likely to see further innovation. That means more tools to help with manipulation detection, analysis and other aspects of photo verification.
One company that’s already working on that is Fourandsix. It offers FourMatch, an extension for Photoshop that “instantly analyzes any open JPEG image to determine whether it is an untouched original from a digital camera.”
I spoke with co-founder Kevin Connor last year about the prospect of achieving 100 percent accuracy for image detection and verification.
“There’s a temptation to want to have some magic bullet or magic algorithm that will tell you whether an image is real or not, and we quickly realized that’s just not going to work,” he told me. “What you have to do is approach it as a detective and examine all the various clues in the image itself and the file that contains the image.”
For the Guardian, the lack of a magic bullet has required a large-scale training effort in the newsroom. As Geary told me, GuardianWitness verification mixes human and machine elements, but it’s “predominantly human.”
“When we built the back-end tools we made it a requirement to pull in some basic information (e.g. EXIF data) and make it visible to our team,” she said. “Then there are other checks they will do — some of which move into investigative work … Online verification can actually be quite a substantial act of journalism.”
In conjunction with the launch of GuardianWitness, the organization gave roughly 100 of its journalists training in verification by working with Storyful, a social-media news service that sources and verifies user-generated video for use by news organizations. (Disclosure: Spundge, the company where I’m a partner, continues to have discussions with Storyful about finding ways to work together.)
“I’m quite proud that we have taken so many through verification training, but I also recognize that it’s never enough and you can’t stop there,” Geary said. “This is a rapidly changing field and — in some cases — an outright fight to avoid spreading misinformation. As with all changing skills, different people pick it up at a different pace dependent on need and on understanding. We’d like to look into being able to keep up with training but to do this in a way that recognizes the demands of a newsroom and help people to learn on the job when they need to.”
Stewart and others say there will never be a Holy Grail of automated photo verification — the human element will always be necessary.
“I do not however think that we’ll have full automation any time soon or that we even should,” he said. “I think editorial scrutiny will always play a role.” But, he added, if he were running or planning a desk for user-generated visuals, “pursuing technical tests” for verification would certainly be a priority.
Stewart provided a good motto for the efforts to automate aspects of verification: “Launch and iterate is a far better strategy than ignore.”