Not Everything is Verifiable, But That’s OK: Lessons From a Failed Geolocation

Most published open source research consists of user-generated content which has been verified, with analysis corroborating the date, time and location of capture. Once solved and explained, the process of geolocation can seem intuitive. Yet what’s less visible and less widely discussed are the moments of doubt during any challenging geolocation task and the instances in which content cannot ultimately be verified.

Leaving content unverified is common but deciding to do so is rarely straightforward.

When key components of successful geolocation include engaging a critical mindset, patience and persistence, at what point do you leave a piece of content unverified and move on? And when skilful open source research relies on individual techniques to get the most from the tools and content, how do you know when you’ve exhausted all possible approaches?

Leaving content unverified is common but deciding to do so is rarely straightforward. In this article, I highlight the process of deciding to move on, and underscore its importance. In the hopes of making this decision easier, I share the lessons I have learnt about communication, keeping track of your findings and evaluating the context.

Case study

Most recently, I faced this problem when trying to verify a set of photographs purportedly taken at a port in Viet Nam. The images show wood which other lines of Amnesty’s investigation suggested may have been logged in protected forests in Cambodia and then illegally exported. There were other ways to make the overall case, but colleagues and I were keen to see if we could verify this step in the supply chain.

From the group of a dozen images posted to Facebook, only one photograph, showing the exterior of a freight container, stood out as potentially verifiable. The container had a visible tracking number and there were logos in the background. I also knew that the company exporting the timber only shipped to two cities in Viet Nam: Hai Phong and Ho Chi Minh City. Based on this information, I felt confident I could find the location.

The piece of content I was working on. Source: Facebook.

When reverse image search didn’t return any results I began geolocating the content. I started with one of the numbers written on the container: GMDU 815298. A Google search indicated that the prefix ‘GMDU’ referred to the shipping company Gemadept. However, when I looked up the container number on freely available freight container tracking sites they didn’t return any results. Instead, I went straight to the source – the Gemadept tracking website. Here I found that the company requires a login to track their containers.

With tracking the specific container ruled out, I tried to narrow down the location in other ways. I knew I had the option to manually comb through the satellite imagery of ports in both cities to try and match the ground type. However, before committing to this time-consuming task, I followed up on any other clues that might help narrow it down.

For contextual information, I searched other content posted on the same Facebook group. I looked for other posts from the same location or details in captions and comments referencing locations. I found a post from 1 December 2020 which showed images of freight containers, however only from the interior. The comments of both posts were just negotiations about price and quantity, and other content on the group showed timber in a warehouse rather than a port.

I returned to the image and closely re-examined it for any other details. There was a logo on the container which I didn’t recognise, so I did a Yandex reverse image search on it. The results pointed to the Bureau Veritas certification mark which indicates a shipping company’s compliance with marine standards. This certification isn’t location-specific and so I pivoted again.

I turned to the company logos on containers in the background of the image. Their container numbers were too pixelated to track, so instead, I looked to identify the container companies and the locations each company shipped to. By comparing this data for the five companies pictured, some of whom I hadn’t heard of, I thought I might be able to rule out certain ports or narrow it down to one city.

The original content with the container number, certification mark and other freight container’s logos highlighted. Source: Facebook.

The companies Maersk and Heung-A were clearly legible and a quick Google search of the visible letters on the other two containers placed them as Nippon Yusen Kaisha (NYK) and Compañía Sudamericana de Vapores (CSAV). To keep track when comparing ports, I added pins to Google Earth for each location each company shipped to. After an hour of searching, I ultimately found that all five companies shipped to both cities.

With all of the contextual clues in the image not helping narrow down the location, I finally turned to systematically searching satellite imagery of both cities’ ports.

From the start, I ruled out trying to compare the pattern/colour configuration of the containers as there was conflicting information about the date the image was captured. Containers are unloaded and collected daily in these ports, so temporal information is key. However, whilst the image was uploaded to Facebook on 29 November 2020 it was posted alongside photographs of the container packing lists, which were dated 11, 12 and 13 November 2020. I couldn’t narrow it down to one of these dates because weather data in both cities showed cloud cover the whole week, matching the weather in the image. Without knowing the date of image capture I couldn’t access satellite imagery from the exact day and attempt to match the container’s patterns. Plus, based on the cloud cover over those days, it would have been unlikely any satellite imagery captured would clearly show the ground.

My second option was to compare the images using the dusty, unpaved ground. I picked the closest date on Google Earth Pro’s historical imagery selector, 11 December 2020, and began searching. First, one city at a time, I zeroed in on port areas that had containers present, marking them on Google Earth with pins to keep track of my search zone. I then worked through each pin. I used the locations with Google Photos attached to get a sense of what different ground materials looked like on satellite imagery. Once I’d learnt this, I could quickly rule out paved areas and narrow down locations that could match. Sometimes I switched between satellite imagery dates to find a better image. After a few hours of searching, I had a dozen possible locations across two cities with matching ground types.

By this stage, I felt I had used all the techniques I knew and the tools that I had access to. Yet still, the decision to move on and leave the image unverified felt like a failure. As a final step, I got a second opinion on my process and decision, and we decided that geolocating the content with the information currently available was not going to be possible.

Screenshot showing potential and ruled out locations in one area of Hai Phong, Viet Nam. Source: Google Earth Pro.

Lessons

1. Evaluate the context

Context is key. Had this content been essential to a research output I would have likely persisted. Painstakingly searching through satellite imagery is an approach I often return to when geolocating content, and this persistence often pays off. Yet the process can be lengthy and time-consuming and so the time invested in a piece of content must be proportional to its relevance and use. This can be challenging in situations where the output isn’t yet concrete, but taking the time to evaluate what the content actually shows is essential before and throughout a difficult geolocation task.

2. Keep track of your findings

When verifying content, you are working at the intersection of your own skills and the tools at your disposal. And throughout any challenging geolocation task, there are multiple decision points where you evaluate your options using the tools available to you. Because this was a difficult geolocation, throughout I kept track of what I tried in the spreadsheet where I stored the content. I also used pins and polygons in Google Earth Pro to mark the areas I’d searched. By keeping track of the strategies you try and the results they yield you can:

  • Avoid repeating yourself (especially when working on a piece of content over an extended period)
  • Collaborate more easily
  • Evaluate and reflect on your methodology
  • Have more confidence in your decision to move on

3. Communicate and collaborate

Finally, I learnt that getting a second opinion on both the content and your methodology can be invaluable when working on a challenging geolocation task. A fresh set of eyes can contribute a new observation or strategy or confirm your decision to move on. In this case study, bringing in a second opinion and collaborating earlier in the process would have sped up the process even further. It was ultimately the second opinion from a colleague that gave me the confidence to finally move on.

Conclusion

Although it felt like an admission of failure, leaving the content unverified in this instance exercised a central, often underappreciated skill in open source research: knowing when to move on.

What began as a photograph that I felt confident I could geolocate, gradually evolved into a challenging task which tested my patience, confidence and skills. Evaluating the context, tracking my findings and communicating with colleagues to get a second opinion helped me conclude that I couldn’t geolocate the content with the information available within a reasonable timeframe. As human rights abuses are constantly occurring, and evidence of them are being shared in ways that I can verify, in this case, I decided to move on.

This conclusion was reached gradually and it is not irreversible: if more information arises, I can easily return to the photo and all of my background research. If the information never comes, I learnt the probable complexity of geolocating a photo of a freight container for future research.

Therefore ultimately, although it felt like an admission of failure, leaving the content unverified in this instance exercised a central, often underappreciated skill in open source research: knowing when to move on.