Monthly Archives: March 2013

Using Crowd Wisdom to Annotate the Web

Since January I’ve been intensively researching in the space around citation practices in academia as part of the HighWire MRes ‘Special Topics’ module: I’m really intrigued by it all. As is often the way with these things I’ve probably gone about it the wrong way, with an idea of a solution (I was thinking about applications of linked/semantic data…), before properly understanding the problem. Thankfully, to learn this kind of thing is why I’m doing an MRes.

I’m actually consciously trying to resist being consumed by my obsession with this topic, yet at the same time trying to master my own feeling of completeness as regards my knowledge of the subject. Part of the issue is that most (not all) of the academics I’ve spoken to about my concerns to do with citation practices react quickly and deeply, suggesting that this is ‘the way things are’, inherently political, unbounded… and generally a difficult area to work in. They’re probably right, but it doesn’t mean I shouldn’t go there. I’ve already written a 2000 word literature review titled ‘Impact Metrics: Lies, Damned Lies, Statistics‘, and a mock EPSRC funding proposal ‘Expressing Research Output Through Linked Data‘ on these subjects, so I won’t elaborate here… however my thinking in this area did lead me to think about annotation as a method by which to make various practices on the web more transparent, and potentially a way of mitigating the Matthew effect.

The Matthew effect suggests that ‘the rich get richer and the poor get poorer’. It applies in academia too: a highly cited paper is far more likely to grow its citations quicker than a paper that has no citations. Thompson Reuters actually run a ‘Highly Cited’ service, on their front page they state:

“Once achieved, the highly cited designation is retained. With each new list, we add highly cited individuals, departments and laboratories to this elite community.”

I don’t want to appear objectionable, but, it is quite a scary proposition. They’re saying that once they (Thompson Reuters) have awarded this accolade, it is enshrined forever.. thus the ‘elite’ community is created. This touches on my issues with impact measures per se. It is impossible to explain the nuances of a lot of literature, knowledge, or learning, and to express why or how it is valuable by way of a number. The content of academic literature (excepting tables, figures, etc) is qualitative. Regardless of the field there’s a qualitative element. So why don’t we discuss it qualitative terms? Plain English…! “This is relevant because….” or “I disagree because…”

I don’t think we should ignore statistically based metrics. I don’t think we should ignore citation counts. I do think that being highly cited (whether or not Thompson Reuters invite you into their club) is usually a great thing, helping both authors and researchers that need to access relevant literature. However we’re missing out on the subjective. And the subjective has value. Even worse, if we’re counting citations and making a judgement on them, we assume that they’re quite an objective thing: which is a total fallacy. Why one paper receives citations and another one doesn’t could be for any reason, right through to being friends with somebody, to typeface, to an artefact of the indexing process, or simply because of the keywords chosen to describe the paper.

I had a really great lecture from Wolfgang Emmerich, although the lecture was really about agile software development methods used at Wolfgang’s company Zuehlke, we tested out the wisdom of the crowd. Wolfgang had each of us guess the weight of a motocycle. We revealed our first guesses, discussed, then re-guessed. We took the mean of the second guesses, and the ‘crowd’ (only about 10 of us) was within 5% of the correct weight. Pretty impressive I thought.

So… I postulate that the wisdom of the crowd, combined with an open annotation system, could be a massively important tool for adding extra value to things like, for example, citations. On an exploratory punt at working with Mendeley to further explore this through my summer project I was pointed toward by William Gunn (head of Academic Outreach at Mendeley). are developing an open annotation system, relying on crowd-sourced and reputation based data… to annotate everything…… After watching this short video (below) I had one of those terrible yet affirming moments. The thought running through my head was “Again?!?! Again!!!? Why does every idea I have, seem to have already been had by somebody infinitely more able to deliver on it than myself.” I had had this idea before, but kind of wrote it off as being ‘too big’, eventually sanitising it down so much that I was just thinking about annotating citations. As it is, the ambition embodied by the project I think potentially has the power to transform the web. It’s also a reminder to me to ignore those authority figures that suggest that maybe the area of interest is ‘too big’ or ‘too political’ or ‘just the way things are’ – sometimes you’ve got to throw caution to the wind, just like Mario Capecchi did. 

Spesh To’ics: Presents for all

The special topics module has three marked components:

I did the presentation today, so I thought it makes sense to write some reflection about it. Firstly, I don’t think it went that well! It certainly wasn’t a complete failure, but at the same time I would have preferred things to go better.

This, is pretty much unrelated, but was a video I used in a presentation while studying Interactive Arts… I needed to sum up how the last few months had been, so I simply put all of the images I’d taken during that period into a very fast video that showed them all in chronological order. It uses an early ‘prototype’ version of the Joe Galen track Interlude to get stuff.

I was actually quite pleased with the slides I put together to support the presentation, and therefore the planned structure of my talk. I used a metaphor of various periods in history to relate to the ‘narrative’ I tried to put together: I thought it was quite neat. I also thought the content I included, and references, were the most relevant and would contextualise the research questions that concluded the presentation. 

Unfortunately I was stung by two quite fundamental issues. Firstly, and I knew this before I started, my talk was too long. I needed about 12 minutes to get through everything and the allotted time was only 10 minutes. This resulted the zero-hour decision to skip out some of the early information I intended to talk about. Secondly I hadn’t memorised my ‘script’ and I hadn’t printed the notes, so I was left to read them off my computer screen. Sadly this meant that I was in a rather unnatural position: if this were a stage show it wouldn’t have gotten good reviews. A final basic/schoolboy error was that I used my new Macbook to give the presentation, I had underestimated quite how small the text would be (due to the insanely high resolution): I could hardly read the notes!

In the end I totally failed to present the final slides, so didn’t have chance to actually explain the direction I want to take the research in, in any detail. I was quite upset that I didn’t manage to get that in, as this was really the only ‘new’ part to the presentation. I did manage to refer to some of the missing content in the questions at the end of the talk, but not in the same manner that I’d liked, and also the link to my narrative structure was missing.

So.. these are my take aways:

  • Be conservative with timing
  • Print notes
  • Test with the ‘live’ kit (to avoid the tiny-text problem)
  • Learn your script

But…. like I said….. it could’ve been a whole lot worse, so not all bad. There isn’t that much information on the slides themselves (it was supposed to be contained in the talk) but here they are anyway!

Spesh To’ics Week {unsure}; The revelations of research

Well, in hindsight, I’m quite impressed with myself that I managed to do a weekly blog for four weeks running. It’s been…. *goes to check dates*… precisely one month since I last did a special topics reflective blog. I was insistant that I’d do it regularly.. and, well, although there’s been a several-week break, I don’t think that this is a bad thing on this occasion! Between my last update and now I’ve had conflicting priorities. A few assignments. A few distractions. A deep dive week. All the while however I’ve been thinking about this piece of work, and my area of research for special topics. The one annoyance, and something that in terms of reflective practice might be useful, is that I haven’t managed to do as much reading as I’d liked to have done. Fortunately, however, the deadline has been extended for the written piece for special topics, so that means more time for reading (and procrastination).

I’ve become so excited about this topic that I’ve been thinking ahead and envisaging continuing the research for my HighWire summer project (the summer project runs or somewhere in the region of three months, and the expected outcome is a paper, or other artefact). In support of that I met with Jon Whittle and Matthew Rowe at Lancaster’s InfoLab21 to discuss. It was quite an enlightening discussion. Firstly I realised I still hadn’t narrowed my thinking to anything specific enough to begin a “proper” project around and secondly, I was pointed in the direction of some really interesting work by Matthew.

altmetrics was the most significant thing Matthew pointed me toward, a movement started a couple of years ago. Their work hinges around a manifesto, and broadly speaking this movement encompasses all of what I’ve been thinking about. The very fact they’ve termed it a manifesto is indicative of the size of the problem. A “normal” paper wouldn’t manage to communicate the problem that is trying to be addressed. So those behind the altmetrics movement, and I, are both hinting that a wholesale change is necessary to resolve the engrained issues in the way “impact” is measured, not to mention a whole host of inter-related complications of this. While it’s always reassuring to find somebody has had the same kind of thoughts as yourself, it’s also daunting and worrying to understand quite how large the scale of the issue is. Going down the altmetrics rabbit hole, there is no sigh of the depth abating. There’s a lot of stuff down there, mostly juicy, the occasional dropping. The occasional juicy dropping.

I think this is all part of the Science 2.0 movement, in some way or other. What is it with me and tending toward “2.0” stuff? My project Prayer 2.0, my undergraduate dissertation (Web 2.0: Web as canvas), and one of my recent assignments for HighWire (Do we need Smartphone 2.0?). I dunno. Maybe I just tend toward buzzwords. I hope not though.