(alt)Metrics and Philosophy of IT

I’m currently reading The Whale and the Reactor by Langond Winner (1986). It was recommended to me by a colleague’s husband when discussing philosophy and IT.  I’m into the 3rd chapter and it’s interesting, I would recommend it (at least, thus far). My interests in philosophy of IT stems from my early areas of focus including HCI/UX, social informatics, and digital humanities. While I’m not performing research in these areas, the readings still stay with me and I find myself trying to utilize terminology and methodologies from articles I read some time ago as I examine social media metrics (altmetrics), scholarly communication, and the academic reward system.

During my Ph.D. years, I had the great opportunity to take a course at Indiana University Bloomington where myself and another doctoral student performed close readings of Martin Heidegger, Gilles Deleuze and Félix Guattari, and other prominent scholars discussing philosophy and technology under the supervision of Dr. Ron Day and Dr. Hamid Ekbia. It was a great time and I learned a lot about myself and the way I approach philosophy of IT. While I haven’t had the opportunity to pursue this interest of philosophy of IT, I have continued to shuffle the ideas around in my head and apply what I learned from the course and readings since this time to the way in which I think about scholarly communication, altmetrics, and the academic reward system.

Our Reading List:

Social media metrics (or altmetrics) is a relatively new area of study examining the distribution of scholarly articles within primarily social media contexts (which I’ve discussed here before). Of interest are both metrics relating to how often an article is shared in these environments and who/why/how the agents using the online platforms disseminate and consume this information. We, as social media metrics scholars, first used traditional bibliometrics measures to examine counts of social media acts and tried to determine if these counts correlated to an increase in citations of said articles. More recently, social media metrics scholars have begun to utilize theories and methods from other disciplines, including psychology, sociology, and linguistics, to examine these acts. I was very excited to be part of a book chapter (https://arxiv.org/abs/1502.05701) that discussed applying theories from other domains to the study of altmetrics.

Access, Appraise, Apply – Engagement

 

When examining the larger picture of the academic reward system, social media metrics could be considered a fourth leg of the stool. We have traditionally considered authorship, citations, and acknowledgements as part of the academic reward system. Yet, the ability to track social media acts relating to scholarly documents has introduced a new means of capturing the consumption and dissemination of these documents. While we do not claim that these acts equate to authorship, citations, or acknowledgments, these activities do represent some form of engagement with scholarly work.

Academic Reward System Four-Legged Stool
Academic Reward System Four-Legged Stool

This change in a reward system after many years of relative consistency has brought about much discussion. Part of the driving force behind adding social media metrics to the academic reward system is the notion of “societal impact.” These days, many funding agencies, universities, government entities, and (some) tenure committees are asking scholars to provide some evidence of how their research has had impact outside of academia. One way in which scholars can provide evidence of societal impact is to utilize altmetric-related counts. But, this notion of societal impact is highly contested and there are many different definitions of “societal impact” available.

What I’m now thinking about and trying to consider is a way to discuss these changes in scholarly communication and the academic rewards system utilizing what authors have discussed in the philosophy of IT literature. I believe that useful insights and vocabulary from the philosophy of IT literature can allow us to think about the academic reward system from a new perspective and critically discuss the impact social media and technology has had on the actors and acts performed within the academic system. Going back to The Whale and the Reactor book, Winner seems to focus (so far) on technology and power, which I have a feeling will provide useful insight into how I think about social media metrics (altmetrics) and the academic reward system. It brings to mind Latour’s Actor Network Theory and the agency of technology in a system.

More to come soon…

 

Lessons from 2:AM

Last week I was lucky enough to attend the 2:AM Conference in Amsterdam. The conference was focused on altmetrics–a type of metric that is typically calculated based on scholarly communication events captured in online contexts (e.g., events in Twitter, Mendeley, Wikipedia, etc). For some time I’ve been critical of the term “altmetrics” because I had taken it to mean “alternative to citations,” but after this conference I’m not so confident in my previous position. Altmetrics is an umbrella term that we use to help describe the type of research we are doing (at least those of us that research these things), it is a buzzword that others use to talk about scholarly communication in online contexts, it is a term that the media has used, it is currently used in organizations, libraries, universities, and companies to promote scientific work, and it has become a term that somehow represents the potential for measuring impact outside of the academic machine (other than scientific impact). While it has been criticised many times in the past for being the wrong term, I am not sure there is a more appropriate term… and that is fine.  We have had suggestions including social media metrics (Haustein, Larivière, Thelwall, Amyot, & Peters, 2014), complimetrics (complimentary metrics) (Adie, 2014), influmetrics (influence metrics) (Cronin & Weaver, 1995; Rousseau & Ye, 2013), and more traditionally webometrics (Almind & Ingwersen, 1997), to name just a few, but these do not seem to be any better and also do not seem to possess that something that “alt”metrics seems to possess.

I dabble in linguistics and I believe that words are of vital importance to our ability to understand and discuss the same phenomenon (especially in science), which is why I was so adamant that “altmetrics” was the wrong term to be using. But then I took another look at the altmetrics manifesto (the 5th anniversary of this important object was celebrated at the conference) and reevaluated my own position based on my accumulated knowledge in the field, what I learned at this conference, and a closer inspection of the manifesto to come to the realization that altmetrics is fine when you think of it as an “alternative means of measuring scholarly communication.”

The conference venue was great as we were housed at the Amsterdam Science Park, a sprawling complex on the eastern side of Amsterdam.  There were quite a few attendees and the presentations and workshop were informative and thought-provoking. Many of the primary data providers, publishing companies, metrics providers, and others in this field sent representatives including Jason Priem (impactstory.org), Euan Adie (altmetric.com), William Gunn (mendeley.com), Greg Gordon (ssrn.com), Martin Fenner (niso.org), and Geoff Bilder (crossref.org). In addition, the four authors of the altmetrics manifesto were in attendance to celebrate its 5th anniversary– Jason PriemDario TaraborelliPaul Groth, and Cameron Neylon.  I was able to speak with both Jason and Cameron and they were engaging, down to earth people who are great scholars and excited by the future of scholarly communication (I wasn’t able to speak with Paul or Dario at such length).

What I gleaned from 2:AM was that there was an ongoing discussion from multiple perspectives taking place regarding the ability for altmetrics to measure impact, the types of impact there might be for scholarly communication, and the importance of trust when considering the reasons behind altmeteric events. In addition, I am looking forward to be a part of a group (formed at the”theories” conference breakout session) that will write a white paper describing and defining common terms used in altmetric research for the purpose of allowing others outside of our community to understand and contribute to the ongoing work in the field. I also learned that many in the field had read our book chapter (arXiv:1502.05701) on applying citation and social theories to the understanding of altmetric events–they were very supportive of our efforts to put forth this first attempt at developing a framework for understanding altmetric events. Yet we all know that much more work needs to be done and hopefully this white paper will be a nice step in that direction.

What I also learned from listening to Jason Priem, Dario Taraborelli, Paul Groth, and Cameron Neylon was that our group has somewhat ignored the important component of the manifesto, which is talking about altmetric events as a type of  “filters” for scholarly research and communication:

No one can read everything. We rely on filters to make sense of the scholarly literature, but the narrow, traditional filters are being swamped. However, the growth of new, online scholarly tools allows us to make new filters; these altmetrics reflect the broad, rapid impact of scholarship in this burgeoning ecosystem. We call for more tools and research based on altmetrics. (Priem, Taraborelli, Groth & Neylon, para 1, 2010)

This is an important aspect that I too simply took for granted and something I need to reflect on as my understanding of these phenomena continue to grow and change.

 

References

Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). Altmetrics: A manifesto. October 26, 2010. Retrieved from http://altmetrics.org/manifesto

 

 

Boundaries

The current state of scholarly communication is in flux as various avenues for the consumption and dissemination of ideas, discussions, and research continue to be developed and then adopted by scholars. These contexts offer different affordances (Gibson, 1977), or possibilities for action provided by the platform, and offer different types of networks with which a scholar can view and interact. Affordances found in these environments typically include the ability to share information in a particular way (e.g. tweet, Facebook post, blog, comment, post links or media, hashtags, etc.), consume information, create a profile (public, private, or mixed), and connect with other users of the platform. There are various types of networks represented across these platforms, including blogs (external facing networks), Facebook and Twitter (social networks), and Wikipedia (interconnected networks). A scholar can present herself on these online platforms along a continuum ranging from personal to professional.

Problems can arise from interacting within these online contexts as the information is disseminated to a vast unknown audience, it is archivable, it is searchable, and it can be copied and removed from the context in which it was originally published (boyd, 2006). This can prove damaging to the reputation of a scholar and can lead to shame, punishment, or dismissal as seen from recent examples. In one example, a scholar who had been offered a tenure-track position at the University of Illinois, Urbana-Champaign, had this same offer rescinded after several tweets made by the individual were deemed anti-semitic in nature by the university board (Jaschik, 2014). In another example, a professor from the University of New Mexico was put on probation and given counselling after tweeting an offensive remark about Ph.D. applicants (Ingeno, 2013). There have been other examples of these types of infractions from Facebook and from blogging.

Before the rise of these massive online networks, scholars already found it difficult to manage the boundaries between their personal and professional lives. The introduction of online contexts in which a person can interact with vast audiences exacerbates the situation for scholars as they (often) already are maintaining a tenuous balance between their personal and professional identities from their time spent mentoring and teaching students in and out of the classroom. The boundaries between personal and professional are changing; what was considered personal interactions outside the classroom now have been thrust into the spotlight partially because of the new networks in which scholars interact. This relationship between the changing personal and professional boundaries of self-presentation and the size of the network and proximity of the nodes has not been adequately discussed.

Goffman (1959) discussed the acts of self-presentation and impression management in his social research as acting out a particular role for an audience and maintaining that role across time. These acts rely on various aspects including social norms, rules, and context to be effective. You could interpret Goffman’s writing in a way that suggests he considered the network and it’s significance to people in their day to day lives, as he (Goffman, 1961, p. 127) noted later that “[w]hen seen up close, the individual, bringing together in various ways all the connections that he has in life, becomes a blur.” He knew that boundary maintenance was a crucial component of self-presentation and impression management, as he divided the act of self-presentation into three different regions: front stage, back-stage, and the outside region. What he did not directly speak to was the actual size of the network and the influence this would have on the boundaries between these regions.

goffman-self-presentation
A graphical representation of Goffman’s Self-Presentation framework

Related to this, Mehra, Kilduff, and Brass (2001, p. 131) argued that while a large network “can enable the individual to access numerous others for information and other resources,” they warned that “[p]eople who interact with numerous others in organizations run the risk of running short of time and other resources” In addition to the time and resources used to maintain large networks, scholars run the risk of further blurring the boundaries between their personal and professional selves. I want to further investigate this relationship between networks and self-presentation and impression management and the blurring between personal and professional.

 

References

boyd, d. (2006). Friends, Friendsters, and MySpace Top 8: Writing Community Into Being on Social Network Sites. First Monday, 11 (12)(12), 1–15. Retrieved from http://www.firstmonday.org/issues/issue11_12/boyd/index.html

Gibson, J. J. (1977). The Theory of Affordances. In R. Shaw & J. Bransford (Eds.), Perceiving, Acting, and Knowing: Toward an Ecological Psychology (pp. 127–143). Hillsdale, NJ: Lawrence Erlbaum.

Goffman, E. (1959). The Presentation of Self in Everyday Life. New York: Anchor.

Goffman, E. (1961). Encounters: Two studies in the sociology of interaction. Indianapolis: The Bobbs-Merrill Company, Inc.

Ingeno, L. (2013, June 14). Outrage over professor’s Twitter post on obese students. Inside Higher Ed. Retrieved from https://www.insidehighered.com/news/2013/06/04/outrage-over-professors-twitter-post-obese-students

Jaschik, S. (2014, August). Out of a job. Inside Higher Ed. Retrieved from https://www.insidehighered.com/news/2014/08/06/u-illinois-apparently-revokes-job-offer-controversial-scholar

Mehra, A., Kilduff, M., and Brass, D.J. (2001) The social networks of high and low selfmonitors: Implications for workplace performance. Administrative Science Quarterly, 46(1), pp. 121-146.

The ecosystem of science

I’ve been thinking a lot about what Science really means to me and what the philosophers of science have said about the system of science. I love Newton’s famous notion about “standing on the shoulders of giants,” but I don’t necessarily see it in that way… especially in my line of research investigating altmetrics and scholarly communication.

It’s a blustery evening in Finland and I am watching the trees bend and shed leaves in the strong breeze while thinking about this. It seems to me that the system of science resembles an ecosystem in which we try to make our lives meaningful and to shed light on our surroundings. We do, of course, use the work of others to view things through their eyes, but I don’t see myself standing on their shoulders and reaching for the stars. Instead I see myself as a small sapling, struggling for nourishment in a vast forest. At the same time, I view those before me, especially those marvelous minds from which I borrow, as large trees that shade me from the sun and break the harsh winds blowing over me. I see the trees of Goffman and Gibson, of Heidegger and Kant, and on and on, in my part of the forest. These solid, long standing trees protect me and nourish me, allowing me to grow and to become a tree myself.

As scholarly communication and science has changed, so too has the ecosystem. We are no longer simply trying to aspire to being the trees that provide the root system of science, we are also trying to spread and have an impact outside our forests. I feel like we are  now flowering trees, making pollen that can be carried away to the farthest fields with hopes of having an impact on our surroundings. We have evolved to make use of the technologies that have become a part of our world, to attract the attention of others so that they can carry our pollen away. A large part of this new technology and ecosystem is the internet, specifically social media and other online sources of information. Social media users are the bees that we need to spread our pollen, our information, outside of our isolated forests. What the bees are doing with this information, we don’t yet know.  But what we do know is that they can spread it faster and farther than ever before.

Through my work I hope we can figure out where our information is being spread and what kinds of impact we are having on society.

It. Is. Done.

I have finally finished my Ph.D. Yay. I graduated from the School of Informatics and Computing,  Indiana University, Bloomington at the end of July, 2015.

After seven years of contemplating social structures, norms, behaviors, communication, and the ways in which people use the affordances of social media, I was able to successfully defend my thesis in front of four of my peers and a handful of students in May, 2015 and make the required minor revisions and formatting changes to submit the final version of the document to the graduate school at the beginning of July, 2015.

It has been a long, rewarding journey and I am happy that I completed it. I have been able to travel around the world, move to two countries, and meet some extraordinary scholars, travelers, and neighbors. It’s been quite an adventure, one which I hope continues as I progress in my career as an academic. Thank you to everyone for the support and love throughout this process.

I’m now in Finland working with great scholars and looking to improve my abilities as a scholar, researcher, teacher, and coworker.

Kiitos!

Scholarly Communication, ‘Altmetrics’, and Social Theory

In a recent book chapter (that is currently under review), my colleagues and I discuss the application of citation theories and social theories to popular media and social media metrics (so-called altmetrics) being collected by sites like Altmetric.com, ImpactStory.org, and Plum Analytics. These metrics are being used by organizations such as libraries, publishers, universities, and others to measure scholarly impact. It is an interesting area of research in that it helps us understand how scholarly work is being consumed and disseminated in social media (and thus presumably to an audience outside of the academy).

I come to this research having dabbled in many different areas of studies beginning with neuropsychology (as an undergraduate), human-computer interaction, information architecture, and web design (as a master’s student), and finally social informatics (at the beginning of my Ph.D.), digital humanities (middle of Ph.D.), and scholarly communication and sociology (thesis work). I believe this indirect path has allowed me to consider research questions from different perspectives and allows me to apply various theoretical and methodological lenses to the same problem (as is the case for many Information Science graduates). It’s also a path that has allowed me to contribute to the data collection aspect of this work, as I’ve written several programs that have assisted in the collection and storage of huge amounts of data (hundreds of millions of tweets, publication records, etc.) on scholarly (and other) activities. These experiences have allowed me to contribute to the book chapter mentioned above, several articles and presentations, and continues to allow me to contribute to understanding scholarly communication in social and popular media venues.

I’m looking forward to finalizing my thesis and to continue to examine these social and scholarly communication issues in my current research position at UdeM and in a permanent faculty position with future colleagues.

Harvesting Images from Site for Study

Recently
I needed to write code that would allow me to harvest images from a site where the images were displayed 10 per page over n number of pages.  I wanted to set it up so that I could start it and let it run over time and harvest images.  This immediately meant I’d be working with PHP and jQuery using AJAX.  I’ve written another post titled Web Scraping Using PHP and jQuery about this type of AJAX script and I needed to use what I’d learned here to implement this new scraping engine.

The reason I’m writing this code is so that we can start a few studies on selfies.

I ended up using a bookmarklet, which is a JavaScript snippet that you can add to a browser as a bookmark. We had assistants visit the pages where the images were stored and click on the bookmark to harvest images and metadata associated with each image. While it is a bit cumbersome, it was the easiest and quickest way to start collecting the images for our project. I wrote the code such that the JavaScript would speak to the PHP code and the PHP code would handle all the heavy lifting (saving image and scraping page for metadata). The images and text files were automatically saved to a Dropbox folder using the Dropbox API.

PHP + jQuery + Twitter API => Performing a Basic Search Using OAuth and the REST API v1.1

INTRO

To get started, you’ll need access to PHP 5.x on a web server.  I’m currently working on an Apache server with the default installation of PHP 5.3.  This should be available on most hosting services, especially those setups featuring open source software (as opposed to Microsoft’s .NET framework). In addition, I’m using a Postgres database on the back end to store the information I’m scraping and extracting (you can just as easily use MySQL).  If you want to run this code on your local machine, download WAMP, MAMP, XAMPP, or another flavor of server/language/database package.

TWITTER API, OAuth, & PHP twitteroauth Library

First, familiarize yourself with the Twitter Developer Website.  If you want to skip right to the API, check out the REST API v1.1 documentation.  To test a search, go to the Twitter Search page and type in a search term; try typing #BigData in the query field to search for the BigData hashtag.  You’ll be presented with a GUI version of the results.  If you want to try doing the same thing programatically and return data in JSON format, you’ll need to use the REST API search query… and you must be authenticated to do this.  To create credentials to use the search query, you must create an OAuth profile; so go and visit https://dev.twitter.com/docs/auth/tokens-devtwittercom so you can retrieve your ACCESS TOKEN and ACCESS SECRET.  Luckily we can use the PHP twitteroauth library to connect to Twitter’s API and start writing code (here’s an example of the code you’ll need:  https://dev.twitter.com/docs/auth/oauth/single-user-with-examples#php).  At this point you’ll need to set up your OAuth profile with Twitter and download the PHP twitteroauth library, edit the proper information to add your TOKEN and SECRET to the PHP twitteroauth library, and ensure all the files are on your web server in the appropriate place.

PERFORMING A SEARCH & RETRIEVING JSON DATA

I’m assuming you have set up the OAuth profile on Twitter and that you’ve downloaded the PHP twitteroauth library.  I like to create an “app_tokens.php” file containing my CONSUMER_KEY, CONSUMER_SECRET, USER_TOKEN, and USER_SECRET information assigned to variables; this way I can include anywhere I need it.

Now that we have our authorization credentials we are ready to use tmhOAuth as the middle man to send a request to Twitter’s API.  Let’s say we want to perform the same search we did above, but this time we don’t want a GUI version of the data… instead we want JSON data back so that we can easily add it to a database.  We need to find out what command the Twitter API expects and pass it a value; for our example, the Twitter API search query is simply:   https://api.twitter.com/1.1/search/tweets.json We can pass it several different parameters, but we’ll start with the most basic and use the q query parameter.  We want to pass the parameter the value “#BigData”, but we need to convert the pound sign (#) to a URL encoded version => %23… Our code then looks like this:

This request will use the REST API v1.1 and return JSON data.  We are passing the search a paramater of q=>’%23BigData’ which translates to searching for the hashtag “#BigData” (without the quotes).  We are also passing the ‘count’ and ‘result_type’ parameters (for more info on the other parameters, see the documentation).  Lastly, we need to get the response back from Twitter and output it; if we have an error, we need to output that too.  Using the twitteroauth libraries examples, I know I need to have the following code:

The above code receives two pieces of data from the Twitter API:  the response code and the response data. The response code indicates if we have errors.  The response data holds the JSON data the we received from the query.  The first result of my JSON data (yours won’t contain the same information, but it will contain similar structure) looks like this:

If you look at the JSON data above, you’ll see a key titled “text” and the value assigned to it; this is the content of the tweet and you can clearly see that it contains the hashtag #bigdata.  So we now know the code works and we can programatically query Twitter.  When you examine the Twitter API you will find that we can make 450 request every 15 minutes;  this will of course not get us ALL the tweets using the hashtag “#bigdata”, but it will give us a useful sample at 30 results per request == 13,500 tweets every 15 minutes.

Cheers.