Microsoft PowerPoint - the NCRM EPrints Repository

Download Report

Transcript Microsoft PowerPoint - the NCRM EPrints Repository

Virtual Knowledge Studio (VKS)

What is Webometrics?

Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Information Studies

1. Introduction □ □ Webometrics is concerned with gathering data on and measuring aspects of the Web □ □ □ □ □ □ web sites web pages hyperlinks web search engine results YouTube video commenter networks MySpace Friend networks …for very varied social science purposes

New problems: Web-based phenomena □ Webometrics can be applied to understanding web-based phenomena □ Why do web sites interlink?

□ Which web sites interlink?

□ What interlinking patterns exist?

□ What topics are frequently blogged about?

Old problems: Offline phenomena reflected online □ Some offline phenomena have measurable online reflections □ International communication □ Inter-university collaboration □ University-business collaboration □ The impact or spread of ideas □ Public opinion

2. Examples Blog searching - blogpulse.com

Example: Identifying and tracking public science concerns in blogs Over 100,000 Blogs and other sources tracked daily via RSS feeds Objective: to identify and track public concerns about science E.g., “Schiavo” identified and tracked as potential public science concern

Example: The online impact of research groups (NetReAct)

Austria Switzerland Geopolitical connected Belgium Germany Example: Links between EU universities Norway France UK Spain Finland Normalised linking, smallest countries removed Sweden Poland Italy NL

International biofuels research network

Example: MySpace age profiles

percentage of profiles containing swearing US males 16-19 US females 16-19 UK males 16-19 moderate strong very strong sample size 10% 11%

33%

UK females 16-19

18% 47% 38%

33% 38% 2% 2%

8% 3%

1,530 1,287 171 130 (typical sample size 20-148 for non-web swearing research)

emphatic adverb/adjective OR adverbial booster OR premodifying intensifying negative adjective (36% of swearing) □ □ □ □ □ □ □ □ and we r guna go to town again n make a ryt fuckin nyt of it again lol see look i'm fucking commenting u back lol and stop fucking tickleing me!! Thanks for the party last night it was fucking good and you are great hosts. That 50's rock and roll weekender was fucking mint!

Fuckin my space, my arse 1/2 d ppl cudnt even speak fuckin english!

yeah so me and sarah broke up and everythings fucking shit

YouTube – Video poster ages

YouTube friend network

Online impact - Keywords in web pages mentioning IWRM

Data Gathering/Processing Tools □ Blogpulse.com – blog network diagrams □ LexiURL Searcher – links, web text, YouTube, Flickr, Technorati □ Issue Crawler, Google TouchGraph links

Discussion points for online data □ □ Validity – is the underlying meaning of the text/video/picture readily apparent to the researcher?

□ Possibly not to any great degree for teenagers’ MySpace comments or very personal YouTube videos Reliability –are search engines accurate/good at returning the correct results?

□ Google blog search shows unreliability – very variable over time □ Researchers can triangulate different similar search engines or over time to test reliability

Discussion points for online data □ Coverage – to what extent is all the phenomena of interest covered by the source (e.g., search engine) used?

Sample bias – are certain types of people over-represented? (e.g., the more literate, the more vocal, the more politically active, youth, educated, creative types…)

Summary

□ The web contains a wide variety of interesting web and “web 2.0” content posted by many different people in many different formats □ Webometric methods can give insights into this data

Books

□ □ □ Thelwall, M. (2009). Introduction to

webometrics: Quantitative web research for

the social sciences. New York: Morgan & Claypool.

Rogers, R. (2005). Information politics on the Web. Massachusetts: MIT Press.

http://lexiurl.wlv.ac.uk http://webometrics.wlv.ac.uk

http://www.issuecrawler.net

Important considerations

□ Data accuracy □ Data cleaning □ Context to help interpret results □ Report results carefully

Example: Analysis of the accuracy of search engine results Live Search results analysis