Ranking Web of repositories:
End users’ point of view?
Isidro F. Aguillo
Editor of the Rankings Web Cybermetrics Lab – CSIC. Spain
Agenda
•
A classification of repositories
–
A common portal or different websites?
•
Institutional repositories
–
A new role, with focus on added value
•
Good practices: end user point of view
–
Citing “correctly” the resources
•
Ranking Web (Webometrics) of Repositories
–
Background, objectives and methodology
–
Brazilian results: Preliminary analysis
–
Future developments
2
A classification of repositories
•
By provider
–
Personal (or group) homepages
–
Institutional repositories
–
Subject repositories
–
Portal of e-journals
•
By content
–
Metadata (no full text)
–
Preprints/postprints
–
Thesis/MS Thesis
–
Formal plus informal contents (raw data?)
–
Learning objects
–
Digitised Archives (all formats)
•
Metarepositories
–
Directories
–
Harversters
One portal or different websites?
•
Different contents, different objectives, different treatment
– One shop concept is only supported by technical reasons (common management software)
– Very confusing for the end users: Preservation (thesis, archives), evaluation (papers), dissemination (journals), teaching/research (multimedia objects, raw data)
•
Formal scholarly communication
– Requires specific treatment for providing profiles for use in academic evaluation
– Policy relevant bridge to CRIS (Current Research Information Systems)
•
Educational supporting material
– A common source is a “totum revolutum” without links to specific courses and professors
•
Local e-journals portal
– Involves contributions by authors from external institutions
• Harvesting: Sharing or stealing?
– Branding and intellectual moral rights in danger
An example of CRIS
5
euroCRIS.org
Redalyc journals
6
Institutional repositories
•
The tangible and intangible “treasure” of the institution
–
Under the full control and management of the institution
–
Open Access (no restrictions)
–
Intellectual property, but also brand and moral rights to be preserved
–
Librarians in charge, but not the library ownership
–
High rank in the webdomain: http://repository.domain.tld/
•
Not only a catalog
–
Metadata are important, but not so important
–
Emphasis in full text records
–
Rich environment
•
A short list of suggested added-value extras
–
News
–
Personal (groups, departments, faculties, schools) profiles
–
Reports (focusing on contents)
–
Links to/from third parties (CRIS, Social web, Citation databases)
–
Statistics
A few examples
8
Independent projects
Added-value extras
10
Profiling (I)
11
Profiling (II)
Usage, Citations and Mentions
13
UT statistics with eprints’ IRStats
14
Exploiting combined resources
15
Good practices …
•
Permanent URLs: Is it really a good idea?
– Technical management of internal DNS pretty easy: pURLs are needed due to laziness, lack of professionalism or misunderstanding in the ICT departments – Permanent systems are under external (foreign, private) control
– pURLs do not identify institutions, authors or titles – DOIs are linked to journals (and editors)
•
The end users of repositories are mainly other authors
– Paper deposit can increase visibility and probably impact (citations) if personal and institutional authorship is clearly unambiguously attributed
•
Items in repositories should be citable
– Citable items are the full text files, not the metadata webpage
– The URLS should be easy to use, avoiding long strings of useless meaningless characters or numbers
– Web domain of the institution, author(s) name and other semantic valuable info should be provided in the URL
– Repository name should clearly and explicitly included in the host name
•
Branding and moral rights are very relevant aspects of intellectual
property
Citing a record
17
www.lume.ufrgs.br
Citing a record!
18
repositorio.ufpa.br
Permanent(?)URLs
19
www.doi.org
www.handle.net
Institutional (?) repository
20
aut.researchgateway.ac.nz
http://aut.researchgateway.ac.nz/bitstream/handle/10292/3173/Gidd ings%20JNR%20Manuscript%20Jan%205th%202006.pdf?sequence=11
“Hosted” Institutional “repository”
21
Ranking Web of Repositories
• Background and objectives
– Inspired by the Ranking Web (Webometrics) of Universities, the Ranking of Repositories started in 2008
– The aim is to support Open Access initiatives in universities and research centers. A secondary objective is to promote good practices
• Current situation
– The ranking is published two times (January and July editions) per year
– Conditions: An independent web domain/sub-domain and focus on research mission
– The current edition (July 2012) analyzes 1611 repositories (including 1438 institutional ones and 111 “portals”)
• Methodology
– The composite indicator is evolving for better reflecting the repositories performance, but respecting the ratio 1:1 between the weights of activity and impact indicators
• Size: Number of web pages (by Google), excluding the rich files (10%)
• Visibility: Combining external inlinks and referred domains according to the two major providers of link data: Majestic SEO y ahrefs (50%).
• Rich files: Total files of these types (by Google): pdf, doc+docx, ppt+pptx and ps+eps (10%)
• Scholar: The total number of papers in Google Scholar for the 5-year period 2007-2011 (30%)
Ranking webometrics
23
Brazilian repositories
•
Leaders in Latin-American
–
35 institutional repositories (Brazilian universities are over 200
plus several hundreds higher education institutions more)
–
4 Brazilian repositories in the Top 10 of the Region
–
Scielo, the most important portal in the world
•
Current problems
–
Most of the contents are thesis and dissertations
–
Google Scholar indexing below the expected coverage
–
Servers slow or down (frequently?)
–
Lack of explicit suffixes (e.g.: pdf in Acrobat files)
–
Criteria for naming and citing records and all-in-one strategies
should be discussed
–
Technical developments regarding added value services are
badly needed
Brazil in the Ranking
25
repositories.webometrics.info/en1/ Latin_America/Brazil
Final comments
•
A personal view
–
The “best practices” presented here are only a personal
view of a papers’ author not strongly linked to the librarian
orthodoxy
–
But repository managers should be aware of some of the
problems commented and their impact on the authors’
self-population databases
•
Future developments in the ranking
–
Composite indicator will be rebuilt giving more importance
to Google Scholar and the full-text files correctly named
(explicit suffixes like .pdf)
–
Current classification is not reflecting the repositories
diversity yet, but we are considering a wider definition of
“portal” that means many institutional repositories are
candidate to be transferred
Obrigado!
Questions?
Contact Info
Isidro F. Aguillo, HcDr
Cybermetrics Lab. CSIC. Madrid. Spain
isidro.aguillo@csic.es