Nearly A Librarian: November 2010

Saturday, November 27, 2010

Saturday, November 20, 2010

Web Search Engines, Pts. 1 & 2 / David Hawking

These articles explain how search engines use crawling algorithms to search and index the web. In part 1, Hawking describes how crawling machines are assigned to specific URLs via hashing. If a crawler comes across a URL that is not assigned to it, it then sends it to the correct crawler. Indexers first scan and then sort documents containing specific words and phrases. Like crawlers, indexers are also assigned to specific URLs to manage the volume documents that will be analyzed.

Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting / Shreeves, Habing, et al.

OAI-MPH was created to facilitate access to online archives via shared metadata standards. These shared standards allow users from different organizations or users of different systems to easily share resources. The involved repositories use metadata standards such as Dublin Core, XML, etc. In the future, OAI-MPH will work towards improving its registry to be more searchable and providing better descriptions.

The Deep Web: Surfacing Hidden Value / Bergman

When performing an internet search, a typical user is only scratching the surface of the web. According to Bergman, most are getting just 0.03% of what is actually available. The "deep web" is the other 99.97%. Many of these sites are made up of company/business intranets, specialized databases, archives, & repositories. This article is ten years old and I wonder how much of this information has changed because of the sophistication of current search tools. I do believe parts of the web remain "hidden" but they're not as inaccessible as they once were.

Comments from Wk #11

http://jilliansblog2600.blogspot.com/2010/11/week-11.html?showComment=1290312038211#c1351984000986848690

http://nrampsblog.blogspot.com/2010/11/unit-11-web-search-and-oai-protocol.html?showComment=1290312661547#c4758384399719463169

Monday, November 15, 2010

Muddiest Point Wk #10

I have no muddy points this week.

Saturday, November 13, 2010

Comments from Wk #10

https://lis2060notes.wordpress.com/2010/11/06/reading-nov-15/#comment-20

http://maj66.blogspot.com/2010/11/week-10-readings.html?showComment=1289710758965#c2196093730956100326

Readings for Wk #10

Digital Libraries: Challenges & Influential Work / Mischo

(This article really made me appreciate how far we've come in digital libraries technology. When I started college in 1991, I had to do a research paper for sociology class. I spent 3 days researching my subject in two different libraries and then had to make a 15 minute appointment to have access to a specific database. It's amazing what has happened in just 20 years.)
Mischo gives a brief history of digital library projects and why they were developed. Digital libraries were created out of the need to make large amounts of information housed in several different places/systems easily accessible via simpler portals. Like most projects of this scale involving and effecting several fields, it was primarily funded by the government and launched at a few select university libraries. The most surprising thing to me was that the early stages of this project were undertaken during the early age of the WWW. Thanks to this group of developers, programmers, engineers, and libraries, anyone can just visit ProQuest, Muse, or Google Scholar to download books, articles, etc. on almost any subject instead of visiting 3 different libraries to use specialized machines or databases.

Dewey Meets Turing: Librarians, Computer Scientists & DLI / Paepkce, Garcia-Molina, Wesley

This article explores the mostly harmonious relationship between librarians and computer scientists in the context of Digital Library Initiatives. Working together on this project made sense in so many ways initially because both understood the need to build collections that could "search[ed], organize[ed], and browse[d]." However, with the rise of the Web, both groups had to adjust their thinking on how to implement many of their goals. Computer scientists were naturally drawn to the breakthroughs made possible by the Web (machine learning, links everywhere - not just local, etc) while librarians had to grapple with higher prices for online journal content. As this relationship has evolved since the early DLI projects, librarians and computer scientists have been able to learn from each other. Computer scientists have collected websites of similar topics into hubs. Librarians can now help these computer scientists manage their scholarly publications online.

Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age / Lynch

Institutional repositories are gaining popularity for several reasons: metadata standards have been implemented, the price of online storage is cheap, the high price of serials, & it promotes scholarship at the institutions. Lynch uses MIT's Dspace as a model repository that utilized open source software and corporate partnerships (in this case with Hewlett-Packard). While creating their own repositories can lower costs for institutions/libraries (cutting out contracting with other firms to handle digital storage), Lynch warns them to stay on mission. First, don't use the repository to control or impose ownership over student/faculty/researchers intellectual property. Lynch states that successful repositories "are responsive to the needs ...and advance the interests of campus communities and of scholarship broadly." Second he says that repositories can't be slowed down or burdened by heavy policies. Libraries, faculty, researchers must cooperate on making policies that don't advance one group's agenda over the others. Third, institutions must be committed to maintaining & funding the repository after it's established.

Sunday, November 7, 2010

Muddiest Point - Wk #9

I think I understand the reasoning behind using XML, but am not following the difference between DTD & XML schemas. Why are schemas better than DTD?

Saturday, November 6, 2010

Readings for Wk #9

Introducing the Extensible Markup Language (XML) / Bryan, Extending Your Markup / Bergholz, A Survey of XML Standards / Ogbuji, XML Schema Tutorial / W3

Extensible Markup Language (XML) is a "subset of Standardized General Markup Language (SGML)" made to carry & store data. It is more flexible than HTML because the user can define their own tags, which makes it easier to share data across different languages & fields. Users don't need a particular version of software to create documents in XML structure. XML governs the structure of a data & not, as HTML does, what that data will look like. XML is expressed in documents which are made up of entities which are made of elements. Elements are made of attributes. For some reason, this breakdown of the structure of XML documents make me think of second grade grammar lessons when we learned sentence diagramming. These articles (except W3) were difficult to get through because they assume the reader has some experience/background with SGML. I'll have to re-read & go through the tutorials a few more times.

Comments from Wk #9

http://sek80.blogspot.com/2010/11/week-9-reading-notes.html?showComment=1289098536444#c5517278810594988611

http://marypatelhattab.blogspot.com/2010/11/week-9-readings.html?showComment=1289098825453#c5930458660809599655

Tuesday, November 2, 2010

Assignment #5

Here's the link to my Koha book shelf:
http://upitt01-staff.kwc.kohalibrary.com/cgi-bin/koha/virtualshelves/shelves.pl?viewshelf=79
It's titled "Solo Library Management" & covers a few of the books that have been a tremendous help with my job & some books I've been meaning to read. My user name is TRW37.

Monday, November 1, 2010

Muddiest Point - Wk #8

Is it necessary to know HTML any more since it seems as if most web editing software has HTML built-in?

Readings for Wk #8

W3Schools HTML Tutorial / W3Schools.com

HTML is the markup language used to describe websites. Before reading this, I knew enough about HTML to recognize it when I see it & to do some basic edits (change fonts/sizes, make lists, bold, etc) to an existing website. This site was very easy to follow - easier to follow than the many manuals & seminars I've read/attended on building websites. I especially liked how it gave space for real-time practice.

HTML Cheat sheet / Webmonkey

I've bookmarked this page for future reference. It'll be very handy for any web editing projects because although HTML is a good thing to be familiar with, no one is going to remember all the correct tags.

W3Schools CSS Tutorial / W3Schools.com

Cascading Style Sheets (CSS) allows you to apply styles to many web pages at once, saving a huge amount of time and effort. (I'm a little more familiar with this because of a web project at work [which ate my life] last year.) CSS is perfect for large-scale projects, like creating a new website. Once you've decided the basic format & the elements you'll need for each of the pages on the site (i.e. background colors, fonts, sizes, etc.), you can save those elements in a .css file & use it to style your pages.

Beyond HTML / Goans & Leach

This article discusses one library's adoption of a content management system to manage their web guides. CMS can make web sites easier to manage and edit because prior knowledge of HTML is not necessary. For this particular library,CMS also allowed them to be more creative in tagging and customizing information to better meet the needs of their users. It was interesting to see that when the authors surveyed librarians and liaisons in their use of CMS, most respondents indicated that the "ease of use" was the deciding factor in choosing CMS.

Nearly A Librarian